1 comments

  • dengolius 1 day ago
    This article discusses why TiDB, a distributed SQL database, migrated its observability platform from Prometheus to VictoriaMetrics.

    The Problem with Prometheus At scale, Prometheus started showing limitations, especially for large enterprise customers like Pinterest.The main issues were: - High resource consumption: Prometheus used a lot of CPU and memory, leading to frequent out-of-memory (OOM) crashes. - Long recovery times: After a crash, Prometheus needed a long time to recover, sometimes failing altogether. - Limited query performance: Large queries would often fail or be very slow.

    The Solution: VictoriaMetrics

    TiDB switched to VictoriaMetrics and saw significant improvements: - Better resource utilization: CPU and memory usage dropped significantly, eliminating OOM crashes. - Improved query performance: Large queries that previously failed in Prometheus now run efficiently in VictoriaMetrics. - Lower costs: Reduced resource consumption and better storage efficiency led to lower operational costs.