location_city Tokyo schedule Apr 19th 03:00 - 03:20 PM JST place Room C people 15 Interested

Prometheus Remote Write is the protocol used to send Prometheus metrics from Prometheus (and friends) to compatible remote storage endpoints. Remote Write is generally used for long term storage, centralization, and cloud services. It also allows users to run Prometheus in an agent mode, reducing local storage requirements.

At Grafana Labs we host metrics storage for thousands of customers using Grafana Mimir. We also have a Mimir cluster that we run for ourselves as a central operations tool. We write all Prometheus metrics from our various production Kubernetes to this operations Mimir cluster, and we write roughly a gigabyte per second of compressed data over the network. This data is sent across regions and cloud providers to a central Mimir cluster. At our scale, this can become a substantial cost factor.

In this talk you will learn about the Prometheus Remote Write format and the Prometheus storage format. We will use the Remote Write format as an example of how structuring wire formats can reduce the required network egress bandwidth even when compression is already being used. More specifically, you’ll learn see how you can reuse concepts from database design, specifically Prometheus TSDB's index, to reduce the egress bytes by as much as 60%.

 
 

Outline/Structure of the Talk

we'll cover:

  • what is prometheus
  • what is prometheus remote write
  • what were the design goals of remote write, how did that influence the current wire format
  • some details about grafana labs remote write usage, look at how cloud providers charge for network egress
  • discuss why the current wire format is inefficient
  • give an overview of prometheus' storage format and it's index and examine how that can be applied to remote write's wire format

Learning Outcome

from abstract: In this talk you will learn about the Prometheus Remote Write format and the Prometheus storage format. We will use the Remote Write format as an example of how structuring wire formats can reduce the required network egress bandwidth even when compression is already being used. More specifically, you’ll learn see how you can reuse concepts from database design, specifically Prometheus TSDB's index, to reduce the egress bytes by as much as 60%.

Target Audience

devs and operators who are using or interested in using Prometheus

Prerequisites for Attendees

knowledge of what metrics/prometheus is and how a timeseries database stores data would be useful but is not necessary

schedule Submitted 3 months ago
help