Skip to main content
Myrouter provides comprehensive monitoring metrics for LLM API usage. These metrics give you in-depth insights into the availability and performance of your LLM API requests. You can view monitoring metrics on the LLM Monitoring page.

Metric Descriptions

All metrics below are broken down by model and sampled at the minute level, but depending on your selected time interval, data points may not be displayed for every minute. In such cases, data points within that interval are averaged.
  • Requests Per Minute (RPM) Shows the number of API requests made per minute, helping you understand usage patterns and API concurrency levels.
  • Request Success Rate Shows the percentage of successful API responses (non-5xx status codes) per minute, reflecting API availability.
  • Average Tokens Per Request Shows the average number of input and output tokens per request per minute, helping you understand token consumption patterns.
  • End-to-End (E2E) Latency Shows the total time required for the model to generate a complete response per minute of requests. Includes p99, p95, and average latency metrics.
  • Time to First Token (TTFT)
    This metric is only tracked for streaming requests with the stream=true parameter enabled.
    Shows the time required to process the prompt and generate the first output token per minute of requests. Includes p99, p95, and average latency metrics.
  • Time Per Output Token (TPOT)
    This metric is only tracked for streaming requests with the stream=true parameter enabled.
    Shows the average time between consecutive output tokens per minute of requests. Includes p99, p95, and average latency metrics.