Start here if traces feel confusing. This page explains the minimum you need to answer questions like “who is calling this service?”, “which endpoints are busiest?”, and “what is failing?”.
The 30-second version
- A trace is one request or one network call.
client.service.nameis the service that made the call.server.service.nameis the service that received the call.http.pathis the endpoint or route.count(traces)means “how many calls happened?”by (...)means “split that total into buckets.”
First: switch to MetoroQL mode
Most of the examples below are written in MetoroQL first, because it is the clearest way to express trace questions.- In Trace Search, click the
<>icon at the end of the search section (at the right end of the search bar). The tooltip saysSwitch to MetoroQL mode. - In Metric Explorer, open the mode selector at the top-right of the query row and switch from
Standard modetoMetoroQL mode. - If you want to come back later, use the same controls to switch back to
Standard mode. Switching between the Standard and MetoroQL modes will convert the query automatically.
The “Standard Mode” examples later on refer to Metric Explorer with
trace data selected. Trace Search standard mode is for browsing individual traces, not for grouped time-series questions.One request, pictured simply
Keep this picture in your head for almost every traces question:Incoming vs outgoing, in plain English
Ifservice-a calls service-b on /api/dev:
- From
service-a’s point of view, that is an outgoing call. - From
service-b’s point of view, that is an incoming call. - In Metoro, the caller is the
client.service.name. - In Metoro, the receiver is the
server.service.name.
clientdoes not necessarily mean a browser or end user. It means “the thing that started the request.”serverdoes not necessarily mean a VM or physical server. It means “the thing that received the request.”
- Want outgoing calls from a service: filter on
client.service.name - Want incoming calls to a service: filter on
server.service.name
Which field do I filter?
What the main trace fields mean
| Field | Plain English |
|---|---|
client.service.name | Who made the call |
server.service.name | Who received the call |
http.method | GET, POST, PUT, DELETE, etc. |
http.path | Which endpoint was hit |
http.status_code | The exact response code |
http.status_code.bucket | The response code family, like 2XX or 5XX |
metoro.is_server_span | Whether Metoro is showing the server side of the request |
Other useful trace attributes
| Field | Plain English |
|---|---|
client.namespace | Kubernetes namespace that made the call |
server.namespace | Kubernetes namespace that received the call |
client.container.id | Exact caller pod/container |
server.container.id | Exact receiver pod/container |
server.net.host.name | Destination host or IP, especially useful for external calls |
server.external | Whether the destination is outside your Kubernetes cluster |
environment | Metoro environment |
traceId | Unique ID for one trace |
client.host.availability_zone | Availability zone the request came from |
server.host.availability_zone | Availability zone that handled the request |
Which screen to use
| If you want to… | Use… |
|---|---|
| See who talks to who | Service Map |
| Inspect individual requests | The Traces page |
| Count requests, split by endpoint, or chart traffic | Metric Explorer in Standard Mode, or a MetoroQL query using count(traces) |
Group by, without the jargon
group by just means:
Do not give me one big total. Split the total into smaller buckets.
No group by: one total
Group by http.path: split by endpoint
http.path in the Group by control does the same thing as writing by (http.path) in MetoroQL.
Add more fields when you need narrower buckets. For example, by (http.method, http.path) splits GET /api/dev and POST /api/dev into separate rows.
Examples:
by (...) clause too.
Copy-paste questions and answers
Each example below has:- a MetoroQL version you can paste directly
- a Standard Mode version you can build in Metric Explorer without writing the query yourself
Number of incoming calls to a service
Timeseries data:traceStat:request countFilters:server.service.name=<service>Group by: leave empty
Number of incoming calls to a service by endpoint
Timeseries data:traceStat:request countFilters:server.service.name=<service>Group by:http.path
Number of incoming calls to a service by method and endpoint
Timeseries data:traceStat:request countFilters:server.service.name=<service>Group by:http.method,http.path
Number of outgoing calls from a service
Timeseries data:traceStat:request countFilters:client.service.name=<service>Group by: leave empty
Number of outgoing calls from a service by destination service
Timeseries data:traceStat:request countFilters:client.service.name=<service>Group by:server.service.name
Number of outgoing calls from a service by endpoint
Timeseries data:traceStat:request countFilters:client.service.name=<service>Group by:server.service.name,http.path
Which services are calling my service?
Timeseries data:traceStat:request countFilters:server.service.name=<service>Group by:client.service.name
Which namespaces are calling my service?
Timeseries data:traceStat:request countFilters:server.service.name=<service>Group by:client.namespace
All endpoint call counts in a namespace
Timeseries data:traceStat:request countFilters:server.namespace=<namespace>Group by:server.service.name,http.path
Which endpoints on my service are failing?
Timeseries data:traceStat:request countFilters:server.service.name=<service>,http.status_code.bucket=5XXGroup by:http.path
Which endpoints on my service are slow?
Timeseries data:traceStat:p95 latencyFilters:server.service.name=<service>Group by:http.path
Which containers in my service are handling traffic?
Timeseries data:traceStat:request countFilters:server.service.name=<service>Group by:server.container.id
Which external APIs is my service calling?
Timeseries data:traceStat:request countFilters:client.service.name=<service>,server.namespace=External ServiceGroup by:server.service.name,http.path
Which external hosts is my service calling?
Timeseries data:traceStat:request countFilters:client.service.name=<service>,server.namespace=External ServiceGroup by:server.net.host.name,http.path
A real mental model
If you are looking at your API service:server.service.name="<your-api>"means “requests coming into my API”client.service.name="<your-api>"means “requests my API made to something else”
If you do not know the exact service name
Service names in Metoro are often full Kubernetes-style names such as/k8s/default/checkout-service.
If you are not sure of the exact value, use a regex match:
When http.path is not enough
http.path is best for HTTP traffic. For other protocols, use a more useful field for that protocol:
span.namefor a generic span namedb.operationfor database callsserver.service.nameto see which downstream service was hit
Beginner workflow
Start with the service you care about
Decide whether you care about traffic coming into the service or going out of the service.
Pick the right side
Use
server.service.name for incoming traffic. Use client.service.name for outgoing traffic.Add one split at a time
Start with
by (http.path). If that is too broad, use by (http.method, http.path) or by (server.service.name, http.path).Related docs
Traces Overview
Understand how trace data is collected and shown in Metoro
Service Map
See who talks to who before drilling into specific requests
MetoroQL
Learn the query syntax behind the examples on this page
