General Documentation

Overview

This document explains how the various parts of Metrist's offering fit together, especially the shared code that is run on-premises.

Terminology

Metrist Agent

The Metrist Agent (MA) is the cornerstone of our on-premises offering. It has two main functions:

Private synthetic monitoring

Using an API key, the Metrist Agent (MA) fetches customer-specific scheduling configuration from the Metrist back-end and uses that to decide which synthetic monitors to run. This happens roughly every ten seconds. For each monitor that is scheduled to run, the CMA will look at the last time the monitor reported back (again fetching this from the Metrist back-end) and decide whether enough time has elapsed to warrant a new run of the monitor. If this is true, then the monitor is run. Measurements collected by the monitor are then sent to the Canaray back-end for further analysis.

The goals of private synthetic monitoring are three-fold:

  1. To provide insights how vendor APIs work from the customer's premises. The shared monitors run from a number of geographically distributed locations but these locations will not necessarily match where customer interactions with the vendor happens, and network issues often can result in highly local outages.
  2. To provide insights how vendor APIs react on the customer's exact data. Shared monitors run with "dummy" data, often on vendor accounts dedicated to monitoring. It is very well possible that data sizes influence how a vendor API behaves, and as such using a production API key for monitoring may result in observing different behaviour. By employing private monitoring, control of production API keys and production data can stay where it should be: on-prem. Note that the monitor is still synthetic - it runs "fake" transactions - but it operates on "real" data, which can have a large performance impact.
  3. To facilitate monitoring for vendor APIs that are not supported by Metrist. Using the shared source for the CMA, customers can build their own monitors and run them through the agent.

The CMA comes bundled with all monitors that Metrist supports for private monitoring. It is the configuration document, however, that decides which monitors are run. By storing this configuration document centrally, multiple instances of the customer agent can run with the same configuration. By sending measurements back to the Metrist back-end, measurements can be aggregated, compared with public measurements, and trigger notifications in the same way as notifications for shared monitors are triggered.

In-process monitoring

For certain production processes, Metrist supplies "In-Process Agents" (IPAs) that intercept outgoing API calls (or, more often, outgoing HTTP calls). How this happens is dependent on the actual stack; two curent examples are:

More agents may follow, but they will all share the characteristics of the current agents:

Note that with this setup, customer-operated software is still in full control over what is sent to Metrist's backend: while the IPA typically operates in "firehose" mode, observing and forwarding all outgoing API calls to the CMA, the CMA acts as a filter to only send data that is clean, free of sensitive information, and expected to be sent to Metrist's backend.

IPA/CMA protocol

Data flows from IPA to CMA through UDP, by default on port 51712 (this is configurable). As indicated above, UDP was chosen because of the "fire-and-forget" characteristics of sending datagrams: there is no wait for acknowledgement, not even a check whether there is actually a process to receive them, so no chance that the sending process will block or otherwise delay. While data can be lost, we think it is a small consideration in exchange for predictable performance of the IPA.

Again, to keep the IPA code simple, the format that is expected is very trivial. Currently two formats are supported - the Ruby and PHP IPAs "see" differently-formatted data and rather than burdening them with parsing or combining URL fragments, it is shipped "as is" to the CMA so that the IPA code can stay as simple as possible. This trend may continue with more and more format variations to accomodate specific use cases.

An IPA message consists of fields, separated by tab characters (ASCII code 9) and terminated by line feeds (ASCII code 11). This format is simple to construct, simple to parse, and human readable. The first field is always a version code, and it determines the rest of the payload. The current variants are: