Skip to main content

Architecture

Simply stated, Hotrod is a platform for creating and running data processing pipelines. There are five components at the core of Hotrod, and this section explains the role of each.

Pipes

Pipes are the foundation of the platform. A Pipe is a data processing pipeline, which represents a specific workload or job.

A Pipe reads data from a source, optionally performs a series of transformations on data, and finally writes data to some destination. Pipes are defined with the Pipe Language.

The Pipe Language

The Pipe Language is YAML-based syntax and grammar for defining Pipes. A simple Pipe may look like this when defined in the Pipe Languge:

name: simple-HTTP-example

input:
http-poll:
address: "https://httpbin.org/get"
raw: true

actions:
- rename:
fields:
- origin: source_ip

output:
file:
path: "/tmp/output.log"

This Pipe will:

  1. Read a JSON response from a URI, via HTTP
  2. Rename the response field origin to source_ip
  3. Write the response to a file

All Pipes follow this basic pattern. In the Pipe Language, the emphasis is on the outcome, not low-level implementation details.

For detailed documentation of the Pipe Language, see the Pipe Language Reference.

The Pipe Runtime

While the Pipe Language provides a concise, consistent means of defining Pipes, the Pipe Runtime is responsible for executing those Pipe definitions in a robust manner. Low-level details like reading data from files, databases, queues and web servers are handled internally. As such, the Pipe Language is the "user interface" to the Pipe Runtime.

While Users work with the Pipe Language to define Pipes, the Pipe Runtime itself is transparent, and should be considered part of Hotrod's internals. However, this Guide covers some pertinent details about it's operation.

Agents

An Agent is a light-weight process responsible for executing one or more Pipes. It coordinates with the Server about the Pipes it should run, as well as to report logs and metrics data.

Agents are the unit of scale in the platform. Since the Server manages Agents, they require little to no maintenance beyond minimal initial configuration.

Server

The Server is the central component of the platform, and is responsible for:

  • Providing management interfaces (Web UI, CLI, HTTP API)
  • Managing Pipes and Agents
  • Deploying (and removing) Pipes from Agents
  • Aggregating metrics, logs and other runtime-related data

The number of Agents that a Server can support is primarily constrained by two factors:

  • The Server's available network bandwidth
  • The Server's available storage for logs and metrics data for the desired retention period.