Architecture
Simply stated, Hotrod is a platform for creating and running data processing pipelines. There are five components at the core of Hotrod, and this section explains the role of each.
Pipes
Pipes are the foundation of the platform. A Pipe is a data processing pipeline, which represents a specific workload or job.
A Pipe reads data from a source, optionally performs a series of transformations on data, and finally writes data to some destination. Pipes are defined with the Pipe Language.
The Pipe Language
The Pipe Language is YAML-based syntax and grammar for defining Pipes. A simple Pipe may look like this when defined in the Pipe Languge:
name: simple-HTTP-example
input:
http-poll:
address: "https://httpbin.org/get"
raw: true
actions:
- rename:
fields:
- origin: source_ip
output:
file:
path: "/tmp/output.log"
This Pipe will:
- Read a JSON response from a URI, via HTTP
- Rename the response field
origin
tosource_ip
- Write the response to a file
All Pipes follow this basic pattern. In the Pipe Language, the emphasis is on the outcome, not low-level implementation details.
For detailed documentation of the Pipe Language, see the Pipe Language Reference.
The Pipe Runtime
While the Pipe Language provides a concise, consistent means of defining Pipes, the Pipe Runtime is responsible for executing those Pipe definitions in a robust manner. Low-level details like reading data from files, databases, queues and web servers are handled internally. As such, the Pipe Language is the "user interface" to the Pipe Runtime.
While Users work with the Pipe Language to define Pipes, the Pipe Runtime itself is transparent, and should be considered part of Hotrod's internals. However, this Guide covers some pertinent details about it's operation.
Agents
An Agent is a light-weight process responsible for executing one or more Pipes. It coordinates with the Server about the Pipes it should run, as well as to report logs and metrics data.
Agents are the unit of scale in the platform. Since the Server manages Agents, they require little to no maintenance beyond minimal initial configuration.
Server
The Server is the central component of the platform, and is responsible for:
- Providing management interfaces (Web UI, CLI, HTTP API)
- Managing Pipes and Agents
- Deploying (and removing) Pipes from Agents
- Aggregating metrics, logs and other runtime-related data
The number of Agents that a Server can support is primarily constrained by two factors:
- The Server's available network bandwidth
- The Server's available storage for logs and metrics data for the desired retention period.