Skip to main content
Version: Next

Logging

Structured Audit Logging

The Server aggregates structured logs from the various platform components, which are stored in the Server DB. Various logs commands are available to inspect the behaviour of the Server, Agents, and Pipes.

Let’s start with a simple Pipe:

name: uptimex

context:
interval: 1s

input:
exec:
command: uptime
interval: '{{interval}}'

actions:
- extract:
input-field: _raw
remove: true
pattern: 'load average: (\S+), (\S+), (\S+)'
output-fields: [m1, m5, m15]
- convert:
- m1: num
- m5: num
- m15: num
- script:
let:
- k: count()
- filter:
condition: k % 2 == 0

output:
exec:
command: cat

Assume we have an Agent, have loaded this Pipe, and have attached it to the Agent:

$> hotrod agents add test0 --id 0

$> hotrod pipes add --file uptimex.yml

$> hotrod agents update test0 --add-pipe uptimex

[INFO] user=admin adding pipe=uptimex to agent=test0

An example of the Server logs is below and there is an audit trail detailing the operations and the responsible user:

$> hotrod server logs

timestamp | level | message
---------------------+-------+--------------------------------------------------
2020-06-16T14:17:50 | INFO | user=admin successful login from 127.0.0.1:38212
2020-06-16T14:17:59 | INFO | user=admin assigning id=0 to agent='test0'
2020-06-16T14:18:22 | INFO | user=admin adding pipe=uptimex
2020-06-16T14:18:22 | INFO | user=admin rebuilding agents: []
2020-06-16T14:18:22 | INFO | user=admin rebuilding agents: []
2020-06-16T14:18:36 | INFO | user=admin adding pipe=uptimex to agent=test0

Agent logs displays the Pipe being attached:

$> hotrod agents logs

timestamp | agent | level | message
---------------------+--------+-------+----------------------------------------
2020-06-16T14:18:13 | test0 | INFO | current []
2020-06-16T14:18:13 | test0 | INFO | allowed []
2020-06-16T14:18:13 | test0 | INFO | agent connected to hotrodd
2020-06-16T14:18:22 | test0 | INFO | current []
2020-06-16T14:18:22 | test0 | INFO | allowed []
2020-06-16T14:18:38 | test0 | INFO | New Pipe detected: hotrod-pipe-uptimex
2020-06-16T14:18:39 | test0 | INFO | current ["hotrod-pipe-uptimex"]
2020-06-16T14:18:39 | test0 | INFO | allowed ["hotrod-pipe-uptimex"]
2020-06-16T14:18:39 | test0 | INFO | Pipes Updated...

Pipe logs displays the Pipe being launched:

$> hotrod pipes logs

timestamp | agent | pipe | level | step | function | message
---------------------+--------+---------+-------+------+----------+----------------------
2020-06-16T14:18:38 | test0 | uptimex | WARN | 0 | | pipe started
2020-06-16T14:18:39 | test0 | uptimex | INFO | 0 | | systemd pipe started

Pipe Logging

Logging from Pipes is not enabled by default. You need to enable logging on an Agent-by-Agent basis.

$> hotrod agents settings --enable-pipe-logging test0

You can switch on debug mode for the Pipe using:

$> hotrod pipes trace uptimex --enable-debugging --agent test0

The Pipe may only be scheduled to run in a few minutes, but a restart can be forced:

$> hotrod pipes trace uptimex --command restart --agent test0

After waiting a few seconds for the samples to collect, you can inspect the Pipe logs:

$> hotrod pipes logs
timestamp | agent | pipe | level | step | function | message
---------------------+--------+---------+-------+------+-------------+-------------------------------------------
2020-06-16T14:18:38 | test0 | uptimex | WARN | 0 | | pipe started
2020-06-16T14:18:39 | test0 | uptimex | INFO | 0 | | systemd pipe started
2020-06-16T14:42:49 | test0 | uptimex | DEBUG | 5 | output-exec | {"m1":0.29,"m5":0.38,"m15":0.44,"k":1452}
2020-06-16T14:42:51 | test0 | uptimex | DEBUG | 5 | output-exec | {"m1":0.29,"m5":0.38,"m15":0.44,"k":1454}
2020-06-16T14:42:53 | test0 | uptimex | DEBUG | 5 | output-exec | {"m1":0.27,"m5":0.37,"m15":0.44,"k":1456}

DEBUG mode prints out the event that will be written out. In this case, events are sent to systemd-journald using HTTP POST, plain TCP, etc. Inspecting Pipe data should not require shell access to remote devices. step is the zero-based index of the Pipe step, where the input is 0 and the output is 5.

The field "k" comes from the Lua function count() which increments endlessly and is even — we use a filter to clear out odd values.

Debugging needs to be switched off explicitly using --disable-debugging.

trace provides further insights into the operation of a Pipe. It prints out the trace id which then references the result of this trace.

$> hotrod pipes trace --enable-tracing uptimex -t test0
trace id 1

By default, this will run trace for 5 samples. After a few seconds, pass the trace id to view the resulting Pipe logs:

$> hotrod pipes logs --trace-id 1

timestamp | agent | pipe | level | step | function | message
---------------------+--------+---------+-------+------+----------------+------------------------------------------------------------------------------------------------------------------------------
2020-06-16T15:41:14 | test0 | uptimex | TRACE | 1 | action-extract | {"_raw":" 15:41:14 up 50 days, 21:20, 1 user, load average: 0.42, 0.44, 0.44"} -> [{"m1":"0.42","m5":"0.44","m15":"0.44"}]
2020-06-16T15:41:14 | test0 | uptimex | TRACE | 2 | action-convert | {"m1":"0.42","m5":"0.44","m15":"0.44"} -> [{"m1":0.42,"m5":0.44,"m15":0.44}]
2020-06-16T15:41:14 | test0 | uptimex | TRACE | 3 | action-script | {"m1":0.42,"m5":0.44,"m15":0.44} -> [{"m1":0.42,"m5":0.44,"m15":0.44,"k":4957}]
2020-06-16T15:41:14 | test0 | uptimex | TRACE | 4 | action-filter | {"m1":0.42,"m5":0.44,"m15":0.44,"k":4957} -> []
2020-06-16T15:41:15 | test0 | uptimex | TRACE | 1 | action-extract | {"_raw":" 15:41:15 up 50 days, 21:20, 1 user, load average: 0.39, 0.43, 0.44"} -> [{"m1":"0.39","m5":"0.43","m15":"0.44"}]
2020-06-16T15:41:15 | test0 | uptimex | TRACE | 2 | action-convert | {"m1":"0.39","m5":"0.43","m15":"0.44"} -> [{"m1":0.39,"m5":0.43,"m15":0.44}]
2020-06-16T15:41:15 | test0 | uptimex | TRACE | 3 | action-script | {"m1":0.39,"m5":0.43,"m15":0.44} -> [{"m1":0.39,"m5":0.43,"m15":0.44,"k":4958}]
2020-06-16T15:41:15 | test0 | uptimex | TRACE | 4 | action-filter | {"m1":0.39,"m5":0.43,"m15":0.44,"k":4958} -> [{"m1":0.39,"m5":0.43,"m15":0.44,"k":4958}]
2020-06-16T15:41:15 | test0 | uptimex | TRACE | 5 | output-exec | {"m1":0.39,"m5":0.43,"m15":0.44,"k":4958}
...

This provides a step-by-step view of the data flow. Here, at line four, the value "k":4957, is not passed through by action-filter. In the next sample, data arrives at output.

Traces can be enabled for Pipes running on multiple Agents. To distinguish the outputs, hotrod agents logs and hotrod pipes logs both have an --agent option to specify one or more Agents to search for.

Pipe Errors

The general rule of the Pipe runtime is to not discard data; this leads to a tolerant way of viewing errors. But without good diagnostics, it is difficult to troubleshoot Pipe problems, since we are usually only one typo away from success.

Consider this Pipe, containing a flaw. The intention of this is to extract the number of days that this computer has been running for:

name: err

input:
exec:
command: uptime
interval: 2s

actions:
- extract:
warning: true
pattern: UP (\d+)
output-fields: [up]

- convert:
- up: num

output:
write: console

You will notice actions['extract'].warning: true — this is because not matching a pattern is not always an error. However, we need to be sure of this.

After adding this Pipe and attaching it, here are the logs (with --pipe err to exclude uptimex logs):

$> hotrod pipes logs --pipe err

timestamp | agent | pipe | level | step | function | message
---------------------+--------+------+-------+------+----------------+----------------------------------
2020-06-16T16:10:19 | test0 | err | INFO | 0 | | systemd pipe started
2020-06-16T16:10:20 | test0 | err | WARN | 1 | action-extract | extract: no captures with regex
2020-06-16T16:10:22 | test0 | err | WARN | 1 | action-extract | extract: no captures with regex
2020-06-16T16:10:24 | test0 | err | WARN | 1 | action-extract | extract: no captures with regex
2020-06-16T16:10:26 | test0 | err | WARN | 1 | action-extract | extract: no captures with regex
2020-06-16T16:10:28 | test0 | err | WARN | 1 | action-extract | extract: no captures with regex

To cover all the issues, switch on debugging as before:

$> hotrod pipes logs --pipe err

timestamp | agent | pipe | level | step | function | message
---------------------+--------+------+-------+------+----------------+----------------------------------------------------------------------------------
2020-06-16T16:06:14 | test0 | err | DEBUG | 3 | output-exec | {"_raw":" 16:06:14 up 50 days, 21:45, 1 user, load average: 0.59, 0.57, 0.51"}
2020-06-16T16:06:16 | test0 | err | WARN | 1 | action-extract | extract: no captures with regex
2020-06-16T16:06:16 | test0 | err | WARN | 2 | action-convert | fields not found up
2020-06-16T16:06:16 | test0 | err | DEBUG | 3 | output-exec | {"_raw":" 16:06:16 up 50 days, 21:45, 1 user, load average: 0.94, 0.64, 0.53"}
2020-06-16T16:06:18 | test0 | err | WARN | 1 | action-extract | extract: no captures with regex
2020-06-16T16:06:18 | test0 | err | WARN | 2 | action-convert | fields not found up
2020-06-16T16:06:18 | test0 | err | DEBUG | 3 | output-exec | {"_raw":" 16:06:18 up 50 days, 21:45, 1 user, load average: 0.94, 0.64, 0.53"}

Here we have two problems, flowing from the fact that UP does not match up:

  1. The pattern extraction.

  2. The failed conversion.

Again, a field not found in convert might be fine, when dealing with events following different schemas.

So, the raw data — which may be valuable — will always flow through.

Controlling Log Output

Logs have a level: trace, debug, warn, and error.

The minimum logging level is set with --level (default is trace).

Use --level error to only show errors.

It is always possible to display logs in csv format — this is one of the strengths of structured logging.

$> hotrod pipes logs --format csv

timestamp,agent,pipe,level,step,function,message
2020-06-16T16:32:56,test0,err,WARN,1,action-extract,extract: no captures with regex
2020-06-16T16:32:58,test0,err,WARN,1,action-extract,extract: no captures with regex
2020-06-16T16:33:00,test0,err,WARN,1,action-extract,extract: no captures with regex
2020-06-16T16:33:02,test0,err,WARN,1,action-extract,extract: no captures with regex
...

By default, you will receive the last 25 records — use --num-records to change this.

A start time can be specified, in a fairly relaxed format similar to that used by the date (see man date) command:

$> hotrod pipes logs --start-time '16 June' --num-records 10

timestamp | agent | pipe | level | step | function | message
---------------------+--------+---------+-------+------+-------------+-------------------------------------------
2020-06-16T14:18:38 | test0 | uptimex | WARN | 0 | | pipe started
2020-06-16T14:18:39 | test0 | uptimex | INFO | 0 | | systemd pipe started
2020-06-16T14:42:49 | test0 | uptimex | DEBUG | 5 | output-exec | {"m1":0.29,"m5":0.38,"m15":0.44,"k":1452}
2020-06-16T14:42:51 | test0 | uptimex | DEBUG | 5 | output-exec | {"m1":0.29,"m5":0.38,"m15":0.44,"k":1454}
2020-06-16T14:42:53 | test0 | uptimex | DEBUG | 5 | output-exec | {"m1":0.27,"m5":0.37,"m15":0.44,"k":1456}
2020-06-16T14:42:55 | test0 | uptimex | DEBUG | 5 | output-exec | {"m1":0.27,"m5":0.37,"m15":0.44,"k":1458}
2020-06-16T14:42:57 | test0 | uptimex | DEBUG | 5 | output-exec | {"m1":0.32,"m5":0.38,"m15":0.44,"k":1460}
2020-06-16T14:42:59 | test0 | uptimex | DEBUG | 5 | output-exec | {"m1":0.32,"m5":0.38,"m15":0.44,"k":1462}
2020-06-16T14:43:01 | test0 | uptimex | DEBUG | 5 | output-exec | {"m1":0.32,"m5":0.38,"m15":0.44,"k":1464}
2020-06-16T14:43:03 | test0 | uptimex | DEBUG | 5 | output-exec | {"m1":0.3,"m5":0.37,"m15":0.44,"k":1466}

Specifying a date implies the time starts from midnight — in the above case the data was first created at 14:00.

One can also specify a time-range explicitly. Times without dates are assumed to refer to the current day:

$> hotrod pipes logs --start-time '14:00' --end-time '17:00' --file csv >logs.csv

Metrics and Volumes

$> hotrod server traffic

metrics 43.47 kiB
logs 2.80 MiB
unsent logs 0.00 B
tarballs sent 599.50 kiB
$> hotrod volumes
date | agents | bboxes | pipes | lines | volume
------------+---------+--------+-------+-------+------------
2020.06.17 | 2 | 0 | 3 | 3629 | 287.11 kiB
2020.06.16 | 1 | 0 | 2 | 22992 | 1.78 MiB