Input Modes
The nature, characteristics and frequency of incoming data is dependent on the type of input used in a Pipe.
Streaming Inputs
These Inputs keep running once started, and may produce data on a continuous basis, or as data becomes available. Pipes with streaming inputs are long-running processes, and will only stop manually or in case of non-recoverable errors.
Non-streaming Inputs
These inputs produce data once, or until they process all input data. They will stop their associated Pipe when all data is fully processed. Note that when scheduling a non-streming input with cron
or interval
, it effectively becomes a streaming input.
Mixed-mode inputs
These inputs can act as either streaming or non-streaming inputs, depending on their configuration. For example, the file input will by default tail a file (or many files) continuously, making it a streaming input. The file input can also be configured to stop after it reaches the end of it's specified input file(s). It then becomes a non-streaming input that will exit once done.
Back Pressure
In a Pipe, back pressure refers to the situation wherein a Pipe receives data faster than it can write data. In cases where back pressure may arise, Pipe authors must give special attention to the handling of such scenarios.
Pipes do not store data internally, except in limited scenarios such as output batching. Some outputs also supports a retry
mechanism, which has a forever
option. But the output buffer can eventually fill up, resulting in discarded outbound events.
Consider the following example:
- Data is streaming into a Pipe via an AMQP input
- The Pipe writes data to an HTTP API
- The receiving web server becomes unavailable due to network parity
If the output has no retry
configured, the http-post
output discards outbound events after HTTP requests fail.
With retry
configured, outbound events is eventually discarded once the output buffer reaches it's limit. This is dependent on the rate of input events and available memory on the Pipe's host machine.
There are cases where back pressure may arise, and event loss is not acceptable. A common solution is to use two pipes, along with a persistent storage mechanism:
Such approaches provides Pipe authors more control and reliability in cases where outbound streams might temporarily fail.
Input Mode Reference
The following table lists the input modes of the available inputs.
Input | Streaming | Can Schedule |
---|---|---|
amqp | yes | |
http-server | yes | |
kafka | yes | |
tcp | yes | |
udp | yes | |
azure-blob | yes | |
echo | yes | |
exec | yes | |
http-poll | yes | |
s3 | yes | |
scuba | yes | |
sql | yes | |
echo | yes | |
files * | mixed | |
redis ** | mixed | yes |
*Streaming by default, unless file tailing is disabled
**Streaming when subscribing to channels