Skip to main content
Version: 3.3.1

s3

Stream data from a S3 Object

Available from Hotrod: 3.1

Field NameDescriptionTypeDefault
intervalHow often to run the commandduration-
cronHow often to run the command. Note that Hotrod uses a different format than Cron it includes a column for seconds. See full discussioncron-
immediateRun as soon as invoked, instead of waiting for the specified cron intervalboolfalse
random-offsetSets a random offset to the schedule, then sticks to itduration0s
windowFor resources that need a time window to be specifiedWindow-
blockBlock further input schedules from triggering if the pipe output is retryingboolfalse
bucket-nameThe storage service container for created blobsstring-
object-namesThe name for the blobarray of strings-
object-name-fieldThe field that a blob name from an operation should be stored infield-
creation-time-fieldThe field that the blob creation time should be stored infield-
last-modified-fieldThe field that the blob last modified time should be stored infield-
content-length-fieldThe field that the blob content length information should be stored infield-
content-type-fieldThe field that the blob content type information should be stored infield-
etag-fieldThe field that the object ETag should be stored infield-
data-fieldA field that the blob data should be nested infield-
regionRegionstring-
endpointS3 Endpointstring-
access-keyAccess Key IDstring-
secret-keySecret Key IDstring-
security-tokenSecurity Tokenstring-
session-tokenSession Tokenstring-
timestamp-modeDerive a timestamp for this blob for filtering purposes based on the selected strategy.S3ObjectTimestampMode-
maximum-ageRemove any object older than this many seconds from the candidate listMaxAgeSpecifier-
modeThe operating mode for this inputS3BlockInputMode-
fingerprintingEnable object fingerprinting, which will cause a object to only be downloaded onceboolfalse
maximum-fingerprint-ageRemove any object fingerprints older than this from the trackerMaxAgeSpecifier30 days

interval

How often to run the command

By default, interval: 0s which means: once. Note that scheduled inputs set document markers. See full discussion

Type: duration

Example

action:

exec:
command: echo 'once a day'
interval: 1d

cron

How often to run the command. Note that Hotrod uses a different format than Cron it includes a column for seconds. See full discussion

Type: cron

Example: Once a day

action:

exec:
command: echo 'once a day'
cron: '0 0 0 * * *'

Example: Once a day, using a convenient shortcut

action:

exec:
command: echo 'once a day'
cron: '@daily'

immediate

Run as soon as invoked, instead of waiting for the specified cron interval

Type: bool

Example: Run immediately on invocation, and thereafter at 10h every morning

action:

exec:
command: echo 'hello'
immediate: true
cron: '0 0 10 * * *'

random-offset

Sets a random offset to the schedule, then sticks to it

This can help avoid the thundering herd problem, where you do not, for example, want to overload some service at 00:00:00

Type: duration

Example: Would fire up to a minute after every hour

action:

exec:
command: echo 'hello'
random-offset: 1m
cron: '0 0 * * * *'

window

For resources that need a time window to be specified

Type: Window

Field NameDescriptionTypeDefault
sizeWindow sizeduration-
offsetWindow offsetduration0s
start-timeAllows the windowing to start at a specified timetime-
highwatermark-fileSpecify file where timestamp would be stored in order to resume, for when Pipe has been restartedpath-

size

Window size

Type: duration

Example

action:

exec:
command: echo 'one two'
window:
size: 1m

offset

Window offset

Type: duration

Example

action:

exec:
command: echo 'one two'
window:
size: 1m
offset: 10s

start-time

Allows the windowing to start at a specified time

It should in the following format: 2019-07-10 18:45:00.000 +0200

Type: time

Example

action:

exec:
command: echo 'one two'
window:
size: 1m
start-time: 10s

highwatermark-file

Specify file where timestamp would be stored in order to resume, for when Pipe has been restarted

Type: path

Example

action:

exec:
command: echo 'one two'
window:
size: 1m
highwatermark-file:: /tmp/mark.txt

block

Block further input schedules from triggering if the pipe output is retrying

Type: bool

bucket-name

The storage service container for created blobs

Type: string

object-names

The name for the blob

Type: array of strings

object-name-field

The field that a blob name from an operation should be stored in

Type: field

creation-time-field

The field that the blob creation time should be stored in

Type: field

last-modified-field

The field that the blob last modified time should be stored in

Type: field

content-length-field

The field that the blob content length information should be stored in

Type: field

content-type-field

The field that the blob content type information should be stored in

Type: field

etag-field

The field that the object ETag should be stored in

Type: field

data-field

A field that the blob data should be nested in

Type: field

region

Region

Type: string

endpoint

S3 Endpoint

Type: string

access-key

Access Key ID

Type: string

secret-key

Secret Key ID

Type: string

security-token

Security Token

Type: string

session-token

Session Token

Type: string

timestamp-mode

Derive a timestamp for this blob for filtering purposes based on the selected strategy.

Type: S3ObjectTimestampMode

Field NameDescriptionTypeDefault
noneThe default mode, do not filter object based on timestamps--
last-modifiedFilter object on the last-modified timestamp reported by the service--
blob-name-patternFilter blobs on the timestamp derived from the object name for example: object-name-pattern: =(?P<Y>[\\d]{4,4})-(?P<m>[\\d]{2,2})-(?P<d>[\\d]{2,2})/string-

none

The default mode, do not filter object based on timestamps

last-modified

Filter object on the last-modified timestamp reported by the service

blob-name-pattern

Filter blobs on the timestamp derived from the object name for example: object-name-pattern: =(?P<Y>[\\d]{4,4})-(?P<m>[\\d]{2,2})-(?P<d>[\\d]{2,2})/

Type: string

maximum-age

Remove any object older than this many seconds from the candidate list

Type: MaxAgeSpecifier

Field NameDescriptionTypeDefault
secondsSpecify the maximum age in number of secondsinteger-
durationSpecify the maximum age as a human readable duration (example: 1 hour)string-

seconds

Specify the maximum age in number of seconds

Type: integer

duration

Specify the maximum age as a human readable duration (example: 1 hour)

Type: string

mode

The operating mode for this input

Type: S3BlockInputMode

Field NameDescriptionTypeDefault
list-objectsList Objects--
download-objectsDownload Given Objects--
list-and-download-objectsList Objects and Download--

list-objects

List Objects

download-objects

Download Given Objects

list-and-download-objects

List Objects and Download

fingerprinting

Enable object fingerprinting, which will cause a object to only be downloaded once

Type: bool

maximum-fingerprint-age

Remove any object fingerprints older than this from the tracker

Type: MaxAgeSpecifier

Field NameDescriptionTypeDefault
secondsSpecify the maximum age in number of secondsinteger-
durationSpecify the maximum age as a human readable duration (example: 1 hour)string-

seconds

Specify the maximum age in number of seconds

Type: integer

duration

Specify the maximum age as a human readable duration (example: 1 hour)

Type: string