Remap with VRL
Modify your observability data as it passes through your topology using Vector Remap Language (VRL)
Is the recommended transform for parsing, shaping, and transforming data in Vector. It implements the Vector Remap Language (VRL), an expression-oriented language designed for processing observability data (logs and metrics) in a safe and performant manner.
Please refer to the VRL reference when writing VRL scripts.
Configuration
Example configurations
{
"transforms": {
"my_transform_id": {
"type": "remap",
"inputs": [
"my-source-or-transform-id"
],
"source": ". = parse_json!(.message)\n.new_field = \"new value\"\n.status = to_int!(.status)\n.duration = parse_duration!(.duration, \"s\")\n.new_name = del(.old_name)",
"file": "./my/program.vrl"
}
}
}
[transforms.my_transform_id]
type = "remap"
inputs = [ "my-source-or-transform-id" ]
source = """
. = parse_json!(.message)
.new_field = "new value"
.status = to_int!(.status)
.duration = parse_duration!(.duration, "s")
.new_name = del(.old_name)"""
file = "./my/program.vrl"
---
transforms:
my_transform_id:
type: remap
inputs:
- my-source-or-transform-id
source: |-
. = parse_json!(.message)
.new_field = "new value"
.status = to_int!(.status)
.duration = parse_duration!(.duration, "s")
.new_name = del(.old_name)
file: ./my/program.vrl
{
"transforms": {
"my_transform_id": {
"type": "remap",
"inputs": [
"my-source-or-transform-id"
],
"timezone": "local",
"source": ". = parse_json!(.message)\n.new_field = \"new value\"\n.status = to_int!(.status)\n.duration = parse_duration!(.duration, \"s\")\n.new_name = del(.old_name)",
"file": "./my/program.vrl",
"drop_on_error": null,
"drop_on_abort": true,
"reroute_dropped": null
}
}
}
[transforms.my_transform_id]
type = "remap"
inputs = [ "my-source-or-transform-id" ]
timezone = "local"
source = """
. = parse_json!(.message)
.new_field = "new value"
.status = to_int!(.status)
.duration = parse_duration!(.duration, "s")
.new_name = del(.old_name)"""
file = "./my/program.vrl"
drop_on_abort = true
---
transforms:
my_transform_id:
type: remap
inputs:
- my-source-or-transform-id
timezone: local
source: |-
. = parse_json!(.message)
.new_field = "new value"
.status = to_int!(.status)
.duration = parse_duration!(.duration, "s")
.new_name = del(.old_name)
file: ./my/program.vrl
drop_on_error: null
drop_on_abort: true
reroute_dropped: null
drop_on_abort
optional boolabort
statement. These events will instead be written to the dropped
output.true
drop_on_error
optional booldropped
output.false
file
common optional string literalFile path to the Vector Remap Language (VRL) program to execute for each event.
If a relative path is provided, its root is the current working directory.
Required if source
is missing.
inputs
required [string]A list of upstream source or transform
IDs. Wildcards (*
) are supported.
See configuration for more info.
reroute_dropped
optional booldrop_on_error
and
drop_on_abort
) to the dropped
output instead dropping them entirely.false
source
common optional string remap_programThe Vector Remap Language (VRL) program to execute for each event.
Required if file
is missing.
timezone
optional string literaltimezone
option.
The time zone name may be any name in the TZ database, or local
to
indicate system local time.local
Outputs
<component_id>
dropped
dropped
output. When the
drop_on_error
or drop_on_abort
configuration values are set to true
and reroute_dropped
is also set to true
, events that result in runtime
errors or aborts will be dropped from the default output stream and sent to
the dropped
output instead. For a transform component named foo
, this
output can be accessed by specifying foo.dropped
as the input to another
component. Events sent to this output will be in their original form,
omitting any partial modification that took place before the error or abort.Telemetry
Metrics
linkcomponent_received_event_bytes_total
countercomponent_id
instead. The value is the same as component_id
.component_received_events_count
histogramcomponent_id
instead. The value is the same as component_id
.component_received_events_total
countercomponent_id
instead. The value is the same as component_id
.component_sent_event_bytes_total
countercomponent_id
instead. The value is the same as component_id
.component_sent_events_total
countercomponent_id
instead. The value is the same as component_id
.events_in_total
countercomponent_received_events_total
instead.component_id
instead. The value is the same as component_id
.events_out_total
countercomponent_sent_events_total
instead.component_id
instead. The value is the same as component_id
.processed_bytes_total
countercomponent_id
instead. The value is the same as component_id
.processed_events_total
countercomponent_received_events_total
and
component_sent_events_total
metrics.component_id
instead. The value is the same as component_id
.processing_errors_total
countercomponent_id
instead. The value is the same as component_id
.utilization
gaugecomponent_id
instead. The value is the same as component_id
.Examples
Parse Syslog logs
Given this event...{
"log": {
"message": "\u003c102\u003e1 2020-12-22T15:22:31.111Z vector-user.biz su 2666 ID389 - Something went wrong"
}
}
[transforms.my_transform_id]
type = "remap"
inputs = [ "my-source-or-transform-id" ]
source = ". |= parse_syslog!(.message)"
---
transforms:
my_transform_id:
type: remap
inputs:
- my-source-or-transform-id
source: . |= parse_syslog!(.message)
{
"transforms": {
"my_transform_id": {
"type": "remap",
"inputs": [
"my-source-or-transform-id"
],
"source": ". |= parse_syslog!(.message)"
}
}
}
{
"appname": "su",
"facility": "ntp",
"hostname": "vector-user.biz",
"message": "Something went wrong",
"msgid": "ID389",
"procid": 2666,
"severity": "info",
"timestamp": "2020-12-22T15:22:31.111Z",
"version": 1
}
Parse key/value (logfmt) logs
Given this event...{
"log": {
"message": "@timestamp=\"Sun Jan 10 16:47:39 EST 2021\" level=info msg=\"Stopping all fetchers\" tag#production=stopping_fetchers id=ConsumerFetcherManager-1382721708341 module=kafka.consumer.ConsumerFetcherManager"
}
}
[transforms.my_transform_id]
type = "remap"
inputs = [ "my-source-or-transform-id" ]
source = ". = parse_key_value!(.message)"
---
transforms:
my_transform_id:
type: remap
inputs:
- my-source-or-transform-id
source: . = parse_key_value!(.message)
{
"transforms": {
"my_transform_id": {
"type": "remap",
"inputs": [
"my-source-or-transform-id"
],
"source": ". = parse_key_value!(.message)"
}
}
}
{
"@timestamp": "Sun Jan 10 16:47:39 EST 2021",
"id": "ConsumerFetcherManager-1382721708341",
"level": "info",
"module": "kafka.consumer.ConsumerFetcherManager",
"msg": "Stopping all fetchers",
"tag#production": "stopping_fetchers"
}
Parse custom logs
Given this event...{
"log": {
"message": "2021/01/20 06:39:15 +0000 [error] 17755#17755: *3569904 open() \"/usr/share/nginx/html/test.php\" failed (2: No such file or directory), client: xxx.xxx.xxx.xxx, server: localhost, request: \"GET /test.php HTTP/1.1\", host: \"yyy.yyy.yyy.yyy\""
}
}
[transforms.my_transform_id]
type = "remap"
inputs = [ "my-source-or-transform-id" ]
source = """
. |= parse_regex!(.message, r'^(?P<timestamp>\\d+/\\d+/\\d+ \\d+:\\d+:\\d+ \\+\\d+) \\[(?P<severity>\\w+)\\] (?P<pid>\\d+)#(?P<tid>\\d+):(?: \\*(?P<connid>\\d+))? (?P<message>.*)$')
# Coerce parsed fields
.timestamp = parse_timestamp(.timestamp, "%Y/%m/%d %H:%M:%S %z") ?? now()
.pid = to_int!(.pid)
.tid = to_int!(.tid)
# Extract structured data
message_parts = split(.message, ", ", limit: 2)
structured = parse_key_value(message_parts[1], key_value_delimiter: ":", field_delimiter: ",") ?? {}
.message = message_parts[0]
. = merge(., structured)"""
---
transforms:
my_transform_id:
type: remap
inputs:
- my-source-or-transform-id
source: >-
. |= parse_regex!(.message, r'^(?P<timestamp>\d+/\d+/\d+ \d+:\d+:\d+
\+\d+) \[(?P<severity>\w+)\] (?P<pid>\d+)#(?P<tid>\d+):(?:
\*(?P<connid>\d+))? (?P<message>.*)$')
# Coerce parsed fields
.timestamp = parse_timestamp(.timestamp, "%Y/%m/%d %H:%M:%S %z") ?? now()
.pid = to_int!(.pid)
.tid = to_int!(.tid)
# Extract structured data
message_parts = split(.message, ", ", limit: 2)
structured = parse_key_value(message_parts[1], key_value_delimiter: ":", field_delimiter: ",") ?? {}
.message = message_parts[0]
. = merge(., structured)
{
"transforms": {
"my_transform_id": {
"type": "remap",
"inputs": [
"my-source-or-transform-id"
],
"source": ". |= parse_regex!(.message, r'^(?P<timestamp>\\d+/\\d+/\\d+ \\d+:\\d+:\\d+ \\+\\d+) \\[(?P<severity>\\w+)\\] (?P<pid>\\d+)#(?P<tid>\\d+):(?: \\*(?P<connid>\\d+))? (?P<message>.*)$')\n\n# Coerce parsed fields\n.timestamp = parse_timestamp(.timestamp, \"%Y/%m/%d %H:%M:%S %z\") ?? now()\n.pid = to_int!(.pid)\n.tid = to_int!(.tid)\n\n# Extract structured data\nmessage_parts = split(.message, \", \", limit: 2)\nstructured = parse_key_value(message_parts[1], key_value_delimiter: \":\", field_delimiter: \",\") ?? {}\n.message = message_parts[0]\n. = merge(., structured)"
}
}
}
{
"client": "xxx.xxx.xxx.xxx",
"connid": "3569904",
"host": "yyy.yyy.yyy.yyy",
"message": "open() \"/usr/share/nginx/html/test.php\" failed (2: No such file or directory)",
"pid": 17755,
"request": "GET /test.php HTTP/1.1",
"server": "localhost",
"severity": "error",
"tid": 17755,
"timestamp": "2021-01-20T06:39:15Z"
}
Multiple parsing strategies
Given this event...{
"log": {
"message": "\u003c102\u003e1 2020-12-22T15:22:31.111Z vector-user.biz su 2666 ID389 - Something went wrong"
}
}
[transforms.my_transform_id]
type = "remap"
inputs = [ "my-source-or-transform-id" ]
source = """
structured =
parse_syslog(.message) ??
parse_common_log(.message) ??
parse_regex!(.message, r'^(?P<timestamp>\\d+/\\d+/\\d+ \\d+:\\d+:\\d+) \\[(?P<severity>\\w+)\\] (?P<pid>\\d+)#(?P<tid>\\d+):(?: \\*(?P<connid>\\d+))? (?P<message>.*)$')
. = merge(., structured)"""
---
transforms:
my_transform_id:
type: remap
inputs:
- my-source-or-transform-id
source: >-
structured =
parse_syslog(.message) ??
parse_common_log(.message) ??
parse_regex!(.message, r'^(?P<timestamp>\d+/\d+/\d+ \d+:\d+:\d+) \[(?P<severity>\w+)\] (?P<pid>\d+)#(?P<tid>\d+):(?: \*(?P<connid>\d+))? (?P<message>.*)$')
. = merge(., structured)
{
"transforms": {
"my_transform_id": {
"type": "remap",
"inputs": [
"my-source-or-transform-id"
],
"source": "structured =\n parse_syslog(.message) ??\n parse_common_log(.message) ??\n parse_regex!(.message, r'^(?P<timestamp>\\d+/\\d+/\\d+ \\d+:\\d+:\\d+) \\[(?P<severity>\\w+)\\] (?P<pid>\\d+)#(?P<tid>\\d+):(?: \\*(?P<connid>\\d+))? (?P<message>.*)$')\n. = merge(., structured)"
}
}
}
{
"appname": "su",
"facility": "ntp",
"hostname": "vector-user.biz",
"message": "Something went wrong",
"msgid": "ID389",
"procid": 2666,
"severity": "info",
"timestamp": "2020-12-22T15:22:31.111Z",
"version": 1
}
Modify metric tags
Given this event...{
"metric": {
"counter": {
"value": 102
},
"kind": "incremental",
"name": "user_login_total",
"tags": {
"email": "vic@vector.dev",
"host": "my.host.com",
"instance_id": "abcd1234"
}
}
}
[transforms.my_transform_id]
type = "remap"
inputs = [ "my-source-or-transform-id" ]
source = """
.tags.environment = get_env_var!("ENV") # add
.tags.hostname = del(.tags.host) # rename
del(.tags.email)"""
---
transforms:
my_transform_id:
type: remap
inputs:
- my-source-or-transform-id
source: |-
.tags.environment = get_env_var!("ENV") # add
.tags.hostname = del(.tags.host) # rename
del(.tags.email)
{
"transforms": {
"my_transform_id": {
"type": "remap",
"inputs": [
"my-source-or-transform-id"
],
"source": ".tags.environment = get_env_var!(\"ENV\") # add\n.tags.hostname = del(.tags.host) # rename\ndel(.tags.email)"
}
}
}
{
"counter": {
"value": 102
},
"kind": "incremental",
"name": "user_login_total",
"tags": {
"environment": "production",
"hostname": "my.host.com",
"instance_id": "abcd1234"
}
}
Emitting multiple logs from JSON
Given this event...{
"log": {
"message": "[{\"message\": \"first_log\"}, {\"message\": \"second_log\"}]"
}
}
[transforms.my_transform_id]
type = "remap"
inputs = [ "my-source-or-transform-id" ]
source = ". = parse_json!(.message) # sets `.` to an array of objects"
---
transforms:
my_transform_id:
type: remap
inputs:
- my-source-or-transform-id
source: ". = parse_json!(.message) # sets `.` to an array of objects"
{
"transforms": {
"my_transform_id": {
"type": "remap",
"inputs": [
"my-source-or-transform-id"
],
"source": ". = parse_json!(.message) # sets `.` to an array of objects"
}
}
}
[{"log":{"message":"first_log"}},{"log":{"message":"second_log"}}]
Emitting multiple non-object logs from JSON
Given this event...{
"log": {
"message": "[5, true, \"hello\"]"
}
}
[transforms.my_transform_id]
type = "remap"
inputs = [ "my-source-or-transform-id" ]
source = ". = parse_json!(.message) # sets `.` to an array"
---
transforms:
my_transform_id:
type: remap
inputs:
- my-source-or-transform-id
source: ". = parse_json!(.message) # sets `.` to an array"
{
"transforms": {
"my_transform_id": {
"type": "remap",
"inputs": [
"my-source-or-transform-id"
],
"source": ". = parse_json!(.message) # sets `.` to an array"
}
}
}
[{"log":{"message":5}},{"log":{"message":true}},{"log":{"message":"hello"}}]
How it works
Emitting multiple log events
Multiple log events can be emitted from remap by assigning an array to the root path
.
. One log event is emitted for each input element of the array.
If any of the array elements isn’t an object, a log event is created that uses the
element’s value as the message
key. For example, 123
is emitted as:
{
"message": 123
}
Event Data Model
You can use the remap
transform to handle both log and metric events.
Log events in the remap
transform correspond directly to Vector’s log schema,
which means that the transform has access to the whole event and no restrictions on how the event can be
modified.
With metric events, VRL is much more restrictive. Below is a field-by-field breakdown of VRL’s access to metrics:
Field | Access | Specific restrictions (if any) |
---|---|---|
type | Read only | |
kind | Read/write | You can set kind to either incremental or absolute but not to an arbitrary value. |
name | Read/write | |
timestamp | Read/write/delete | You assign only a valid VRL timestamp value, not a VRL string. |
namespace | Read/write/delete | |
tags | Read/write/delete | The tags field must be a VRL object in which all keys and values are strings. |
It’s important to note that if you try to perform a disallowed action, such as deleting the type
field using del(.type)
, Vector doesn’t abort the VRL program or throw an error. Instead, it ignores
the disallowed action.
Lazy Event Mutation
When you make changes to an event through VRL’s path assignment syntax, the change isn’t immediately applied to the actual event. If the program fails to run to completion, any changes made until that point are dropped and the event is kept in its original state.
If you want to make sure your event is changed as expected, you have to rewrite your program to never fail at runtime (the compiler can help you with this).
Alternatively, if you want to ignore/drop events that caused the program to fail,
you can set the drop_on_error
configuration value to true
.
Learn more about runtime errors in the Vector Remap Language reference.
Vector Remap Language
The Vector Remap Language (VRL) is a restrictive, fast, and safe language we designed specifically for mapping observability data. It avoids the need to chain together many fundamental Vector transforms to accomplish rudimentary reshaping of data.
The intent is to offer the same robustness of full language runtime (ex: Lua) without paying the performance or safety penalty.
Learn more about Vector’s Remap Language in the Vector Remap Language reference.