brief introduction to the RFC5424 syslog message format
outline
- definition
- format overview
- header
- PRI
- version
- timestamp
- hostname
- app-name
- procid
- msgid
- STRUCTURED-DATA
- SD-ELEMENT
- SD-ID
- SD-PARAM
- IANA-registered SD-IDs and PARAM-NAMEs
- MSG
- syslog messages examples
definition
- An "originator" generates syslog content to be carried in a message.
- A "collector" gathers syslog content for further analysis.
- A "relay" forwards messages, accepting messages from originators or other relays and sending them to collectors or other relays.
The syslog message has the following ABNF [RFC5234] definition:
- format:
HEADER = PRI VERSION SP TIMESTAMP SP HOSTNAME SP APP-NAME SP PROCID SP MSGID
- Each character set used in the HEADER must be seven-bit ASCII in an eight-bit field.
PRI
- format
1*3
means "from 1 to 3", and 1*3DIGIT
allows a 1-digit, 2-digit, or 3-digit number.
If the number on the left side of *
is not given, the default value is 0; if the number on the right side of *
is not given, the default value is infinity.
- the content after
;
in a line is a comment
/
means "or"
%d49-57
is equivalent to %d49 / %d50 / ... / %d57
, also equivalent to "1" / "2" / ... / "9"
- PRIVAL = facility number * 8 + severity number
severity number = PRIVAL % 8
facility = PRIVAL / 8
(the possible value of severity number is 0~7)
- The facility number specifies the subsystem that produced the message, and the severity number difines the severity of the message.
- facility table:
- severity table:
VERSION
- format:
- this field denotes the version of the syslog system
TIMESTAMP
- format:
TIME-SECFRAC
enclosed by []
such as[TIME-SECFRAC]
, is equivalent to *1TIME-SECFRAC
, which means it is an optional element.
- elements enclosed by
()
are considered as a single element.
(e.g. ("+" / "-") TIME-HOUR ":" TIME-MINUTE
means "+" or "-" ...
, but not "+" or "- ..."
)
- examples
2003-10-11T22:14:15.003Z
represents 11 October 2003 at 10:14pm, with 15.003 seconds into the next minute, in UTC.
2003-08-24T05:14:15+08:00
represents 24 August 2003 at 05:14:15am in NST, which is UTC+8.
HOSTNAME
- format:
- this field contains the hostname and the domain name of the originator in FQDN (fully qualified domain name) format (specified in RFC1034)
- not all syslog applications are able to provide an FQDN. As such, other values may also be present in HOSTNAME. The values could be FQDN, static IP address, hostname, dynamic IP address, or nilvalue (in the order of preference).
APP-NAME
- format:
APP-NAME = NILVALUE / 1*48PRINTUSASCII
- this field identify the device or application that originated the message.
- could be used for filtering messages on a relay or collector.
PROCID
- format:
PROCID = NILVALUE / 1*128PRINTUSASCII
- this field is often used to provide the process name or process ID associated with a syslog system.
- PROCID can also be used to identify which messages belong to a group of messages.
(For example, an SMTP mail transfer agent might put its SMTP transaction ID into PROCID.)
MSGID
- format:
MSGID = NILVALUE / 1*32PRINTUSASCII
- this field identify the type of message.
(For example, a firewall might use the MSGID "TCPIN" for incoming TCP traffic and the MSGID "TCPOUT" for outgoing TCP traffic.)
STRUCTURED-DATA
- format:
STRUCTURED-DATA = NILVALUE / 1*SD-ELEMENT
- STRUCTURED-DATA provides a mechanism to express information in a well defined, easily parseable and interpretable data format.
- Each character set used in STRUCTURED-DATA must be seven-bit ASCII in an eight-bit field.
SD-ELEMENT
- format:
SD-ELEMENT = "[" SD-ID *(SP SD-PARAM) "]"
- STRUCTURED-DATA can contain zero, one, or multiple SD-ELEMENT.
- An SD-ELEMENT consists of a name and name-value pairs. The name is referred to as SD-ID. The name-value pairs are referred to as "SD-PARAM".
SD-ID
- format:
- identify the type and purpose of the SD-ELEMENT
- case-sensitive
- there are two formats for SD-ID names:
- IANA-registered names (discussed below, in IANA-registered SD-IDs and PARAM-NAMEs)
- names in the format
name@<private enterprise number>
that could be defined by anyone, e.g., example@32473
.
Both the first type and the part preceding the at-sign in the second type SD-ID must not contain an at-sign (@), an equal-sign (=), a closing brace (]), a quote-character ("), whitespace, or control characters (ASCII code 127 and codes 32 or less).
- example (discussed below, in SD-PARAM)
SD-PARAM
- format:
- Each SD-PARAM consists of a name, referred to as PARAM-NAME, and a value, referred to as PARAM-VALUE.
- PARAM-NAME is case-sensitive.
- IANA controls all PARAM-NAMEs, with
the exception of those in SD-IDs whose names contain an at-sign. (registered type is discussed below, in IANA-registered SD-IDs and PARAM-NAMEs)
- An SD-PARAM MAY be repeated multiple times inside an SD-ELEMENT.
- example
- It is a STRUCTURED-DATA with 2 SD-ELEMENTs, each is in brackets
[]
.
- In the first element, the SD-ID is
exampleSDID@32473
, and there are three SD-PARAMs with PARAM-NAMEs as iut
, eventSource
, eventID
repectively; and with PARAM-VALUE
as 3
, Application
, 1011
respectively.
- In the second element, the SD-ID is
examplePriority@32473
, and there is one SD-PARAM with PARAM-NAME as class
, and and with PARAM-VALUE
as high
IANA-registered SD-IDs and PARAM-NAMEs
- registered SD-IDs and PARAM-NAMEs table (these PARAM-NAMEs should be used with the associated SD-ID)
SD-ID |
PARAM-NAME |
timeQuality |
tzKnown, isSynced, syncAccuracy |
origin |
ip, enterpriseId, software, swVersion |
meta |
sequenceId, sysUpTime, language |
- timeQuality
It is used by the originator to describe its system time when it is not properly synchronized with a reliable external time.
- tzKnown
- It indicates whether the originator knows its time zone (
1
for yes, 0
for no).
- isSynced
- It indicates whether the originator is synchronized to a reliable external time source (
1
for yes, 0
for no).
- syncAccuracy
- It indicates how accurate the originator thinks its time synchronization is.
- It is an integer describing the maximum number of microseconds that its clock may be off between synchronization intervals.
- It can be used only when the value of isSynced is
1
- example:
[timeQuality tzKnown="1" isSynced="1" syncAccuracy="60000000"]
originator system time is synchronized but could be earlier or later than official time within 1 minute.
- origin
It may be used to indicate the origin of a syslog message.
-
ip
- It denotes an IP address of originator
- It can be used to provide identifying information in addition to what is present in the HOSTNAME field.
- If an originator has multiple IP addresses, it may either list one of its IP addresses in the "ip" parameter or include multiple "ip" parameters in a single "origin" structured data element.
- example:
[origin ip="192.0.2.1" ip="192.0.2.129"]
-
enterpriseId
- It must be a 'SMI Network Management Private Enterprise Code', registered and maintained by IANA, whose prefix is iso.org.dod.internet.private.enterprise (1.3.6.1.4.1).
- example:
The enterprise number of UNIX registered in PRIVATE ENTERPRISE NUMBERS is 4, so the enterpriseId of UNIX would be 1.3.6.1.4.1.4
-
software
- It identifies the software that generated the message.
- If it is used, "enterpriseId" should also be specified.
- It parameter is a string no longer than 48 characters.
- It is not the same as the APP-NAME header field. It MUST always contain the name of the generating software, whereas APP-NAME can contain anything else, including an operator-configured value.
-
swVersion
- It identifies the version of the software that generated the message.
- If it is used, the "software" and "enterpriseId" parameters should be provided, too.
- It is a string no longer than 32 characters.
- meta
It is used to provide meta-information about the message. If the "meta" SD-ID is used, at least one parameter should be specified.
- sequenceId
- It tracks the originator messages submitting sequence.
- It is an integer set to 1 when the syslog function is started and increased with every message up to a maximum value of 2147483647, and set to 1 when it reaches to the maximum value.
- sysUpTime
- It used to include the SNMP "sysUpTime" parameter in the message, and the value is represented as a decimal integer.
- The semantics of the SNMP "sysUpTime" is defined in RFC3418, which are "The time (in hundredths of a second) since the network management portion of the system was last re-initialized."
- language
- It specified the natural language used inside MSG (language identifiers are defined in BCP 47).
MSG
- format:
- MSG contains a free-form message that provides information about the event.
- The character set used in MSG should be UNICODE, encoded using UTF-8. If the syslog application cannot encode the MSG in Unicode, it may use any other encoding.
- If a syslog application encodes MSG in UTF-8, the string must start
with the Unicode byte order mask (BOM), which for UTF-8 is ABNF %xEF.BB.BF.
syslog messages examples
-
with no STRUCTURED-DATA
PRI: <34>
, Facility: 4
, Severity: 2
VERSION: 1
timestamp:2003-10-11T22:14:15.003Z
HOSTNAME: mymachine.example.com
.
APP-NAME: su
PROCID: unknown, displayed as NULVALUE -
MSGID: ID47
. The MSG is 'su root' failed for lonvick...
, encoded in UTF-8.
There is no STRUCTURED-DATA, displayed as NULVALUE -
-
with STRUCTURED-DATA
-
with no MSG