# RFC5424 syslog Message Format introduction brief introduction to the [RFC5424](https://tools.ietf.org/html/rfc5424) syslog message format ## outline - definition - format overview - header - PRI - version - timestamp - hostname - app-name - procid - msgid - STRUCTURED-DATA - SD-ELEMENT - SD-ID - SD-PARAM - IANA-registered SD-IDs and PARAM-NAMEs - MSG - syslog messages examples ## definition - An "originator" generates syslog content to be carried in a message. - A "collector" gathers syslog content for further analysis. - A "relay" forwards messages, accepting messages from originators or other relays and sending them to collectors or other relays. ## format overview The syslog message has the following ABNF [[RFC5234]](https://tools.ietf.org/html/rfc5234#section-3.6) definition: ``` SYSLOG-MSG = HEADER SP STRUCTURED-DATA [SP MSG] HEADER = PRI VERSION SP TIMESTAMP SP HOSTNAME SP APP-NAME SP PROCID SP MSGID PRI = "<" PRIVAL ">" PRIVAL = 1*3DIGIT ; range 0 .. 191 VERSION = NONZERO-DIGIT 0*2DIGIT HOSTNAME = NILVALUE / 1*255PRINTUSASCII APP-NAME = NILVALUE / 1*48PRINTUSASCII PROCID = NILVALUE / 1*128PRINTUSASCII MSGID = NILVALUE / 1*32PRINTUSASCII TIMESTAMP = NILVALUE / FULL-DATE "T" FULL-TIME FULL-DATE = DATE-FULLYEAR "-" DATE-MONTH "-" DATE-MDAY DATE-FULLYEAR = 4DIGIT DATE-MONTH = 2DIGIT ; 01-12 DATE-MDAY = 2DIGIT ; 01-28, 01-29, 01-30, 01-31 based on ; month/year FULL-TIME = PARTIAL-TIME TIME-OFFSET PARTIAL-TIME = TIME-HOUR ":" TIME-MINUTE ":" TIME-SECOND [TIME-SECFRAC] TIME-HOUR = 2DIGIT ; 00-23 TIME-MINUTE = 2DIGIT ; 00-59 TIME-SECOND = 2DIGIT ; 00-59 TIME-SECFRAC = "." 1*6DIGIT TIME-OFFSET = "Z" / TIME-NUMOFFSET TIME-NUMOFFSET = ("+" / "-") TIME-HOUR ":" TIME-MINUTE STRUCTURED-DATA = NILVALUE / 1*SD-ELEMENT SD-ELEMENT = "[" SD-ID *(SP SD-PARAM) "]" SD-PARAM = PARAM-NAME "=" %d34 PARAM-VALUE %d34 SD-ID = SD-NAME PARAM-NAME = SD-NAME PARAM-VALUE = UTF-8-STRING ; characters '"', '\' and ; ']' MUST be escaped. SD-NAME = 1*32PRINTUSASCII ; except '=', SP, ']', %d34 (") MSG = MSG-ANY / MSG-UTF8 MSG-ANY = *OCTET ; not starting with BOM MSG-UTF8 = BOM UTF-8-STRING BOM = %xEF.BB.BF UTF-8-STRING = *OCTET ; UTF-8 string as specified ; in RFC 3629 OCTET = %d00-255 SP = %d32 PRINTUSASCII = %d33-126 NONZERO-DIGIT = %d49-57 DIGIT = %d48 / NONZERO-DIGIT NILVALUE = "-" ``` ## HEADER - format: `HEADER = PRI VERSION SP TIMESTAMP SP HOSTNAME SP APP-NAME SP PROCID SP MSGID` - Each character set used in the HEADER must be seven-bit ASCII in an eight-bit field. #### PRI - format ``` PRI = "<" PRIVAL ">" PRIVAL = 1*3DIGIT ; range 0 .. 191 DIGIT = %d48 / NONZERO-DIGIT NONZERO-DIGIT = %d49-57 ``` - `1*3` means "from 1 to 3", and `1*3DIGIT` allows a 1-digit, 2-digit, or 3-digit number. If the number on the left side of `*` is not given, the default value is 0; if the number on the right side of `*` is not given, the default value is infinity. - the content after `;` in a line is a comment - `/` means "or" - `%d49-57` is equivalent to `%d49 / %d50 / ... / %d57`, also equivalent to `"1" / "2" / ... / "9"` - PRIVAL = facility number * 8 + severity number severity number = PRIVAL % 8 facility = PRIVAL / 8 (the possible value of severity number is 0~7) - The facility number specifies the subsystem that produced the message, and the severity number difines the severity of the message. - facility table: ``` Numerical Facility Code 0 kernel messages 1 user-level messages 2 mail system 3 system daemons 4 security/authorization messages 5 messages generated internally by syslogd 6 line printer subsystem 7 network news subsystem 8 UUCP subsystem 9 clock daemon 10 security/authorization messages 11 FTP daemon 12 NTP subsystem 13 log audit 14 log alert 15 clock daemon (note 2) 16 local use 0 (local0) 17 local use 1 (local1) 18 local use 2 (local2) 19 local use 3 (local3) 20 local use 4 (local4) 21 local use 5 (local5) 22 local use 6 (local6) 23 local use 7 (local7) ``` - severity table: ``` Numerical Severity code 0 Emergency: system is unusable 1 Alert: action must be taken immediately 2 Critical: critical conditions 3 Error: error conditions 4 Warning: warning conditions 5 Notice: normal but significant condition 6 Informational: informational messages 7 Debug: debug-level messages ``` #### VERSION - format: ``` VERSION = NONZERO-DIGIT 0*2DIGIT ``` - this field denotes the version of the syslog system #### TIMESTAMP - format: ``` TIMESTAMP = NILVALUE / FULL-DATE "T" FULL-TIME FULL-DATE = DATE-FULLYEAR "-" DATE-MONTH "-" DATE-MDAY DATE-FULLYEAR = 4DIGIT DATE-MONTH = 2DIGIT ; 01-12 DATE-MDAY = 2DIGIT ; 01-28, 01-29, 01-30, 01-31 based on ; month/year FULL-TIME = PARTIAL-TIME TIME-OFFSET PARTIAL-TIME = TIME-HOUR ":" TIME-MINUTE ":" TIME-SECOND [TIME-SECFRAC] TIME-HOUR = 2DIGIT ; 00-23 TIME-MINUTE = 2DIGIT ; 00-59 TIME-SECOND = 2DIGIT ; 00-59 TIME-SECFRAC = "." 1*6DIGIT TIME-OFFSET = "Z" / TIME-NUMOFFSET TIME-NUMOFFSET = ("+" / "-") TIME-HOUR ":" TIME-MINUTE NILVALUE = "-" ``` - `TIME-SECFRAC` enclosed by `[]` such as`[TIME-SECFRAC]`, is equivalent to `*1TIME-SECFRAC`, which means it is an optional element. - elements enclosed by `()` are considered as a single element. (e.g. `("+" / "-") TIME-HOUR ":" TIME-MINUTE` means `"+" or "-" ...`, but not `"+" or "- ..."`) - examples - `2003-10-11T22:14:15.003Z` represents 11 October 2003 at 10:14pm, with 15.003 seconds into the next minute, in UTC. - `2003-08-24T05:14:15+08:00` represents 24 August 2003 at 05:14:15am in NST, which is UTC+8. #### HOSTNAME - format: ``` HOSTNAME = NILVALUE / 1*255PRINTUSASCII PRINTUSASCII = %d33-126 ``` - this field contains the hostname and the domain name of the originator in FQDN (fully qualified domain name) format (specified in [RFC1034](https://tools.ietf.org/html/rfc1034)) - not all syslog applications are able to provide an FQDN. As such, other values may also be present in HOSTNAME. The values could be FQDN, static IP address, hostname, dynamic IP address, or nilvalue (in the order of preference). #### APP-NAME - format: `APP-NAME = NILVALUE / 1*48PRINTUSASCII` - this field identify the device or application that originated the message. - could be used for filtering messages on a relay or collector. #### PROCID - format: `PROCID = NILVALUE / 1*128PRINTUSASCII` - this field is often used to provide the process name or process ID associated with a syslog system. - PROCID can also be used to identify which messages belong to a group of messages. (For example, an SMTP mail transfer agent might put its SMTP transaction ID into PROCID.) #### MSGID - format: `MSGID = NILVALUE / 1*32PRINTUSASCII` - this field identify the type of message. (For example, a firewall might use the MSGID "TCPIN" for incoming TCP traffic and the MSGID "TCPOUT" for outgoing TCP traffic.) ## STRUCTURED-DATA - format: `STRUCTURED-DATA = NILVALUE / 1*SD-ELEMENT` - STRUCTURED-DATA provides a mechanism to express information in a well defined, easily parseable and interpretable data format. - Each character set used in STRUCTURED-DATA must be seven-bit ASCII in an eight-bit field. #### SD-ELEMENT - format: `SD-ELEMENT = "[" SD-ID *(SP SD-PARAM) "]"` - STRUCTURED-DATA can contain zero, one, or multiple SD-ELEMENT. - An SD-ELEMENT consists of a name and name-value pairs. The name is referred to as SD-ID. The name-value pairs are referred to as "SD-PARAM". #### SD-ID - format: ``` SD-ID = SD-NAME SD-NAME = 1*32PRINTUSASCII; except '=', SP, ']', %d34 (") SP = %d32 ``` - identify the type and purpose of the SD-ELEMENT - case-sensitive - there are two formats for SD-ID names: 1. IANA-registered names (discussed below, in [IANA-registered SD-IDs and PARAM-NAMEs](#IANA-registered-SD-IDs-and-PARAM-NAMEs)) 2. names in the format `name@<private enterprise number>` that could be defined by anyone, e.g., `example@32473`. Both the first type and the part preceding the at-sign in the second type SD-ID must not contain an at-sign (@), an equal-sign (=), a closing brace (]), a quote-character ("), whitespace, or control characters (ASCII code 127 and codes 32 or less). - example (discussed below, in [SD-PARAM](#SD-PARAM)) #### SD-PARAM - format: ``` SD-PARAM = PARAM-NAME "=" %d34 PARAM-VALUE %d34 PARAM-NAME = SD-NAME PARAM-VALUE = UTF-8-STRING ; characters '"', '\' and ']' MUST be escaped. ``` - Each SD-PARAM consists of a name, referred to as PARAM-NAME, and a value, referred to as PARAM-VALUE. - PARAM-NAME is case-sensitive. - IANA controls all PARAM-NAMEs, with the exception of those in SD-IDs whose names contain an at-sign. (registered type is discussed below, in [IANA-registered SD-IDs and PARAM-NAMEs](#IANA-registered-SD-IDs-and-PARAM-NAMEs)) - An SD-PARAM MAY be repeated multiple times inside an SD-ELEMENT. - example ``` [exampleSDID@32473 iut="3" eventSource="Application" eventID="1011"][examplePriority@32473 class="high"] ``` - It is a STRUCTURED-DATA with 2 SD-ELEMENTs, each is in brackets `[]`. - In the first element, the SD-ID is `exampleSDID@32473`, and there are three SD-PARAMs with PARAM-NAMEs as `iut`, `eventSource`, `eventID` repectively; and with `PARAM-VALUE` as `3`, `Application`, `1011` respectively. - In the second element, the SD-ID is `examplePriority@32473`, and there is one SD-PARAM with PARAM-NAME as `class`, and and with `PARAM-VALUE` as `high` #### IANA-registered SD-IDs and PARAM-NAMEs - registered SD-IDs and PARAM-NAMEs table (these PARAM-NAMEs should be used with the associated SD-ID) | SD-ID | PARAM-NAME | | -------- | -------- | | timeQuality |tzKnown, isSynced, syncAccuracy| |origin|ip, enterpriseId, software, swVersion| |meta|sequenceId, sysUpTime, language| - timeQuality It is used by the originator to describe its system time when it is not properly synchronized with a reliable external time. - tzKnown - It indicates whether the originator knows its time zone (`1` for yes, `0` for no). - isSynced - It indicates whether the originator is synchronized to a reliable external time source (`1` for yes, `0` for no). - syncAccuracy - It indicates how accurate the originator thinks its time synchronization is. - It is an integer describing the maximum number of microseconds that its clock may be off between synchronization intervals. - It can be used only when the value of isSynced is `1` - example: `[timeQuality tzKnown="1" isSynced="1" syncAccuracy="60000000"]` originator system time is synchronized but could be earlier or later than official time within 1 minute. - origin It may be used to indicate the origin of a syslog message. - ip - It denotes an IP address of originator - It can be used to provide identifying information in addition to what is present in the HOSTNAME field. - If an originator has multiple IP addresses, it may either list one of its IP addresses in the "ip" parameter or include multiple "ip" parameters in a single "origin" structured data element. - example: `[origin ip="192.0.2.1" ip="192.0.2.129"]` - enterpriseId - It must be a 'SMI Network Management Private Enterprise Code', registered and maintained by IANA, whose prefix is iso.org.dod.internet.private.enterprise (1.3.6.1.4.1). - example: The enterprise number of UNIX registered in [PRIVATE ENTERPRISE NUMBERS](https://www.iana.org/assignments/enterprise-numbers/enterprise-numbers) is 4, so the enterpriseId of UNIX would be `1.3.6.1.4.1.4` - software - It identifies the software that generated the message. - If it is used, "enterpriseId" should also be specified. - It parameter is a string no longer than 48 characters. - It is not the same as the APP-NAME header field. It MUST always contain the name of the generating software, whereas APP-NAME can contain anything else, including an operator-configured value. - swVersion - It identifies the version of the software that generated the message. - If it is used, the "software" and "enterpriseId" parameters should be provided, too. - It is a string no longer than 32 characters. - meta It is used to provide meta-information about the message. If the "meta" SD-ID is used, at least one parameter should be specified. - sequenceId - It tracks the originator messages submitting sequence. - It is an integer set to 1 when the syslog function is started and increased with every message up to a maximum value of 2147483647, and set to 1 when it reaches to the maximum value. - sysUpTime - It used to include the SNMP "sysUpTime" parameter in the message, and the value is represented as a decimal integer. - The semantics of the SNMP "sysUpTime" is defined in [RFC3418](https://tools.ietf.org/html/rfc3418), which are "The time (in hundredths of a second) since the network management portion of the system was last re-initialized." - language - It specified the natural language used inside MSG (language identifiers are defined in [BCP 47](https://tools.ietf.org/html/bcp47)). ## MSG - format: ``` MSG = MSG-ANY / MSG-UTF8 MSG-ANY = *OCTET ; not starting with BOM MSG-UTF8 = BOM UTF-8-STRING BOM = %xEF.BB.BF UTF-8-STRING = *OCTET ; UTF-8 string as specified in RFC 3629 OCTET = %d00-255 SP = %d32 ``` - MSG contains a free-form message that provides information about the event. - The character set used in MSG should be UNICODE, encoded using UTF-8. If the syslog application cannot encode the MSG in Unicode, it may use any other encoding. - If a syslog application encodes MSG in UTF-8, the string must start with the Unicode byte order mask (BOM), which for UTF-8 is ABNF `%xEF.BB.BF.` ## syslog messages examples 1. with no STRUCTURED-DATA ``` <34>1 2003-10-11T22:14:15.003Z mymachine.example.com su - ID47 - BOM'su root' failed for lonvick on /dev/pts/8 ``` PRI: `<34>`, Facility: `4`, Severity: `2` VERSION: `1` timestamp:`2003-10-11T22:14:15.003Z` HOSTNAME: `mymachine.example.com`. APP-NAME: `su` PROCID: unknown, displayed as NULVALUE `-` MSGID: `ID47`. The MSG is `'su root' failed for lonvick...`, encoded in UTF-8. There is no STRUCTURED-DATA, displayed as NULVALUE `-` 2. with STRUCTURED-DATA ``` <165>1 2003-10-11T22:14:15.003Z mymachine.example.com evntslog - ID47 [exampleSDID@32473 iut="3" eventSource= "Application" eventID="1011"] BOMAn application event log entry... ``` 3. with no MSG ``` <165>1 2003-10-11T22:14:15.003Z mymachine.example.com evntslog - ID47 [exampleSDID@32473 iut="3" eventSource= "Application" eventID="1011"][examplePriority@32473 class="high"] ```