ELK, Docker and Spring Boot

In a previous post I plugged Docker’s logs into an ELK system using the brand new GELF plugin. The setup was simple and it was working great except for a couple of issues. This article will show how to improve the basic setup to better cope with real life logs, in my case the ones created by Spring Boot applications.

While this post focuses mainly on Spring Boot logs, it can be easily used for other logging systems in Java or any other languages at the cost of changing the regular expressions.

Logs Issues

Docker container logging works by capturing the container console output (stdout & stderr) and sending the lines to the log driver. This is great as many applications print one line for each logging event, at least most of the time. From here there are a couple of items that could be improved.

Log Message Semantics

The log messages often carry some semantics. For instance in Spring Boot the default log format is:

TIME_STAMP LOG_LEVEL PID --- [THREAD_NAME] LOGGER_NAME : LOG_MESSAGE

For instance:

2014-03-05 10:57:51.112  INFO 45469 --- [           main] org.apache.catalina.core.StandardEngine  : Starting Servlet Engine: Apache Tomcat/7.0.52
2014-03-05 10:57:51.253  INFO 45469 --- [ost-startStop-1] o.a.c.c.C.[Tomcat].[localhost].[/]       : Initializing Spring embedded WebApplicationContext
2014-03-05 10:57:51.253  INFO 45469 --- [ost-startStop-1] o.s.web.context.ContextLoader            : Root WebApplicationContext: initialization completed in 1358 ms
2014-03-05 10:57:51.698  INFO 45469 --- [ost-startStop-1] o.s.b.c.e.ServletRegistrationBean        : Mapping servlet: 'dispatcherServlet' to [/]
2014-03-05 10:57:51.702  INFO 45469 --- [ost-startStop-1] o.s.b.c.embedded.FilterRegistrationBean  : Mapping filter: 'hiddenHttpMethodFilter' to: [/*]

With the basic setup, each line is stored in the short_message field as a string. While the ELK stack features full search capabilities, it’ll be really better to have the those information is specific fields.

Multi-line Log Messages

In some cases, a logging event can be translated into several lines for instance when printing a stack trace. When this happens, ELK with store each of the lines as a separate events. Here again, it’ll be great to regroup the lines in a single ELK event.

Improved Setup

From the previous article, all changes will be done in the logstash configuration, more precisely by adding a filter section:

input {
    [...]
}
filter {
}
output {
    [...]
}

Merging Multiline Logs

Logstash provides a mechanism to merge several log events through the multiline filter. This filter works by matching one of the event’s field against a regular expression.

For Spring Boot, all log messages will start with a timestamp in the ISO format, this’ll be the indicator of a new log event. In addition we will need to configure Logstash to look for the log message in the short_message field while the default configuration is to look in to the message field.

The logs on multiple lines is fixed by adding the following lines in the filter section:

multiline {
    pattern => "^%{TIMESTAMP_ISO8601}"
    negate => true
    what => "previous"
    source => "short_message"
}

Filter vs input

The multiline feature is available either as part of the gelf input or as a separate filter. Using the input version would be better as the filter prevent us from using multithreaded filtering. However the input multiline can only use the message field which is always filled with an empty string by the gelf input filter. Follow the status of PR #18 that should fix this issue in a fore-coming release.

Semantics

Extracting data from the logs will be performed using the grok Logstash filter. From the log messages we’re going to extract:

the log level
the PID
the thread name
the logger name
the message (first line)
the stack trace (lines after the first)

We won’t use the timestamp as the one provided by Docker’s logging driver is already reliable enough. Au contraire we will use the log level from the messages rather than the one provided by the log driver as the log driver only distinguish between the messages printed on stdout (log level 6) and the ones printed on stderr (log level 3).

Grok’s configuration is pretty simple, well at least if you don’t look at the regular expressionish part of the configuration. Grok is using a pattern system based on regular expressions. In addition plain regular expressions, Grok features patterns that we are using to extract parts of the log messages. The basic syntax is %{PATTERN_NAME:field_name}. If you’re interested in Grok’s patterns you can should a look to this nice tool.

The filter configuration for grok will be:

grok {
    match => { "short_message" => "^%{TIMESTAMP_ISO8601}\s+%{LOGLEVEL:log_level}\s+%{NUMBER:pid}\s+---\s+\[\s*%{USERNAME:thread}\s*\]\s+%{DATA:class}\s*:\s*%{DATA:log_message}(?:\n%{GREEDYDATA:stack})?\n*$" }
}

Mutations

At this point, with the two filters above, you’ll have multiple line messages merged together into logging events and fields extracted from the messages. This section will do some clean up by replacing the message field with the extracted log_message and the digital level field with the extracted log_level. This will be done using the mutate filter with the following configuration:

mutate {
    replace => { "message" => "%{log_message}" }
    replace => { "level" => "%{log_level}"}
    remove_field => [ "log_level", "log_message" ]
}

The real configuration will also have to take care of cases where the grok pattern won’t be matched. It is available on this gist.

Afterthoughts

I installed this system on the test platforms at my client’s a couple of weeks ago and, to say it shortly, it rocks: no need to give ssh access to testers in order to let them have access to the logs, log messages for all servers available in a single place, permalinks to log events and of course the power of Elastic Search behind.

The multiline filter works flawlessly since the beginning. However the grok filter was a little be more troublesome and I while working 99% of the time, some log messages gave him a hard time. The current version should be working fine but it might need some tweaks to cope at some points.

Overview

Logs Issues
- Log Message Semantics
- Multi-line Log Messages
Improved Setup
Afterthoughts

Christophe Labouisse