Kibana and ElasticSearch over Haproxy logs

One of the most interesting things about open source, is the capability of being able to deploy software that can compete with tools more emblazoned or “enterprise” grade appliance.

One of these software is HaProxy which, with some tuning and doc reading, can easily sustain 20k-30k connection per seconds on a 1GB dual core virtual machine.

Of course, during testing phase and during live you would need to see haproxy logs, in particular error logs but also the access one; since the amount of data is rather big, you’d prefer having another server that has to do with write on the disk and let haproxy only deal with load balancing.

This can be easily done with haproxy due to his inner ability to use a remote logging daemon without having to use the local one.

This can be easily achieved on the conf file by adding under global section:

log "syslogip"
log-send-hostname "haproxy hostname"

This will redirect to facility local2 of remote every haproxy log.

After you have done this simple step, one question arise: how to parse those logs and get some useful statistic?

What “database” should i go for writing all that data?

This is where tools like elasticsearch come helping you.

There’s an extension to make your rsyslog talk with elasticsearch so it’s all pretty much trivial to install and configure.

Start from a template like this one to configure how to send your info to elastic search (this one suppose your elastic search is on localhost)

module(load="omelasticsearch") # for outputting to Elasticsearch

# this is for index names to be like: logstash-YYYY.MM.DD
template(name="logstash-index" type="list") {
    property(name="timereported" dateFormat="rfc3339" position.from="1""4")
    property(name="timereported" dateFormat="rfc3339" position.from="6""7")
    property(name="timereported" dateFormat="rfc3339" position.from="9""10")

# this is for formatting our syslog in JSON with @timestamp
template(name="plain-syslog" type="list") {
      constant(value="\"@timestamp\":\"")     property(name="timereported" dateFormat="rfc3339")
      constant(value="\",\"host\":\"")        property(name="hostname")
      constant(value="\",\"severity\":\"")    property(name="syslogseverity-text")
      constant(value="\",\"facility\":\"")    property(name="syslogfacility-text")
      constant(value="\",\"tag\":\"")   property(name="syslogtag" format="json")
      constant(value="\",\"message\":\"")    property(name="msg" format="json")
# this is where we actually send the logs to Elasticsearch (localhost:9200 by default)
local2.* action(type="omelasticsearch" template="plain-syslog" searchIndex="logstash-index" dynSearchIndex="on")

#this is to avoid local2 being processed in other log rules
& ~

Why we use this template? because we want to adhere to logstash format, in case we want to play with that tool or use one of his output plugin.

At this point you can use Kibana for general looking to those logs.