Fast ETL with GOlang mongo and elasticsearch

June 30, 2015

I recently had the chance to write an ETL to import data from mogno into elasticsearch, to then show some graphics through kibana
I must say I found golang to be a very handy tool: blazing fast execution, quite easy to write.

The etl had to retrieve records from a time based series stored in multiple mongodb collection.
Some transformation of the data occurs and then everything is put inside elasticsearch via a bulk insert.
The etl must run every 4 seconds and the amount of data retrieved can be very large, so I made it run chunks of imports, having the chunksize being configurable with a commandline parameter

GOlang structs (types) came in very handy for this job
I’ve been able to perform almost 2k read-transform-load per second this way, on a low end box
The number reached ~5k on my dev computer (i7 + 8GB + SSD)

I then wrote a simple bash script to start an infinte import loop, importing the last 5 seconds every 4 seconds

Tags: , , , ,

Leave a Reply

Your email address will not be published. Required fields are marked *

ERROR: si-captcha.php plugin says GD image support not detected in PHP!

Contact your web host and ask them why GD image support is not enabled for PHP.

ERROR: si-captcha.php plugin says imagepng function not detected in PHP!

Contact your web host and ask them why imagepng function is not enabled for PHP.