Fast ETL with GOlang mongo and elasticsearch

giugno 30, 2015
By

I recently had the chance to write an ETL to import data from mogno into elasticsearch, to then show some graphics through kibana
I must say I found golang to be a very handy tool: blazing fast execution, quite easy to write.

The etl had to retrieve records from a time based series stored in multiple mongodb collection.
Some transformation of the data occurs and then everything is put inside elasticsearch via a bulk insert.
The etl must run every 4 seconds and the amount of data retrieved can be very large, so I made it run chunks of imports, having the chunksize being configurable with a commandline parameter

GOlang structs (types) came in very handy for this job
I’ve been able to perform almost 2k read-transform-load per second this way, on a low end box
The number reached ~5k on my dev computer (i7 + 8GB + SSD)

I then wrote a simple bash script to start an infinte import loop, importing the last 5 seconds every 4 seconds

Tags: , , , ,

Lascia un commento

Il tuo indirizzo email non sarĂ  pubblicato. I campi obbligatori sono contrassegnati *


*