This version of Vestibule is read only. It represents an archive of the community effort to produce content for Ruby Manor 4.

Big Data and Ruby

updated over 4 years ago; latest suggestion over 4 years ago

This proposal has been withdrawn...

Update: we've manually withdrawn this proposal, so there is no confusion about voting for it. If you're willing to give a presentation about 'big data', propose one now!

Anybody out there doing serious Big Data work with Ruby?

Suggestions

  • Acd62030df551952268e84c8fff26a5b James Adam suggests over 4 years ago

    Vestibule isn't really suited to discussion of this kind; instead I suggest that you use the mailing list to gather some support and find someone willing to propose a talk.

  • B26921f172c966c9c3804aa8d8042713 Gavin Heavyside suggests over 4 years ago

    I've had good results in the past using Wukong from Infochimps (https://github.com/infochimps-labs/wukong) which uses Hadoop streaming.

  • 13ee02bb9012e843dbe7d0a23a6656f7 David Salgado suggests over 4 years ago

    I tried.

    We generate about 1TB of JSON log data per day, and we're starting to use a Hadoop cluster to analyse it.

    I wrote some map/reduce jobs using the Mandy gem, but ended up re-implementing them in Java (which was exactly as much fun as you'd expect) because it was somewhere between 5 and 7 times faster, IIRC.

    To be fair, I think the difference is mainly because Mandy depends on Hadoop streaming, and the java code doesn't.

    JRuby would probably have been almost, if not just as good.