Introduced at 2011, and happening for the forth time, the grand challenge poses a real world data set and a set of problems that would keep distributed event enthusiastic busy for few months. Selected solutions are invited to the conference, and final showdown and announcement happen at the conference.
2013 challenge includes a football (soccer) game, which I blogged earlier. 2014 challenge is an smart metering use case, timely one given the advent of IoT, that included 4B (yes with a capital B) events collected over 6 weeks from 40 houses and 2000 sensors. You can find more details from http://www.cse.iitb.ac.in/debs2014/?page_id=42. (also Zbigniew Jerzak, The DEBS 2014 Grand Challenge, ACM Distributed Event based System, 2014)
In 2014 challenge, four solutions were accepted out of about 25 submissions: from Dresden University of Technology (Germany), Imperial College London, Fraunhofer Institute (Germany), and WSO2.
Problem involves two queries: predicting load and finding outlier sensors. The predicting algorithm is given, as this is a event-processing challenge, not an machine learning challenge. Both queries needed a single node solution as well as a distributed solution.
First query was easy to parallelize, and all four solutions had completed this successfully, posting single node throughputs of few thousands up to 300k and 400K events per second. Scaling first query turns out to be much trickier, and solutions had posted throughput of 0.85M with 4 nodes, 1.7 with 6 nodes, 1.1 with 50 nodes (events per second).
Second query includes a median calculated over 24 hour sliding window, with about 1000 events per second. This means about 74million events and calculated over sliding window sliding in 1 second. This had everyone swearing. Single node solutions gave throughputs ranging from few 100k to few millions.
Distributed solution for query two, well no one had it solved. All distributed solutions were slower than the single node solution. Obviously, there are lot more work to be done by the community at large.
There was stiff competition and everyone was at their toes though the presentations. Overall winner was solution from Fraunhofer Institute (#3) and audience award went to Imperial College London (#2).
It was lot of fun, and among my parting thought was “If 40 houses generate 4B events, how we are going to handle millions of houses?”
You can find our presentation from slideshare and paper from “Solving the grand challenge using an opensource CEP engine“. I will write a detailed blog about our solution soon. Other solutions presentations and papers are available in ACM library.