This is the slide deck for introduction to Open Source and Apache Way talk I did at Apache Bar Camp 2012 at Engineering Faculty, University of Peradeniya. More info at http://readme.lk/apache-meetup-kandy/
There are several options.
- Using Zookeeper: Following two threads talk about this. It should be reasonably fast. Twitter guys have tried this and says it was bit slow.
- Cassandra: This has been raised several times, and answer was to use UUIDs (which does not work for us)
Then Cassandra introduced counters, but it does not support incrementAndGet() and no plan to do the future as well. So that does not work.
- Write a custom server: This is easy, basically create a service that give a increasing ID. But very hard to cluster this and behavior in case of a failure is complicated.
- “A timestamp, worker number and sequence number”: Twitter Guys created solution based on “a timestamp, worker number and sequence number” (this is kind of that we use as well, except that ran few dedicated servers for this) http://engineering.twitter.com/2010/06/announcing-snowflake.html
- Other Algos: Only looked at these briefly. But they are complicated.
Using DHTs: http://horicky.blogspot.com/2007/11/distributed-uuid-generation.html
A Fault-Tolerant Protocol for Generating Sequence Numbers for Total Ordering Group Communication in Distributed System, http://www.iis.sinica.edu.tw/page/jise/2006/200609_16.pdf
IMHO, “a timestamp, worker number and sequence number” is the best option. Only downside of this is that this assumes that broker nodes are loosely synced in time. Only other option I see is Zookeeper.