Samisa has done a Book on RESTful PHP Web services. That is the second book from Axis2 teams, and WSO2, the first been Quickstart Apache Axis2 by Deepal. Congratulations Samisa, having work on few papers, I can imagine what it take to write a book!!
Could we use the idea of Wisdom of the Crowd (e.g. tagging, comments, ranking etc) with research papers? In a way, we already do so, using citations. However, citation is a slow porcesses, and at best, it take around six months to a citaton to appear. I believe, it is interesting to enable comments, tags, and recommendation, etc to enable more involved discussion. Like it or not, there is a lack of discussion between academia and industry (research labs of companies does not qulify as industry), and one reason been, people from the industry do not have time, energy, or intensive, to write a paper and go through the process of publishing it, even though they do have a comment, or improvement to a idea. However, given that comments are enabled, there is a better chance that more people comment. Of course, there will be a concern about quality of comments, but just like Wikipedia, the quality will prevail in the long run.
We are seen many posts on blogs lead to lengthy discussions on various aspects, and more often than not, research work has more than one aspects, and could benefit from more involved discussions. Therefore, features likes comments, tags would help. May be, one day we could argument the peer reviews using similar ideas, making the paper really peer reviewed by all the peers.
Furthermore, in my opinion we should be ashamed that, CS papers, despite being the state art of information processing, are very hard to search, and categorize. When you think of the papers, they do have well defined relationships in terms of citations. But, if I picked a paper now, how hard is it to understand provenance of it’s idea. If we create a graph linked by citations, and weight them using something like page rank algorithm (Google algorithm to rank web pages), we can easily identify Hubs (both authoritative papers, and authors, and may be even groups), and important paths of development (provenance of ideas). I am sure this is already proposed somewhere elase, and maybe some tools already has it. But I think it is shame that ACM, or IEEE sites does not support it. We should use results of our own reserach, before expect other people to use them.
Acceptance Rate and Conference Impact Rank are two measures of a conference, and a list of those ranking can be found in Networking Conferences Statistics and http://citeseer.ist.psu.edu/impact.html respectively. Also few good lists of Call for Papers can be found in following sites. Some are old, but one can Google for new version of those conferences.
This site ( http://highscalability.com) links to what ever available data on large scale systems, starting from Google but going to lot of others. I found it interesting, as there is nothing like “really doing it!!”, and I love to hear about first hand experience on scaling up!!.
These days, I am playing with a large scale messaging system (broker network). Things great when it works, but when things start to go wrong with one of these things, you do not need to be there.
One big down side of Asynchronous messaging is it is so hard to debug, specially when messages jump from node to node, and your are not the author of the code (then you do not know the magic places where to put stdouts 😦 ).
Following is a little trick that helped me, it is not a silver bullet, but does give some comfort. You need to start with followings.
1. Log the message ID, or some unique ID with every log message. Usually with luck developers have already done this.
2. Set the log4j to print the time stamp first thing in each log statement
Assuming you were able to get all the logs to a same directory (my case I have NFS mounted across all nodes, so that was easy), following command will list all the things related to a given message, in the real time order, so you can walk though it and tell what happend. (sort simply sort them with log4j time stamps)
grep message-id *.log| sed -‘s/.*\.log://’|sort
It is a simple command, but could be very useful. Also if you need to merge all the logs, in time order do cat *|sort. As you could guess, there are many variations of this. Actually maybe message system developer should put down some standard format for message sending, receiving and routing, which let people to write log mining code/scripts that can uncover problems.
If anyone knows some useful log mining tool, please! please! drop me a note :).
With advent of the cloud, learning to do large scale parallel programming is becoming a useful skill. For an example, Google needs graduates to learn those things in the University. This is a old field, but has surfaced since the cloud. NSF did a workshop on the topic, 2008 NSF Data-Intensive Scalable Computing in Education Workshop. There some material, and pointers there.
As you would guess, Map-Reduce is the kind of the start point, but there are many others.