(Update: I finally got around to upload the conference posters in original higher resolution. This wasn't my usual cannon so sorry for the little grainy result but it should be kinda readable.)
KDD 2008 was a great learning experience, providing opportunities for life-long learning, career development, and professional networking. It helps very much in getting to know the data mining community, the current trends in knowledge discovery & machine learning areas and that all these prolific authors and researchers are actually humans, not robots like previously thought.
Social Networking was the dominating theme of the conference and research areas specified, no doubt about that. They key sessions were as follows.
- Trevor Hastie of Stanford University on “Regularization Paths and Coordinate Descent
- Thore Graepel of Microsoft Research on “Large Scale Data Analysis and Modeling in Online Services and Advertising”
- Michael Schwarz of Yahoo! Research on “Internet Advertising and Optimal Auction Design”
- Jitendra Malik of the University of California Berkeley on “The Future of Image Search"
One of my personal favorites was Foster Provost and Jennifer Neville's tutorial session on predictive modeling in social networks. Also, I got a chance to meet and talk to the the following luminaries of the genre.
- Professor Foster Provost, editor-in-Chief of the journal Machine Learning
- Ron Kohavi (GM for Microsoft's Experimentation Platform,)
- Gregory Piatetsky-Shapiro (Chair ACM SIGKDD)
And to see the following
However, I missed the chance of meeting Dr. Jaiwei Han. He had to leave early.
It was a well organized event with breaks. There are a few suggestions I have for improvement.
1. Provide a voting mechanism to allow people to choose their sessions of liking in advance; allocate the size of room according to the interest. This might not be perfect but will provide a good estimation. This is because some of the rooms were completely packed when people were sitting on the floor in the alleyway and some of them were half empty.
2. Full disclosure and reproducibility is important in academia and research. Some of the data used in the papers and presentations was unavailable for verification of the claims due to the proprietary nature of it, especially some of the vendor specific presentations (Yahoo, Microsoft and Orkut etc). There are very effective anonymization and privacy preserving techniques to allow the sharing.
3. Slides of the presentations should be made available to the attendees.
and next time, J'aime la vie en Paris!