KDD 2010 Educational Data Mining Challenge

KDD data mining challenge has we have decided to push back the challenge start date to Monday, April 19 at 2pm EDT due to feedback on development datasets. They are trying to validate the challenge data sets and also have pushed back the competition end date to Tuesday, June 8 at 2pm EDT.

Following is the KDD Data Mining challenge details via the homepage https://pslcdatashop.web.cmu.edu/KDDCup/

This year's challenge

How generally or narrowly do students learn? How quickly or slowly? Will the rate of improvement vary between students? What does it mean for one problem to be similar to another? It might depend on whether the knowledge required for one problem is the same as the knowledge required for another. But is it possible to infer the knowledge requirements of problems directly from student performance data, without human analysis of the tasks?

This year's challenge asks you to predict student performance on mathematical problems from logs of student interaction with Intelligent Tutoring Systems. This task presents interesting technical challenges, has practical importance, and is scientifically interesting.

Task description

At the start of the competition, we will provide 5 data sets: 3 development data sets and 2 challenge data sets. Each of the data sets will be divided into a training portion and a test portion. Student performance labels will be withheld for the test portion of the challenge data sets but available for the development data sets. The competition task will be to develop a learning model based on the challenge and/or development data sets, use this algorithm to learn from the training portion of the challenge data sets, and then accurately predict student performance in the test sections. At the end of the competition, the actual winner will be determined based on their model's performance on an unseen portion of the challenge test sets. We will only evaluate each team's last submission of the challenge sets.