KDD Cup 2000

Held in conjunction with the Sixth ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

Co-Chairs:

Carla Brodley, School of Electrical and Computer Engineering , Purdue University
Ronny Kohavi , Blue Martini Software
Special thanks to Brian Frasca, Llew Mason, and Zijian Zheng from Blue Martini Software and Ben Bernstein from Gazelle.com
Thanks to Acxiom for providing data enhancements.

Email: kddcup2000@bluemartini.com

Summary talk presented at KDD (8/20/2000)
KDD-Cup 2000 organizers' report: Peeling the onion. SIGKDD Explorations, 2(2):86-98, 2000

Cups in previous years: KDD Cup 99, KDD Cup 98 (data)

General Information (updated Apr 2002)

The KDD Cup 2000 domain contains clickstream and purchase data from Gazelle.com, a legwear and legcare web retailer that closed their online store on 8/18/2000.

You are required to sign a non-disclosure agreement in order to receive a password to access the data, although the original restrictions have been dramatically relaxed on Apr 2002 to allow wider use of the data.  Basically, any use of the data is allowed as long as the proper acknowledgment is provided and a copy of the work is provided to Blue Martini Software.

In order to access the data, you must fill out the form on this page . Your username and password will be emailed to you.

When you have received a username and password (see above), you can go to the confidential section of the site, which contains a description of the tasks, the data, background information, and more.

The reference to the KDD Cup 2001 is as follows (a PDF is available here):

Ron Kohavi, Carla Brodley, Brian Frasca, Llew Mason, and Zijian Zheng.  KDD-Cup 2000 organizers' report: Peeling the onion. SIGKDD Explorations, 2(2):86-98, 2000. http://robotics.stanford.edu/users/ronnyk/kddOrganizerReport.pdf

The bibtex entry is:

    @Article{kddcup2000,
    author = {Ron Kohavi and Carla Brodley and Brian Frasca and Llew Mason and Zijian Zheng},
    title = {{KDD-Cup} 2000 Organizers' Report:  Peeling the Onion},
    journal = {SIGKDD Explorations},
    volume = {2},
    number = {2},
    pages = {86--98},
    url = {http://robotics.stanford.edu/users/ronnyk/kddOrganizerReport.pdf},
    year = 2000}
A paper describing the Blue Martini architecture is available here
Suhail Ansari, Ron Kohavi, Llew Mason, and Zijian Zheng, Integrating E-Commerce and Data Mining: Architecture and Challenges, ICDM 2001.
Please remember the restrictions on the data.

Real Datasets for Association Rule Discovery (updated Oct 2002):

Three real-world datasets are available. You are required to sign a simple non-disclosure agreement in order to receive a password to access the data. Basically, any use of the data is allowed as long as the proper acknowledgment to Blue Martini Software is provided and a copy of the work is sent (e-mail is fine). For reference, please reference the following article instead of the KDD Cup paper:
Zijian Zheng, Ron Kohavi, and Llew Mason, Real World Performance of Association Rule Algorithms, KDD 2001.

The bibtex entry is:

    @inproceedings{ zheng-kohavi-mason-real-assoc,
    author = "Zijian Zheng and Ron Kohavi and Llew Mason",
    title = "Real World Performance of Association Rule Algorithms",
    booktitle = "Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining",
    editor={Foster Provost and Ramakrishnan Srikant},
    pages={401--406},
    year = 2001,
    url = {
    http://robotics.Stanford.EDU/users/ronnyk/realWorldAssoc.pdf}}
Note, a long version of the oroginal paper is available as well as the slides. Please remember the restrictions on the data.

KDD Cup Winners: click here for more information

There were five questions at the KDD Cup 2000. The results for Question 2 have been revised. When we calculated the results at Purdue, we had a subtle bug. The bug was uncovered thanks to Yoshinori Yaginuma who calculated his own score using the posted test data. We have corrected for this bug and have posted the new results for Question 2 (11/20/00).

Question 1 Winner: Amdocs ( Paper , Poster )

Given a set of page views, will the visitor view another page on the site or will the visitor leave?

Honorable Mentions: Mui Seng Martin Lee, Chong Jin Ong and S. Sathiya Keerthi of Mechanical and Production Engineering Department, National University of Singapore
 

Question 2 Winner: Salford Systems, Inc

Given a set of page views, which product brand will the visitor view in the remainder of the session?

Honorable Mentions: MP13 team of Alexei Vopilov, Ivan Shabalin and Vladimir Mikheyev, and the team of Mukund Deshpande, George Karypis, Department of Computer Science and Engineering, University of Minnesota

Question 3 Winner: Salford Systems, Inc

Given a set of purchases over a period of time, characterize visitors who spend more than $12 (order amount) on an average order at the site.

Honorable Mentions: Orit Rafaely, Tel-Aviv University and Amdocs

Question 4 Winner: e-steam ( Poster )

Given a set of page views, characterize killer pages, i.e., pages after which users leave the site.

Honorable Mentions: SAS, Amdocs, and LLSoft, Ltd

Question 5 Winner: Amdocs ( Paper , Poster )

Given a set of page views, characterize which product brand a visitor will view in the remainder of the session?

Schedule (passed):

Summary talk presented at KDD 8/20/2000
KDD-Cup 2000 organizers' report: Peeling the onion. SIGKDD Explorations, 2(2):86-98, 2000