The Interactive Data Exploration and Analytics (IDEA) workshop addresses the development of data mining techniques that allow users to interactively explore their data. We focus and emphasize on interactivity and effective integration of techniques from data mining, visualization and human-computer interaction (HCI). In other words, we explore how the best of these different but related domains can be combined such that the sum is greater than the parts. Last years' IDEAs at KDD 2013 in Chicago and KDD 2014 in New York City were great successes.
IDEA will be a full-day workshop on Monday, Aug 10, at KDD 2015 at the Hilton Sydney (map), on Level 4, Room 1. Register and book hotel rooms through KDD's registration site.
We are proud to have Microsoft Research as the Headline supporter of IDEA 2015!
You are cordially invited to join the Microsoft Research supported Poster + Interactive Demo + Networking session at 4:10 P.M.!
In total, 9 papers were accepted at IDEA 2015, for oral presentation over the day, and for interactive discussion at the poster + demo + network session.
8:50 | Welcome |
9:00 |
Keynote 1
Monash University The Knowledge Factory: A Retrospective
Geoff Webb is a Professor of Information Technology Research in the Faculty of Information Technology at Monash University, where he heads the Centre for Data Science. His primary research areas are machine learning, data mining, user modeling and computational structural biology. Many of his learning algorithms are included in the widely-used Weka machine learning workbench. A commercial implementation of his association discovery techniques, Magnum Opus, has been acquired by BigML Inc for inclusion in their cloud-based data mining solution. He was editor-in-chief of the highest impact data mining journal, Data Mining and Knowledge Discovery from 2005 to 2014. He is co-editor of the Springer Encyclopedia of Machine Learning, a member of the advisory board of Statistical Analysis and Data Mining, a member of the editorial board of Machine Learning and was a foundation member of the editorial board of ACM Transactions on Knowledge Discovery from Data. He has been Program Committee Co-Chair of the two top data mining conferences, ACM SIGKDD International Conference on Knowledge Discovery from Data (2015) and the IEEE International Conference on Data Mining (2010) and General Co-Chair of the 2012 IEEE International Conference on Data Mining. He is a technical advisor to BigML, Inc. He is an IEEE Fellow and has received the 2013 IEEE ICDM Service Award and a 2014 Australian Research Council Discovery Outstanding Researcher Award.
Abstract
This talk revisits the first major program of research into interactive rule discovery. The Knowledge Factory is an interactive rule learning system developed in the 1990s. It has many novel features that still remain relevant today. The talk will cover the key techniques that the research developed, interpreting them in the light of subsequent developments in the field. |
9:50 |
Research Talks (time allocation: 15+5 each)
Check www.realkd.org for the free open source implementation. |
10:30 | Coffee |
11:00 |
Research Talks (time allocation: 15+5 each)
|
12:20 | Lunch |
1:45 | Re-welcome |
1:50 | Keynote 2
Stanford University Machine Learning for Human Decision Making
Jure Leskovec is assistant professor of Computer Science at Stanford University and chief scientist at Pinterest. His research focuses on mining large social and information networks. Problems he investigates are motivated by large scale data, the Web and on-line media. This research has won several awards including a Microsoft Research Faculty Fellowship, the Alfred P. Sloan Fellowship and numerous best paper awards. Leskovec received his bachelor's degree in computer science from University of Ljubljana, Slovenia, and his PhD in in machine learning from the Carnegie Mellon University and postdoctoral training at Cornell University. Jure also co-founded a machine learning startup Kosei which was recently acquired by Pinterest. You can follow him on Twitter @jure
Abstract
In many real-life settings human judges are making decisions and choosing among many alternatives in order to label or classify items: Medical doctor diagnosing a patient, criminal court judge making a decision, a crowd-worker labeling an image, and a student answering a multiple-choice question. Gaining insights into human decision making is important for determining the quality of individual decisions as well as identifying mistakes and biases. In this talk we discuss the question of developing machine learning methodology for estimating the quality of individual judges and obtaining diagnostic insights into how various judges decide on different kinds of items. We develop a series of increasingly powerful hierarchical Bayesian models which infer latent groups of judges and items with the goal of obtaining insights into the underlying decision process. We apply our framework to a wide range of real-world domains, and demonstrate that our approach can accurately predict judges decisions, diagnose types of mistakes judges tend to make, and infer true labels of items. |
2:40 |
Research Talks (time allocation: 15+5 each)
|
3:00 | Coffee |
3:30 | Talks (time allocation: 15+5 each)
|
4:10 |
— with Teacakes, croissants, appetizers and drinks — |
5:20 | Closing |
Submission | |
Notification | |
Camera-ready | |
Workshop | Mon, August 10, 2015 |
All papers will be peer reviewed, single-blinded. We welcome many kinds of papers, such as (and not limited to):
Authors should clearly indicate in their abstracts the kinds of submissions that the papers belong to, to help reviewers better understand their contributions. Submissions must be in PDF, written in English, no more than 10 pages long — shorter papers are welcome — and formatted according to the standard double-column ACM Proceedings Style (Tighter Alternate style).
For accepted papers, at least one author must attend the workshop to present the work.
For paper submission, proceed to the IDEA 2015 submission website.
We have entered the era of big data. Massive datasets, surpassing terabytes and petabytes, are now commonplace. They arise in numerous settings in science, government, and enterprises. Today, technology exists by which we can collect and store such massive amounts of information. Yet, making sense of these data remains a fundamental challenge. We lack the means to exploratively analyze databases of this scale. Currently, few technologies allow us to freely "wander" around the data, and make discoveries by following our intuition, or serendipity. While standard data mining aims at finding highly interesting results, it is typically computationally demanding and time consuming, thus may not be well-suited for interactive exploration of large datasets.
Interactive data mining techniques that aptly integrate human intuition, by means of visualization and intuitive human-computer interaction (HCI) techniques, and machine computation support have been shown to help people gain significant insights into a wide range of problems. However, as datasets are being generated in larger volumes, higher velocity, and greater variety, creating effective interactive data mining techniques becomes a much harder task.
Our focus and emphasis is on interactivity and effective integration of techniques from data mining, visualization and human-computer interaction. In other words, we intend to explore how the best of these different but related domains can be combined such that the sum is greater than the parts.