Polo Chau | Tue, 3-4pm | Klaus 1324 |
Meera Manohar Kamath | Wed, 10.30-11.30am | Klaus 2126 |
Gopi Krishnan Nambiar | Tue, 12-1pm | CCB common area (1st floor) |
Siddharth Rajendra Raja | Wed, 3-4pm | TSRB 243 (Tentative) |
Ramakrishnan (Ramki) Kannan | Mon, 3-4pm | Klaus 1305 |
Akanksha | Thu, 3-4pm | CCB common area (1st floor) |
Date | Topic | Tue | Thu | Events | |
---|---|---|---|---|---|
Aug | 18, 20 | * Course introduction * Big data analytics building blocks, data Collection, and simple storage (SQLite) |
Slides | slides | |
25, 27 | * Data cleaning & integration * Data Mining Concepts & Tasks |
slides | slides | HW1 out | |
Sept | 1, 3 |
* Dimensionality Reduction: techniques, visualization, practitioner's guide -- by Ramakrishnan Kannan
* Visualization DOs and DON'Ts; Heilmeier Questions |
slides | slides | |
8, 10 |
* Example project: Wenwen Chang on Predicting Fire Risks in Atlanta * Visualization fundamentals by Chad Stolper |
Wenwen's slides slides |
Chad's vis101 slides | HW1 due (Fri, 11:55pm) | |
15, 17 |
* Data visualization for the web (D3) by Chad Stolper * Graph analytics
|
Chad's D3 slides | slides | Form project teams by Friday; HW2 out |
|
22, 24 |
* Continuing with graphs
* Scaling up: Hadoop, Pig |
slides | slides | ||
29, 1 |
* Scaling up: HBase, Hive |
slides | Q&A | HW2 due (Fri, 10/2, 11:55pm) | |
Oct | 6, 8 | Project proposal presentations | Show time! | Show time! | Project proposal & slides due (Mon, 10/5, 11:55pm) |
13, 15 | * Scaling up: Spark, Spark SQL |
Student recess; no class | slides | ||
20, 22 |
* Classification concepts, cross validation, k nearest neighbors, decision tree |
slides | Flavio Villanustre, VP, HPCC Systems & LexisNexis | HW3 out | |
27, 29 |
* Ensemble method, bagging, random forests * Classification (visualization & interaction) * Recommender Systems by Ramakrishnan Kannan |
slides | Ramki's slides | ||
Nov | 3, 5 | Analytics in practice: Nikolaos Vasiloglou II * Clustering |
Nick's slides | slides | HW3 due (Fri, 11/6, 11:55pm) |
10, 12 |
* Text analytics: concepts * Text analytics: algorithms (LSI=SVD) * Time series: algorithms, visualization, & applications |
slides | slides | HW4 out Project progress report due (Fri, 11:55pm EST) |
|
17, 19 |
* Time series: algorithms, visualization, & applications |
slides | |||
24, 26 | Thanksgiving | X | X | ||
Dec | 1, 3 | * Closing words and course overview * Project poster presentations |
Poster presentation. 5pm to 6pm-ish. Klaus 1116. Pizza + drinks served! | Proj final report due (Fri, 11:55pm EST) |
We use Piazza for discussion and all announcements.
Post your questions there. Our teaching staff and your fellow classmates will help answer them quickly. You can also use Pizza to find project teammates.
T-square will only be used for submission of assignments and projects.
While we welcome everyone to share their experiences in tackling issues and helping each other out, but please do not post your answers, as that may affect the learning experience of your fellow classmates.
Some assignments may involve web programming and D3 (e.g., Javascript, CSS).
You are expected to quickly learn many new things. For example, an assignment on Hadoop programming may require you to learn some basic Java and Scala quickly, which should not be too challenging if you already know another high-level language like Python or C++. Please make sure you are comfortable with this.
Please take a look at the assignments (homework and project) of the previous offerings of this course, which will give you some idea about the difficulty level of the assignments.
Basic linear algebra, probability knowledge is expected.
Prof. John Stasko - Information Visualization - Fall 2012
Prof. Jeff Heer - Research Topics in Interactive Data Analysis - Spring 2011
Prof. Christos Faloutsos - Multimedia Databases and Data Mining - Fall 2012