Polo Chau | Tue, 3:30-4:00pm (+ 30min after Tue's class at Clough Starbucks) |
Klaus 1324 |
Gopi Krishnan Nambiar | Mon, 9-10 AM | common area between Klaus 3201 and 3217 near the stairwell |
Nilaksh Das | Tue, 2-3 PM | CCB common area (1st floor) |
Pradeep Vairamani Rajendran | Wed, 2-3PM | common area between Klaus 3201 and 3217 near the stairwell |
Ajitesh Jain | Mon, 12.30 - 1.30pm | outside Klaus 1324 |
Vishakha Singh | Wed, 11.30-12.30 PM | CCB common area (1st floor) |
Date | Topic | Tue | Thu | Events | |
---|---|---|---|---|---|
Jan | 12, 14 | * Course introduction * Big data analytics building blocks |
intro | building blocks | |
19, 21 | * Data Collection, and simple storage (SQLite) * Data cleaning |
collection, cleaning | canceled | HW1 out | |
26, 28 |
* Data integration * Project showcase: Wenwen Chang on Predicting Fire Risks in Atlanta * Project showcase: PASSAGE: A Travel Safety Assistant |
integration | Fire Project slides, Passage Project slides | ||
Feb | 2, 4 |
* Data integration; similarity functions * Data Mining Concepts & Tasks * Heilmeier Questions * Visualization 101 |
concepts, heilmeier | vis101 | HW1 due (Fri, 11:55pm) |
9, 11 |
* Data visualization for the web (D3) * Digital Advertising and Analytics by Dr. Sam Franklin, VP of data science at 360i |
d3 | Form project teams by Friday; HW2 out |
||
16, 18 |
* Fixing common visualization issues * Intro to classification: cross validation, k nearest neighbor |
fix-vis | classification-intro, graph-basics | ||
23, 25 |
* Graph analytics
|
graph basics, centrality | graph algorithms, applications | HW2 due (Fri, 11:55pm) | |
Mar | 1, 3 |
* Scaling up: Hadoop, Pig, Hive * Scaling up: Spark, Spark SQL * PANDAS |
hadoop | pig, hive, spark, pandas | HW3 out |
8, 10 | Project proposal presentations | Show time! | Show time! | Project proposal & slides due (Mon, 11:55pm) | |
15, 17 |
* Analytics in practice II: Trey Grainger, CareerBuilder * Scaling up: SPARK stack * Scaling up: HBase |
hbase | |||
22, 24 | Spring Break | X | X | ||
29, 31 |
* Decision tree * Ensemble method, bagging, random forests * Classification (visualization & interaction) |
tree, bagging, random forests, vis | Canceled | HW3 due (Fri, 11:55pm) |
|
Apr | 5, 7 |
* Clustering * Text analytics: concepts * Text analytics: algorithms (LSI=SVD) |
clustering | text analytics | HW4 out Project progress report due (Fri, 11:55pm EST) |
12, 14 |
* Time series: algorithms, visualization, & applications |
time series concepts, algorithms | time series algorithms, vis | ||
19, 21 |
Dimension reduction (PCA, MDS, LDA, IsoMap) |
dimension reduction | HW4 due (Fri, 11:55pm) | ||
26 | Project poster presentations |
Poster presentation. 4:30pm to 6pm-ish. Klaus Atrium. Pizza + drinks served! | X | Proj final report due (Tue, 11:55pm EST) |
We use Piazza for discussion and all announcements.
Post your questions there. Our teaching staff and your fellow classmates will help answer them quickly. You can also use Pizza to find project teammates.
T-square will only be used for submission of assignments and projects.
While we welcome everyone to share their experiences in tackling issues and helping each other out, but please do not post your answers, as that may affect the learning experience of your fellow classmates.
Some assignments may involve web programming and D3 (e.g., Javascript, CSS).
You are expected to quickly learn many new things. For example, an assignment on Hadoop programming may require you to learn some basic Java and Scala quickly, which should not be too challenging if you already know another high-level language like Python or C++. Please make sure you are comfortable with this.
Please take a look at the assignments (homework and project) of the previous offerings of this course, which will give you some idea about the difficulty level of the assignments.
Basic linear algebra, probability knowledge is expected.
Prof. John Stasko - Information Visualization - Fall 2012
Prof. Jeff Heer - Research Topics in Interactive Data Analysis - Spring 2011
Prof. Christos Faloutsos - Multimedia Databases and Data Mining - Fall 2012