This course has concluded. See https://poloclub.github.io/#cse6242 for all past course offerings.

This course will introduce you to broad classes of techniques and tools for analyzing and visualizing data at scale. It emphasizes on how to combine computation and visualization to perform effective analysis. We will cover methods from each side, and hybrid ones that combine the best of both worlds. Students will work in small teams to complete a research project exploring novel approaches for interactive data & visual analytics.


Course Information

Instructor
Polo Chau      Thu 3-4pm, Klaus 1324
TAs
Parikshit Ram    Mon 4-5pm, Klaus 1315
Sooraj Bhat
Class meets
Tue, Thu 1:35 - 2:55, Klaus 2456
Q&A and discuss at
Piazza

Schedule (tentative)

Wk Date Topic
1 Jan 8 Course Introduction
10 Big data analytics process & building blocks
2 15 Data Collection, Simple Storage (SQLite) & Cleaning
17 Data Integration HW1 out
3 22 Visualization fundamentals
24 How to present your analysis (to your boss, or for research)
4 29 Classification (techniques) HW1 due
31 Classification (visualization & interaction)
5 Feb 5 Canceled
7 Clustering
6 12 Dimensionality Reduction (techniques)
14 Dimensionality Reduction (more tehniques, visualization, practitioner's guide)
7 19 Graphs I (basics, how to build and store graphs, laws, etc.) HW 1 grades out
21 Graphs II (centrality, algorithms)
8 26 Graphs III (Interactive tools, applications) HW2 out
28 Scaling up (Hadoop)
9 Mar 5 Scaling up (Pig, HBase)
7 Scaling up (HBase, Hive, Pegasus) Proposal due, 1:30pm
10 12 Project proposal presentations
14 Project proposal presentations
11 19 Spring break
21 Spring break
12 26 Human Computation 1  
28 Human Computation 2 HW2 due
13 Apr 2 Time series (algorithms)  
4 Time series (visualizaiton & applications) AWS Setup Guide, HW3 out;
Progress report due 4/5, 5pm
14 9 Text analytics (algorithms and concepts)
11 Text analytics (LSI=SVD, visualization)
15 16 Canceled
18 Review
16 23 Project presentations
25 Project presentations Final report due 4/26, 5pm;
HW3 due 4/26, 1:30pm;
bonus HW due 4/28, 11:59pm

Grading

Homework

HW1 8% solution
HW2 16%
HW3 14%
Bonus HW 8%

Late policy for deliverable

Project

Team project: 2-4 people. Description and grading policy (proposal + presentation, progress report, final report + presentation).

Textbooks and reading materials

Prerequisites

No formal prerequisites. However, students are expected to complete significant programming assignments (homework, project) that may involve higher-level languages or scripting (e.g., Java, R, Matlab, etc.). Basic algebra, probability knowledge is expected.

Acknowledgements & Related Classes