Polo Chau | Tue, 3:30PM-4:00PM + FREE after-class coffee, at Clough Starbucks |
Klaus 1324 | |
Kiran Sudhir Head TA |
Thu, 3:05PM-4:05PM | Klaus- Open area next to Polo’s office | |
Varun Bezzam | Wed, 12:30PM-1:30PM | Klaus- Open area next to Polo’s office | |
Yuyu Zhang | Wed, 12:30PM-1:30PM | Klaus- Open area next to Polo’s office | |
Akanksha Bindal | Mon, 11:00AM-12:00PM | Klaus- Open area next to Polo’s office | |
Vishal Bhatnagar | Mon, 11:00AM-12:00PM | Klaus- Open area next to Polo’s office | |
Vivek Iyer | Thu, 3:05PM-4:05PM | Klaus- Open area next to Polo’s office |
Everyone must join this class's Piazza, at https://piazza.com/gatech/fall2017/cse6242aqcx4242a/.
Double check that you are joining the right Piazza!When you have questions about class, homework, project, etc., post your questions there. Our teaching staff and your fellow classmates will help answer them quickly. You can also use Piazza to find project teammates.
T-square will only be used for submission of assignments and projects.
While we welcome everyone to share their experiences in tackling issues and helping each other out, but please do not post your answers, as that may affect the learning experience of your fellow classmates.
Wk | Dates | Topics | Tue | Thu | Events (eastern time; EST) | |
---|---|---|---|---|---|---|
1 | Aug | 22, 24 | * Course introduction * Big data analytics building blocks |
intro | building blocks, data collection | |
2 | 29, 31 |
* Data Collection, and simple storage (SQLite) * Data cleaning * Class Project overview; Heilmeier questions * Data Cleaning: Starting with the End in Mind (VantagePoint) by Prof. Alan Porter, Denise Chiavetta, and Stephen Carley |
SQLite, Data cleaning | VantagePoint lecture, project overview | HW1 out | |
3 | Sept | 5, 7 |
* Example projects:
(1) Firebird: Predicting Fire Risks in Atlanta, by Shang-Tse Chen (2) PASSAGE: A Travel Safety Assistant, by Nilaksh Das * GT Github; one drive * Data integration: knowledge graph; data reconciliation/de-duplication; similarity functions |
Firebird, PASSAGE | Github, data integration, analytics concepts | |
4 | 12, 14 |
* Visualization 101 |
Irma | vis101 | HW1 due (Fri, 9/15,11:55pm) | |
5 | 19, 21 |
* Data visualization for the web (D3) * Fixing common visualization issues (* Fixing presentation issues) |
vis fix | D3 | Form project teams by Fri, 9/22; HW2 out |
|
6 | 26, 28 |
* Scaling up: Hadoop, Pig, Hive * Data analytics concepts & tasks * Overview of project proposal and presentation |
Hadoop, Pig, Hive | analytics tasks, project proposal and presentation | ||
7 | Oct | 3, 5 |
* Scaling up: Spark, Spark SQL * Scaling up: HBase |
spark | hbase | |
8 | 10, 12 |
* Classification key concepts, k-NN, cross validation * Clustering: k-means, hierarchical clustering, DBSCAN |
X (student recess) |
classification, clustering | HW2 due (Wed, 10/11, 11:55pm); HW3 out |
|
9 | 17, 19 | Project proposal presentations | Show time! | Show time! | Project proposal & slides due (Mon, 10/16, 11:55pm) | |
10 | 24, 26 |
* Classification: decision tree, vis (ROC, AUC, confusion matrix) * Clustering vis * Ensemble method, bagging, random forests * Graph analytics
|
classification vis, clustering vis, ensemble, random forests | graph basics | ||
11 | Nov | 31, 2 |
* Graph analytics
|
graph centrality & algorithms | mmap | HW3 due (Fri, 11/3, 11:55pm) HW4 out |
12 | 7, 9 |
* Text analytics: concepts * Text analytics: algorithms (LSI=SVD) |
text analytics | cont'd | Proj progress report due (Fri, 11/10, 11:55pm) | |
13 | 14, 16 |
* Time series: algorithms, visualization, & applications |
time series: basics, linear forecast | time series: non-linear forecast, vis | ||
14 | 21, 23 | Thanksgiving | X | X | ||
15 | 28, 30 |
* Publication-quality figures * Project poster presentations |
pub-quality figures, lessons learned | Poster presentation. 4:30pm to 6pm-ish. Klaus Atrium. Pizza + drinks served! | HW4 due (Fri, 12/1, 11:55pm) | |
16 | Dec | 5 | * Closing words and course review |
lessons learned | X | Proj final report due (Tue, 12/5, 11:55pm) |