Polo Chau | Tue, 3:30PM-4PM + FREE after-class coffee, at Clough Starbucks |
Klaus 1324 (Polo's office) | |
Neetha Ravishankar | Tue, 12-1pm | All TA office hours are held in the open area outside Polo's office | |
Jennifer Ma Head TA |
Mon, 1-2pm | ||
Mansi Mathur | Mon, 1-2pm | ||
Arathi Arivayutham | Wed, 11am-12pm | ||
Vineet Vinayak Pasupulety | Wed, 11am-12pm | ||
Siddharth Gulati | Tue, 12-1pm |
Everyone must join this class's Piazza, at https://piazza.com/gatech/spring2018/cse6242aqcx4242a.
Double check that you are joining the right Piazza!When you have questions about class, homework, project, etc., post your questions there. Our teaching staff and your fellow classmates will help answer them quickly. You can also use Piazza to find project teammates.
T-square will only be used for submission of assignments and projects.
While we welcome everyone to share their experiences in tackling issues and helping each other out, please do not post your answers, as that may affect the learning experience of your fellow classmates.
Wk | Dates | Topics | Tue | Thu | Events (eastern time) | |
---|---|---|---|---|---|---|
1 | Jan | 9, 11 | * Course introduction * Big data analytics building blocks * Data Collection |
intro | building blocks, buzz words, data collection | |
2 | 16, 18 |
* simple storage (SQLite) * Data cleaning * Class Project overview; Heilmeier questions * GT Github; one drive |
SQLite, Data cleaning | project overview; Github | HW1 out | |
3 | 23, 25 |
* Example projects:
(1) Firebird: Predicting Fire Risks in Atlanta, by Shang-Tse Chen (2) PASSAGE: A Travel Safety Assistant, by Nilaksh Das * Data integration: knowledge graph; data reconciliation/de-duplication; similarity functions |
Firebird, PASSAGE | Data integration | ||
4 | Feb | 30, 1 |
* Visualization 101 * Fixing common visualization issues |
vis101 | vis fix | HW1 due (Fri, 2/2,11:55pm) |
5 | 6, 8 |
* Fixing presentation issues * Data visualization for the web (D3) |
* Publication-quality figuresD3 |
Dr. Kevin Roundy, Symantec Research Labs pub-quality figures |
Form project teams by Fri, 2/9; HW2 out |
|
6 | 13, 15 |
* Data analytics concepts & tasks * Overview of project proposal and presentation * Scaling up: Hadoop, Pig, Hive |
analytics concepts, project proposal and presentation | Hadoop, Pig, Hive | ||
7 | 20, 22 |
* Scaling up: Spark, Spark SQL * Scaling up: HBase |
spark | hbase | ||
8 | Mar | 27, 1 |
* Classification key concepts, k-NN, cross validation |
classification | cont'd | HW2 due (Wed, 2/28, 11:55pm); HW3 out |
9 | 6, 8 | Project proposal presentations | Show time! | Show time! | Project proposal & slides due (Mon, 3/5, 11:55pm) | |
10 | 13, 15 |
* Ensemble method, bagging, random forests * Classification: decision tree, vis (ROC, AUC, confusion matrix) * Clustering: k-means, hierarchical clustering, DBSCAN * Clustering vis * Graph analytics
|
classification vis, clustering vis, random forests | graph basics | ||
11 | 20, 22 | Spring break | X | X | ||
12 | 27, 29 |
* Graph analytics
|
graph centrality & algorithms | mmap | HW3 due (Fri, 3/30, 11:55pm) HW4 out |
|
13 | Apr | 3, 5 |
* Text analytics: concepts |
text analytics | canceled | Proj progress report due (Fri, 4/6, 11:55pm) |
14 | 10, 12 |
* Text analytics: algorithms (LSI=SVD) * Time series: algorithms, visualization, & applications |
cont'd | time series: basics, linear forecast | ||
15 | 17, 19 |
* Closing words * Lessons learned |
time series: non-linear forecast, vis | course review; 10 lessons learned | HW4 due (Fri, 4/20, 11:55pm) | |
16 | 24 | * Project poster presentations | Poster presentation. 4:30pm to 5:45pm-ish. Klaus Atrium. Pizza + drinks served! | X | Proj final report due (Tue, 4/24, 11:55pm) |
The Office of Disability Services offers accommodations for students with disabilities. Please contact the office should you need help.