CMSC 491/691, Introduction to Data Science. Fall 2017
Computer Science & Electrical Engineering Department
University of Maryland Baltimore County


Schedule of Lectures

This page will be updated during the semester.


Topics Readings Notes Final exam reading list
Course overview and logistics
What is data science?
Data Science process
Chapter 1 of O'Neil and Schutt
There's More Than One Kind of Data Scientist by Harlan Harris.
CRISP-DM
CRISP-DM
Intro to data science (PPT) slides 1-10
Databases - SQL SQL PPT Slides slides 1-41
Databases - NoSQL PPT slides 1-22
A Crash Course in Python Chapter 2 of Grus
Appendix A of McKinney
4 min overview of jupyter
Try jupyter on your browser
Python Crash course
Pandas Pandas Library Documentation
McKinney's book on pandas
10 Minutes to pandas
Pandas Cookbook
Pandas I Pandas II
Getting to Know your data Chapter 2 of DMCT PPT PDF slides 1-41, 55-68
Data Preprocessing Chapter 3 of DMCT PPT PDF Slides 1-15, 23, 25, 26, 31, 34, 35, 42-48, 54-61
Clustering Chapter 10 of DMCT PPT PDF Slides 1-27
Linear regression Chapter 3 of ISLA PDF slides 1-26
Classification I
Decision Trees
Naive Bayes
Logistic regression Model evaluation and Selection
Bagging and Boosting
Random Forests
Chapter 4 of ISLA
Chapter 8 of DMCT
Logistic regression
PPT PDF
slides 1-11 of logistic reg
slides 1-13, 28-30, 48-55, 58-62, 68-70 of the PPT
Classfication II
Bayesian Belief Nets
Neural Nets and Backpropagation
Support Vector Machines
k-Nearest Neighbors
Active Learning
Transfer Learning
Chapter 9 of DMCT PPT PDF Slides 1-6, 12-17, 26-30, 35-40, 66-67
Outlier Detection Chapter 12 of DMCT PPT PDF slides 1-13
Neural Nets and Deep Learning
Dimensionality reduction (SVD and CUR) Chapter 11 of MMDS PPT PDF slides 7-18, 46-54
Recommendation Systems Chapter 9 of MMDS PPT PDF slides 9-23
Mining Social-Network Graphs Chapter 10 of MMDS PPT PDF slides 7-14, 19-30, 49-52
Distributed/Cloud computing, scaling up Amazon EC2 Tutorial
Hadoop and Map-Reduce Chapter 2 of MMDS
MapReduce Tutotial by Yahoo
Hadoop Streaming Framework
"Writing an Hadoop MapReduce Program in Python" by Michael Noll
"A Guide to Python Frameworks for Hadoop" by Uri Laserson
Making Python on Apache Hadoop Easier with Anaconda and CDH
PPT PDF slides 1-20, 23-25
Spark Apache Spark Tutorial: Machine Learning with PySpark An Overview of Spark by Jim Scott.
Intro to Spark by Matei Zaharia
slides 1-13, 15-26, 29-31, 38-51, 58 from Zaharia's PPT


© Copyright 2017- Dr. K. Kalpakis