CMSC 491/691, Introduction to Data Science. Fall 2017
Computer Science & Electrical Engineering Department
University of Maryland Baltimore County
Course References
Books
Data Mining: Concepts and Techniques
by J. Han, M. Kamber, and J. Pei (3rd edition, 2011).
Alias: DMCT11
Mining of Massive Data Sets
Jure Leskovec, Anand Rajaraman, and Jeff Ullman (2014).
Alias: MMDS14
Introduction to Statistical Learning with Applications in R
by G. James, D. Witten, T. Hastie and R. Tibshirani (2013).
Alias: ISLA13
Slides and videos
are also available.
The Elements of Statistical Learning
by T. Hastie, R. Tibshirani, and J. Friedman (2009).
Alias: ESL09
A First Course in Design and Analysis of Experiment
Doing Data Science
by Cathy O'Neil and Rachel Schutt (2013).
Data Science from Scratch First Principles with Python.
by Joel Grus (2015).
Code samples
Python for Data Analysis (Data Wrangling with Pandas, NumPy, and IPython)
by Wes McKinney (2012).
Interactive Data Visualization for the Web
by S. Murray (2013).
Python and Software resources
Virtual Box VM and Ubuntu Linux
python
Anaconda
Jupyter
scikit-learn
(Machine Learning in Python)
Statsmodels
Statistics Python Library
seaborn: statistical data visualization
(Python Library)
pandas
Python Data Analysis Library
matplotlib 2-d
Installing Jupyter (IPython Notebook) on HDP 2.4
Spark
HortonWorks Apache Hadoop Sandbox Virtual Machine
Training Apache Spark Essentials
Datasets
OpenBaltimore
DataSF
Kaggle Datasets
IARPA Mapping of the World Challenge
IARPA MORGOTH'S CROWN Challenge
Satellite Imagery Feature Detection Challenge
SpaceNet Challenges
Other
IARPA Prize Challenges
Planet: Understanding the Amazon from Space Challenge
UCI Machine Learning Repository
Readings Assignments
Please refer to the
Schedule of Lectures
© Copyright 2017- Dr. K. Kalpakis