E0 259 : Data Analytics
Ramesh Hariharan (Strand Genomics and CSA, IISc) and Rajesh Sundaresan (ECE, IISc)
Lectures: (Most likely) Tuesdays and Thursdays, Time: TBD, Location: TBD
2017 Lectures and assignments
Links will be put up as the course progresses.
2016 Lectures and assignments
2015 complementary lectures and assignments
Will be scheduled towards the end of the course
Data sets from astronomy, genomics, neuroscience, sports, surveillance cameras, and social networks will be analysed to answer specific scientific questions. Statistical tools and modeling techniques will be introduced as needed to analyse the data and eventually address the scientific question.
Data Analytics has assumed increasing importance in recent times. Several industries are now built around the use of data for decision making. Several research areas too, genomics and neuroscience being notable examples, are increasingly focused on large-scale data generation rather than small-scale experimentation to generate initial hypotheses. This brings about a need for data analytics. This course will develop modern statistical tools and modeling techniques through hands-on data analysis in a variety of application domains.
The course will illustrate the principles of hands-on data analytics through several case studies (10 such studies). On each topic, we will introduce a scientific question and discuss why it should be addressed. Next, we will present the available data, how it was collected, etc. We will then discuss models, provide analyses, and finally touch upon how to address the scientific question using the analyses.
In 2016, we covered the following case studies.
- Astronomy: From Tycho Brahe's observations to the conclusion that Mars moves in an elliptical orbit.
- Visual Neuroscience: Neural correlates predict search difficulty.
- Genomics: Understanding the causes of cancer.
- Sports: The Duckworth-Lewis-Stern method for setting targets in shortened limited overs cricket matches.
- Genomics: The basis for red-green colour blindness.
- Genomics: Evolutionary history of Indian caste populations.
- Signal Processing: Video background separation.
- Networks: Community detection.
- Recommendation systems.
- Networks: Functional connectivity patterns of the brain.
- Random Processes (E2 202) OR Probability and Statistics (E0 232) OR equivalent.
There will be about eight assignments, one on each of the first eight modules. A fair amount of hands-on work is expected. Students will use Python.
- 50/100 : Assignments
- 20/100 : Final examination
- 30/100 : Course project and presentation
- There is no text book for this course. Various handouts will be provided from different sources.