Data Science Roadmap

A curriculum covering Data Science topics relevant to Machine Learning, Bioinformatics, and Artificial Intelligence. Includes necessary background in mathematics and the sciences.
Suggestions:

  • Do all courses/books in order
  • Don’t skip anything (if you’re tempted to skip a course because you think you already know it, take the final, if you pass, then skip).
  • Projects are listed at the end of each Tier, you should consider them assigned at the beginning of the Tier, and due at the end – so work on them alongside the books and courses as you work your way through a Tier.
  • For any project, feel free to use Python or R. Python is introduced earlier in the guide so it’s likely you’ll use it nearly exclusively for the earlier projects. It would be a good idea to lean more heavily on R in the latter projects to ensure you have substantial practice in both.
  • Make a point of building your GitHub profile during your study so that over time you show a history of data science projects.

If you haven’t already, begin blogging about what you’re learning on Medium (or elsewhere). Try to focus on tutorials for topics you’ve learned well that might be helpful to others.

Blogs

Bookmark these blogs and try to read at least one article from one of them per week.

Level 0 – prep

  1. Course: Question Everything: Scientific Thinking in Real Life
  2. Book: College Algebra or Course: College Algebra and Problem Solving or thorough knowledge of Algebra.
  3. Book: Precalculus or Course: Precalculus or Khan Academy or thorough knowledge of Precalculus
  4. Course: Introduction to Computer Science and Programming Using Python
  5. Course Series: Data Science Path on Cognitive Class
  6. Book: Think Python
  7. Project: Do all problems in Rosalind’s Python Village
  8. Project: Titanic: Machine Learning from Disaster
  9. Project: Complete one Bot Programming Competition on CodinGame

Level 1 – Foundations

  1.  Course: M001: MongoDB Basics – note: this course is only intermittently offered, sign up at the start of the Tier, then proceed to the other courses and return to this one when it’s in session
  2.  Course: MM220P: MongoDB for Python Developers – note: this course is only intermittently offered, sign up at the start of the Tier, then proceed to the other courses and return to this one when it’s in session
  3.  Course: Introduction to Biology – The Secret of Life
  4.  Course: Single Variable Calculus
  5. Course: Introduction to Probability and Statistics
  6. Course Series: Deep Learning Path on Cognitive Class
  7. Course: Introduction to Computational Thinking and Data Science
  8. Course: Data Science
  9. Book: Think Stats
  10. Course: Programming for the Web with JavaScript
  11. Course: Introduction to Solid State Chemistry
  12. Course: Multivariable Calculus
  13. Book: An Introduction to Statistical Learning
  14. Course: Principles of Biochemistry
  15. Book: The Elements of Statistical Learning
  16. Course: Intro to Machine Learning
  17. Project: Complete the Hackerrank Python Track
  18. Project: Do 10 problems (of your choice) on Rosalind
  19. Project: House Prices: Advanced Regression Techniques
  20. Project: Complete one competition of your choice from Crowd Analytix
  21. Project: Complete one Bot Programming Competition on CodinGame
  22. Project: Complete Deep Learning – TensorFlow on CodinGame

Level 2 – develop more expertise

  1. Course: MongoDB Performance – note: this course is only intermittently offered, sign up at the start of Tier 2, then proceed to the other courses and return to this one when the next session opens up.
  2. Book: Biology
  3. Course: Linear Algebra
  4. Book: Think Bayes
  5. Course: Proteins: Biology’s Workforce
  6. Course: Mathematics for Computer Science
  7. Course: Python for Data Science
  8. Course: Data Visualization and D3.js
  9. Course: Statistics and Probability in Data Science using Python
  10. Book: Think DSP
  11. Course: DNA: Biology’s Genetic Code
  12. Course: Machine Learning
  13. Book: Think Complexity
  14. Course: Database Mini-Courses – take all mini-courses
  15. Course: Deep Learning
  16. Project: Do 20 problems (of your choice) on Rosalind
  17. Project: Digit Recognizer
  18. Project: Complete the Hackerrank Probability Challenges
  19. Project: Complete the Hackerrank Linear Algebra Foundations Challenges
  20. Project: Complete one competition of your choice from Crowd Analytix
  21. Project: Complete one Bot Programming Competition on CodinGame

Level 3 – Add more rigor

  1. Course: Introduction to Algorithms (Python – requires book purchase) or Algorithms, Part I and Algorithms, Part II (Java)
  2. Course: Design and Analysis of Algorithms (Python – requires book purchase) or Analysis of Algorithms (Java)
  3. Book: The Art of R Programming
  4. Course: Mathematical Biostatistics Boot Camp 1
  5. Course: Mathematical Biostatistics Boot Camp 2
  6. Course Series: Big Data Path on Cognitive Class
  7. Course: Convex Optimization
  8. Book: R for Data Science
  9. Course: Probability: Basic Concepts & Discrete Random Variables
  10. Course Series: Hadoop Path on Cognitive Class
  11. Course: Probability: Distribution Models & Continuous Random Variables
  12. Course: Introduction to Mechanics, Part 1
  13. Course: Electricity & Magnetism, Part 2
  14. Course: Statistics for Applications
  15. Course: Intro to Hadoop and MapReduce
  16. Course: Differential Equations
  17. Project: Do 30 problems (of your choice) on Rosalind
  18. Project: Complete one competition of your choice on Kaggle
  19. Project: Complete the Hackerrank Algorithms Challenges
  20. Project: Complete one competition of your choice from Crowd Analytix
  21. Project: Complete one competition of your choice from Analytics Vidhya

Level 4 – Practical applications

  1. Book: Machine Learning with R
  2. Course: Statistics and R for the Life Sciences
  3. Course: Introduction to Linear Models and Matrix Algebra
  4. Book: The Quest for Artificial Intelligence
  5. Course: Statistical Inference and Modeling for High-throughput Experiments
  6. Course: High-Dimensional Data Analysis
  7. Course: Introduction to Bioconductor: Annotation and Analysis of Genomes and Genomic Assays
  8. Course: High-performance Computing for Reproducible Genomics
  9. Course: Case Studies in Functional Genomics
  10. Course: Quantum Mechanics for Everyone
  11. Course: Artificial Intelligence (AI)
  12. Course: Machine Learning
  13. Book: Multiagent Systems
  14. Course: Robotics
  15. Course: Animation and CGI Motion
  16. Project: Do 30 problems (of your choice) on Rosalind
  17. Project: Complete one competition of your choice on Kaggle
  18. Project: Complete one competition of your choice on Kaggle
  19. Project: Complete the Hackerrank Artificial Intelligence Challenges
  20. Project: Complete one competition of your choice from Crowd Analytix
  21. Project: Complete one competition of your choice from Analytics Vidhya
  22. Project: Complete one competition of your choice from Driven Data

Level 5 – Bonus Round – Advanced

  1. Course: Topics in Mathematics of Data Science
  2. Book: Bayesian Methods for Hackers
  3. Book: Bayesian Methods in the Search for MH370
  4. Course: Mathematics of Machine Learning
  5. Book: Mining of Massive Datasets
  6. Book: Informatics in the Future
  7. Course: Discrete Stochastic Processes
  8. Book: Bisociative Knowledge Discovery
  9. Course: Dynamic Systems and Control
  10. Book: New Horizons for a Data-Driven Economy
  11. Book: The Challenge of Chance

Attribution

Related Articles

Core Java

1. What are primitive types in Java ? byte, short, int, long, float, double, char, boolean… 2. What are tokens? Name the 5 types of tokens available in Java with an example…

Hackerearth-Algorithm

      1. Challenge: Quicksort 1 – Partition The previous challenges covered Insertion Sort, which is a simple and intuitive sorting algorithm with a running…

JavaScript

Explain event delegation Event delegation is a technique involving adding event listeners to a parent element instead of adding them to the descendant elements. The…

Hackerearth-Java

     1. Challenge : Welcome to Java! Welcome to the world of Java! In this challenge, we practice printing to stdout. The code stubs…

Responses

Your email address will not be published. Required fields are marked *

×