Minimum Viable Study Plan for Machine Learning Interview
This repo is written based on REAL interview questions from big companies and the study materials are based on legit experts i.e Andrew Ng, Yoshua Bengio etc.
I have 5 YOE in Machine Learning and have interviewed more than dozen big companies. This is the minimum viable study plan that covers all actual interview questions from Facebook, Amazon, Apple, Google, MS, SnapChat, Linkedin, Intuit etc.
If you're interested to learn more about paid ML system design course, subscribe here. This course will provide 6-7 practical usecases with proven solutions. After this course you will be able to solve new problem with systematic approach.
|Prepare for interview||Common questions about Machine Learning Interview process.|
|Study guide||Study guide contained minimum set of focus area to aces your interview.|
|Design ML system||ML system design includes actual ML system design usecases.|
|Test your ML knowledge||Machine Learning quiz are designed based on actual interview questions from dozen of big companies.|
|Practice coding||Leetcode questions by categories for MLE|
|Advance topics||Read advance topics|
|Mock interivew||Contact [email protected]|
I use LC time tracking to keep track of how many times I solves a question and how long I spent each time. Once I finish non-trivial medium LC questions 3 times, I have absolutely no issues solving them in actual interviews (sometimes within 8-10 minutes). It makes a big difference.
Leetcode questions by categories
- Know SQL join: self join, inner, left, right etc.
- Use hackerrank to practice SQL.
- Revise/Learn SQL Window Functions: window functions
- Java garbage collection
- Python pass-by-object-reference
- Python GIL, Fluent Python, chapter 17
- Python multithread
- Python concurrency, Fluent Python, chapter 18
Statistics and probability
- Learn Bayesian and practice problems in Bayesian
- Let A and B be events on the same sample space, with P (A) = 0.6 and P (B) = 0.7. Can these two events be disjoint?
- Given that Alice has 2 kids, at least one of which is a girl, what is the probability that both kids are girls? (credit swierdo)
- A group of 60 students is randomly split into 3 classes of equal size. All partitions are equally likely. Jack and Jill are two students belonging to that group. What is the probability that Jack and Jill will end up in the same class?
- Given an unfair coin with the probability of heads not equal to .5. What algorithm could you use to create a list of random 1s and 0s.
- Spark architecture and Spark lessons learned (outdated since Spark 3.0 release)
- Spark OOM
- Cassandra best practice and here
- Collinearity and read more
- Features scaling
- Random forest vs GBDT
- SMOTE synthetic minority over-sampling technique
- Compare discriminative vs generative model and extra read
- Logistic regression. Try to implement logistic regression from scratch. Bonus point for vectorized version in numpy + completed in 20 minutes sample code from martinpella. Followup with MapReduce version.
- Quantile regression
- L1/L2 intuition
- Decision tree and Random Forest fundamental
- Explain boosting
- Least Square as Maximum Likelihood Estimator
- Maximum Likelihood Estimator introduction
- Kmeans. Try to implement Kmeans from scratch sample code from flothesof.github.io. Bonus point for vectorized version in numpy + completed in 20 minutes. Follow-up with worst case time complexity and improvement for initialization.
- I didn't use flashcard but I'm sure it helps up to certain extend.
- The deep learning book. Read Part ii
- Machine Learning Yearning. Read from section 5 to section 27.
- Neural network and backpropagation
- Activation functions
- Loss and optimization
- Convolution Neural network notes
- Recurrent Neural Networks
ML system design
ML classic paper
- Technical debt in ML
- Rules of ML
- An Opinionated Guide to ML Research. There is valuable advice in the Personal development section at the bottom.
- Scaling ML at Uber
- DL in production
- Uber eats trip optimization
- Uber food discovery
- Personalized store feed
- Doordash dispatch optimization
Fraud detection (TBD)
- Ad click prediction trend
- Ad Clicks CTR
- Delayed feedbacks
- Entity embedding
- Star space, embedding all the things
- Twitter timeline ranking
- Instagram explore
- TikTok recommendation
- Deep Neural Networks for YouTube Recommendations
- Wide & Deep Learning for Recommender Systems
Acknowledgements and contributing
- Thanks for early feedbacks and contributions from Vivian, aragorn87 and others. You can create an Issue or Pull Request on this repo.
- Thanks to this community, we have donated about $200 to HopeForPaws. If you want to support, you can contribute too on their website.
- If you want contribute and want to stay anonymous, send an email to [email protected]