Machine Learning Resources
Tutorials
Machine Learning for Developers
http://xyclade.github.io/MachineLearning/
Logistic Regression Vs Decision Trees Vs SVM
- Part I: http://www.edvancer.in/logistic-regression-vs-decision-trees-vs-svm-part1/
- Part II: http://www.edvancer.in/logistic-regression-vs-decision-trees-vs-svm-part2/
Machine learning: A practical introduction
- blog: http://www.infoworld.com/article/3010401/big-data/machine-learning-a-practical-introduction.html
Tutorials on Machine Learning (Tom Dietterich)
http://web.engr.oregonstate.edu/~tgd/projects/tutorials.html
Machine Learning Tutorials
- intro: “This repository contains a topic-wise curated list of Machine Learning and Deep Learning tutorials, articles and other resources. Other awesome lists can be found in this list.”
- homepage: http://ujjwalkarn.github.io/Machine-Learning-Tutorials/
- github: https://github.com/ujjwalkarn/Machine-Learning-Tutorials/blob/master/README.md
A Visual Introduction to Machine Learning
Machine Learning – A gentle & structured introduction
- blog: http://blog.cambridgecoding.com/2016/02/14/machine-learning-a-gentle-structured-introduction/
- slides: http://pan.baidu.com/s/1hqVGAl2
A Comparison of Supervised Learning Algorithm
Statistical Learning and Kernel Methods
Getting Started with Machine Learning
https://www.infoq.com/articles/getting-started-ml
Getting Started with Machine Learning: For the absolute beginners and fifth graders
https://medium.com/@suffiyanz/getting-started-with-machine-learning-f15df1c283ea#.fqipdiyyn
Machine Learning Crash Course
- part 1: https://ml.berkeley.edu/blog/2016/11/06/tutorial-1/
- part 2: https://ml.berkeley.edu/blog/2016/12/24/tutorial-2/
Rules of Machine Learning: Best Practices for ML Engineering
http://martin.zinkevich.org/rules_of_ml/rules_of_ml.pdf
Machine Learning is Fun!
Machine Learning is Fun! - The world’s easiest introduction to Machine Learning
Machine Learning is Fun! Part 2 - Using Machine Learning to generate Super Mario Maker levels
Machine Learning is Fun! Part 3: Deep Learning and Convolutional Neural Networks
Machine Learning is Fun! Part 4: Modern Face Recognition with Deep Learning
Machine Learning Theory
Machine Learning Theory - Part 1: Introduction
https://mostafa-samir.github.io/ml-theory-pt1/
Machine Learning Theory - Part 2: Generalization Bounds
https://mostafa-samir.github.io/ml-theory-pt2/
Boosting
“Quick Introduction to Boosting Algorithms in Machine Learning”
http://www.analyticsvidhya.com/blog/2015/11/quick-introduction-boosting-algorithms-machine-learning/
An Empirical Comparison of Three Boosting Algorithms on Real Data Sets with Artificial Class Noise(AdaBoost vs. LogitBoost vs. BrownBoost)
A (small) introduction to Boosting
Boosting and AdaBoost for Machine Learning
Gradient Boosting
Complete Guide to Parameter Tuning in Gradient Boosting (GBM) in Python
Understanding Gradient Boosting, Part 1
Gradient Boosting explained [demonstration]
- blog: https://arogozhnikov.github.io/2016/06/24/gradient_boosting_explained.htmlhttps://arogozhnikov.github.io/2016/06/24/gradient_boosting_explained.html
A Kaggle Master Explains Gradient Boosting
http://blog.kaggle.com/2017/01/23/a-kaggle-master-explains-gradient-boosting/
Performance of various open source GBM implementations
- intro: h2o VS. xgboost VS. lightgbm
- github: https://github.com/szilard/GBM-perf
arboretum - Gradient Boosting on GPU
- intro: Gradient Boosting powered by GPU(NVIDIA CUDA)
- github: https://github.com/sh1ng/arboretum
Gradient Boosting from scratch
https://medium.com/mlreview/gradient-boosting-from-scratch-1e317ae4587d
XGBoost
XGBoost: A Scalable Tree Boosting System
XGBoost: eXtreme Gradient Boosting
- intro: Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Flink and DataFlow
- github: https://github.com/dmlc/xgboost
GPU Accelerated XGBoost
Awesome XGBoost
- intro: This page contains a curated list of examples, tutorials, blogs about XGBoost usecases.
- github: https://github.com/dmlc/xgboost/blob/master/demo/README.md
Complete Guide to Parameter Tuning in XGBoost (with codes in Python)
- blog: https://www.analyticsvidhya.com/blog/2016/03/complete-guide-parameter-tuning-xgboost-with-codes-python/
- zh-blog: http://blog.csdn.net/u010657489/article/details/51952785
LinXGBoost: Extension of XGBoost to Generalized Local Linear Models
Tree Boosting With XGBoost - Why Does XGBoost Win “Every” Machine Learning Competition?
- intro: Master thesis
- thesis page: https://brage.bibsys.no/xmlui/handle/11250/2433761
XGBoost: Scalable GPU Accelerated Learning
- intro: describe the multi-GPU gradient boosting algorithm implemented in the XGBoost library
- arxiv: https://arxiv.org/abs/1806.11248
LightGBM
LightGBM, Light Gradient Boosting Machine
- intro: LightGBM is a fast, distributed, high performance gradient boosting (GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.
- github: https://github.com/Microsoft/LightGBM
pyLightGBM: Python binding for Microsoft LightGBM
Benchmarking LightGBM: how fast is LightGBM vs xgboost?
GPU-acceleration for Large-scale Tree Boosting
- intro: University of California, Davis & Google Research
- intro: GPU Accelerated LightGBM for Histogram-based GBDT Training
- arxiv: https://arxiv.org/abs/1706.08359
- github: https://github.com/huanzhang12/lightgbm-gpu
Lessons Learned From Benchmarking Fast Machine Learning Algorithms
- intro: XGBoost and LightGBM
- blog: https://blogs.technet.microsoft.com/machinelearning/2017/07/25/lessons-learned-benchmarking-fast-machine-learning-algorithms/
CatBoost
CatBoost is an open-source gradient boosting library with categorical features support
- intro: CatBoost is a machine learning method based on gradient boosting over decision trees.
- homepage: https://catboost.yandex/
- github: https://github.com/catboost/catboost
Bootstrap
Coding, Visualizing, and Animating Bootstrap Resampling
http://minimaxir.com/2015/09/bootstrap-resample/
Can we trust the bootstrap in high-dimension?
Cascades
Making faces with Haar cascades and mixed integer linear programming
- blog: http://matthewearl.github.io/2016/01/14/inverse-haar/
- github: https://github.com/matthewearl/inversehaar
Classifiers
Measuring Performance of Classifiers
Convex Optimization
Convex Optimization: Algorithms and Complexity
- arxiv: http://arxiv.org/abs/1405.4980
- blog: https://blogs.princeton.edu/imabandit/2015/11/30/convex-optimization-algorithms-and-complexity/
cvx-optim.torch: Torch library for convex optimization
Decision Tree
Soft Decision Trees
- paper: http://www.cmpe.boun.edu.tr/~ethem/files/papers/icpr2012_softtree.pdf
- project page: http://www.cs.cornell.edu/~oirsoy/softtree.html
- github: https://github.com/oir/soft-tree
Canonical Correlation Forests
Decision Trees Tutorial
End-to-end Learning of Deterministic Decision Trees
- intro: Heidelberg University
- arxiv: https://arxiv.org/abs/1712.02743
Extremely Fast Decision Tree
Generative Models
A note on the evaluation of generative models
Markov Networks
Markov Logic Networks
Markov Chains
Evolution, Dynamical Systems and Markov Chains
http://www.offconvex.org/2016/03/07/evolution-markov-chains/
Markov Chains: Explained Visually
Matrix Computations
Randomized Numerical Linear Algebra for Large Scale Data Analysis
http://researcher.watson.ibm.com/researcher/view_group.php?id=5131
Sketching-based Matrix Computations for Machine Learning
http://xdata-skylark.github.io/libskylark/
Matrix Factorization
Neural Network Matrix Factorization
Beyond Low Rank + Sparse: Multi-scale Low Rank Matrix Decomposition
k-Means Clustering Is Matrix Factorization
CuMF_SGD: Fast and Scalable Matrix Factorization
- arxiv: https://arxiv.org/abs/1610.05838
- github: https://github.com/CuMF/cumf_sgd
Gaussian Processes
The Gaussian Processes Web Site
Chained Gaussian Processes
- jmlr: http://jmlr.org/proceedings/papers/v51/saul16.html
- arxiv: http://arxiv.org/abs/1604.05263
- github: https://github.com/SheffieldML/ChainedGP
Introduction to Gaussian Processes
Multi-label Learning
Neural Network Models for Multilabel Learning
Conditional Bernoulli Mixtures for Multi-label Classification
- homepage: http://www.chengli.io/publications/li2016conditional.html
- paper: http://www.chengli.io/publications/li2016conditional.pdf
- slides: http://www.chengli.io/publications/li2016conditional_slides.pdf
- github: https://github.com/cheng-li/pyramid
- wiki: https://github.com/cheng-li/pyramid/wiki/CBM
Multi-Label Learning with Label Enhancement
https://arxiv.org/abs/1706.08323
Multi-Task Learning
Multitask Learning
- intro: 1997
- paper: http://www.cs.cornell.edu/~caruana/mlj97.pdf
Multi-Task Learning: Theory, Algorithms, and Applications (2012)
Nearest Neighbors
Annoy: Approximate Nearest Neighbors in C++/Python optimized for memory usage and loading/saving to disk
- github: https://github.com/spotify/annoy
Hidden Markov Models (HMM)
tensorflow_hmm: A tensorflow implementation of an HMM layer
- intro: Tensorflow and numpy implementations of the HMM viterbi and forward/backward algorithms
- github: https://github.com/dwiel/tensorflow_hmm
Online Learning
Lecture Notes on Online Learning
Scale-Free Online Learning
Online Learning with Expert Advice
Stochastic Gradient Descent (SGD)
Stochastic Gradient Descent (v.2)
- author: Leon Bottou
- intro: SGD, ASGD, Stochastic Gradient SVM, Stochastic Gradient CRFs
- homepage: http://leon.bottou.org/projects/sgd
Gradient descent with Python
Stochastic Gradient Descent (SGD) with Python
Gradient Descent Learns Linear Dynamical Systems
Why is gradient descent robust to non-linearly separable data?
Boosted Regression Trees
DART: Dropouts meet Multiple Additive Regression Trees
- paper: http://www.jmlr.org/proceedings/papers/v38/korlakaivinayak15.html
- github: https://github.com/dmlc/xgboost/blob/master/doc/tutorials/dart.md
Visualization
Visualising High-Dimensional Data
- blog: http://blog.applied.ai/visualising-high-dimensional-data/
- ipn(“t-SNE Demo”): https://s3-eu-west-1.amazonaws.com/appliedai.static/tsnedemo/htmlrenders/01_EndToEnd_DataViz.html
Interactive demonstrations for ML courses
Comprehensive Guide on t-SNE algorithm with implementation in R & Python
https://www.analyticsvidhya.com/blog/2017/01/t-sne-implementation-r-python/
Tricks
Machine Learning Trick of the Day
- (1): Replica Trick: http://blog.shakirm.com/2015/07/machine-learning-trick-of-the-day-1-replica-trick/
- (2): Gaussian Integral Trick: http://blog.shakirm.com/2015/08/machine-learning-trick-of-the-day-2-gaussian-integral-trick/
- (3): Hutchinson’s Trick: http://blog.shakirm.com/2015/09/machine-learning-trick-of-the-day-3-hutchinsons-trick/
- (4): Reparameterisation Tricks: http://blog.shakirm.com/2015/10/machine-learning-trick-of-the-day-4-reparameterisation-tricks/
- (5): Log Derivative Trick: http://blog.shakirm.com/2015/11/machine-learning-trick-of-the-day-5-log-derivative-trick/
Debug Machine Learning
Debugging Machine Learning Tasks
Tackle Unbalanced Classes
Classic strategies:
- class re-sampling
- cost-sensitive training
Dealing with Unbalanced Classes ,Svm, Random Forests And Decision Trees In Python
- blog: http://bigdataexaminer.com/data-science/dealing-with-unbalanced-classes-svm-random-forests-and-decision-trees-in-python/
- blog: http://www.kdnuggets.com/2016/04/unbalanced-classes-svm-random-forests-python.html
Fighting Class Unbalance Supervised ML Problem
http://www.erogol.com/fighting-class-unbalance-supervised-ml-problem/
Survey of resampling techniques for improving classification performance in unbalanced datasets
Learning from Imbalanced Classes
- blog: http://www.svds.com/learning-imbalanced-classes/
- github: https://github.com/silicon-valley-data-science/learning-from-imbalanced-classes
Towards Competitive Classifiers for Unbalanced Classification Problems: A Study on the Performance Scores
This Machine Learning Project on Imbalanced Data Can Add Value to Your Resume
Dealing with unbalanced data: Generating additional data by jittering the original image
- blog: https://medium.com/@vivek.yadav/dealing-with-unbalanced-data-generating-additional-data-by-jittering-the-original-image-7497fe2119c3
- ipynb: https://nbviewer.jupyter.org/github/vxy10/SCND_notebooks/blob/master/preprocessing_stuff/img_transform_NB.ipynb
7 Techniques to Handle Imbalanced Data
http://www.kdnuggets.com/2017/06/7-techniques-handle-imbalanced-data.html
Mathematics
Some Notes on Applied Mathematics for Machine
An extended collection of matrix derivative results for forward and reverse mode algorithmic differentiation
Probability Cheatsheet
- homepage: http://www.wzchen.com/probability-cheatsheet
- github: https://github.com/wzchen/probability_cheatsheet
Probability Cheatsheet v2.0 http://static1.squarespace.com/static/54bf3241e4b0f0d81bf7ff36/t/55e9494fe4b011aed10e48e5/1441352015658/probability_cheatsheet.pdf
Kalman Filter
How Kalman Filters Work
- part 1: http://www.anuncommonlab.com/articles/how-kalman-filters-work/
- part 2: http://www.anuncommonlab.com/articles/how-kalman-filters-work/part2.html
- part 3: http://www.anuncommonlab.com/articles/how-kalman-filters-work/part3.html
Understanding the Basis of the Kalman Filter Via a Simple and Intuitive Derivation
- paper: https://www.cl.cam.ac.uk/~rmf25/papers/Understanding%20the%20Basis%20of%20the%20Kalman%20Filter.pdf
L-BFGS
Code Stylometry
De-anonymizing Programmers via Code Stylometry
- keywords: source code authorship, random forests
- paper: http://www.princeton.edu/~aylinc/papers/caliskan-islam_deanonymizing.pdf
Recommendation / Recommender System
Master Recommender Systems
- intro: Learn how to design, building and evaluate recommender systems for commerce and content.
- course page: https://www.coursera.org/specializations/recommender-systems
Human Curation and Convnets: Powering Item-to-Item Recommendations on Pinterest
Top-N Recommendation with Novel Rank Approximation
- arxiv: http://arxiv.org/abs/1602.07783
- github: https://github.com/sckangz/SDM16
On the Effectiveness of Linear Models for One-Class Collaborative Filtering
- paper: http://www.cs.toronto.edu/~darius/papers/SedhainEtAl-AAAI2016.pdf
- github: https://github.com/mesuvash/LRec
An Adaptive Matrix Factorization Approach for Personalized Recommender Systems
Implementing your own Recommender Systems in Python using Stochastic Gradient Descent
How to Write Your Own Recommendation System
- blog(part 1): http://elliot.land/how-to-write-your-own-recommendation-system-part-1
- blog(part 2): http://elliot.land/how-to-write-your-own-recommendation-system-part-2
Addressing Cold Start for Next-song Recommendation
- intro: ACM Recsys 2016
- paper: http://mac.citi.sinica.edu.tw/~yang/pub/chou16recsys.pdf
- github: https://github.com/fearofchou/ALMM
Using Navigation to Improve Recommendations in Real-Time
Local Item-Item Models For Top-N Recommendation
Lessons learned from building real-life recommender systems
- intro: Recsys 2016 tutorial
- slides: http://www.slideshare.net/xamat/recsys-2016-tutorial-lessons-learned-from-building-reallife-recommender-systems
- mirror: https://pan.baidu.com/s/1eSdWcue
Algorithms Aside: Recommendation As The Lens Of Life
Pairwise Preferences Based Matrix Factorization and Nearest Neighbor Recommendation Techniques
Mendeley: Recommendations for Researchers
- intro: RecSys 2016
- slides: http://saulvargas.es/slides/recsys2016/#/
Past, Present and Future of Recommender Systems: an Industry Perspective
- intro: RecSys 2016
- slides: http://www.slideshare.net/xamat/past-present-and-future-of-recommender-systems-and-industry-perspective
- mirror: https://pan.baidu.com/s/1kVQ4SKZ
TF-recomm: Tensorflow-based Recommendation systems
List of Recommender Systems
Related Pins at Pinterest: The Evolution of a Real-World Recommender System
- intro: Pinterest, Inc.
- arxiv: https://arxiv.org/abs/1702.07969
Lifelong Learning
Lifelong Machine Learning
- book: https://www.cs.uic.edu/~liub/lifelong-machine-learning.html
- pdf: https://vk.com/doc-44016343_439142620?hash=a96978fe024d79e455&dl=2e154ea5883bbc8fd6
NELL (Never Ending Language Learner)
Toward an architecture for neverending language learning
NEIL (Never Ending Image Learner)
NEIL: Extracting Visual Knowledge from Web Data
- paper: http://www.cv-foundation.org/openaccess/content_iccv_2013/papers/Chen_NEIL_Extracting_Visual_2013_ICCV_paper.pdf
- slides: http://web.cs.hacettepe.edu.tr/~nazli/courses/bil722/slides/week10_1.pdf
- slides: http://sglab.kaist.ac.kr/~sungeui/IR/Presentation/first/20141104%EC%9D%B4%EC%9C%A4%EC%84%9D.pdf
- talk: http://techtalks.tv/talks/neil-extracting-visual-knowledge-from-web-data/59408/
- poster: http://www.cs.cmu.edu/~xinleic/docs/neil/NEIL_poster.pdf
Expert Gate: Lifelong Learning with a Network of Experts
Lifelong Machine Learning and Computer Reading the Web
- intro: KDD 2016 Tutorial
- paper: https://www.cs.uic.edu/~liub/Lifelong-Machine-Learning-Tutorial-KDD-2016.pdf
Lifelong Machine Learning for Natural Language Processing
- intro: EMNLP 2016 Tutorial
- slides: http://www.emnlp2016.net/tutorials/chen-liu-t3.pdf
Zero-Shot Learning
An embarrassingly simple approach to zero-shot learning
- paper: http://jmlr.org/proceedings/papers/v37/romera-paredes15.html
- github: https://github.com/MLWave/extremely-simple-one-shot-learning
Zero-Shot Learning - The Good, the Bad and the Ugly
One Shot Learning
Matching Networks for One Shot Learning
Maximum Entropy
Maximum entropy probability distribution
https://www.wikiwand.com/en/Maximum_entropy_probability_distribution
Metric Learning
Distance Metric Learning: A Comprehensive Survey
- intro: 2006
- paper: https://www.cs.cmu.edu/~liuy/frame_survey_v2.pdf
Large Scale Metric Learning from Equivalence Constraints
- intro: CVPR 2012. KISSME
- paper: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.384.2335&rep=rep1&type=pdf
Large Scale Strongly Supervised Ensemble Metric Learning, with Applications to Face Verification and Retrieval
- intro: NEC Laboratories America
- arxiv: https://arxiv.org/abs/1212.6094
Finance and Trading
Efficient Portfolio optimisation by Hybridised Machine Learning
- intro: Thesis 2014
- mirror: http://pan.baidu.com/s/1eQvSyZ4
Feature Selection for Portfolio Optimization
The Efficient Frontier: Markowitz portfolio optimization in Python
Self-Study Plan for Becoming a Quantitative Trader
- part 1: https://www.quantstart.com/articles/Self-Study-Plan-for-Becoming-a-Quantitative-Trader-Part-I
- part 2: https://www.quantstart.com/articles/Self-Study-Plan-for-Becoming-a-Quantitative-Trader-Part-II
Pyfolio – a new Python library for performance and risk analysis
Application of Machine Learning: Automated Trading Informed by Event Driven Data
- intro: MIT master thesis
- paper: https://dspace.mit.edu/bitstream/handle/1721.1/105982/965785890-MIT.pdf
Python Programming for Finance
Algorithmic trading in less than 100 lines of Python code
https://www.oreilly.com/learning/algorithmic-trading-in-less-than-100-lines-of-python-code
Designing an Algorithmic Trading Strategy with Python
https://www.youtube.com/watch?v=9XYjR6ge73M
Different Interpretation about Same Model
Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning
- intro: ICML 2016
- arxiv: https://arxiv.org/abs/1506.02142
Dropout as a Bayesian Approximation: Insights and Applications
http://mlg.eng.cam.ac.uk/yarin/PDFs/Dropout_as_a_Bayesian_approximation.pdf
k-Means Clustering Is Matrix Factorization
https://arxiv.org/abs/1512.07548
word embedding as matrix factorization
Neural Word Embedding as Implicit Matrix Factorization
Deformable Part Models are Convolutional Neural Networks
- intro: CVPR 2015
- arxiv: https://arxiv.org/abs/1409.5403
- paper: http://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Girshick_Deformable_Part_Models_2015_CVPR_paper.pdf
k-Means is a Variational EM Approximation of Gaussian Mixture Models
https://arxiv.org/abs/1704.04812
Steepest descent with momentum for quadratic functions is a version of the conjugate gradient method
http://www.sciencedirect.com/science/article/pii/S0893608003001709
On the momentum term in gradient descent learning algorithms
EM as a coordinate descent
Backprop as Functor: A compositional perspective on supervised learning
- intro: MIT
- arxiv: https://arxiv.org/abs/1711.10455
Papers
Do we Need Hundreds of Classifiers to Solve Real World Classification Problems?
- intro: evaluate 179 classifiers arising from 17 families (discriminant analysis, Bayesian, neural networks, support vector machines, decision trees, rule-based classifiers, boosting, bagging, stacking, random forests and other ensembles, generalized linear models, nearest-neighbors, partial least squares and principal component regression, logistic and multinomial regression, multiple adaptive regression splines and other methods), implemented in Weka, R (with and without the caret package), C and Matlab, including all the relevant classifiers available today
- intro: “The random forest is clearly the best family of classifiers”
- paper: http://www.jmlr.org/papers/volume15/delgado14a/delgado14a.pdf
Are Random Forests Truly the Best Classifiers?
- intro: question the conclusion that random forests are the best classifiers
- paper: http://jmlr.org/papers/volume17/15-374/15-374.pdf
- notes: http://weibo.com/ttarticle/p/show?id=2309404007876694808654
- my notes: jeez, I love the above two papers..
An Empirical Evaluation of Supervised Learning in High Dimensions
Machine learning: Trends, perspectives, and prospects
- intro: M. I. Jordan and T. M. Mitchell. Science
- paper: http://www.cs.cmu.edu/~tom/pubs/Science-ML-2015.pdf
Debugging Machine Learning Tasks
LIME
“Why Should I Trust You?”: Explaining the Predictions of Any Classifier
- intro: Local Interpretable Model-Agnostic Explanations (LIME)
- homepage: http://homes.cs.washington.edu/~marcotcr/blog/lime/
- arxiv: http://arxiv.org/abs/1602.04938
- github: https://github.com/marcotcr/lime
- github: https://github.com/marcotcr/lime-experiments
- blog: https://www.oreilly.com/learning/introduction-to-local-interpretable-model-agnostic-explanations-lime
- blog: http://dataskeptic.com/epnotes/trusting-machine-learning-models-with-lime.php
- notes: https://blog.acolyer.org/2016/09/22/why-should-i-trust-you-explaining-the-predictions-of-any-classifier/
Datasets
Datasets for Machine Learning
Books
Machine Learning plus Intelligent Optimization: THE LION WAY, VERSION 2.0
- book: http://intelligent-optimization.org/LIONbook/
- slides: http://intelligent-optimization.org/LIONbook/LIONway-slides-chapter3.pdf
Level-Up Your Machine Learning
https://www.metacademy.org/roadmaps/cjrd/level-up-your-ml
An Introduction to the Science of Statistics: From Theory to Implementation (Preliminary Edition)
Python Machine Learning
Machine Learning for Hackers
A Course in Machine Learning
- homepage: http://ciml.info/
- github: https://github.com/hal3/ciml
An Introduction to Statistical Learning: with Applications in R
- homepage: http://www-bcf.usc.edu/~gareth/ISL/
- course page: https://lagunita.stanford.edu/courses/HumanitiesSciences/StatLearning/Winter2016/about
- unofficial solutions: http://blog.princehonest.com/stat-learning/
- github: https://github.com/asadoughi/stat-learning
Introduction to Machine Learning with Python
- github(Notebooks and code): https://github.com/amueller/introduction_to_ml_with_python
Introduction to Machine Learning (Second Edition)
- author: Ethem Alpaydin
- book: https://static.aminer.org/upload/pdf/1821/326/1262/53e99a91b7602d9702304e89.pdf
Videos
Video resources for machine learning
http://dustintran.com/blog/video-resources-for-machine-learning/
Blogs
10 More lessons learned from building real-life Machine Learning systems — Part I
Machine Learning: classifier comparison using Plotly
Fitting a model via closed-form equations vs. Gradient Descent vs Stochastic Gradient Descent vs Mini-Batch Learning. What is the difference?
A Friendly Introduction to Cross-Entropy Loss
https://rdipietro.github.io/friendly-intro-to-cross-entropy-loss/
How to choose algorithms for Microsoft Azure Machine Learning
New to Machine Learning? Avoid these three mistakes
Machine Learning Exercises In Python
- part 1: http://www.johnwittenauer.net/machine-learning-exercises-in-python-part-1/
- part 2: http://www.johnwittenauer.net/machine-learning-exercises-in-python-part-2/
- part 3: http://www.johnwittenauer.net/machine-learning-exercises-in-python-part-3/
- part 4: http://www.johnwittenauer.net/machine-learning-exercises-in-python-part-4/
- part 5: http://www.johnwittenauer.net/machine-learning-exercises-in-python-part-5/
- part 6: http://www.johnwittenauer.net/machine-learning-exercises-in-python-part-6/
- part 7: http://www.johnwittenauer.net/machine-learning-exercises-in-python-part-7/
- part 8: http://www.johnwittenauer.net/machine-learning-exercises-in-python-part-8/
- github: https://github.com/jdwittenauer/ipython-notebooks
- reddit: https://www.reddit.com/r/MachineLearning/comments/4xgkoa/all_of_andrew_ngs_machine_learning_class_in_python/
Assessing Stability of K-Means Clusterings
Cross-Validation Gone Wrong
- blog: http://betatim.github.io/posts/cross-validation-gone-wrong/
- ipn: http://nbviewer.jupyter.org/url/betatim.github.io//downloads/notebooks/cross_validation.ipynb
Probabilistic Machine Learning in PyMC3
- blog: http://twiecki.github.io/ODSC_London_2016_Probabilistic_ML_Wiecki.slides.html#/
- slides: https://docs.google.com/presentation/d/1puj4iN70MRVauUmIMAZS0pfANktjdQ5uCP7H8OLPKFk/edit#slide=id.p
- mirror: https://pan.baidu.com/s/1pLJCya3
- ipn: https://twiecki.github.io/probabilistic_ml.ipynb
Bias in ML, and Teaching AI
- blog: http://nlpers.blogspot.ru/2016/11/bias-in-ml-and-teaching-ai.html
- slides: http://www.umiacs.umd.edu/~hal/talks/16-11-diversity-bias.odp
- mirror: https://pan.baidu.com/s/1bpqWIkB
Solutions for Skilltest Machine Learning : Revealed
https://www.analyticsvidhya.com/blog/2016/11/solution-for-skilltest-machine-learning-revealed/
Machine Learning Performance Improvement Cheat Sheet
- intro: 32 Tips, Tricks and Hacks That You Can Use To Make Better Predictions.
- blog: http://machinelearningmastery.com/machine-learning-performance-improvement-cheat-sheet/
What is better: gradient-boosted trees, or a random forest?
http://fastml.com/what-is-better-gradient-boosted-trees-or-random-forest/
A Practical Guide to Tree Based Learning Algorithms
https://sadanand-singh.github.io/posts/treebasedmodels/
Model evaluation, model selection, and algorithm selection in machine learning
Part I - The basics
http://sebastianraschka.com/blog/2016/model-evaluation-selection-part1.html
Part II - Bootstrapping and uncertainties
http://sebastianraschka.com/blog/2016/model-evaluation-selection-part2.html
Part III - Cross-validation and hyperparameter tuning
http://sebastianraschka.com/blog/2016/model-evaluation-selection-part3.html
ROC / AUC
ROC: Receiver Operating Characteristic
AUC: Area Under the Curve
Tutorials: Plotting AP and ROC curves
http://www.vlfeat.org/overview/plots-rank.html
Beautiful Properties Of The Roc Curve
http://jxieeducation.com/2016-09-27/Beautiful-Properties-Of-The-ROC-Curve/
On calculating AUC
ROC to precision-recall curve translator
https://rafalab.shinyapps.io/roc-precision-recall/
t-SNE
How to Use t-SNE Effectively
Libraries
LambdaNet: Purely functional artificial neural network library implemented in Haskell
rustlearn: Machine learning crate for Rust
MILJS : Brand New JavaScript Libraries for Matrix Calculation and Machine Learning
- arxiv: http://arxiv.org/abs/1503.05743v1
- github: https://github.com/mil-tokyo
- homepage: http://mil-tokyo.github.io/
machineJS: Automated machine learning- just give it a data file!
Machine Learning for iOS: Tools and resources to create really smart iOS applications
DynaML: Scala Library/REPL for Machine Learning Research
- homepage: http://mandar2812.github.io/DynaML/
- github: https://github.com/mandar2812/DynaML/
Smile - Statistical Machine Intelligence and Learning Engine
- intro: Smile is a fast and comprehensive machine learning system.
- homepage: http://haifengl.github.io/smile/index.html
- github: https://github.com/haifengl/smile
benchm-ml
- intro: A minimal benchmark for scalability, speed and accuracy of commonly used open source implementations (R packages, Python scikit-learn, H2O, xgboost, Spark MLlib etc.) of the top machine learning algorithms for binary classification (random forests, gradient boosted trees, deep neural networks etc.).
- github: https://github.com/szilard/benchm-ml
KeystoneML: Simplifying robust end-to-end machine learning on Apache Spark
- intro: a software framework, written in Scala, from the UC Berkeley AMPLab designed to simplify the construction of large scale, end-to-end, machine learning pipelines with Apache Spark.
- homepage: http://keystone-ml.org/
- github: https://github.com/amplab/keystone
Talisman: A straightforward & modular NLP, machine learning & fuzzy matching library for JavaScript
- homepage: http://yomguithereal.github.io/talisman/
- github: https://github.com/Yomguithereal/talisman
PRMLT: Pattern Recognition and Machine Learning Toolbox
- homepage: http://prml.github.io/
- github: https://github.com/PRML/PRMLT
The Fido Project: An open source C++ machine learning library targeted towards embedded electronics and robotics
- homepage: https://fidoproject.github.io/
- github: https://github.com/FidoProject/Fido
rusty-machine: Machine Learning library for Rust
- homepage: https://crates.io/crates/rusty-machine/
- github: https://github.com/AtheMathmo/rusty-machine
RoBO - a Robust Bayesian Optimization framework
Dlib: A toolkit for making real world machine learning and data analysis applications in C++
- intro: Dlib is a modern C++ toolkit containing machine learning algorithms and tools for creating complex software in C++ to solve real world problems.
- homepage: http://dlib.net/
- github: https://github.com/davisking/dlib
Bayesian Networks and Bayesian Classifier Software
ML-lib: An extensive machine learning library, made from scratch (Python)
Top Machine Learning Projects for Julia
Helit: My machine learning/computer vision library for all of my recent papers, plus algorithms that I just like.
- github: https://github.com/thaines/helit
Gorgonia: a library that helps facilitate machine learning in Go
GoLearn: Machine Learning for Go
Cortex: Machine learning in Clojure
- intro: Neural networks, regression and feature learning in Clojure.
- github: https://github.com/thinktopic/cortex
ELI5: A library for debugging machine learning classifiers and explaining their predictions
PHP-ML - Machine Learning library for PHP
- github: https://github.com/php-ai/php-ml
- github: https://github.com/php-ai/php-ml-examples
- docs: http://php-ml.readthedocs.io/en/latest/
ml.js - Machine learning tools in JavaScript
Propel
- intro: A Machine Learning Framework for JavaScript / Differential Programming in JavaScript
- homepage: http://propelml.org/
- github: https://github.com/propelml/propel
Resources
Machine Learning Surveys: A list of literature surveys, reviews, and tutorials on Machine Learning and related topics
machine learning classifier gallery
http://home.comcast.net/~tom.fawcett/public_html/ML-gallery/pages/
Machine Learning and Computer Vision Resources
http://zhengrui.github.io/zerryland/ML-CV-Resource.html
A Huge List of Machine Learning And Statistics Repositories
http://blog.josephmisiti.com/a-huge-list-of-machine-learning-repositories/
Machine Learning in Python Course
https://www.springboard.com/learning-paths/machine-learning-python/
机器学习(Machine Learning)&深度学习(Deep Learning)资料(Chapter 1)
https://github.com/ty4z2008/Qix/blob/master/dl.md
The Spectator: Shakir’s Machine Learning Blog
Useful Inequalities
http://www.lkozma.net/inequalities_cheat_sheet/ineq.pdf
Math for Machine Learning
http://www.umiacs.umd.edu/~hal/courses/2013S_ML/math4ml.pdf
Cheat Sheet: Algorithms for Supervised- and Unsupervised Learning
Annalyzin: Analytics For Layman, with Tutorials & Experiments
https://annalyzin.wordpress.com/
ALGORITHMS: AI, Data Mining, Clustering, Data Structures, Machine Learning, Neural, NLP, …
Awesome Machine Learning: A curated list of awesome machine learning frameworks, libraries and software (by language)
awesome-machine-learning-cn: 机器学习资源大全中文版
- intro: 机器学习资源大全中文版,包括机器学习领域的框架、库以及软件
- github: https://github.com/jobbole/awesome-machine-learning-cn
Machine and Deep Learning with Python
useR! 2016 Tutorial: Machine Learning Algorithmic Deep Dive
- homepage: http://user2016.org/tutorials/10.html
- github: https://github.com/ledell/useR-machine-learning-tutorial
Top-down learning path: Machine Learning for Software Engineers
- intro: A complete daily plan for studying to become a machine learning engineer.
- github: https://github.com/ZuzooVn/machine-learning-for-software-engineers
30 Top Videos, Tutorials & Courses on Machine Learning & Artificial Intelligence from 2016 https://www.analyticsvidhya.com/blog/2016/12/30-top-videos-tutorials-courses-on-machine-learning-artificial-intelligence-from-2016/
Machine Learning Problem Bible (MLPB)
- github: https://github.com/ben519/MLPB
The most shared Machine Learning conten on Twitter from the past 7 days
- Based on the millions of #machinelearning tweets already processed by The Herd Locker, noise is a little over 94% of the conversation. Tracking the 8,000 daily tweets that are tagged #machineLearning, the platform filters and ranks the most popular shared content in realtime. Machine learning’s zeitgeist, you might say. It’s been running for over a year, monitoring half a billion tweets a day, and will always be free to use. No ads. No BS. http://theherdlocker.com/tweet/popularity/machinelearning
Projects
Machine learning algorithms: Minimal and clean examples of machine learning algorithms
- intro: A collection of minimal and clean implementations of machine learning algorithms.
- github: https://github.com/rushter/MLAlgorithms
Plotting high-dimensional decision boundaries
Flappy Learning: Program that learns to play Flappy Bird by machine learning (Neuroevolution)
- blog: https://xviniette.github.io/FlappyLearning/
- github: https://github.com/xviniette/FlappyLearning
Readings / Questions / Discussions
A Super Harsh Guide to Machine Learning
https://www.reddit.com/r/MachineLearning/comments/5z8110/d_a_super_harsh_guide_to_machine_learning/
(Quora): What are the top 10 data mining or machine learning algorithms?
(Quora): What are the must read papers on data mining and machine learning?
https://www.quora.com/What-are-the-must-read-papers-on-data-mining-and-machine-learning
(Quora): What would be your advice to a software engineer who wants to learn machine learning? https://www.quora.com/What-would-be-your-advice-to-a-software-engineer-who-wants-to-learn-machine-learning-3/answer/Alex-Smola-1
Machine Learning FAQ
MLNotes: Very concise notes on machine learning and statistics
Machine Learning Problem Bible (MLPB)
- github: https://github.com/ben519/MLPB
List of machine learning concepts
What is the relation between Logistic Regression and Neural Networks and when to use which?