class: center, middle, inverse, title-slide # Introduction to Machine Learning ### Alexander Goncearenco ### Martin Skarzynski ### 2019-02-07 --- data:image/s3,"s3://crabby-images/c7d2b/c7d2b7174ea64cce876e137580c06bd4aa4d2b4c" alt="" --- data:image/s3,"s3://crabby-images/31b60/31b60806432ceb0e6bffe87cbb5abbc30f04e725" alt="" <br> <br> -- data:image/s3,"s3://crabby-images/5bac7/5bac7252dbb51e0fccbba8273546c568c1fb11f9" alt="" --- data:image/s3,"s3://crabby-images/71c36/71c362705d9c16c66e60f6262f7de7fdff2c6878" alt="" --- # Problem formulation .pull-left[+ Data set + **X** [N, M]: N samples, N features + **y** : N labels (outcomes / responses) + Machine Learning: + Type of task + Batch vs online + Supervision ] .pull-right[data:image/s3,"s3://crabby-images/df4c6/df4c62995e85220b2ab7e5eeac7cce2fbf642c38" alt=""] --- # By task: + In **classification** the algorithm must assign inputs to one of two (or more) classes + In **regression** the algorithm returns a value for each sets of inputs received + In **clustering** the algorithm divides the inputs supplied into two or more subgroups + In **density estimation** the algorithm constructs an estimate for the population distribution based on a small subset of the whole + Finally, **dimensionality reduction** maps inputs on to a lower dimensional space --- # Batch vs Online + In **batch learning** algorithms all the data is available and can be processed at one time + In **online learning** algorithms only part of the data is available for inclusion at any one time --- # Supervised vs Unsupervised + In **supervised learning** each of the supplied inputs is labeled with the desired output + In **unsupervised learning** no labels are available and the algorithm must find the underlying structure in the data itself + Somewhere in the middle is **semi-supervised learning** in which only some of the supplied inputs also have the desired output + In **reinforcement learning** the algorithm must interact with an environment to perform certain actions that maximize reward --- data:image/s3,"s3://crabby-images/c1217/c12170aaa7efb99764e546cfdc125ef056562cd9" alt="" --- # Classification .pull-left[data:image/s3,"s3://crabby-images/a2a60/a2a60fbbfac0f364ffc9ff77200fd16b78c23e7d" alt=""] .pull-right[data:image/s3,"s3://crabby-images/df4c6/df4c62995e85220b2ab7e5eeac7cce2fbf642c38" alt=""] --- # Classification .pull-left[data:image/s3,"s3://crabby-images/a0408/a0408a918c3432b7e2b97eff31de3b108d8d6ad5" alt=""] .pull-right[data:image/s3,"s3://crabby-images/df4c6/df4c62995e85220b2ab7e5eeac7cce2fbf642c38" alt=""] --- # Regression + Predicted variable + Observed variable data:image/s3,"s3://crabby-images/bb6ce/bb6ce45ba08ee453b9ae29ceb51e0e7fe718a799" alt="" --- # Clustering data:image/s3,"s3://crabby-images/fecbc/fecbce7e3577f6d07f20db7efe19f39418b00d0b" alt="" --- # Clustering data:image/s3,"s3://crabby-images/7e4aa/7e4aa24b7f0b28f893aa41ea3388f39e910368e2" alt="" --- # Latent variable models, density estimation data:image/s3,"s3://crabby-images/c2d67/c2d67c93aaa4d48c780219b97a3307b08385a133" alt="" --- # Dimensionality reduction, feature selection <img src='assets/img/image13.png' width='500px'></img> --- # Dimensionality reduction for visualization .pull-left[data:image/s3,"s3://crabby-images/b4df1/b4df1af5bdcaeb8084fd4fe68bb76fdf4ef562fa" alt=""] .pull-right[data:image/s3,"s3://crabby-images/6d02a/6d02ac70557c2587a63327835b3cad3dcf2994d9" alt=""] --- # Simple neural networks and Deep Learning data:image/s3,"s3://crabby-images/dfb28/dfb28b888a93edd8bd27c0e1c481e63af4c9f38b" alt="" --- # When to use Machine Learning? + **You cannot code the rules:** Many tasks cannot be adequately solved using a deterministic rule-based solution. A large number of factors could influence the answer (feature space is large) + **You cannot scale:** ML solutions are effective at handling large-scale problems (sample space is large) + Machine learning benefits come at a price (computational, design)