Machine Learning
Content
- Machine Learning and Economics
- Intro to Python
-
- Variables(text, digits)
-
- Mulitline strings
-
- List
-
- Tuples
-
- Dictionaries
-
- Sets
-
- If statements
-
- For loops
-
- Functions
- Matplotlib package
- Seaborn package
- Numpy package
- Pandas package
- Scikit-learn package
-
- Linear regression
-
-
- Accuracy score (R^2)
-
-
-
- Train_test_split()
-
-
-
- Complexity of the model
-
-
-
- Lasso regression
-
-
-
- Ridge regression
-
-
- Logistic regression
-
- Support Vector Machine
-
- Decision trees
-
- Random forest
-
- Xgboost
-
- K-Fold cross validation
About course
Machine learning is used when you need to learn how to solve some class of problems for which it is difficult to write an explicit algorithm, but you can find many examples with correct answers. So, it is impossible to preset yourself a handwritten algorithm that would be able to distinguish a photo of a cat from pictures of the dog, but if you have enough pictures of both, you can use machine learning to build such an algorithm automatically. In this course, students will learn about principles and algorithms for turning training data into effective automated predictions. We will cover how to predict poverty scores, how to predict health outcomes for the people and so on. **Instructor**: Ilias Suvanov Lecuters will be held on Friday at 08:00
Beneficial links
- Course chat
- Zoom link
- GitHub repositoy with course materials
- Anonymous feedback
- Gradebook
- Grading Policy
Seminars
Instructor | Schedule |
---|---|
Ilias Suvanov | Lecuters will be held on Friday at 08:00 |
Course News
TBALectures/Seminars
№ | Date | Title | Materials | View | Additional Materials |
---|---|---|---|---|---|
L1 | 11th January | What is Data, Why Python, Data-driven policy making. An introduction to machine learning. Basic terms, problem statements and application examples. | Slides | ||
S1 | 11th January | Variables, Strings, Multiline Strings in Python, If Statment, List, Dictionary, Set, For Loop Statement in Python | Seminar 1 | nbviewer | youtube tutorials |
S2 | 18th January | Pandas, numpy and matplotlib library | Seminar 2 | nbviewer Video | youtube tutorials |
S3 | 25th January | Guest lecture by Ilya Schurov. Intro to Machine Learning | Video | ||
S4 | 1st February | Linear Regression | Theory Code Code | ||
S5 | 8th February | Train/Test Split | Code | ||
S6 | 15th February | Presentation: Machine Learning for Everyone | |||
S7 | 22nd February | Logistic Regression | |||
S8 | 1st March | Logistic Regression | Theory Code Code | ||
S9 | 15th March | Support Vector Machine (SVM) | SVM Theory SVM Code | ||
S10 | 29th March | K-Fold Cross Validation. L1 and L2 regularization. | K-fold K-fold |
How to download .ipynb file from GitHub?
Grading Policy
Final Grade of course participants will be measured by the number of completed homeworks and FINAL exam. Additional points will be given to course participants for the participation in the class, presentations given in the class etc.
Control Work
Midterm
There would be NO MIDTERM exam for this course.
Oral Examination
Instructor reserves the right to require any course participant to sit for an individual oral examination (with turned on webcam and mic) before submitting the final grade to the registrar. Refusal to sit for an individual oral examination by course participant, may result in failing the course. Instructor will notify potential course participants about oral examination in the mid of April.
Final
For the FINAL exam you need to do an individual(groups are not allowed) analysis on the Kaggle platform, using any dataset on the platform. Your analysis should include Exploratory Data Analysis, Linear Regression model, Logistic Regression model and accuracy scores. You need to present your analysis in the class. Presentation should be no more than 5 minutes. The schedule for presentations can be found under the link https://docs.google.com/spreadsheets/d/1LqXnvP_nJlfSS4z8DBPtloJnbC1tX3rMAP0JZATS6WQ/edit#gid=0
Learning Python Materials
Foundational books
- Ben Stephenson - The Python Workbook: A Brief Introduction with Exercises and Solutions
- Nicola Lacey - Python by Example: Learning to Program in 150 Challenges
- Python documentation
- Jake VanderPlas - Python Data Science Handbook: Essential Tools for Working with Data
- Wes McKinney - Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython
- Joel Grus - Data Science from Scratch: First Principles with Python
Machine Learning Materials
Online courses
- Ilya Schurov - Machine Learning at HSE(Russian language)
- MIT Introduction to Deep Learning
- Andrew Ng - Machine learning on Coursera
- Google - AI Adventures
Books
- Mathematics for Machine Learning - book with mathematical introduction to machine learning. You might be especially interested in Probability theory chapters.
Learning Statistics Materials
Online courses
Books
- Christopher Barr, David M. Diez, and Mine Çetinkaya-Rundel - OpenIntro Statistics