Data Science with R
About course
Course is planned to be lectured for AUCA students in the Spring semester of 2020/2021 study year. This course is aimed at introducing programming and computational tools useful for future careers as data scientists. In the course, students will set up their own R programming environment; learn how to write, execute and modify R code and R scripts; load data sets into R, create effective numerical and graphical summary statistics, and see how to use R to perform some common statistical analyses; use programming techniques such as loops, conditionals and functions, to effectively solve practical and analytical issues that data scientists encounter when working with data.
Instructor: Ilias Suvanov
Lectures will be held on Tuesday at 08:00 and Thursday at 08:00
Content
- Introduction
- Difference between R vs Rstudio vs Kaggle notebook (Notepad vs MS Word vs Google doc)
- Strings
- Digits
- Vectors
- Factors
- DataFrames
-
- Selecting columns using $ operator
-
- Selecting sub-table using [ , ] operator
-
- summary()
-
- table()
-
- Importing data
-
- Working with missing values
- Tibbles
- For loops
- If statements
- Functions
- RMarkdown
- Ggplot2 package (Visualization)
-
- Scatter Plot
-
- Line Graph
-
- Bar Graph
-
- Histogram
-
- Density Plot
-
- 2d Density Plot
-
- Correlation Plot
-
- Box Plot
-
- Facet_grid()
-
- Facet_wrap()
- Plotly (Interactive Visualizations)
- R base package for graphics
-
- plot()
-
- hist()
- Dplyr package (Data manipulations)
-
- Piping operator %>%
-
- Select()
-
- Mutate()
-
- Filter()
-
- Group_by()
-
- Rename()
- Tidyr package (Reshaping data format; wide and long data formats)
-
- pivot_longer()
-
- pivot_wider()
- Merging two dataframes
-
- cbind()
-
- rbidn()
-
- merge() (Left join, right join, outter join, self join)
- Casual inference
-
- Linear Regression
-
-
- Interpreting regression table
-
-
-
- R squared
-
-
-
- Standard errors
-
-
-
- T-statistics
-
-
-
- Heteroskedasticity vs homoscedasticity
-
-
-
- Robust standard errors
-
-
-
- Classical approach to linear regression
-
-
-
- Modern approach to linear regression
-
-
- Probit and logit models
-
-
- Interpreting coefficients
-
- Latex
Learning resources
- Resources for Learning R
- Resources for Learning Ggplot2
- Resources for Learning Plotly
- Resources for Learning Dplyr
- Resources for Learning Rmarkdown
- Resources for Learning Econometrics Data Science
Beneficial links
- Zoom link
- Course chat
- GitHub repositoy with course materials
- Anonymous feedback
- Table with grades
- Oral Examination general guidelines </ul>
Seminars
Instructor | Schedule |
---|---|
Ilias Suvanov | Lectures will be held on Tuesday at 08:00 and Thursday at 08:00 |
Installing Rstudtio
To use RStudio first you need to install R and the RStudio itself
First: install R software in the link
Second: install RStudio software in the link
How to install Rstudio video
Course News
TBALectures/Seminars
№ | Date | Title | Materials at Github | View |
---|---|---|---|---|
L1 | 11th January | Data Science and Economics | </a> | Slides |
S1 | 13th January | Installing RStudio, Rmarkdown. R objects, vectors, dataframes. ggplot2 library. | Markdown | Webpage Video |
S2 | 18th January | Dplyr library. Mutate function. Present birth records in-class exercises. For loop statement. | Markdown | Webpage Video |
S3 | 20th January | Summary statistics. Group_by statistics. Histogramms. | Markdown | Webpage Video |
S4 | 25th January | ggplot2 cheatsheat. Brfss dataset. | Markdown | Webpage Video |
S5 | 27th January | Ggplot2 scatterplots. ggrepel, ggthemes | Webpage | HDI dataset Video |
S6 | 1st February | subset function, %in% keyword, rnorm, rbind. Ggplot2 2d density plot. facetgrid | 2d density plot with ggplot2 | Video |
S7 | 3rd February | Facetgrid | - | |
S8 | 8th February | Review of the homeworkd Introduction to R: Factors. Gapminder data, groupby operator recap. Functions in R. | - | Video |
S9 | 10th February | Functions in R. Readline. Else-if statement. Linear Regression. | - | Video |
S10 | 17th February | Tibbles vs Dataframes. Leaflet package. | - | Video |
S11 | 22th February | SQL-type joins. Merge joins. | - | Video |
S12 | 24th February | Joins recap. | - | - |