Learning objectives
The aim of the course is to lay a foundation for analysis of real-world data.
First module: at the end of the module, students will have learned the appropriate skills to analyze complex data.
Second module: at the end of the module, students will be able to understand methods used in socio-economic research and to apply a range of statistical tools in empirical studies.
Requirements
Elements of descriptive statistics and probability theory.
Contents
The course is organized as follows.
- First module (30 hours):
Introduction to R and RStudio.
Data visualization.
Data wrangling and tidy data.
Geospatial data and text as data.
Statistical foundations.
Predictive analysis: regression and machine learning. - Second module (30 hours):
Causal inference.
Omitted variables.
Matching techniques.
Instrumental variable models.
Regression Discontinuity Design models.
Difference-in-difference models.
Methods of evaluation
Learning assessment will take place through four assignments (two for each module) and a final test.
The assignments will include exercises and the evaluation of each assignment will weigh 15% of the total.
The final test, which will weigh 40% of the total, will be a written test with multiple choice questions. The aim of the final test is to verify the understanding of all the statistical techniques addressed during the course.
Suggested reading list
Modern Data Science with R (2nd ed.) by Benjamin S. Baumer, Daniel T. Kaplan and Nicholas J. Horton. Chapman and Hall/CRC. ISBN-13: 978-0367191498
Causal analysis: Impact evaluation and Causal Machine Learning with applications in R by Martin Huber. MIT Press. ISBN: 9780262545914
More information
R programming language (R Core team) will be used for data analysis.
E-learning page: https://elearning.unisi.it/course/view.php?id=12059