Introduction
In this course, we will embark on an exciting journey into the realm of machine learning and data science. Machine learning, a subset of AI, empowers computers to learn from data and make predictions or decisions without explicit programming.
Throughout this course, we will leverage the power of Python, the most popular language for machine learning. Libraries such as scikit-learn provides a rich set of tools for data preprocessing, model building, evaluation, and much more.
This course will equip you with the foundational knowledge and practical skills needed to tackle real-world problems involving data analysis, classification, and regression.
Course content
The course covers the following topics:
- Loading, cleaning, and exploration of data.
- Quarto and Jupyter notebooks for the presentation of exploratory data analyses and applications of machine learning methods.
- Fitting of basic classifiers.
- Run and make informed choices among linear regression.
- Interpret the coefficients of regression models along with their p-values.
- Splitting data into test and training sets.
- Construction estimator pipelines, including data loading, preprocessing, fitting, and model evaluation.
- Perform regularized regression, such as ridge and lasso.
- Usage of non-linear features such as polynomials and splines.
- Feature transformations such as one-hot encoding.
- Make informed choices between different models using cross-validation.