Core Data Science using Python

Exploratory data analysis and interactive visualisation, unsupervised learning, dimensionality reduction and feature extraction, supervised learning and more.

LEVEL: BEGINNER
DURATION: 3-DAY COURSE
DELIVERED: AT YOUR OFFICE

What you will learn

wk6-feature

The course is extremely interactive and hands-on. You will learn by working through concrete problems with a real dataset. You will be taught by academic and industry experts in the field, who have a wealth of experience and knowledge to share

Preprocessing (scaling, log transformations, imputation, one hot coding)
Exploratory data analysis and interactive visualisation
Unsupervised learning (k-means clustering, hierarchical clustering)
Dimensionality reduction and feature extraction
Supervised learning (KNN, decision trees, random forests, SVMs)
Model Evaluation and Tuning
Logistic Regression

Languages and libraries :

Python 3
Numpy and Pandas for data manipulation
Scikit-learn and statsmodel for linear and time series models
Matplotlib for visualisation

PREREQUISITES

Elementary Python programming and use of the command line. You can acquire these skills at our Python bootcamp.

Basic probability and linear algebra.

AUDIENCE

Individuals who want to master new technical skills and learn the latest techniques and industry best practices to work effectively with Data Science teams.

Get in touch with us to learn about the course

DAY ONE

DATA SCIENCE ESSENTIALS

 

Session 1

Introduction to Data Science

  • Overview of Data Science and Machine Learning
  • Supervised vs. Unsupervised Learning
  • Working with the Jupyter notebook
  • The Numpy library for array manipulation

 

Session 2

Working with real-world data

  • The Pandas library for data manipulation
  • Data cleaning and pre-processing
  • Data visualisation with Matplotlib and Seaborn

 

Session 3

Principal Component Analysis (PCA)

  • What is PCA and why you need it
  • Applying PCA in Python with SKLearn

DAY TWO

UNSUPERVISED LEARNING AND SUPERVISED LEARNING

 

Session 1

Unsupervised learning

  • The scikit-learn library for Machine Learning and scikit-learn pipelines
  • k-means clustering
  • Hierarchical cluster analysis
  • Density-based clustering (DBScan)

 

Session 2

Supervised Learning

  • The k-Nearest Neighbour algorithm
  • Overfitting, underfitting, bias-variance tradeoff
  • Cross-Validation and hyperparameter tuning

DAY THREE

MACHINE LEARNING

Session 1

Random Forests

  • Decision Trees
  • Ensemble models and Random Forests

Session 2

Logistic Regression

  • Logistic Regression
  • Regularisation: Ridge and Lasso

Session 3

Support Vector Classifiers

  • Linear Support Vector Classifiers (SVC)
  • The kernel-trick and non-linear SVCs

Get in Touch

CONTACT US

We will email you within the next 24 hours to arrange a quick call to help with any questions about the programme and recommend pre-course materials.

We look forward to speaking with you.

Dr. Raoul-Gabriel Urma

Dr. Raoul-Gabriel Urma

Director