Introduction to Data Science

Exploratory data analysis and interactive visualisation, unsupervised learning, dimensionality reduction and feature extraction, supervised learning and more.

  • Level: beginner
  • Duration: 2-day course
  • Delivered: in-house

What you will learn

The course is extremely interactive and hands-on. You will learn by working through concrete problems with a real dataset. You will be taught by academic and industry experts in the field, who have a wealth of experience and knowledge to share.

  • Preprocessing (scaling, log transformations, imputation, hot coding)
  • Exploratory data analysis and interactive visualisation
  • Unsupervised learning (k-means clustering, hierarchical clustering)
  • Dimensionality reduction and feature extraction
  • Supervised learning (KNN)

Languages and libraries

  • Python programming language
  • Numpy and pandas for data manipulation
  • Scikit-learn for machine learning algorithms
  • Matplotlib and seaborn for data visualisation


Day One

Data Science Essentials

Session 1

Introduction to Data Science

  • Overview of Data Science and Machine Learning
  • Supervised vs. Unsupervised Learning
  • Working with the Jupyter notebook
  • The Numpy library for array manipulation

Session 2

Working with real-world data

  • The Pandas library for data manipulation
  • Data cleaning and pre-processing
  • Data visualisation with Matplotlib and Seaborn

Session 3

Principal Component Analysis (PCA)

  • What is PCA and why you need it
  • Applying PCA in Python with SKLearn

Day Two

Unsupervised learning and supervised learning

Session 1

Unsupervised learning

  • The scikit-learn library for Machine Learning and scikit-learn pipelines
  • k-means clustering
  • Hierarchical cluster analysis
  • Density-based clustering (DBScan)

Session 2

Supervised Learning

  • The k-Nearest Neighbour algorithm
  • Overfitting, underfitting, bias-variance tradeoff
  • Cross-Validation and hyperparameter tuning


  • Elementary Python programming and use of the command line. You can acquire these skills at our Python bootcamp.
  • Basic probability and linear algebra.


Individuals who want to master new technical skills and learn the latest techniques and industry best practices to work effectively with Data Science teams.

Get in touch

Get in touch to discuss team size, pricing and your tech requirements. Send an email to or fill in our contact form. We’ll be sure to get back to you soon.

Contact our team