Introduction to Data Science

Exploratory data analysis and interactive visualisation, unsupervised learning, dimensionality reduction and feature extraction, supervised learning and more.

Next Date: 11 Feb - 12 Feb 2017




London and Cambridge


two days Course

What you will learn

The course is extremely interactive and hands-on. You will learn by working through concrete problems with a real dataset. You will be taught by academic and industry experts in the field, who have a wealth of experience and knowledge to share.

  • Preprocessing (scaling, log transformations, imputation, hot coding)
  • Exploratory data analysis and interactive visualisation
  • Unsupervised learning (k-means clustering, hierarchical clustering)
  • Dimensionality reduction and feature extraction (PCA, t-SNE)
  • Supervised learning (decision trees)

Languages and libraries

  • Python programming language
  • numpy and pandas for data manipulation
  • scikit-learn for machine learning algorithms
  • plotly for interactive visualisations

Progression paths

Cement your skills by working through a follow-up project with our feedback and attaining the Data Science Foundation certificate.

Learn state-of-the art machine learning techniques at our Advanced Machine Learning Techniques in Python bootcamp.

Acquire specialised Natural Language Processing skills at our Text Mining and Natural Language Processing with Python bootcamp.

Learn how to make quantitative predictions with our Forecasting and Regression course.


Audience: All aspiring data scientists, students, researchers and professionals who are are curious about this exciting and rapidly growing field.

Prerequisites: basic statistics and probability theory; basic python

Day 1

Pre-processing, exploratory data analysis (EDA), visualisation, unsupervised learning.

Session 1

Introduction to Data Science

  • Overview of Data Science and Machine Learning
  • Supervised vs. Unsupervised Learning
  • Classification vs. Regression
  • Real-world Applications

Session 2

Working with real-world data

  • Cleaning and mining real-world data
  • Data cleaning and pre-processing
  • Exploratory data analysis
  • Interactive visualisations

Session 3

Unsupervised learning

  • Fundamentals of clustering techniques
  • k-means clustering
  • Hierarchical cluster analysis
  • Density-based clustering



  • Drinks with fellow participants and lecturers

Day 2

Dimensionality reduction, feature extraction, supervised learning.

Session 1

Dimensionality Reduction and Feature Extraction

  • The curse of dimensionality
  • Principal Component Analysis (PCA)

Session 2

Supervised Learning

  • The K Nearest Neighbor algorithm
  • Decision Tree classifier
  • Overfitting and Validation
  • Hyperparameter tuning

Session 3

Guest Speaker

  • Data Science in industry

Data Science Foundation Certificate

This certification acknowledges that you have successfully acquired the skills taught at the Cambridge Coding Data Science Bootcamp and that you are able to apply them independently.

To attain the certificate, you will be required to complete a project-based assessment after the bootcamp which you will be able to include in your own portfolio of work.

Once completed, you will receive detailed feedback on your code, problem-solving approach, and methodology, providing you with invaluable guidance on how to develop as a data scientist.

Price: £100 extra


The Cambridge Coding Academy Machine Learning certificate recognises that you are able to apply fundamental machine learning skills and solve problems independently, based on the methodology taught at the Data Science Bootcamp.

You will also receive detailed feedback on code, problem-solving approach and methodology, which will help you to develop as a practitioner in this field.

Submission and Assessment

Assessment will be in the form of an open-ended project working with a real-world dataset. You need to complete this within 2 months of attending the bootcamp. You may apply any of the techniques introduced in the bootcamp and explore other techniques if you wish to.


You will need to demonstrate that you understand the essential principles and techniques that you have been taught at the Data Science Bootcamp in Python, and that you would be able to apply these in real-world settings. Specifically, you need to show you are able to:

  • Understand the Machine Learning problem
  • Pre-process and improve data quality for analytics
  • Transform, visualise and analyse data
  • Build the model
  • Train and validate the model
  • Evaluate and optimise the model


Check out video highlights, photos and interviews from our previous bootcamps.

Book your ticket

Next Date: 11 Feb - 12 Feb 2017

Location: THECUBE - Studio 5 , 155 COMMERCIAL STREET, E1 6BJ, London (London)

Ticket includes online course materials, code resources, lunch and networking drinks

In-house Training

Get in touch to discuss your requirements by emailing or by completing our contact form.

We can deliver this course as a private training at your office during week days.

We can also design a bespoke curriculum matching your specific training objectives.