Introduction to Data Science
Exploratory data analysis and interactive visualisation, unsupervised learning, dimensionality reduction and feature extraction, supervised learning and more.
Next Date: 11 Feb - 12 Feb 2017
London and Cambridge
two days Course
What you will learn
The course is extremely interactive and hands-on. You will learn by working through concrete problems with a real dataset. You will be taught by academic and industry experts in the field, who have a wealth of experience and knowledge to share.
- Preprocessing (scaling, log transformations, imputation, hot coding)
- Exploratory data analysis and interactive visualisation
- Unsupervised learning (k-means clustering, hierarchical clustering)
- Dimensionality reduction and feature extraction (PCA, t-SNE)
- Supervised learning (decision trees)
Languages and libraries
- Python programming language
- numpy and pandas for data manipulation
- scikit-learn for machine learning algorithms
- plotly for interactive visualisations
Cement your skills by working through a follow-up project with our feedback and attaining the Data Science Foundation certificate.
Learn state-of-the art machine learning techniques at our Advanced Machine Learning Techniques in Python bootcamp.
Acquire specialised Natural Language Processing skills at our Text Mining and Natural Language Processing with Python bootcamp.
Learn how to make quantitative predictions with our Forecasting and Regression course.
Prerequisites: basic statistics and probability theory; basic python
Pre-processing, exploratory data analysis (EDA), visualisation, unsupervised learning.
Introduction to Data Science
- Overview of Data Science and Machine Learning
- Supervised vs. Unsupervised Learning
- Classification vs. Regression
- Real-world Applications
Working with real-world data
- Cleaning and mining real-world data
- Data cleaning and pre-processing
- Exploratory data analysis
- Interactive visualisations
- Fundamentals of clustering techniques
- k-means clustering
- Hierarchical cluster analysis
- Density-based clustering
- Drinks with fellow participants and lecturers
Dimensionality reduction, feature extraction, supervised learning.
Dimensionality Reduction and Feature Extraction
- The curse of dimensionality
- Principal Component Analysis (PCA)
- The K Nearest Neighbor algorithm
- Decision Tree classifier
- Overfitting and Validation
- Hyperparameter tuning
- Data Science in industry
Data Science Foundation Certificate
This certification acknowledges that you have successfully acquired the skills taught at the Cambridge Coding Data Science Bootcamp and that you are able to apply them independently.
To attain the certificate, you will be required to complete a project-based assessment after the bootcamp which you will be able to include in your own portfolio of work.
Once completed, you will receive detailed feedback on your code, problem-solving approach, and methodology, providing you with invaluable guidance on how to develop as a data scientist.Price: £100 extra
The Cambridge Coding Academy Machine Learning certificate recognises that you are able to apply fundamental machine learning skills and solve problems independently, based on the methodology taught at the Data Science Bootcamp.
You will also receive detailed feedback on code, problem-solving approach and methodology, which will help you to develop as a practitioner in this field.
Submission and Assessment
Assessment will be in the form of an open-ended project working with a real-world dataset. You need to complete this within 2 months of attending the bootcamp. You may apply any of the techniques introduced in the bootcamp and explore other techniques if you wish to.
You will need to demonstrate that you understand the essential principles and techniques that you have been taught at the Data Science Bootcamp in Python, and that you would be able to apply these in real-world settings. Specifically, you need to show you are able to:
- Understand the Machine Learning problem
- Pre-process and improve data quality for analytics
- Transform, visualise and analyse data
- Build the model
- Train and validate the model
- Evaluate and optimise the model