An interview with our Applied Data Science Alum, Jan Cuppens
Signing up for the course, I was looking to up my skills in general, what I got in return was so much more. Cambridge Spark helped me get a new job, as well as kick off my start up. All whilst hanging out with great people and having an amazing time.
What did you most enjoy working on during the Applied Data Science Bootcamp?
I enjoyed working on the assignments the most, as they enabled us the put to practice what we had learned. The project I most enjoyed related to predicting which Kickstarter start-ups would likely receive funding and which would not.
What were your key takeaways from the programme?
There really is no better way to learn than by doing, and the programme is all about that. I wouldn’t argue Gladwell’s 10.000 hours rule to master a skill, however I can say that after completing this course you’ll be well on your way.
Feature engineering is key! It is crucial for data scientists to have a good understanding of the domain the questions revolve around. No matter how fancy the algorithm or how sophisticated the visualisation, if you don’t manage to model the signal in a proper way and highlight the key underlying information, you will likely not obtain an optimised result.
You worked on your own start up idea for the final project, could you tell me a bit about the techniques you applied?
I am very much interested in the world of collectibles (watches, cars, whiskey, etc.), not necessarily from a collectors’ perspective, but rather as alternative investments. Over the past decade we have seen a great surge in investment towards these asset classes, in a lot of cases even outperforming traditional market yields. I believe this to be a domain that could greatly benefit from machine learning.
For my final project I build a classification model to determine which collectibles would sell and which would sell above asking price. Much like fraud detection problems, the level of class imbalance was rather high. Therefore, I decided to go for more complex tree-based models (Random forest, XG-Boost) & ensemble models, as they lend themselves nicely to the kind of data collected. After a lot of feature engineering (NLP was key) and dealing with the hazard of overfit, I managed to get quite good F1 scores for both classes.
It is so great to hear you have landed a new job! What position have you moved into?
I am very pleased to be joining the Data Science team at Dunnhumby.
What piece of advice do you have for those looking to join the Applied Data Science Bootcamp?
Make sure you feel comfortable coding in Python and know the basics. Get yourself a good intro to data science and machine learning book and read as much as you can. Go to the drinks, you will meet some very cool people.
Want to learn more about our programme?
Check out our Applied Data Science Curriculum for more information about the topics covered.