Join our community of hundreds of researchers, analysts and data scientists for an opportunity to network, develop new skills and gain insight into the evolving field of data science.
Hear from industry and academic speakers representing a range of sectors, from research and bioinformatics to business and finance
Learn about the practical application and implementation of the latest tools, techniques to industry case-studies.
Share knowledge, pick up new ideas and connect with developers, analysts, researchers and executives.
The Data Science Summit’s are all about putting research into action. You can see how the latest techniques are implemented, network with other leaders and specialists in the field who make research actionable, and get insight on how you can help transform your company, teams and the way you work.
Sarah Curshen, Director of Executive Education Custom Programmes, Cambridge Judge Business School
Prof. Kenneth Benoit
Professor of Quantitative Social Research Methods, London School of Economics
Session: Quantitative Text Mining, the Social Scientific Way: Mining Social Media on Brexit
Abstract: Text mining and text analytics form increasingly important subsets of data science. This activity may be geared toward extracting value from commercial data, improving policy delivery, studying human speech, or analyzing literature or the arts. In this talk, I present a distinct perspective for analyzing text as data from a social science perspective, meaning that text is used as data for qualities that we cannot observe more directly. I will discuss the implications of this perspective, and provide examples through ongoing work on text analysis of tens of millions of Tweets about Brexit, including machine learning to predict a user's preference for Leave v. Remain, sentiment analysis, and topic models.
Dr. Sebastian Kaltwang and Brook Roberts
Machine Learning Engineer, FiveAI
Session: Overcoming the Data Bottleneck for Self-driving Cars
Abstract: How can we efficiently obtain millions of annotated images for model training? State-of-the-art deep learning models have been able to achieve superhuman performance on various object recognition challenges. This makes them a suitable candidate for the safety critical perception tasks required in self-driving cars. There is one caveat: these methods require large amounts of data, which is typically obtained via a costly and time consuming manual annotation process. We at FiveAI work on urban travel that’s safe for everyone — without costing the earth. Not willing to make any compromises on safety, we needed to find a way to efficiently label large amounts of images. We noticed that driving the car is itself a form of annotation. Using this as a starting point, we can estimate a road plane in 3D based on where the car has driven and are able to project manual labels from this plane into all images of a video sequence. Using this semi-automated labelling process, we have been able to reduce the labelling time from the order of minutes down to 5 seconds per image.
Cloud Developer Advocate, Google
Session: Google Cloud AutoML
Abstract: Thanks to machine learning and AI, applications are now being created that can see, hear, and understand the world around them. Learn how you can easily infuse AI into your business today. In addition to a guided walkthrough and some fun demos of Google Cloud's easy-to-use machine learning APIs: Cloud Vision, Cloud Video Intelligence, Cloud Speech, Cloud Natural Language, and Cloud Translation, we'll demonstrate how Google Cloud AutoML enables developers with limited machine learning expertise to train high quality models by leveraging Google’s state of the art transfer learning, and Neural Architecture Search technology.
Artificial Intelligence DevRel EMEA, Nvidia
Session: Artificial intelligence and the evolution of the computing platform
Abstract: Artificial Intelligence is impacting all areas of society, from healthcare and transportation to smart cities and energy. AI won’t be an industry, it will be part of every industry. NVIDIA invests both in internal research and platform development to enable its diverse customer base, across gaming, VR, AR, AI, robotics, graphics, rendering, visualisation, HPC, healthcare & more. Alison’s talk will introduce the hardware and software platform at the heart of this Intelligent Industrial Revolution: NVIDIA GPU Computing. She will provide insights into how the computational demands for AI have impacted hardware evolution & how academia, enterprise and startups are applying AI, offering a glimpse into state-of-the-art research.
Dr Haitham Bou-Ammar
Head of Reinforcement Learning and Tuneable AI, Prowler
Session: Data-Efficient Reinforcement Learning
Abstract: What's the use of an AI that doesn't make decisions? Current techniques for decision-making in AI are unscalable, computationally expensive, and memory intensive. At PROWLER.io, we are developing next-generation decision makers that are efficient, scalable, and robust. To do so, we draw upon a variety of methodologies from different fields, including probabilistic modelling, game theory, and optimisation. In this talk, I demonstrate how a decision-maker can be made much more data-efficient to tasks that resemble complexities that are close to those encountered in real-world problems. As an example, I present a result for controlling Montezuma's revenge in the order of thousands – not millions – of interactions with the environment. Further, I demonstrate how to control robotic systems in order of tens of episodes.
Dr Maksim Sipos
Session: Automated feature extraction and selection for challenging time-series prediction problems
Abstract: In the case of predictive modelling, feeding time-series data directly into a machine learning algorithm often leads to sub-optimal performance. Most modern algorithms tend to be slow at learning the embedded time dynamics. This is especially the case in challenging problems such as datasets of small sample size and datasets containing low signal to noise ratio. A common solution is to include a pre-processing step, namely feature extraction. Given that many features can be extracted from each time-series, this leads to an exponential increase in the dimensionality of the data. Optimal feature set selection can be a time-intensive process and the optimal solution is a function of the choice of algorithm and parameters. The talk will focus on how including automated feature extraction and selection as part of a full machine learning optimisation pipeline can lead to superior results, especially in the case of challenging time-series problems.
Dr Jeremy Bradley
Lead Data Scientist, Royal Mail
Session: Data Science as a Transformative process
Abstract: Data Science is often misunderstood or misused in a commercial environment as a means of creating more detailed insights in an existing operation - whether that be in ops, finance or marketing. This is a waste of the science and the talent. The real power of using science in a commercial environment is to link its results through well engineered tools to decisions - maybe in an automated, semi-automated or curated fashion. I will talk about some of my experiences of doing this at Tesco and at Royal Mail in this talk. Far from leading to a business operation with less human understanding and characteristics, I will argue that a new data science approach can lead businesses to take greater care of both employees and customers and benefit both as a result.
Deep-dive Machine Learning Workshops using Python
Session 1: NLP - Semantic analysis and information extraction
- Overview of information extractions
- Vector representations of words (Word2Vec)
- Evaluating semantic similarity
Session 2: Neural networks and Deep Learning
- Structure of Neural Networks
- Training of Neural Networks
- Convolutional Neural Networks