Text Mining and Natural Language Processing with Python


  • Level: intermediate
  • Duration: 2-day course
  • Delivered: in-house

What you will learn

You will learn the fundamental skills you need to extract syntactic, semantic and even emotional information from text.

  • Text processing (parsing, tokenisation, lematisation)
  • Syntactic analysis (POS tagging)
  • Semantic analysis (word vector analysis, IR techniques)
  • Topic analysis
  • Language models and text generation

Languages and libraries

  • Python programming language
  • numpy and pandas for data manipulation
  • scikit-learn for machine learning algorithms
  • plotly for interactive visualisations

OUTLINE

Day One

Essential techniques for text processing and information extraction

Session 1

Text processing

  • Text tokenisation
  • Lemmatisation, parsing

Session 2

Semantic analysis and information extraction

  • Overview of information extraction
  • Vector representations of words
  • Evaluating semantic similarity

Day Two

NLP applications and machine learning

Session 1

Text classification and ranking

  • Naive Bayes for spam filtering
  • Sentiment analysis

Session 2

Topic segmentation

  • Clustering.
  • Multi-class classification

Session 3

Language models

  • Text prediction
  • Text generation

Prerequisites

Prerequisites: Good knowledge of python, some familiarity with matrices, basic understanding of machine learning practice (as taught in Introduction to Data Science)

Audience

Those who wish to take their data science skills further and learn state-of-the-art techniques in this constantly evolving field.


Get in touch

Get in touch to discuss team size, pricing and your tech requirements. Send an email to training@cambridgespark.com or fill in our contact form. We’ll be sure to get back to you soon.

Contact our team