Introduction to Big Data and PySpark

 Upskill data scientists in the Big Data technologies landscape and Pyspark as a distributed processing engine

LEVEL: BEGINNER
DURATION: 2-DAYS COURSE
DELIVERED: AT YOUR OFFICE

What you will learn

rawpixel-edited

This two-days course will provide a hands-on introduction to the Big Data ecosystem, Hadoop and Apache Spark in practice.

Understand the challenges in the Big Data ecosystem
Describe the fundamentals of the Hadoop ecosystem
Use the core Spark RDD API to express data processing queries
Monitoring and tuning

Languages and libraries :

Python 3
Spark

PREREQUISITES

Fundamentals Python programming and use of the command line. You can acquire these skills at our Python bootcamp.

AUDIENCE

Those who are curious about the Big Data space and who want to feel comfortable getting their hands dirty with high volume, high velocity, diverse real-world datasets.

DAY ONE

Introduction to Big Data and Spark

DAY TWO

Spark in Practice

Get in Touch

CONTACT US

We will email you within the next 24 hours to arrange a quick call to help with any questions about the programme and recommend pre-course materials.

We look forward to speaking with you.

Dr. Raoul-Gabriel Urma

Dr. Raoul-Gabriel Urma

Director