Data Analysis: Python vs Excel
Excel has been a firm favourite for working professionals for many years and for good reason. It has a large number of capabilities and its ease of use has meant that it’s played a critical role in all manners of business, education, finance and research.
Enter Python. This programming language has gained traction over recent years. One report states that the demand for Python skills, as a requisite in job postings, has increased by 27.6% in the last year and shows no signs of slowing down. It was initially built as a way to write scripts that ‘automate the boring stuff’, but over time Python has become a leader in web development, data analysis and infrastructure management.
Demand for Excel
Microsoft Excel skills are still in high-demand. After 34 years, in this fast-changing tech world, the spreadsheet software is still going strong. The seasoned data analysis tool is still used a lot in the financial sector to organise and present large amounts of data. Excel has been developed and updated recently which means it boasts more user-friendly features and more effective functionalities for all businesses.
According to Microsoft, there are 1.2 billion people that own Excel, of which 800 million people currently use it. In comparison, it’s been estimated that the number of people that use Python is around 8.2 million people. The odds are that if someone you work with sends you a report, it will be in Excel, so it’s useful to know how to use it.
The time for change
However, consultants and IT experts have voiced their concerns over how fragile the spreadsheet software can be. Excel is working to overcome challenges such as:
- Data Volume: Companies, small and large, have most likely used Excel at some point in their development. However, as the organisations generate data, they find themselves dealing with an increasing number of spreadsheets, resulting in complex analytical issues.
- Syntax Errors: Excel has been considered notorious when copying and pasting data in specific cell ranges. This can create lots of errors when inputting formulas manually.
- Security Risks: Companies have to be cautious about the kind of information that is stored in Excel sheets, in case of misuse and cyber attacks. Excel has some security policies that need to be addressed.
Python says, (“Hello, World!”).
First released in 1991, Python has become one of the most ubiquitous programming languages out there. Although Python and Excel technically have different functionalities, Python has developed a strong following as people have realised its capabilities and potential. It’s been deemed as a better data analysis tool by many developers and the wider data science community.
While Python needs basic programming skills, it has been looked upon as a prerequisite for many quantitative roles. Companies are looking to hire new candidates with Python skills with at least beginner-level proficiency.
Its avid practitioners, known as Pythonistas, have uploaded 145,000 custom-built software packages to an online repository. These cover everything from game development to astronomy and can be installed and inserted into a Python program in a matter of seconds. This versatility means that the Central Intelligence Agency has used it for hacking, Google for crawling web pages, Pixar for producing movies and Spotify for recommending songs. Some of the most popular packages harness “machine learning”, by crunching large quantities of data to pick out patterns that would otherwise be imperceptible.
So how popular is it?
According to the ‘Popularity of Programming Language’ index, Python is the world’s most popular language. It’s grown 18.7% in the last five years. With a popularity share of 29.21%, Python beats the closest competitor, Java, by 9.31%. Whilst these numbers might not be an accurate metric to measure value, consider that Uber, PayPal, Google, Facebook, Instagram, Netflix, Dropbox, and Reddit all use Python in their development and testing. Moreover, Python is also used extensively in robotics and embedded systems.
In 2012, Stack Overflow, the largest and most trusted online community for developers, saw questions relating to Python account for less than 4%. Today, over 10% of questions on the site are related to Python.
Not only can it increase your productivity, but it can also have a positive impact on your income. The average salary in the UK for jobs that require Python as a skill is £57,075, compared to £37,504 for jobs using Excel. This means that by learning Python, you can expect to earn a higher salary on average. It’s also a great way to future-proof your career by keeping your skillset up-to-date and relevant.
Who can benefit from learning Python?
Python is such a diverse tool that can be used in multiple applications in plenty of jobs. Some of the most interesting things that you can do with Python are:
- Automate the boring stuff: Updating spreadsheets, renaming files, gathering and formatting data, check for spelling, automate excel reports using Python, and grammar mistakes and compiling reports. These are just a few examples.
- Build a Bitcoin notification service to see when might be a good time to purchase the highly talked about cryptocurrency. If Ethereum is more your thing, the code can be replicated for other currencies.
- Mine data from Twitter to build a sentiment analysis tool. This project would lead nicely into learning more about text processing and speech recognition.
- Build a Blockchain to use for almost any financial transaction.
Alex Zhivitov is a Data Analyst at TransferWise, who is currently hiring for Data Analysts that have experience with Python. In a recent interview, he spoke about the importance of learning Python over the use of Excel. His response was “There are many tasks that you can use Excel or SQL to get the job done but Python allows you more freedom to build the tools yourself and easily cover the data pipeline of data analytics from start to finish”.
Jobs that Python can benefit:
Account Managers, Accountants and anyone working with large datasets can benefit from learning and using Python. Programming knowledge will allow you to extract and manipulate data from multiple reports, to then, filter and detect any inconsistencies in the data on a very large scale, which would take you a long time if using Excel.
Data Analysts can benefit from learning Python as the majority of their work involves trawling through data and Python can help automate that process, saving ones time and effort. As mentioned earlier, TransferWise are hiring Data Analysts that have experience in Python. Alex described that “with Python, you can do so much more because of its general-purpose. It gives you the freedom to build tools for yourself and you can easily cover the entire pipeline of Data Analytics work from start to finish.”
(Interested in finding out more about what a Data Analyst does? Have a look at this overview)
Good news if you work in Marketing: Python can help you too. It can help by automating data collection (SEO indexation, email and SMS responses, and trend information), automate SEO processes, monitor campaigns more effectively and automate customised error checks. The jobs you would generally go to Excel for, you can automate by writing simple Python code.
Journalism: This is particularly relevant within journalism that uses data to tell stories. Those who know Python are in demand as they can rapidly sort through information, making them much more efficient when it comes to writing for deadlines.
What makes Python a better option?
There are many things that Excel can do, and it’s a great tool for basic data analysis, however, Python allows you to do more in terms of analysis. Here are a few reasons…
Python can handle much larger volumes of data and therefore analysis, and it forms a basic requirement for most data science teams. It can easily overcome mundane tasks and bring in automation. Furthermore, it has better efficiency and scalability. Python is faster than Excel for data pipelines, automation and calculating complex equations and algorithms.
Python is free!
Although no programming language costs money to use, Python is free in another sense: it’s open-source. This means that the code can be inspected and modified by anyone. Python is a progressive language that is constantly being developed, collaboratively, by a group of volunteers. Microsoft Excel costs around £100 to download for one license. For businesses (dependant on the number of employees) it could be thousands, and Excel is developed solely by Microsoft employees.
Leveraging the latest research
Excel has a large user base that offers a wide variety of tips and tricks in an open forum. But the Python community does the same, and more. With its strong ethos of collaboration, academics and Data Scientists often publish and share their code. This means that the latest techniques developed in Python are available for free to the community.
A Python library is a collection of functions and methods that allows you to perform many actions without writing your code from scratch. This makes a Data Analyst’s work more efficient seeing as they don’t have to waste time on writing out code, instead, they can just import a library. Different libraries have different functionalities. For example, TensorFlow (developed by Google) is used for machine learning projects and SciKit Learn is a library used when working with complex datasets.
Python is referred to as a ‘glue’ language, which means that it is particularly useful for connecting different scripts together and interact with different systems including different forms of databases (e.g. SQL and NoSQL databases), data formats (JSON, Parquet etc) and web services. The Python community also contributes to many packages that allow you to interact with a range of public APIs. This is often useful for Data Scientists given they have to read data from different places and process it.
Deep Learning and Machine Learning
Python is the de facto language of machine learning. Researchers and Academics are all using Python for deep learning to create predictive and simulative models that find new insights into their data. Most notably, Google’s TensorFlow works mostly with Python.
Python is widely supported
Python is backed by a large community of developers (8.2 million) and therefore has a strong support system. There are many tutorials covering Python concepts all over the web. Even Python programming experts can find guidance if necessary when working on complex problems. As mentioned earlier, Excel does have a many more people using the software, and it’s well supported online in tutorials and guides.
Python is not only supported online but offline too, at conferences, meetups, hackathons and events around the world. PyCon 2019 in Cleveland, for example, had just shy of 4,000 attendees. PyCon is an international set of conferences, held at multiple locations around the world. They aim to bring together developers and data science enthusiasts to discuss and promote the Python programming language.
Another is PyData, a leading conference which focuses on the community of users and developers of data analysis tools to share and learn together with chapters in cities all over the world. (PyData Cambridge will be running in November 2019).
In the business world, Python proficiency has been on the rise. Cambridge Spark CEO, Dr Raoul-Gabriel Urma had an interview with eFinancialCareers about the future for traders if they don’t learn Python. He said, “If you want to get an edge today, you need to create new strategies with Python. This is why all the traders at trading companies are increasing their Python proficiency.” Dr. Urma’s experience and knowledge in Python comes from his Masters in Engineering in computer science and a PhD in computer science from Cambridge University, he followed this by working with Google and Goldman Sachs before starting his own tech company.
Excel vs Python: Who wins?
The evidence suggests that both the software tools have their places with certain jobs. Excel is a great entry level tool and is a quick and easy way to do some analysis. But for the modern era, with large data sets and more complex analytics and automation, Python provides the tools, techniques, and processing power that Excel in, many instances, lacks. After all, Python is more powerful, faster, capable of better data analysis and it benefits from a more inclusive, collaborative support system.
Python is a must-have skill for Data Analysts and now is the time to learn. Alex Zhivotov, also commented “You can be a good Data Analyst without knowing Python but if you want to stand out above the rest, be a star data analyst and progress then you need to learn Python”.
If you want to reap the benefits such as a higher salary, better career opportunities and keeping your skills relevant for the fourth industrial revolution, then learn Python.
Learn Python with Cambridge Spark
At Cambridge Spark, we have our Applied Data Analytics Bootcamp. The Bootcamp is perfect for people that realise the potential of learning Python and the other skills that are becoming highly necessary in the business world. In just three months you could learn Python and advanced data analytics tools to ensure that you’re industry ready to work as a Data Analyst. Equally, the skills you’ll learn can be used to perform data analysis in your current job role and if you want to transition into a role within the Data Science field.
Looking to kick-start your Data Analyst/Data Scientist career?
Whether you’re looking to upskill to access promotions, reskill to remain relevant in your field, or transition to a Data Science career, our Applied Data Science Bootcamp (London) can help you achieve your career objectives.
If Data Analytics is what you’re looking for then we have our Applied Data Analytics Bootcamp, London.
Please complete the form to the right of this text with your details and we’ll get in touch and to talk through your objectives, suitability and the Bootcamp.
Alternatively, click here now to book a callback with the admissions team.
Get in touch now
Please complete all of the required fields to get in touch with us.