Disclaimer: This article is a part of “Secrets of the MSc in Statistics and Data Science at KU Leuven”. Note that the written information below is purely subjective and it does not represent any officials from the program.
Author: Gunho Lee
About the other articles:
- Secrets of the Master of Statistics and Data Science at KU Leuven (Introduction)
- Belgian University Education System & The Master of Statistics and Data Science at KU Leuven
- KU Leuven Master of Statistics and Data Science: Tips for international students & FAQs
- DataCamp: what is it and how can it help me?
- Python vs R
Have you known that “KU Leuven: the Master of Statistics and Data Science” was called “KU Leuven: Master of Statistics” until a few years ago? While the name change may suggest a shift in curriculum, I personally believe that the focus remains on statistics. Although data science has emerged as a buzzword in recent times, there are still some blurred lines when it comes to differentiating between statistics and data science. I would like to provide some insights in this regard.
The Era of Data Science
Data science has gained significant attention in recent years, thanks to the increasing importance of data in various industries. It is no longer surprising to find a plethora of data jargon and advertisements in your search engine. Besides, the demand for data professionals has surged, and organizations require them to handle their daily business operations. Thanks to data science, Statistics starts getting enormous amounts of attention than ever before. However, it is not always clear how statistics and data science differ from each other.
“Data Scientist (n.): Person who is better at statistics than any software engineer and better at software engineering than any statistician.”
- Josh Wills -
The definition of a data scientist overlaps with many fields, including statistics, computer science, and business. From my own experience, there are many statisticians who are uncomfortable with writing code and many data scientists who struggle with basic statistical concepts. Moreover, traditional statistics methods were developed when data was rare and expensive. Statisticians may be more careful when drawing inferences from data, as such methods are meticulously designed with underlying hypotheses. In this sense, I have found that statistics focuses more on inference than prediction, while data science is more focused on prediction (Imagine you are running a business. You may want to analyze your data to increase FUTURE sales [= prediction]).
It is common for statisticians to question the validity of their methods since they are designed with strict hypotheses to enable correct interpretation. Data science emerged with the rise of “Big Data,” and most of the old statistical methods were developed when such a term did not exist. In the statistical world, in many cases, the emphasis is on the validity of the method itself. Data scientists need to perform end-to-end data processing, including data wrangling, and their role has been extended to advanced technologies such as computer vision and natural language processing. On the other hand, statisticians play a significant role in bio-pharmaceutical and social sciences, where general phenomena need to be drawn.
The Necessity of Statistics IN Data Science
“It’s easy to lie with statistics It’s hard to tell the truth without statistics.”
- Andrejs Dunkels -
Despite the overlap between the two fields, statistics is essential in data science, which is often neglected in the field in my view. A strong understanding of statistics helps data scientists understand famous data scientific methods better (You should be able to fully grasp WHY before WHAT). In contrast, a data scientist’s knowledge helps statisticians be more computer-friendly, which is crucial in this era.
Master of Statistics and Data Science at KU Leuven
Back to the point, what does the program teach you more? I would say the focus lies more on “Statistics” than “Data Science”. You will learn the fundamentals of statistics through mandatory courses, and if your interest lies in data science, you can design your own individual study program (ISP) with elective courses offering such skills. However, the program may not meet your expectations if you are looking for hard programming training in Python for instance. In my opinion (purely subjective), it is worth noting that statistics require more precise and accurate training by experts. In a nutshell, the fields of statistics and data science have some similarities and differences, and it is crucial to understand the unique contributions of each field. The rise of data science emphasizes the crucial role of Statistics in the field even more, not the other way around.
Royal Statistics Society — Data Science and Statistics: different worlds?
In May 2015, a group of industry experts including academic professors and chief data scientists gathered to discuss the topic “Data Science and Statistics: different worlds?”. The discussion delved deep into the two worlds and thoroughly dissected them to understand their similarities and dissimilarities. To obtain more interesting ideas and thoughts, please find the video available on YouTube.
Do you have any more questions? Feel free to reach out to me on LinkedIn!