Evaluating the PhD's Value in Data Science Across Countries
Written on
Chapter 1: The Global Value of a PhD for Data Scientists
In this analysis, we explore how valuable a PhD degree is for Data Scientists across various countries. This builds on the previous article, "How much adds a PhD to a Data Scientist’s salary," which focused on data from the United States, as presented in the Stack Overflow survey from 2017 to 2020.
Step 1: Data Preparation
The initial phase of our analysis involves several key tasks: selecting countries with a minimum of 50 responses from Data Scientists, normalizing the salary data to thousands of USD per year, filtering out the top and bottom 5% of salaries, focusing on high cardinality categorical data, and handling any missing values.
Step 2: Implementing a Machine Learning Model
The preprocessed data is then divided into training and testing sets, utilizing the CatBoostRegressor model, which effectively accommodates categorical variables. The resulting model achieves a root mean squared error (RMSE) of approximately 28,000 USD per year, significantly better than the baseline model's RMSE of around 40,000 USD per year.
Step 3: Interpreting the Machine Learning Outcomes
For model interpretability, we employ the SHapley Additive exPlanations (SHAP) technique, a widely used method to assess machine learning model outputs. The values derived from SHAP are also represented in thousands of USD per year.
We first examine the range of SHAP values for various influential features:
Unsurprisingly, the location of the Data Scientist role proves to be the most significant factor affecting annual salaries. Countries like the United States, Switzerland, Norway, Israel, and Denmark show the highest SHAP values:
Additionally, among educational qualifications, the PhD degree exhibits the highest SHAP value, followed closely by the MSc degree:
The differences in SHAP values between PhD and MSc degrees across countries, along with one standard deviation, are as follows:
From the analysis, it is evident that the PhD degree yields the highest compensation in Switzerland, followed by the United States, Israel, Brazil, and Japan. Interestingly, even in countries like India, Brazil, and Turkey, the SHAP value for a PhD remains relatively high despite the overall lower salaries for Data Scientists.
Moreover, there is minimal fluctuation in the SHAP values for PhD versus MSc degrees over time:
I trust that these insights will be beneficial. For any questions or comments, feel free to engage in the comments section or connect with me on LinkedIn or Twitter.
Chapter 2: Key Insights from YouTube
This video titled "Which country has the highest PhD Stipend? [+ boosting yours]" provides a detailed overview of the countries offering the best stipends for PhD candidates, along with tips on how to enhance your own stipend.
In the video "Always Look for QS ranking (Top 100) Universities for PhD or MS Admission," viewers will learn about the importance of selecting highly ranked universities for pursuing advanced degrees, which can significantly impact future career opportunities.