[Want to get experience with agentic coding that doesn't endanger your work, with a mentor who will guide you the whole way? Join my upcoming cohort of HOPPy (Hands-on Projects in Python, https://LernerPython.com/hoppy).
This time, everyone will build an ambitious project — of their own choosing — using agentic tools such as Claude Code. The 8-week cohort starts Sunday, so don't dealy! You can catch a recording of the info session I did a few hours ago at https://www.youtube.com/watch?v=Wk-KHDHkjkI if you have questions. Or just reach out to me, reuven@lerner.co.il.]
It's June, which means two things for those of us in the Northern Hemisphere: First, everyone is already starting to complain about the hot weather. Second, everyone is discussing their summer vacation plans.
How much vacation time do people get in various countries? And how many paid holidays does each country give its workers? This week, we'll look at a few different data sets having to do with this topic.
Data and five questions
Our main data set is Wikipedia, whose "List of annual leave per country" (https://en.wikipedia.org/wiki/List_of_minimum_annual_leave_by_country) provides a good starting point for comparing countries around the world. We'll retrieve that file, and use its table as our main starting point.
We'll also look at an interesting data set from Expedia, which asks people in a number of rich countries whether they feel deprived about vacation time. I got that from an article from CNBC, https://www.cnbc.com/2024/07/01/americans-take-less-time-off-but-europeans-are-more-vacation-deprived.html. We'll use the two (short) CSV files that you can download from that story, one about the vacation deprivation, and another about how many vacation days people in different countries take.
Paid subscribers, both to Bamboo Weekly and to my LernerPython+data membership program (https://LernerPython.com) get all of the questions and answers, as well as downloadable data files, downloadable versions of my notebooks, one-click access to my notebooks, and invitations to monthly office hours.
Learning goals for this week include retrieving data from the Web, joins, regular expressions, cleaning data, and plotting with Plotly.
Here are my five questions for this week. I'll be back tomorrow with my solutions and explanations:
- Turn the main table from the Wikipedia page into a Pandas data frame. Remove all footnote references, turn missing values into NaN, take the lowest number from any range of numbers. Turn the vacation, holidays, and total columns into float dtypes. What five countries have the most mandated vacation days, and which have the fewest?
- What countries have 0 paid holidays? What countries have more paid holidays than vacation days?