Feeling overwhelmed by Pandas?
Survey after survey confirms that Python is the #1 language for data science.
And the most popular library for working with data in Python? Pandas.
That’s great, for a whole lot of reasons:
Python and Pandas are both open source, meaning not only that they’re free of charge, but that their development is driven by the community of users
Pandas is extremely powerful and flexible, able to work with a large number of data formats and perform numerous types of useful calculations
The larger Python ecosystem is immense, meaning that you can leverage tons of open-source packages from elsewhere in the community.
Bottom line? With Python and Pandas, you can do more, in less time, with fewer bugs.
This is great, except for one thing: Pandas is immense, and using it can be overwhelming.
I speak with coders and data scientists at companies around the world. If there’s one thing they say, it’s that they don’t feel confident with Python, even if they have used it for several years. They’re constantly searching on Google, going to Stack Overflow, or — most recently — turning to ChatGPT.
If this is true for Python, then it’s double or triply true for Pandas, which has its own way of thinking, and is a massive library. I often encounter people who have been using Pandas for years, only to discover that they didn’t know about a better, faster, more idiomatic way to do things.
Bamboo Weekly is about fixing this problem. Week by week, feature by feature, little by little. Over time, you’ll become fluent with Pandas, able to solve problems quickly, and with confidence.
Who am I?
I’m Reuven, a full-time trainer in Python and Pandas. I’ve been self-employed since 1995, and after doing programming and consulting projects for years, I decided to switch into training. For more than a decade, that’s all I’ve done. I was a columnist at Linux Journal for 20 years, I’ve written books about programming, and I often speak at technical conferences.
I have to say, I love my work. If I do my job well, then other people can have better, more productive careers.
Why I started Bamboo Weekly
You can only learn through repeated practice. And you’re only going to practice if the topics are interesting and relevant. Far too many Pandas tutorials use made-up data, or data sets that have been cleaned and massaged in advance. They don’t reflect the rough-and-tumble of the real world.
Bamboo Weekly aims to help you improve your Pandas fluency through weekly practice with real-world data sets having to do with current events. The idea is to keep you (and me!) motivated and interested, to look at a wide variety of data formats, and to explore different aspects of Pandas:
On Wednesdays, I send a short description of the news item I want to investigate, a question I want to ask, and a data set you can use to answer it.
On Thursdays, I send my solution to the problem, along with an explanation of why I solved it that way.
In the comments, paid subscribers can discuss the question, my solution, and how else we might have solved it. I’m hoping that we’ll be able to have constructive discussions and debates over these techniques, creating a community of people constantly aiming to improve their data-analysis skills.
Over time, you’ll gain experience and fluency with Pandas. You’ll gain the confidence you need to solve problems at work. And you’ll discover, along with me, all sorts of hidden treasures in the Pandas library that can make your work just that much more fun.
Topics I cover
Since starting Bamboo Weekly, I’ve covered a wide range of Pandas topics, including:
Plotting with Seaborn
Joining tables together
Reading multiple files into a single data frame
Cleaning bad data
Scraping data from Web sites
The new PyArrow backend
Each week, I try to cover several different topics. The goal, as always, is to improve your Pandas skills, and your confidence at solving problems.
I want to hear from you!
I’ve been teaching Python and Pandas for many years, and I’ve been writing about programming and data science for about as long. But Bamboo Weekly is a new kind of publication, different from what I’ve written before. I’m thus hoping — even expecting — to hear from readers with suggestions for topics to discuss, or ways to improve the newsletter.
If you have questions, suggestions for topics to cover, or interesting data sets that I should know about, please send them to me at email@example.com.