BW #64: Coal power

The G7 just announced a plan to dramatically reduce the use of coal for power plants in the next decade. What countries use coal, and are newer plants cleaner than older ones?

BW #64: Coal power

[Yes, this issue came out a bit late — between launching Pandas Workout, getting ready for PyCon US, and my usual schedule of corporate training, it has been a bit busy around here. My apologies!]

Global warming (aka climate change) continues to be a major concern, and countries around the world are doing what they can to reduce the degree to which they are making the situation worse. Just yesterday, the Group of Seven ("G7") leading industrialized democracies announced, at a meeting in Turin, Italy, that they aim to largely phase out coal power by 2035. (An Associated Press report is here: https://www.msn.com/en-us/money/companies/g7-nations-commit-to-phasing-out-coal-by-2035-but-give-japan-some-flexibility/ar-AA1nVWIO )

To be honest, I don't normally think about coal very much. I do know that it's often mentioned as a problematic fuel in many countries, and that we would be better off switching to renewable energy sources. But this G7 decision was presented as a big deal, and that led me to wonder just how many coal-powered power plants there are in the world, and how much they pollute.

Fortunately, I discovered Global Energy Monitor (https://globalenergymonitor.org/), whose Global Coal Plant Tracker contains lots of interesting and useful data about precisely this subject. And that's the data we'll be looking at this week, trying to understand which countries are still running coal plants, what kind of coal and process they're using, and how much emissions they're creating.

Data and seven questions

This week's data, as I wrote above, comes from the Global Energy Monitor's Global Coal Plant Tracker. That project's page is here:

https://globalenergymonitor.org/projects/global-coal-plant-tracker/

To download the data, you'll need to go to the "download data" link:

https://globalenergymonitor.org/projects/global-coal-plant-tracker/download-data/

Downloading the data requires giving them your name and e-mail address, and having it sent to you. *OR* you can just click on the following link, which will give it to you directly:

/content/files/wp-content/uploads/2024/02/global-coal-plant-tracker-january-2024.xlsx

I'm normally not thrilled to give such direct links, especially when organizations have all sorts of rules about distributing their data sets. However, this particular data set is distributed under the Creative Commons CC BY 4.0 license, which means that I can redistribute it, so long as I give credit, which I'll do here:

Global Coal Plant Tracker, Global Energy Monitor, January 2024 release

With that out of the way, here are this week's seven questions and tasks. The learning goals include reducing memory usage, grouping, pivot tables, and various types of plotting.

It takes 6-8 hours to research and write each edition of Bamboo Weekly. Thanks to those of you who support my work with a paid subscription! Even if you can’t, it would mean a lot if you would share BW with your Python- and Pandas-using colleagues. Thanks!

  1. Download the Excel spreadsheet, and load the "Units" sheet from that document into a data frame. We'll only want the following columns: "Country", "Capacity (MW)", "Status", "Start year", "Combustion technology", "Coal type", "Region", and "Annual CO2 (million tonnes / annum)".

  2. How much memory does the data frame take up? How much memory do you save by turning columns into categories? Which columns are most (and least) likely to save us memory in this way? Are there any columns that we *could* turn into categories, but shouldn't?

    • Download the Excel spreadsheet, and load the "Units" sheet from that document into a data frame. We'll only want the following columns: "Country", "Capacity (MW)", "Status", "Start year", "Combustion technology", "Coal type", "Region", and "Annual CO2 (million tonnes / annum)".
    • How much memory does the data frame take up? How much memory do you save by turning columns into categories? Which columns are most (and least) likely to save us memory in this way? Are there any columns that we could turn into categories, but shouldn't?