Skip to content
12 min read plotly plotting multi-index geopandas api

Bamboo Weekly #177: European Summer (solution)

Get better at: Working with Excel files, APIs, GeoPandas, plotting with Plotly, and multi-indexes.

Bamboo Weekly #177: European Summer (solution)

Summer is always hot, but it has been particularly hot in much of Europe – so much so that many people have died from the heat (https://www.usnews.com/news/world/articles/2026-06-28/france-records-around-1-000-additional-deaths-amid-extreme-heat-wave-leading-to-european-records).

This week, we're looking at European weather data from the last two months. We'll see where it was particularly hot — and along the way, we'll also perform all sorts of calculations on European capital cities, including some with the help of GeoPandas.

Data and five questions

We'll use two types of data: First, an Excel file that I prepared with European country names, capital cities, and locations (i.e., latitude, longitude, and altitude).

Just after sending yesterday's questions, I realized that I hadn't included the file of European capitals. I did fix it on the BW site, but I'm enclosing it here if you happened to miss it yesterday:

Aside from that data, we also used https://open-meteo.com/, a free weather-reporting service with an API, to retrieve weather information about these cities.

Paid subscribers, both to Bamboo Weekly and to my LernerPython+data membership program (https://LernerPython.com) get all of the questions and answers, as well as downloadable data files, downloadable versions of my notebooks, one-click access to my notebooks, and invitations to monthly office hours.

Learning goals for this week include GeoPandas, multi-indexes, plotting, and using APIs.

Here are my five questions, along with my solutions and explanations:

Read data from European capitals (the provided Excel document) into a Pandas data frame. Based on the data there, find the capital that is furthest in each direction (north, south, east, and west), as well as at the highest and lowest altitudes.

I started off, as usual, by loading up Pandas. However, I also loaded a number of other modules that we'll use later, including (most notably) GeoPandas and requests:

import pandas as pd
import geopandas as gpd
import requests

from datetime import date, datetime
from plotly import express as px

I then loaded the data into a Pandas data frame. Because the file is in Excel format, I used read_excel. I didn't specify this in my question, but I decided not only to read the data into a data frame, but also to create a two-level multi-index with the country and capital city.

Using read_csv, I'm always able to do this with the index_col keyword argument, passing a list of column names – so I expected to be able to do that with read_excel, too. But it turns out that I was wrong; in read_excel, index_col can get a string (for a one-column index) or a list of integers (with the numeric positions of the index columns), but not a list of strings. I thus wrote:


capitals_filename = 'data/bw-177-euro-capitals.xlsx'
euro_capitals = pd.read_excel(capitals_filename, 
                             index_col=[0, 1])

This gave me a data frame with 47 rows 3 columns.

I then asked to find the capital city that's farthest in each direction. I decided to use nlargest and nsmallest to find the three most extreme in any given direction.

I started with farthest west, which means the smallest longitude:

euro_capitals.nsmallest(columns='Longitude', n=3)

I got:

Country	Capital	Latitude	Longitude	Altitude (m)
Iceland	Reykjavik	64.1466	-21.9426	40
Portugal	Lisbon	38.7251	-9.1498	10
Ireland	Dublin	53.3331	-6.2489	20

It makes sense that Iceland (which isn't technically in Europe, but this was the data set I used) would have the most western capital city.

What about farthest east, meaning the largest longitude?

euro_capitals.nlargest(columns='Longitude', n=3)

The results:

Country	Capital	Latitude	Longitude	Altitude (m)
Russia	Moscow	55.7522	37.6156	156
Cyprus	Nicosia	35.1728	33.354	220
Turkey	Ankara	39.9199	32.8543	938

Again, it makes sense that Moscow is the farthest east in Europe. Nicosia (not too far from where I live) is also pretty far east, relative to the rest of Europe. How about north and south?

For north, meaning largest latitude:

euro_capitals.nlargest(columns='Latitude', n=3)

We get:

Country	Capital	Latitude	Longitude	Altitude (m)
Iceland	Reykjavik	64.1466	-21.9426	40
Finland	Helsinki	60.1695	24.9354	26
Norway	Oslo	59.9127	10.7461	23

Yeah, when I was in Reykjavik (a cute city, but you should really see the rest of Iceland), they made a big deal about how it was the northernmost capital city in the world. Also: Not a very large one!

Next, the southernmost European

euro_capitals.nsmallest(columns='Latitude', n=3)

We got:

Country	Capital	Latitude	Longitude	Altitude (m)
Cyprus	Nicosia	35.1728	33.354	220
Malta	Valletta	35.8997	14.5148	56
Greece	Athens	37.9838	23.7278	150

Again, it makes sense, given that Cyprus is basically in the southeastern corner of Europe.

Given that we have the altitude information, I wanted to check that, too. Here's how we can get the highest-altitude capital city:

euro_capitals.nlargest(columns='Altitude (m)', n=3)

The result:

Country	Capital	Latitude	Longitude	Altitude (m)
Andorra	Andorra la Vella	42.5078	1.5211	1023
Turkey	Ankara	39.9199	32.8543	938
San Marino	San Marino	43.9367	12.4464	749

I should have known – Andorra is in the mountains between Spain and France, so it makes sense that they have the highest-altitude capital city. But it turns out that Ankara (not Istanbul!) isn't very far behind.

Finally, the lowest-altitude capital:

euro_capitals.nsmallest(columns='Altitude (m)', n=3)

Here are the results:

Country	Capital	Latitude	Longitude	Altitude (m)
Netherlands	Amsterdam	52.374	4.8897	2
Latvia	Riga	56.946	24.1059	7
Estonia	Tallinn	59.437	24.7535	9

I guess it shouldn't be a surprise that the Netherlands is the lowest.

Turn the data frame into a GeoPandas data frame. Which two capitals are closest to one another? Which are farthest apart? (This is more complex than you might think! Among other things, you'll need to change the projection to CRS 3035, to handle European distances sanely.)

Given all of this information about various European capitals, I thought it might be interesting to find out which are closest and which are farthest apart. This seems like a really simple problem to solve... but it actually has a bunch of complications.

First, if we're going to perform these sorts of geographic calculations, Pandas isn't a good enough tool. We'll want to use GeoPandas, which we've used before in a few Bamboo Weekly issues (but not for some time).

The first task was to create a GeoDataFrame from our regular ol' Pandas data frame. The difference between the two – and it's a big one! – is that GeoDataFrame objects have a geometry column containing some form of geometric information. In our case, we want to have a "point" object describing the capital city's location.

It's important to remember that the GeoDataFrame inherits from DataFrame in Pandas, which means that it can do anything that a DataFrame can, plus all of the geographic stuff. So we'll create a GeoDataFrame, handing it the euro_capitals data frame as an input. We'll then specify the geometry keyword argument, which lets us set geometry – in this case, with the result of invoking gpd.points_from_xy, a function that takes a series for x and a series for y. Fortunately, we have those with euro_capitals['Longitude'] and euro_capitals['Latitude'].

But wait, that's not enough: When working with GeoPandas, you need to take the CRS – coordinate reference system – into account. What do the numbers in the geometry column mean? They could be angles, distances from a known location, or many other things. (The complications stem from the fact that the Earth is something like a sphere, and performing measurements that assume it's like a rectangle lead to all sorts of incorrect calculations. Take that, Flat Earth Society!)

In our case, the coordinates are in EPSG:4326, a name which just rolls off the tongue, but which is pretty standard, in that it describes longitude and latitude.

The final call to create our GeoDataFrame is thus:

gdf = gpd.GeoDataFrame(
    euro_capitals,
    geometry=gpd.points_from_xy(euro_capitals['Longitude'], euro_capitals['Latitude']),
    crs="EPSG:4326",
)

gdf has 47 rows (just like before) and 4 columns (thanks to the added geometry column).

Now that gdf is defined, we can try to calculate the distances between the cities. One seemingly easy way is by invoking gdf.distance , to calculate the distance from each city to each other city. We could apply it across the entire data frame:

dmatrix = gdf.geometry.apply(lambda g: gdf.distance(g))

After doing this, dmatrix will contain a data frame with the distances between each of the cities. Sounds great, right?

Well... it does sound great, but it won't actually work. That's because of the CRS we were using, namely longitude and latitude, which aren't useful for distances. We'll want to change to a CRS known as 3035, which is used by European governments and mapmakers.

To turn our existing data frame into a 3035 dataframe, we use the to_crs method. Then we can use apply to calculate the distances in meters. I then divided the results by 1,000, in order to get km:


gdf_3035 = gdf.to_crs(3035)

dist = (
    gdf_3035
    .geometry.apply(lambda s: (gdf_3035.distance(s)) / 1000)
    .replace(0, pd.NA)
)

Notice the call to replace? That was necessary, because this calculated the distance between every city and every other city, including the distance between each city and itself. Which, of course, will be 0. Which will then be the closest city, making our comparison a bit useless. I thus replaced all 0 values with pd.NA, the newish Pandas version of NaN. This guaranteed that those values would not be included in our calculations.

Now what? I used min, max, idxmin, and idxmax to find the largest and smallest distances for each city. Remember that invoking min on a series will give you the smallest value in that series – but invoking it on a data frame will give you the minimum value for each column. By invoking each of these four aggregation methods, we'll get the nearest city, the distance there, the farthest city, and the distance there.

I performed these calculations, stuck the results in a dictionary, and used that to create a new summary data frame:

summary = pd.DataFrame({
    'nearest':     dist.idxmin(),
    'nearest_km':  dist.min().astype(float),
    'farthest':    dist.idxmax(),
    'farthest_km': dist.max().astype(float),
})

summary

To find the three capital cities that are closest to one another, I just invoked nsmallest on the nearest_km column:

(
    summary
    .nsmallest(columns='nearest_km', n=3)
    [['nearest', 'nearest_km']]
)

The result:

Country	Capital	nearest	nearest_km
Italy	Rome	Vatican City,Vatican City	4.9088719791
Vatican City	Vatican City	Italy,Rome	4.9088719791
Austria	Vienna	Slovakia,Bratislava	55.0176566505

Not surprisingly, Rome and Vatican City are quite close to one another. And Vienna is quite close to Bratislava, Slovakia.

But wait – there's a problem with this data, in that it is counting each pair twice! Given that every pair of cities has a different distance, I think that we can use a sneaky trick to do this a bit better – sorting the rows by distance, and then using .iloc[::2] to get every other row:

(
    summary
    .sort_values('nearest_km')
    .iloc[::2]
    [['nearest', 'nearest_km']]    
    .head(3)
)

Sure enough, we get the right answers, de-duplicated:

Country	Capital	nearest	nearest_km
Vatican City	Vatican City	Italy,Rome	4.9088719791
Austria	Vienna	Slovakia,Bratislava	55.0176566505
North Macedonia	Skopje	Kosovo,Pristina	77.867487374

What about farthest away?

(
    summary
    .sort_values('farthest_km', ascending=False)
    .iloc[::2]
    [['farthest', 'farthest_km']]   
    .head(3)
)

The results:

Country	Capital	farthest	farthest_km
Iceland	Reykjavik	Cyprus,Nicosia	4854.1533864323
Turkey	Ankara	Iceland,Reykjavik	4395.691201527
Malta	Valletta	Iceland,Reykjavik	3964.088883764

Not surprisingly, we see that Reykjavik is ... well, it's far from just about everywhere.