This week, we looked at data having to do with the value of the US dollar against other currencies. A number of news reports, including one earlier this week in the New York Times (https://www.nytimes.com/2025/06/30/business/dollar-decline-trump.html), indicated that the dollar had lost more value in the first six month of 2025 than in 50 years.
This has a number of effects, including making imports and trips abroad more expensive for Americans, while making US exports cheaper abroad. The weaker dollar seems to reflect nervousness among investors regarding the Trump administration's policies, especially those having to do with tariffs and the high levels of debt the proposed budget would create.
Data and five questions
This week's data comes from the Federal Reserve's tracker of foreign exchange rates (https://www.federalreserve.gov/releases/h10/20250630/), described as H.10 on the Fed's system. They have been tracking a number of different currencies against the US dollar for some time.
I used FRED (the online data tool sponsored by the Federal Reserve of Saint Louis) to download all of the H.10 values that I could at https://fred.stlouisfed.org/release?rid=17 . I then asked to plot all of them, and downloaded a CSV file from that page of plots.
However, I seem to have maxed out FRED's downloading system; I only got from DTWEXBGS (broad index), DEXUSEU (dollar-euro), DEXJPUS (dollar-yen), DEXCHUS (dollar-yuan), DEXCAUS (dollar-Canadian dollar), DEXUSUK (dollar-pound), DEXKOUS (dollar-Korean won), DEXMXUS (dollar-Mexican peso), DEXVZUS (dollar-Venezuelan peso), DTWEXAFEGS (advanced foreign economy index), DEXINUS (dollar-Indian rupee), and DEXBZUS (dollar-Brazilian real). So those are the ones we'll look at.
Paid subscribers (including members of my LernerPython+data membership) can download my data file from the bottom of this message, as well as a copy of my Jupyter notebook.
This week's learning goals include working with CSV files, dates and times, filtering, correlations, grouping, and plotting.
Read the data into a data frame. Make sure that the observation_date column is a datetime, which should then be set to the index. Which currencies have been tracked at the start of the Fed's data set? Which was the most recent currency to be added?
Let's start by loading Pandas:
import pandas as pd
I downloaded the data as a CSV file, so to read it into Pandas I used read_csv
. By default, read_csv
doesn't recognize datetime columns, but leaves them as strings. I thus used the parse_dates
keyword argument to convert the values. I also used the index_col
keyword argument to ask that the observation_date
column be turned into the data frame's index:
filename = 'data/bw-125.csv'
df = pd.read_csv(filename,
parse_dates=['observation_date'],
index_col='observation_date')
With that in place, I asked you to find which currencies were tracked at the start of the Fed's data set?
The data frame contains one column per currency or index. I only wanted currencies, which meant keeping only those columns that started with "DEX". That's most easily done with the filter
method:
(
df
.filter(like='DEX')
)
Notice that I used the like
keyword argument for filter
, which basically runs Python's in
operator on each of the column names, keeping only those for which it's True
. I'm still a bit more partial to regular expressions, and thus passed the regex
keyword argument with ^DEX
as a value (i.e., 'DEX'
anchored to the start of the string):
(
df
.filter(regex='^DEX')
)
I then grabbed the first row in the data frame using iloc
:
(
df
.filter(regex='^DEX')
.iloc[0]
)
This gave me a series in which the index represented the exchange rates and the values represented the rate on that first day of trading – or NaN
if there was no value.
That was the key to finding the currencies that we had from the beginning; I invoked dropna
to remove the NaN
values. The index entries for the remaining values thus represented those that were there on the first day:
(
df
.filter(regex='^DEX')
.iloc[0]
.dropna()
.index
)
The result:
Index(['DEXJPUS', 'DEXCAUS', 'DEXUSUK'], dtype='object')
In other words, the Fed has been tracking the exchange rate with Japan, Canada, and the UK from the start.
But I also asked you to find the most recently added currency. Once again, I started by keeping only currencies with filter
and regex
.
But then I wanted to find the date on which the first non-NaN
value appeared in each column. To do this, I used the first_valid_index
method, which returns the first index with a non-NaN
value.
Running it on the data frame doesn't really help here, since it'll give us the first index in which the entire data frame has a non-NaN
value. But if we run it on each column, we'll get that particular column's first non-NaN
index. And while I try to avoid it, sometimes there's no choice but to use apply
to run a function on each column in a data frame.
The function, in this case, will be a lambda
expression. Its argument will be a column, and we'll invoke first_valid_index
on that column. The result will be the first index in each column for which we have a non-NaN
.
I passed axis='rows'
, which worked fine. This might seem counterintuitive, given that I'm running it on each column. I like to think of it as saying, "I want to get a new row, sharing the column names of the data frame."
(
df
.filter(regex='^DEX')
.apply(lambda c_: c_.first_valid_index(), axis='rows')
)
Finally, I invoked sort_values
to get them from earliest to latest:
(
df
.filter(regex='^DEX')
.apply(lambda c_: c_.first_valid_index(), axis='rows')
.sort_values()
)
The result:
DEXJPUS 1971-01-04
DEXCAUS 1971-01-04
DEXUSUK 1971-01-04
DEXINUS 1973-01-02
DEXCHUS 1981-01-02
DEXKOUS 1981-04-13
DEXMXUS 1993-11-08
DEXVZUS 1995-01-02
DEXBZUS 1995-01-02
DEXUSEU 1999-01-04
dtype: datetime64[ns]
Not surprisingly, the euro (from the European Union) is the most recent addition to the Fed's currency comparisons. That started on January 4th, 1999.
The New York Times article said, "The dollar is off to its worst start to a year in more than half a century." Calculate the percentage drop in the "broad" index (DTWEXBGS) for the first six months of each year starting in 2020.
The "broad" index attempts to capture the general strength of the dollar. I asked you to find the percentage drop in this index in the first half of each year starting in 2020.
For starters, I used loc
to retrieve only data from 2020 and onward, taking advantage of the fact that when datetime
values are in the index, and when we specify only the year, it uses wildcards for all of the other portions of the date and time. We can also use a slice – so in this case, loc['2020':]
does the trick, giving us all data from 2020 and onward.
I used the two-argument version of loc
, specifying the column we want ("DTWEXBGS"). The result is a series:
(
df
.loc['2020':, 'DTWEXBGS']
)
Next, I got rid of the NaN
values in this series with dropna
. 2020 started with a bunch of them, and while they won't affect our calculations, I just didn't want them around. I also invoked reset_index
to turn our series into a two-column data frame:
(
df
.loc['2020':, 'DTWEXBGS']
.dropna()
.reset_index()
)
Next, I used assign
to grab the year from the datetime
value, using dt.year
, and adding a new year
column, and also assign a new month
column from dt.month
. I didn't have to do this, but it made the subsequent queries easier to write and read.
(
df
.loc['2020':, 'DTWEXBGS']
.dropna()
.reset_index()
.assign(year = lambda df_: df_['observation_date'].dt.year,
month = lambda df_: df_['observation_date'].dt.month)
)
I only wanted data from the first six months of the year. Now that I have a month
column defined, I can use isin
inside of a loc
with lambda
to find rows from the first six months of the year:
(
df
.loc['2020':, 'DTWEXBGS']
.dropna()
.reset_index()
.assign(year = lambda df_: df_['observation_date'].dt.year,
month = lambda df_: df_['observation_date'].dt.month)
.loc[lambda df_: df_['month'].isin(range(7))]
)
Now that I have the data I want, I can start to manipulate it. I used groupby
on the DTWEXBGS
column, running the first
and last
aggregation methods via the agg
method.
I'll note that only after using them did I discover that first
and last
are deprecated... but they're pretty convenient here, and they still do work, so I'm going to ignore the documentation's warnings and still use them:
(
df
.loc['2020':, 'DTWEXBGS']
.dropna()
.reset_index()
.assign(year = lambda df_: df_['observation_date'].dt.year,
month = lambda df_: df_['observation_date'].dt.month)
.loc[lambda df_: df_['month'].isin(range(7))]
.groupby('year')['DTWEXBGS'].agg(['first', 'last'])
)
The above gives us a two-column data frame. The index contains the years (2020 through 2025), and the columns are "first" and "last", showing the first and last values in the first half of each year:
first last
year
2020 120.6757 120.4746
2021 111.2143 112.5613
2022 115.4243 120.9718
2023 122.0844 119.7118
2024 119.6165 124.5172
2025 129.6666 120.0786
How can we use this to determine the degree to which the dollar has declined? We can run pct_change
on our data frame, passing axis='columns'
so that it calculates from left to right. This will give us the percentage change from the start of the year to the midpoint (i.e., the end of June). We can then grab the last
column, which contains the percentage change:
(
df
.loc['2020':, 'DTWEXBGS']
.dropna()
.reset_index()
.assign(year = lambda df_: df_['observation_date'].dt.year,
month = lambda df_: df_['observation_date'].dt.month)
.loc[lambda df_: df_['month'].isin(range(7))]
.groupby('year')['DTWEXBGS'].agg(['first', 'last'])
.pct_change(axis='columns')
['last']
)
Finally, we can invoke sort_values
to find out which year had the biggest drop:
(
df
.loc['2020':, 'DTWEXBGS']
.dropna()
.reset_index()
.assign(year = lambda df_: df_['observation_date'].dt.year,
month = lambda df_: df_['observation_date'].dt.month)
.loc[lambda df_: df_['month'].isin(range(7))]
.groupby('year')['DTWEXBGS'].agg(['first', 'last'])
.pct_change(axis='columns')
['last']
.sort_values()
)
Here is what I got:
year
2025 -0.073943
2023 -0.019434
2020 -0.001666
2021 0.012112
2024 0.040970
2022 0.048062
Name: last, dtype: float64
Sure enough, 2025 had the biggest drop in the last five years, by quite a bit, going down about 7 percent since the start of the year.