BW #120: Pennies (solution)

Is the US going to abolish the penny? It would seem so; the US Treasury recently announced that it was no longer going to mint pennies (https://www.nytimes.com/2025/05/22/us/politics/penny-manufacturing.html?unlocked_article_code=1.Kk8.pHPc.l144zQJcZEc_&smid=url-share). This led me to wonder how many other countries still have coins equivalent to one penny – and in general, about coins around the world.

This week, our data will come from Numista (https://en.numista.com/), a site with numismatic (i.e., coin-related) information. The site is mainly aimed at coin collectors, including buyers and sellers. But it includes information about enough coins currently in use for our purposes, allowing us to analyze coins in a variety of ways.

Data and seven questions

The data comes from Numista's API. You'll need to sign up for a free Numista account, and then create a free API key. Note that it's only free to make up to 2,000 requests per month, so I've purposely tried to limit how many requests were needed to solve this weeks' problems.

Learning goals for this week include working with APIs, JSON files, string processing, regular expressions, and grouping.

Paid subscribers can download the program I used to retrieve from the API at the end of this message. (You'll still need to register for your own API key, though!) At the end of this message, paid subscribers can download the JSON file I created, my Jupyter notebook, and a one-click entry to Google Colab that lets you run the notebook right away.

Here are my seven tasks and questions:

Assemble a list of dicts about the coins in Numista's database: Using the API, retrieve information about each coin that is in use in 2025. From that data, create a JSON file. The file should be in the form of a list of dicts (or an "array of objects," in JSON parlance). You'll need to sign up for a free Numista account, and then for a free API key at https://en.numista.com/api/api_key.php . You'll then need to use `requests` to ask for `https://api.numista.com/v3/types`, passing a `params` keyword argument with `{'category':'coin', 'q':'circulation', 'count':50, 'year':2025, 'lang':'en'}` as the value, and a `headers` keyword argument with `{'Numista-API-Key': api_key}` as the value, where `api_key` is set to your API key. Iterate, in a `for` loop, from `range(1,10)`, using `requests.get` to the URL you've constructed, adding the current number (from your iteration) to `params['page']`, so as to request the current page. The response will be in JSON, with the data itself available in the `'types'` key, a list of dicts. If the list has length 0, then exit from the loop. Otherwise, add that list of dicts to the list of all coins you've found, and go back for the next page.

I originally thought that "retrieve the data from the API into a data frame" would be a single task. But then I realized how involved it was, and decided that perhaps it would make more sense to divide it into separate tasks.

I thus asked you to start by retrieving the coin information via the Numista API. This API is typical, in that it uses a "REST" URL-based system. The idea is that you make an HTTP query describing the coins you want to learn about. You get back a series of responses, each with brief information about a coin matching the query, including a coin API. You can then use that coin ID to retrieve information about a particular coin.

To boil it down to the simplest form, we can make a request like this, using requests:

base_url = 'https://api.numista.com/v3'
response = requests.get(base_url + '/types')

(Note that requests is on PyPI; it isn't included in Python's standard library, so you'll need to install it separately.)

However, if you try the above code, it won't work. That's because we need to pass two additional pieces of information:

Parameters, which we set in a dict. These parameters describe the query that we want to make to the Numista server.
Headers, which are sent as metadata with the HTTP request. In our case, there's a single header we need to set and send; the name is 'Numista-API-Key', and the value is the API key that you got from the Numista site.

We can pass these along via the params and headers keyword arguments:

response = requests.get(base_url + '/types',
                        params=params,
                        headers=headers)

In the above code, both params and headers are variables to which we've assigned dictionaries with the appropriate name-value pairs. Here's how I set them more generally:

params = {'category':'coin',
          'q':'circulation',
          'count':50,
          'year':2025,
          'lang':'en'}

headers = {'Numista-API-Key': api_key}

response = requests.get(base_url + '/types',
                        params=params,
                        headers=headers)

The parameters describe the query we want to run. In this case, we're asking for coins, for the string 'circulation' to appear in the coin's description, for the coin to be relevant and active in the year 2025, and for the results to be returned in English. We also indicate that we want to get a maximum of 25 records back with each request, although from what I can tell, that's already the default.

If we run this, we will get info back about 50 coins. But which 50? And what if there are more than 50? For that reason, we'll need to add one final parameter, page, which indicates the page of data we want to receive, starting with 1. If the response contains zero records, then we know that we've gone beyond the final page of records.

Now, we don't directly get records in the response object, which I assign to response. Rather, we get everything having to do with the response, including response headers. The content itself is in a Python bytestring, which we can retrieve via the content attribute.

In theory, we can take this content and pass it to json.loads, which takes a string or bytestring and returns Python objects via JSON. But we can do better than this, invoking the json method provided by requests, which returns the list of Python dicts, each one representing one record that we got from Numista.

Here, then, is what my loop looks like:

import requests

base_url = 'https://api.numista.com/v3'
api_key = 'YOUR_API_KEY_HERE'

params = {'category':'coin',
          'q':'circulation',
          'count':50,
          'year':2025,
          'lang':'en'}

headers = {'Numista-API-Key': api_key}

all_coins = []

for page_number in range(1, 10):
    print(f'Retreiving page {page_number}...')

    params['page'] = page_number
    response = requests.get(base_url + '/types',
                            params=params,
                            headers=headers)

    data = response.json()['types']

    if len(data) == 0:
        print(f'No data at page {page_number}; exiting')
        break

    print(f'\tGot {len(data)} records; adding and continuing...')

    all_coins += data

Notice that I've set the api_key variable to a bogus string; you'll need to put your own API key in there for this to work correctly.

Also notice that I assign to params['page'] from inside of the for loop, because I want to increment the page number each time.

I also did here what I often do when retrieving data, namely have it print out a status report for each iteration.

The result of this code is a list of Python dicts, each representing a coin currently in circulation in 2025.

Create a JSON file based on the data you've retrieved: Iterate over the list of coin dicts you created. For each one, retrieve from `https://api.numista.com/v3/types/COIN_ID`, where `COIN_ID` is the `'id'` value from the current coin. You'll have to pass the same `'Numista-API-Key`' header as before. The only parameter you'll need to send is `'lang'`, indicating `'en'` for English. Take each returned object, decode it from JSON, and append it to a list of dicts. In the end, use `json.dump` to write that list to a JSON file.

We have a list of dicts, and each dict gives the overview of a coin in the Numista database. But we need to get more details, which means making one additional API call for each coin. Once again, we can boil our request down to:

response = requests.get(f'{base_url}/types/{coin_id}',
                       headers=headers,
                       params={'lang': 'en'})

The headers will be unchanged from last time. But the parameters, as we can see, only include the language. The most important factor is coin_id, which we grabbed from the id key in the coin info.

Assuming that we got back a valid response (i.e., one with a status code of 200), we can then invoke response.json() on this response, and add it to a list of coins we've collected (no pun intended) across our for loop:

output = []


for index, one_coin in enumerate(all_coins, 1):
    coin_id = one_coin['id']
    response = requests.get(f'{base_url}/types/{coin_id}',
                           headers=headers,
                           params={'lang': 'en'})

    if response.status_code == 200:
        coin_data = response.json()
        print(f'[{index:4}] Got coin ID {coin_id}, {coin_data['title']}')
        output.append(coin_data)

Notice how I use enumerate to get not just the coin ID, but also the current number of the coin, starting with 1. We don't need that, but I used it to print a log of what we were doing at each point, so that I can get a sense of what the program is up to.

Finally, we can take the output list and write it to a file:

print(f'Writing {len(output)} records...')

with open('coins.json', 'w') as outfile:
    json.dump(output, outfile)

print(f'\tDone.')

I used with around the invocation of open to ensure that our file not only gets the content we want to give it – the result of invoking json.dump on our list of dicts – but also that the file is flushed and closed. The result is a JSON file that we can load back into Python using json.load or (as we'll see in the next solution) using pd.read_json.

As of this stage, we have a JSON file containing all of the data from the Numista database for coins in current circulation.

BW #120: Pennies (solution)

Data and seven questions

Office hours: Correction

Pandas office hours — Thursday, May 29th (recording)

BW #120: Pennies (solution)

Data and seven questions

Read next

Office hours: Correction

Pandas office hours — Thursday, May 29th (recording)