Skip to content
4 min read · Tags: excel grouping multi-index regular-expressions datetime

BW 144: Museum Heists

Get better at: Working with Excel, grouping, cleaning, regular expressions, and multi-indexes.

BW 144: Museum Heists

Before we get into this week's topics, here are three administrative notes:

  1. We'll have office hours on Monday, the 17th; full info will go out this Friday.
  2. If you want to learn about uv, the new Python super-app for packaging, check out my uv crash course at https://uvCrashCourse.com.
  3. A new cohort of PythonDAB, my Python Data Analysis Bootcamp, will be starting on December 4th. Learn more at https://LernerPython.com/bootcamp, or sign up for an info session at https://us02web.zoom.us/webinar/register/WN_t9f0zd-7SkiflqI1Jz7QkA .

And now, onto our regularly scheduled program:

On October 19th, the world was shocked by the theft, in broad daylight, of more than $100 million in jewels from the Louvre in Paris — in 8 minutes! If you thought that the most famous museum in Paris, if not the world, was well guarded, then you weren't alone. The thieves, dressed as construction workers, entered the gallery via a window using glass cutters, smashed some of the displays, and ran off in just minutes. Some of them have been caught, but others remain at large.

The heist seemed more suited to a Hollywood caper film than the front page of major world newspapers. The New York Times recently described the theft itself in great detail, at https://www.nytimes.com/2025/10/30/world/europe/inside-louvre-jewel-heist.html?unlocked_article_code=1.0k8.3jfi.Fqdb95Os1tWk&smid=url-share .

How common are such heists? When and where do they take place? How many people are involved? And do they often use weapons? These, and other questions, were raised by Sandra Clopés and Marc Balcells, professors at Pompeu Fabra University in Barcelona, Spain, in a recent paper, "The Science of Art Theft: Using Data to Identify Criminal Patterns, 1990–2022" (https://www.cambridge.org/core/journals/international-journal-of-cultural-property/article/science-of-art-theft-using-data-to-identify-criminal-patterns-19902022/838EFE01FB7C7AE9244FB458FF103EB1), published in April in the International Journal of Cultural Property.

The paper only includes heists from 1990 through 2022, but can still provide us with an understanding of museum thefts. And indeed, that's our topic for this week, trying to better understand the database of heists used in the paper.

Data and five questions

There are large, complete databases of stolen artworks. However, these are generally behind paywalls or cannot be downloaded. For example, Interpol (the international police network) has a "stolen works of art" database, and even added the Louvre jewels (https://www.interpol.int/News-and-Events/News/2025/Louvre-Museum-theft-Stolen-jewels-added-to-INTERPOL-s-Stolen-Works-of-Art-database). But the database itself can only be searched, not downloaded.

I was thus happy to see that the Clopés and Balcells paper did include data — but only in PDF, and cut in a way that would make analysis difficult. I contacted Professor Clopés, who kindly and quickly sent me an Excel file with their database. You can download it from here:

This week, I have five tasks and questions for you based on this database.

Normally, I only provide a downloadable version of the data to paid BW subscribers and members of my LernerPython subscription program — but this week is an obvious exception to that rule. Tomorrow, I'll not only provide my solutions and explanations, but also my downloadable Marimo notebook, and a clickable link to view and explore the data using the Marimo "Molab" facility.

Learning goals for this week include: Working with dates and times, cleaning data, grouping, regular expressions, and handling weird multi-indexes.

Here are this week's five tasks. I'll be back tomorrow with my solutions and explanations: