Administrative note: I'll be holding office hours on Thursday, May 14th. More info with a Zoom invite will go out shortly.
I'm in Long Beach, California, getting ready for PyCon US, the annual conference of the Python community in the United States. It's always fun to attend PyCon; between the old friends I can chat with, the new friends I meet, and the many things that I learn. I'm giving a tutorial on Wednesday and a talk at the education summit on Thursday.
I'll once again have a booth, telling people about the LernerPython platform, including Bamboo Weekly, PythonDAB (Python Data Analytics Bootcamp), HOPPy (Hands-on Projects in Python), and agentic coding courses. I've got T-shirts, stickers, and book giveaways – so if you're in Long Beach, come on by and say "hi"!
I've always known that Long Beach was a city just south of Los Angeles, but I didn't really know much about it. That changed during the covid-19 pandemic, when I kept reading about the many cargo ships waiting to enter and unload in Los Angeles and Long Beach. I even saw part of the port on my morning walk, and again when I went up to my hotel's rooftop deck.
This week, we'll look into the port in Long Beach – how much cargo is imported and exported through here, and the main countries with which the US trades via Long Beach. We'll use data from the Port of Long Beach itself, along with additional information from the US Department of Transportation's Bureau of Transportation Statistics.
Paid subscribers, both to Bamboo Weekly and to my LernerPython+data membership program (https://LernerPython.com) get all of the questions and answers, as well as downloadable data files, downloadable versions of my notebooks, one-click access to my notebooks, and invitations to monthly office hours.
Learning goals for this week include working with CSV files, pivot tables, merging, plotting with Plotly, and working with dates and times.
Data and six questions
We'll use two data sets this week:
First, we'll use the official monthly statistics from the Port of Long Beach's data portal. Go to the main page at https://polb.com/business/port-statistics, then click on the link to get monthly data from 1987. Download the data in CSV format.
Then we'll get BTS data from https://explore.dot.gov/#/site/BTS/views/ImportsbyCountryandPort/Home. The user interface is... challenging, so we'll limit ourselves to a single report. You'll want, after registering (for free), to retrieve:
- HS port-level data
- Port of Long Beach, CA
- All commodities, including the total
- All countries
- All years through 2025
After marking each of these, I clicked on "create report," and asked for CSV files (comma separated) in sparse format. (I would have preferred to get per-month data, but that exceeded the maximum number of rows the site will export.) After several minutes, I got a CSV file downloaded onto my computer.
Here are my six questions for this week. I'll be back tomorrow with solutions and explanations:
- Read the Long Beach data into a Pandas data frame. Ensure the Date column has a datetime dtype. Remove any rows whose dates aren't in "MONTH YEAR" format. Make sure numeric columns have numeric dtypes. Create a line plot showing the total number of loaded TEUs (i.e., 20-foot cargo containers) inbound and outbound over the course of the data set's history, with separate lines for inbound and outbound.
- The standard way to measure a port's size is to add up all TEUs (i.e., 20-foot cargo containers), whether loaded or empty, inbound or outbound. Create a line plot showing the total for each year in the data set, through 2025. Has the Port of Long Beach grown significantly over time? Also: Containers can be loaded or empty, inbound or outbound. Check that the
Totalcolumn matches the total for loaded and empties. And check thatTotalalso matches the total from loaded inbound and outbound, and also empty inbound and outbound. If these figures don't match, what might the issue (or issues) be? What effect does NaN have?