Skip to content

Bamboo Weekly #159: State of the Union

Get better at: Excel files, CSV files, Plotting with Plotly, grouping, maps, regular expressions, and joins.

Bamboo Weekly #159: State of the Union

Administrative note: We're having Pandas office hours on Thursday. (Sorry for the late notice!) As usual, you're invited to come with any questions you have about Pandas or data-handling in Python in general. All paid subscribers are welcome, as are members of my LernerPython+data membership program. Full Zoom info is at the bottom of this message. I hope to see you there! A recording will be sent out when we're done.

Bamboo Weekly #159: State of the Union

Yesterday, President Donald Trump delivered the State of the Union address, an annual speech in which the US president summarizes his accomplishments from the previous year, and lays out his agenda for the next year. If that sounds like it usually ends up being long and boring, then you're probably right, in part because presidents want to mention anything and everything of importance, and then milk those mentions for political capital during the rest of the year. (At least, that's what I've long read and heard, from presidential aides to the West Wing.)

However, you don't have to appreciate the format or content of the State of the Union as a speech. You can, instead, appreciate it as a data set, one that we can examine and play with. That's thanks to the American Presidency Project at the University of California Santa Barbara (UCSB), where they have an online archive of SOTU speeches and data about them.

Paid subscribers, both to Bamboo Weekly and to my LernerPython+data membership program (https://LernerPython.com) get all of the questions and answers, as well as downloadable data files, downloadable versions of my notebooks, one-click access to my notebooks, and invitations to monthly office hours.

Learning goals for this week include Web scraping, cleaning data, grouping, working with text, and plotting.

Data and six questions

Our primary data for this week's challenges will be the table of SOTU from UCSB:

https://www.presidency.ucsb.edu/documents/presidential-documents-archive-guidebook/annual-messages-congress-the-state-the-union

We will also look at the text of the State of the Union speeches through 2017, looking at patterns in language and length. Those speeches are available from Kaggle, at:

https://www.kaggle.com/datasets/rtatman/state-of-the-union-corpus-1989-2017

Here are my six tasks and questions for the week; I'll be back tomorrow with my solutions and explanations: