BW #34: House of Representatives

There was big news in the US House of Representatives this week, as members of the Republican party ousted the speaker. This week, we look at some recent election data about the US House.

BW #34: House of Representatives

The legislative branch of the US federal government, the Congress, has two parts: The House of Representatives and the Senate. The House has 435 members, each elected every two years from a particular district in the US. The Senate has 100 members, two per state, elected every six years. Every resident of the US thus has a representative (from their district) and two senators (from their state).

The House of Representatives is led by the Speaker of the House, elected by the majority party. In 2022, the Republican party got a slim majority of seats in the House, which meant that they got to elect a speaker. But that's where things got weird: Each new session of Congress writes new rules for how it will conduct itself, and this includes the procedure that'll be used to remove or replace the speaker. Kevin McCarthy managed to get himself elected speaker after 15 (!) votes, but only after agreeing that any single member of the House could force a recall vote on his speakership.

From DALL-E: “Americans voting for members of Congress”

Last week, the House managed to pass a last-minute bill that ensured the US federal government's funding for another 45 days. The bill went onto the Senate, and was signed by President Joe Biden, ensuring that the government wouldn't shut down. But passing that law required help from the minority (Democratic) party, and that was too much for some Republican members. Earlier this week, Rep. Matt Gaetz of Florida proposed that the speaker be recalled -- and in a vote that took place yesterday, he was.

Things in the House are currently in chaos; there is a temporary speaker, and there are all sorts of thoughts and plans about how to elect a new one. But with such a slim Republican majority, a number of hard-line Republican representatives who are willing to shut down the government, and Democratic minority members willing to watch the Republicans squabble... well, it's entertaining and interesting, but probably not in the best interest of the US in particular or the world as a whole.

Here's an article from "Electoral vote," with daily political analysis, looking at (among other things) the current state of affairs in the House: https://www.electoral-vote.com/evp2023/Pres/Maps/Oct04.html

This week, given the news, I thought it might be interesting to look at data about the US House of Representatives, and to see what data we can crunch about it, both past and present.

Data and seven questions

This week's data comes from the MIT Election Lab (https://electionlab.mit.edu/, run by Professor Charles Stewart III. The data itself comes in CSV format, downloadable from the site

https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/IG0UN2

(And no, I can't quite understand why the data is housed at Harvard, either.)

You'll want to download the file 1976-2022-house.tab; look for it toward the lower part of the page, and click on the down arrow next to the file name. You'll be asked to enter some information about yourself, and can then download the file. Note that you'll be given the opportunity to download the data in a variety of formats; I chose the comma-separated (CSV) format, but if you prefer, you can try something else.

The data dictionary, or "codebook" as they call it, is also available for download from the same site. Once again, you can click on the down arrow next to the file name. The data dictionary is in Markdown format, which you can theoretically use a special viewer to read — but to be honest, it was more than readable in my plain-text editor.

This week, I have seven questions and tasks about this data. The learning goals are working with CSV files, grouping, sorting, using stack/unstack, and filtering with `lambda`. I'll be back tomorrow with detailed solutions, including the Jupyter notebook I used to solve the problems myself:

The tasks are:

  • Load the data into a Pandas data frame. We only need the columns `year`, `state`, `state_po`, `district`, `stage`, `candidate`, `party`, `candidatevotes`, and `totalvotes`.
  • How many districts are there per state in 2022? What 10 states have the most districts?