BW #15: Eurovision

The annual Eurovision contest, in which nations battle through song and dance, takes place this week. We'll look through historical Eurovision data — and examine it via the Seaborn plotting package.

Excited about the festivities and coronation in the United Kingdom? No, not those. I’m talking about Eurovision: By the end of this weekend, a new singer will be crowned the best in the world, having won the annual song contest. Each participating country submits one song to the contest each year, and the song which gets the greatest number of votes is declared the winner. The winning country normally hosts the contest the following year; because Ukraine couldn’t host it safely this year, it’s taking place in Liverpool, England.

Eurovision seems to polarize people. Detractors call it kitchy and garish, with terrible music that appeals to people’s worst instincts. To which Eurovision fans respond: Yes, that’s exactly why we love it! Except for Americans, that is, who barely even know that Eurovision exists, except in the Will Farrell comedy he did for Netflix.

In honor of this week’s Eurovision contest, and in order to cover some positive and happy topics, we’ll be digging through a public data set listing all Eurovision contest entries through 2020.

Moreover, the questions all revolve around plotting. The best-known Python plotting software is Matplotlib, and I cannot deny that it’s quite powerful. However, I find its interface to be frustrating, such that I almost always end up looking through the documentation to do what I want. Moreover, the results are never that aesthetically pleasing. For a long time, I’ve avoided direct use of Matplotlib in favor of the Pandas plotting interface, which serves as a wrapper around Matplotlib.

Over the last year, I’ve been using another wrapper around Matplotlib, called Seaborn, for my plotting, and I really like it. Seaborn forces you to think about what you’re trying to say with your plots, rather than how you’re trying to draw them. On top of that, it produces a wide variety of plots that look nice without very much customization.

Data and questions

The data set this week comes from the Eurovision dataset at https://github.com/Spijkervet/eurovision-dataset/blob/master/README.md, created by Janne Spijkervet.

We'll be looking one of the CSV files available in that data set, listing all contestants and entry songs. You can most easily retrieve it from https://github.com/Spijkervet/eurovision-dataset/releases/download/2020.0/contestants.csv.

Here are the questions I want you to answer:

  • Read the entire contestant CSV file data into a data frame.
  • Create a line plot showing how many countries participated in Eurovision each year. The x axis should show the year, and the y axis should show the number of participants. The plot should have a white background, with grid lines.