BW #33: Fracking

Fracking is a commonly used technique to extract petroleum products. It also seems to use a great deal of water. How much water does it use, and is the same amount needed everywhere? Let's check!

BW #33: Fracking

However much we might want and need to move away from fossil fuels, our world continues to depend on them to a huge degree. Over the last few decades, a growing amount of oil and gas have come not from traditional oil rigs, but rather from a technique known as "fracking" (https://en.wikipedia.org/wiki/Fracking). That technique, also known as "hydraulic fracturing," forces oil and gas to emerge from deep underground by injecting high-pressure fluid. The good news is that fracking has made it easier and cheaper to extract petroleum from the ground.

The bad news? There's actually quite a bit of it: First, cheaper prices for oil and gas encourage us to use them more, just the opposite of what we need to combat climate change. Second, there are numerous worries about the environmental issues associated with fracking, from polluted groundwater to earthquakes.

Dall-E, in response to: A car (in its entirety) on the beach. A nozzle is filling the car not with gas, but with water from the ocean. .png

I had learned a bit about fracking several years ago from NPR's Planet Money podcast. In a series they did on oil, they discussed the history of fracking, and its good and bad sides: https://www.npr.org/sections/money/2016/08/17/490375230/oil-3-how-fracking-changed-the-world

This week, the New York Times reported that there's a separate environmental issue with fracking, namely the huge amounts of groundwater that oil companies are using. This is especially a problem in the Western US, where there is a serious water shortage.

The NYT story said that they had used data from FracFocus (https://fracfocus.org/), a registry of fracking sites that gets data from 27 states. I thought that this might be an interesting data set for us to look at this week. And indeed, that's what we'll do.

Data and eleven questions

This week, we'll look at the data from FracFocus. The files are all in CSV format, and can be downloaded via this link:

    https://fracfocusdata.org/digitaldownload/FracFocusCSV.zip

The download page with full information about the data files is at https://fracfocus.org/data-download . Note that there is a data dictionary that comes with the files, in a document called `readme.txt`. That file seems to refer more to the SQL version of the data, rather than the CSV version, but is close enough to understand things.

I have eleven tasks and questions for you this week. The learning goals are: Working with multiple files, parsing and handling weird date formats, formatting numbers in Pandas, pivot tables, selecting columns based on another query, grouping, cleaning data, and looking into the differences between mean and median.

I'll be back tomorrow with my full answers to these questions, along with the Jupyter notebook that I used to solve them:

  • Import the "FracFocusRegistry" CSV files into a single data frame. Include only the columns: JobStartDate, TotalBaseWaterVolume, StateName, CountyName, and FederalWell. Don't bother trying to parse the `JobStartDate` column into datetimes just yet. Let them be strings.
  • Now turn `JobStartDate` into datetime dtypes. If the input date string cannot be parsed, then leave it as `NaT`.