Aftermath of an Earthquake in Bhaktapur Source: Pexels/Sanej Prasad Suwal

Earthquake Data Science

For beginners, with code.

Keno Leon
8 min readApr 24, 2023

--

It’s hard not to be fascinated by earthquakes, especially if you live in an active zone like I do ( Mexico/Mexico City ) and while I am mostly into finance related Data Science and AI, I thought it would be a good idea to shift from my usual into another field just to keep things fresh.

The exercise has so far proven successful ( I am still a beginner though ) and here I’ll go over my experience and of course take a look at Earthquake data…

— — — — — — — — — — — ⭐️ SUBSCRIBE TO MEDIUM ! ⭐️ — — — — — — — — — — —

Collecting Earthquake Data

I couldn’t find a ready made dataset, but thankfully the kind folks at USGS provide historical earthquake information via their API so you can make your own, here’s for instance how to request yearly data in python:

import pandas as pd
import requests

# Define the API endpoint and parameters
url = 'https://earthquake.usgs.gov/fdsnws/event/1/query'
year = 2022

params = {
'format': 'geojson',
'starttime': f'{year}-01-01',
'endtime': f'{year}-12-31',
'minmagnitude': '3.6',
'orderby': 'time'
}

# Send the API request and parse the response
response = requests.get(url, params=params)
data = response.json()

# Extract the desired columns from the GeoJSON features
columns = ['time', 'place', 'mag', 'type', 'geometry']
rows = []
for feature in data['features']:
row = [feature['properties'][col] for col in columns[:-1]]
row.append(feature['geometry'])
rows.append(row)

# Create a Pandas DataFrame from the rows
df = pd.DataFrame(rows, columns=columns)

df.to_csv(f'EarthQuakeRsrch/Datasets/earthquake_data_{year}.csv',
index=False)

print(f'Done collecting data for year: {year}')

Here’s a couple of rows:

   time             place                    mag        type
0 1356910502590 1 km N of Coarraze, France 4.8 earthquake
1 1356905033820 68 km SE of Akutan, Alaska 3.6 earthquake

Geometry
{'type': 'Point', 'coordinates': [-0.233, 43.1...
{'type': 'Point', 'coordinates': [-165.071, 53...

It’s usually a good idea to check for missing values and pre format some columns to make things easier for later use, note for instance that latitude and longitude…

--

--