Getting Started With NCEDC AWS Public Dataset

This section describes some quick examples how to start using the NCEDC dataset. An AWS account is not required to access the dataset, but you may need to install some of the utilities mentioned below.

If you want to use other AWS tools with the Open Dataset you will need to have an AWS account.


The AWS Command Line Interface is Amazon's utility command line access AWS resources and allows you to navigate and retrieve files similar to a UNIX file system.

We recommend installing this on your system to get started (see AWS CLI User Guide).

After you have installed the utility, on your terminal command line you can the following command to see the contents of the NCEDC Dataset.

>aws s3 ls --no-sign-request s3://ncedc-pds/
                           PRE FDSNstationXML/
                           PRE continuous_waveforms/
                           PRE earthquake_catalogs/
                           PRE event_phases/
                           PRE event_waveforms/

AWS CLI has other commands such as cp, sync. The command below copies the FDSN StationXML file for station BK.MOD, located in the Open Dataset bucket s3://ncedc-pds, to the user's current directory.

you can run the following command

>aws s3 cp --no-sign-request s3://ncedc-pds/FDSNstationXML/BK/BK.MOD.xml .
download: s3://ncedc-pds/FDSNstationXML/BK/BK.MOD.xml to ./BK.MOD.xml

Python Boto3

Boto3 is the AWS SDK for Python (see AWS SDK for Python (Boto3)).

This notebook is a simple demonstration of how to use boto3 to access a waveform file from the NCEDC AWS Open Dataset s3://ncedc-pds.

import obspy
from obspy import read
import boto3
from botocore import UNSIGNED
from botocore.config import Config

BUCKET_NAME = 'ncedc-pds'


ch = read('MERC.BK.HNZ.00.D.2022.231')
print (ch)

1 Trace(s) in Stream:
BK.MERC.00.HNZ | 2022-08-19T00:00:00.000400Z - 2022-08-20T00:00:00.000400Z | 100.0 Hz, 8640000 samples