uscrn.get_data#

uscrn.get_data(years=None, which='daily', *, n_jobs=-2, cat=False, dropna=False)#

Get USCRN archive data.

Home page: https://www.ncei.noaa.gov/access/crn/
Info: https://www.ncei.noaa.gov/access/crn/qcdatasets.html
Data: https://www.ncei.noaa.gov/pub/data/uscrn/products/

Sites are stored in separate files for these datasets. If you want to quickly get data for all sites for a short, recent period of time, consider using get_nrt_data().

Note

Variable and dataset metadata are included in the .attrs dict. These can be preserved if you have pandas v2.1+ and save the dataframe to Parquet format with the PyArrow engine.

df.to_parquet('crn.parquet', engine='pyarrow')

Parameters:

years (int | Iterable[int] | None) – Year(s) to get data for. If None (default), get all available years. If which is 'monthly', years is ignored and you always get all available years.
which (Literal['subhourly', 'hourly', 'daily', 'monthly']) – Which dataset.
n_jobs (int | None) – Number of parallel joblib jobs to use for loading the individual files. The default is -2, which means to use one less than joblib’s detected max.
cat (bool) – Convert some columns to pandas categorical type.
dropna (bool) – Drop rows where all data cols are missing data.

Return type:

DataFrame

uscrn.get_data

Contents

uscrn.get_data#