uscrn.get_data#
- uscrn.get_data(years=None, which='daily', *, n_jobs=-2, cat=False, dropna=False)#
Get USCRN archive data.
Home page: https://www.ncei.noaa.gov/access/crn/
Sites are stored in separate files for these datasets. If you want to quickly get data for all sites for a short, recent period of time, consider using
get_nrt_data()
.Note
Variable and dataset metadata are included in the
.attrs
dict. These can be preserved if you have pandas v2.1+ and save the dataframe to Parquet format with the PyArrow engine.df.to_parquet('crn.parquet', engine='pyarrow')
- Parameters:
years (int | Iterable[int] | None) – Year(s) to get data for. If
None
(default), get all available years. If which is'monthly'
, years is ignored and you always get all available years.which (Literal['subhourly', 'hourly', 'daily', 'monthly']) – Which dataset.
n_jobs (int | None) – Number of parallel joblib jobs to use for loading the individual files. The default is
-2
, which means to use one less than joblib’s detected max.cat (bool) – Convert some columns to pandas categorical type.
dropna (bool) – Drop rows where all data cols are missing data.
- Return type:
See also
- Daily data
Notebook example demonstrating using this function to get a year of daily data.