uscrn.get_data

Contents

uscrn.get_data#

uscrn.get_data(years=None, which='daily', *, n_jobs=-2, cat=False, dropna=False)#

Get USCRN archive data.

Sites are stored in separate files for these datasets. If you want to quickly get data for all sites for a short, recent period of time, consider using get_nrt_data().

Note

Variable and dataset metadata are included in the .attrs dict. These can be preserved if you have pandas v2.1+ and save the dataframe to Parquet format with the PyArrow engine.

df.to_parquet('crn.parquet', engine='pyarrow')
Parameters:
  • years (int | Iterable[int] | None) – Year(s) to get data for. If None (default), get all available years. If which is 'monthly', years is ignored and you always get all available years.

  • which (Literal['subhourly', 'hourly', 'daily', 'monthly']) – Which dataset.

  • n_jobs (int | None) – Number of parallel joblib jobs to use for loading the individual files. The default is -2, which means to use one less than joblib’s detected max.

  • cat (bool) – Convert some columns to pandas categorical type.

  • dropna (bool) – Drop rows where all data cols are missing data.

Return type:

DataFrame

See also

Daily data

Notebook example demonstrating using this function to get a year of daily data.