Datasets Submodule¶

LTSF Datasets¶

LTSF is a collection of time series forecasting datasets that are commonly used in benchmarking forecasting algorithms. Typically, the performance is reported as the mean squared error and mean absolute error over multiple forecasting horizons: 96, 192, 336, and 720 time steps.

class torchcast.datasets.ElectricityLoadDataset(path: str | None = None, split: str = 'all', download: str | bool = True, scale: bool = True, columns_as_channels: bool = True, transform: Callable | None = None, input_margin: int | None = 336, return_length: int | None = None)¶

Electricity Load dataset, obtained from:

https://github.com/laiguokun/multivariate-time-series-data

This is derived from:

https://archive.ics.uci.edu/ml/datasets/ElectricityLoadDiagrams20112014

But the data has been subsetted and pre-processed. It is sometimes abbreviated as the ECL dataset.

class torchcast.datasets.ElectricityTransformerDataset(path: str | None = None, task: str = '15min', split: str = 'all', download: bool | str = True, scale: bool = True, columns_as_channels: bool = True, transform: Callable | None = None, input_margin: int | None = 336, return_length: int | None = None)¶

This is the Zhou et al. electricity transformer dataset, obtained from:

https://github.com/zhouhaoyi/ETDataset

This is sometimes abbreviated as the ETT dataset.

class torchcast.datasets.ExchangeRateDataset(path: str | None = None, split: str = 'all', download: bool | str = True, scale: bool = True, columns_as_channels: bool = True, transform: Callable | None = None, input_margin: int | None = None, return_length: int | None = None)¶: This is a record of currency exchange rates, taken from:

https://github.com/laiguokun/multivariate-time-series-data

https://arxiv.org/abs/1703.07015

class torchcast.datasets.GermanWeatherDataset(path: str | None = None, year: int | Iterable[int] = 2020, site: str | Iterable[str] = 'beutenberg', split: str = 'all', download: bool | str = True, scale: bool = True, columns_as_channels: bool = True, transform: Callable | None = None, input_margin: int | None = 336, return_length: int | None = None)¶

This is a dataset of weather data from Germany, obtained from:

https://www.bgc-jena.mpg.de/wetter/weather_data.html

This is provided because it was used in the paper:

https://arxiv.org/abs/2205.13504

Which used only the data from Beutenberg in 2020.

class torchcast.datasets.ILIDataset(path: str, split: str = 'all', scale: bool = True, columns_as_channels: bool = True, transform: Callable | None = None, input_margin: int | None = 336, return_length: int | None = None)¶

This dataset describes both the raw number of patients with influenza-like symptoms and the ratio of those patients to the total number of patients in the US, obtained from the CDC. This must be manually downloaded from:

https://gis.cdc.gov/grasp/fluview/fluportaldashboard.html

To download this dataset, click “Download Data”. Unselect “WHO/NREVSS” and select the desired seasons, then click “Download Data”.

class torchcast.datasets.SanFranciscoTrafficDataset(path: str | None = None, split: str = 'all', download: str | bool = True, scale: bool = True, columns_as_channels: bool = True, transform: Callable | None = None, input_margin: int | None = None, return_length: int | None = None)¶: San Francisco traffic dataset, taken from:

https://pems.dot.ca.gov

https://arxiv.org/abs/1703.07015

Monash Archive Datasets¶

The Monash archive is a collection of time series forecasting datasets in a standard format.

class torchcast.datasets.MonashArchiveDataset(task: str, split: str = 'train', path: str | None = None, download: str | bool = True, transform: Callable | None = None, return_length: int | None = None)¶: This provides access to all Monash forecasting archive datasets:

https://forecastingdata.org

Godahewa et al., 2021. “Monash Time Series Forecasting Archive.” Neural Information Processing Systems 2021.

UCR/UEA Datasets¶

The UCR/UEA archive is a collection of time series classification datasets in a standard format. The UCR archive provides univariate time series, while the UEA archive provides multivariate time series.

class torchcast.datasets.UCRDataset(task: str, split: str = 'train', path: str | None = None, download: bool | str = True, transform: Callable | None = None, return_length: int | None = None)¶: This is the UCR dataset for univariate time series classification, found at:

https://www.timeseriesclassification.com/

class torchcast.datasets.UEADataset(task: str, split: str = 'train', path: str | None = None, download: bool | str = True, transform: Callable | None = None, return_length: int | None = None)¶: This is the UEA dataset for multivariate time series classification, found at:

https://www.timeseriesclassification.com/

Other Datasets¶

class torchcast.datasets.AirQualityDataset(path: str | None = None, download: bool | str = True, transform: Callable | None = None, return_length: int | None = None)¶: This is the De Vito et al. air quality dataset.

Datasets Submodule¶

LTSF Datasets¶

Monash Archive Datasets¶

UCR/UEA Datasets¶

Other Datasets¶

torchcast

Navigation

Related Topics