timeserio.model_selection.time_series_split module

class timeserio.model_selection.time_series_split.PandasTimeSeriesSplit(groupby, datetime_col, n_splits=3, max_train_size=None)[source]

Bases: sklearn.model_selection._split._BaseKFold

Apply a sklearn TimeSeriesSplit to multiple timeseries in a single DF.

The dataframe should be ordered by date ascending for each time series, and the index should be unique.

Parameters
  • groupby (Union[str, List[str]]) – string or array of strings The column name(s) to group the input dataframe by - each group should hold a monotonically increasing time series.

  • datetime_col (str) – string The column name of the datetime column - used to validate that the dataframe is groups of time series.

  • n_splits (int) – int, default = 3 Number of splits. Must be at least 2.

  • max_train_size (Optional[int]) – int, optional Maximum size for a single training set.

split(df, y=None, groups=None)[source]