Skip to main content
Version: v1.0.0


module ml.timeseries.timeops

The timeops module provides helpful functionality for timeseries datasets

function set_timeseries#

set_timeseries(df: DataFrame, time_column: str) → DataFrame

Transforms the dataframe to a timeseries enabled dataframe by applying a DateTimeIndex to the given column


  • `df` (pd.DataFrame): The DataFrame that should be time-indexed


  • `pd.DataFrame`: The DataFrame, including the DatetimeIndex

function add_time_reference#

add_time_reference(    sorted_df: DataFrame,    n: int,    reference_column: str,    new_column: str) → DataFrame

This method will add a reference column to the DataFrame that contains the value of reference column of n items before


  • `sorted_df` (pd.DataFrame): a DataFrame that is sorted by the time
  • `n` (int): the number of records to look back for the referencing columns
  • `new_column` (str): the name of the new column


  • `pd.DataFrame`: the dataframe with the new column created

function time_slice#

time_slice(    df: DataFrame,    time_column: str,    start: datetime = None,    end: datetime = None)

This method takes a time series DataFrame and only returns the time slice, based on the start & end date


  • `df` (pd.DataFrame): The indexed Data Frame , which should be a DatetimeIndex
  • `start` (datetime): The start time of the time slice. When skipped, the time slice begins at the beginning of the index
  • `end` (datetime): The end time of the time slice. When skipped, the time slice end at the end of the index


  • `pd.DataFrame`: the dataframe only containing the records inside the time slice

function get_windows#

get_windows(    sorted_df: DataFrame,    window_size: int,    window_stride: int = 1,    group_column: str = None,    zero_padding: bool = False,    remove_group_column: bool = False,    target_column: str = None)<built-in function array>

This method take a DataFrame and returns a set of time windows of a specific length and a given column, eventually grouped by another column


  • `sorted_df` (pd.DataFrame): A sorted Data Frame by a DatetimeIndex, that contains all time values
  • `window_size` (int): The size of a window. How much record values should be added in every window. Consider this as a slice of the time series.
  • `window_stride` (int): How much records should be between the different windows? (Default: 1)
  • `group_column` (str): The name of the column on which you should group the time windows. This could be something like device_id. Optional.
  • `zero_padding` (bool): If True, zeros will be added in the first time windows, to fill the array, prior to the first values.
  • `remove_group_column` (bool): Indicates if the actual group column should be removed from the destination data frame
  • `target_column` (str): Used to return a related array of values, taking from the column with this name. Commonly used to specify classes in training sets.

Returns: a tuple with the following objects:

  • `np.array`: A multi dimensional array with all the windows, eventuall grouped
  • `np.array`: An array, with all the linked target values

function combine_time_ranges#

combine_time_ranges(*args: DataFrame)

This method combines multiple timeseries (as DataFrame) and removes the overlapping time sections


  • `*args (pd.DataFrame)`: A list of DataFrames, containing time series and the same layout.


  • `pd.DataFrame`: A DataFrame, containing all unique, ordered time series data from the given DataFrames

This file was automatically generated via lazydocs.