cosinorage.features Module

Module Contents

This module provides the functionality to compute a wide range wearable-features based on minute-level ENMO data including - Circadian rhythm features, e.g., IV, IS, RA, M10, L5 - Physical activity features, e.g., SB, LIPA, MVPA - Sleep features, e.g., TST, WASO, SE, SR

class WearableFeatures(handler, features_args={})[source]

Bases: object

A class for computing and managing features from wearable accelerometer data.

This class processes raw ENMO (Euclidean Norm Minus One) data to compute various circadian rhythm and physical activity metrics, including cosinor analysis, non-parametric measures, activity levels, and sleep metrics.

Parameters:
ml_data

Minute-level ENMO data with datetime index

Type:

pd.DataFrame

features_args

Arguments passed to feature computation functions

Type:

dict

feature_dict

Dictionary containing computed features organized by category

Type:

dict

__init__(handler, features_args={})[source]

Initialize WearableFeatures with data from a DataHandler.

Parameters:
  • handler (DataHandler) – DataHandler instance containing ENMO data

  • features_args (dict)

get_features()[source]

Returns the entire feature DataFrame.

Returns:

DataFrame containing all computed features

Return type:

pd.DataFrame

get_ml_data()[source]

Returns the raw ENMO data.

Returns:

DataFrame containing ENMO data with datetime index

Return type:

pd.DataFrame

class BulkWearableFeatures(handlers, features_args={}, cosinor_age_inputs=None, compute_distributions=True)[source]

Bases: object

A class for computing and managing features from multiple wearable accelerometer datasets.

This class processes multiple DataHandler instances to compute features for each and then calculates statistical distributions (mean, std, quartiles, etc.) across all datasets. It provides comprehensive analysis capabilities for cohort studies and large-scale wearable data analysis.

The class handles feature computation failures gracefully, allowing analysis to continue even when some datasets fail to process. It provides both individual feature access and aggregated statistical summaries.

Parameters:
  • handlers (List[DataHandler]) – List of DataHandler instances containing ENMO data. Each handler should have been properly initialized and loaded with data.

  • features_args (dict, optional) – Arguments for feature computation passed to WearableFeatures. Common arguments include: - ‘pa_params’: Physical activity parameters - ‘sleep_params’: Sleep detection parameters Defaults to empty dict.

  • compute_distributions (bool, optional) – Whether to compute statistical distributions across all features. If False, only individual features are computed. Defaults to True.

  • cosinor_age_inputs (List[dict], optional) – List of dictionaries containing age and gender information for CosinorAge computation. Each dictionary should contain: - ‘age’: Chronological age (float) - ‘gender’: Gender (‘male’, ‘female’, or ‘unknown’, optional, defaults to ‘unknown’) - ‘gt_cosinor_age’: Ground truth cosinor age (float, optional) Must be the same length as handlers if provided. If all dictionaries contain ‘gt_cosinor_age’, a ‘cosinor_age_prediction_error’ feature will be computed. Defaults to None.

handlers

List of DataHandler instances provided during initialization

Type:

List[DataHandler]

features_args

Arguments for feature computation

Type:

dict

cosinor_age_inputs

List of age/gender dictionaries for CosinorAge computation

Type:

List[dict]

individual_features

List of feature dictionaries for each handler. Failed computations are represented as None.

Type:

List[dict]

distribution_stats

Statistical distributions across all features. Only populated if compute_distributions=True.

Type:

dict

failed_handlers

List of (handler_index, error_message) tuples for handlers that failed during feature computation.

Type:

List[tuple]

Examples

>>> from cosinorage.datahandlers import GalaxyDataHandler
>>> from cosinorage.features import BulkWearableFeatures
>>>
>>> # Create multiple handlers
>>> handlers = []
>>> for i in range(3):
...     handler = GalaxyDataHandler(f"data/participant_{i}.csv")
...     handler.load_data()
...     handlers.append(handler)
>>>
>>> # Define age and gender information for CosinorAge computation
>>> cosinor_age_inputs = [
...     {"age": 25.5, "gender": "female", "gt_cosinor_age": 26.2},
...     {"age": 30.2, "gender": "male", "gt_cosinor_age": 31.1},
...     {"age": 28.0, "gender": "unknown", "gt_cosinor_age": 27.8}
... ]
>>>
>>> # Compute bulk features with CosinorAge
>>> bulk = BulkWearableFeatures(
...     handlers,
...     compute_distributions=True,
...     cosinor_age_inputs=cosinor_age_inputs
... )
>>>
>>> # Get statistical summary (includes CosinorAge features)
>>> stats = bulk.get_distribution_stats()
>>> print(f"Computed features for {len(stats)} feature types")
>>>
>>> # Check for failures
>>> failed = bulk.get_failed_handlers()
>>> if failed:
...     print(f"Failed handlers: {len(failed)}")
__init__(handlers, features_args={}, cosinor_age_inputs=None, compute_distributions=True)[source]

Initialize BulkWearableFeatures with multiple DataHandler instances.

Parameters:
  • handlers (List[DataHandler]) – List of DataHandler instances containing ENMO data. Each handler should have been properly initialized and loaded with data.

  • features_args (dict, optional) – Arguments for feature computation passed to WearableFeatures. Common arguments include: - ‘pa_params’: Physical activity parameters - ‘sleep_params’: Sleep detection parameters Defaults to empty dict.

  • compute_distributions (bool, optional) – Whether to compute statistical distributions across all features. If False, only individual features are computed. Defaults to True.

  • cosinor_age_inputs (List[dict] | None)

Notes

Empty handlers list is allowed and will result in empty individual_features and distribution_stats.

get_individual_features()[source]

Returns the individual feature dictionaries for each handler.

This method provides access to the raw feature dictionaries computed for each handler. Failed computations are represented as None entries in the list.

Returns:

List of feature dictionaries, one per handler. Each dictionary contains nested feature categories (cosinor, nonparam, physical_activity, sleep). If a handler failed during computation, its entry is None.

Return type:

List[dict]

Examples

>>> features = bulk.get_individual_features()
>>> for i, feat in enumerate(features):
...     if feat is not None:
...         print(f"Handler {i}: MESOR = {feat['cosinor']['mesor']:.3f}")
...     else:
...         print(f"Handler {i}: Failed")
get_distribution_stats()[source]

Returns the statistical distributions across all features.

This method provides comprehensive statistical measures for each feature across all successful computations. The statistics include descriptive measures, distribution characteristics, and quartile information.

Returns:

Statistical distributions for each feature. Keys are feature names (e.g., ‘cosinor_mesor’, ‘nonparam_IS’). Values are dictionaries containing statistical measures: - count, mean, std, min, max, median - q25, q75, iqr (interquartile range) - mode, skewness

Return type:

Dict[str, Dict[str, float]]

Examples

>>> stats = bulk.get_distribution_stats()
>>> mesor_stats = stats['cosinor_mesor']
>>> print(f"MESOR: mean={mesor_stats['mean']:.3f}, std={mesor_stats['std']:.3f}")
get_failed_handlers()[source]

Returns information about handlers that failed during feature computation.

This method provides details about which handlers failed and why, allowing for debugging and quality control in large-scale analyses.

Returns:

List of (handler_index, error_message) tuples for handlers that failed during feature computation. Empty list if all handlers succeeded.

Return type:

List[tuple]

Examples

>>> failed = bulk.get_failed_handlers()
>>> for idx, error in failed:
...     print(f"Handler {idx} failed: {error}")
get_summary_dataframe()[source]

Returns a summary DataFrame with all statistical measures for each feature.

This method converts the statistical distributions into a pandas DataFrame format, making it easy to export, analyze, or visualize the results.

Returns:

Summary DataFrame with features as rows and statistics as columns. Columns include: feature, count, mean, std, min, max, median, q25, q75, iqr, mode, skewness. Empty DataFrame if no distributions were computed.

Return type:

pd.DataFrame

Examples

>>> summary_df = bulk.get_summary_dataframe()
>>> print(summary_df.head())
>>> # Export to CSV
>>> summary_df.to_csv('feature_summary.csv', index=False)
get_feature_correlation_matrix()[source]

Returns correlation matrix between features across all handlers.

This method computes pairwise correlations between all numeric features across all successful computations. This is useful for understanding feature relationships and identifying redundant or highly correlated features.

Returns:

Correlation matrix of features. Values range from -1 to 1, where 1 indicates perfect positive correlation, -1 indicates perfect negative correlation, and 0 indicates no correlation. Empty DataFrame if insufficient data (less than 2 features or no successful computations).

Return type:

pd.DataFrame

Examples

>>> corr_matrix = bulk.get_feature_correlation_matrix()
>>> print(corr_matrix['cosinor_mesor']['nonparam_IS'])  # Correlation between MESOR and IS
>>> # Visualize with heatmap
>>> import seaborn as sns
>>> sns.heatmap(corr_matrix, annot=True, cmap='coolwarm')
plot_sleep_predictions(feature_obj, simple=True, start_date=None, end_date=None)[source]

Plot sleep predictions over time.

Creates visualization of sleep/wake predictions, optionally including non-wear periods. Simple mode shows a binary plot with dots, while detailed mode shows ENMO data with colored bands for sleep/wake states.

Parameters:
  • feature_obj (WearableFeatures) – Feature object containing ml_data with sleep predictions. Must have: - ml_data: DataFrame with ‘sleep’ column (1=sleep, 0=wake) - Optional ‘wear’ column for non-wear periods - ‘ENMO’ column for detailed plotting

  • simple (bool, default=True) – If True, shows simple binary plot with dots for sleep/wake states. If False, shows detailed plot with ENMO data and colored bands.

  • start_date (datetime, optional) – Start date for plotting. If None, uses the earliest date in the data.

  • end_date (datetime, optional) – End date for plotting. If None, uses the latest date in the data.

Returns:

Displays the plot using matplotlib.

Return type:

None

Notes

  • Simple mode: Shows binary sleep/wake states as colored dots

  • Detailed mode: Shows ENMO activity data with colored bands for sleep/wake

  • Non-wear periods are shown in red if ‘wear’ column is available

  • The function automatically handles date range selection

Examples

>>> from cosinorage.features import WearableFeatures
>>> from cosinorage.datahandlers import GenericDataHandler
>>>
>>> # Load data and compute features
>>> handler = GenericDataHandler('data.csv')
>>> features = WearableFeatures(handler)
>>>
>>> # Plot sleep predictions
>>> plot_sleep_predictions(features, simple=True)
>>> plot_sleep_predictions(features, simple=False,
...                       start_date='2023-01-01', end_date='2023-01-02')
plot_non_wear(feature_obj, simple=True, start_date=None, end_date=None)[source]

Plot non-wear periods over time.

Creates visualization of wear/non-wear periods. Simple mode shows a binary plot with dots, while detailed mode shows ENMO data with colored bands for wear states.

Parameters:
  • feature_obj (WearableFeatures) – Feature object containing ml_data with wear/non-wear predictions. Must have: - ml_data: DataFrame with ‘wear’ column (1=worn, 0=not worn) - ‘ENMO’ column for detailed plotting

  • simple (bool, default=True) – If True, shows simple binary plot with dots for wear/non-wear states. If False, shows detailed plot with ENMO data and colored bands.

  • start_date (datetime, optional) – Start date for plotting. If None, uses the earliest date in the data.

  • end_date (datetime, optional) – End date for plotting. If None, uses the latest date in the data.

Returns:

Displays the plot using matplotlib.

Return type:

None

Notes

  • Simple mode: Shows binary wear/non-wear states as colored dots

  • Detailed mode: Shows ENMO activity data with colored bands for wear states

  • Non-wear periods are highlighted in red, wear periods in green

  • The function automatically handles date range selection

Examples

>>> from cosinorage.features import WearableFeatures
>>> from cosinorage.datahandlers import GenericDataHandler
>>>
>>> # Load data and compute features
>>> handler = GenericDataHandler('data.csv')
>>> features = WearableFeatures(handler)
>>>
>>> # Plot wear/non-wear periods
>>> plot_non_wear(features, simple=True)
>>> plot_non_wear(features, simple=False,
...               start_date='2023-01-01', end_date='2023-01-02')
plot_cosinor(feature_obj)[source]

Plot cosinor analysis results for activity rhythm analysis.

Creates detailed visualizations of circadian rhythm analysis showing raw activity data (ENMO) overlaid with fitted cosinor curves. Includes markers for key circadian parameters: MESOR (rhythm-adjusted mean), amplitude, and acrophase (peak timing).

Parameters:

feature_obj (WearableFeatures) – Feature object containing cosinor analysis results and ENMO data. The ml_data DataFrame must contain ‘ENMO’ and ‘cosinor_fitted’ columns. The feature_dict must contain a ‘cosinor’ key with mesor, amplitude, acrophase, and acrophase_time values.

Returns:

Displays the plot using matplotlib.

Return type:

None

Raises:

ValueError – If cosinor features haven’t been computed (missing ‘cosinor_fitted’ column).

Notes

  • Shows raw ENMO data in red and fitted cosinor curve in blue

  • MESOR is displayed as a horizontal green dashed line

  • The plot provides visual validation of the cosinor fit quality

  • Y-axis limits are automatically adjusted to show the full range of data

Examples

>>> from cosinorage.features import WearableFeatures
>>> from cosinorage.datahandlers import GenericDataHandler
>>>
>>> # Load data and compute features
>>> handler = GenericDataHandler('data.csv')
>>> features = WearableFeatures(handler)
>>>
>>> # Plot cosinor analysis results
>>> plot_cosinor(features)
dashboard(features)[source]

Generate a comprehensive visualization dashboard for accelerometer data analysis.

This function creates multiple plots to visualize various aspects of the accelerometer data: 1. Cosinor fit plot showing ENMO data with mesor, amplitude, and acrophase 2. Daily ENMO plots with M10 and L5 periods highlighted 3. IS (Inter-daily Stability) and IV (Intra-daily Variability) visualization 4. RA (Relative Amplitude) plots for each day 5. Sleep predictions visualization 6. Sleep metrics (TST, WASO, PTA, NWB, SOL) visualization 7. Physical activity breakdown by intensity levels

Parameters:

features (WearableFeatures) – A WearableFeatures object containing all extracted features and raw data. Expected to have the following attributes: - feature_dict: Dictionary containing cosinor, nonparam, sleep, and physical_activity features - get_ml_data(): Method returning DataFrame with ENMO and cosinor_fitted data - get_features(): Method returning all extracted features

Returns:

Displays multiple matplotlib figures using plt.show()

Return type:

None

Notes

  • Creates 7 different visualization panels covering all major analysis aspects

  • Each panel is optimized for the specific metrics being displayed

  • The dashboard provides a comprehensive overview of circadian rhythm analysis

  • All plots use consistent color schemes and formatting

Examples

>>> from cosinorage.features import WearableFeatures
>>> from cosinorage.datahandlers import GenericDataHandler
>>>
>>> # Load data and compute features
>>> handler = GenericDataHandler('data.csv')
>>> features = WearableFeatures(handler)
>>>
>>> # Generate comprehensive dashboard
>>> dashboard(features)

Classes

class WearableFeatures(handler, features_args={})[source]

Bases: object

A class for computing and managing features from wearable accelerometer data.

This class processes raw ENMO (Euclidean Norm Minus One) data to compute various circadian rhythm and physical activity metrics, including cosinor analysis, non-parametric measures, activity levels, and sleep metrics.

Parameters:
ml_data

Minute-level ENMO data with datetime index

Type:

pd.DataFrame

features_args

Arguments passed to feature computation functions

Type:

dict

feature_dict

Dictionary containing computed features organized by category

Type:

dict

__init__(handler, features_args={})[source]

Initialize WearableFeatures with data from a DataHandler.

Parameters:
  • handler (DataHandler) – DataHandler instance containing ENMO data

  • features_args (dict)

get_features()[source]

Returns the entire feature DataFrame.

Returns:

DataFrame containing all computed features

Return type:

pd.DataFrame

get_ml_data()[source]

Returns the raw ENMO data.

Returns:

DataFrame containing ENMO data with datetime index

Return type:

pd.DataFrame

class BulkWearableFeatures(handlers, features_args={}, cosinor_age_inputs=None, compute_distributions=True)[source]

Bases: object

A class for computing and managing features from multiple wearable accelerometer datasets.

This class processes multiple DataHandler instances to compute features for each and then calculates statistical distributions (mean, std, quartiles, etc.) across all datasets. It provides comprehensive analysis capabilities for cohort studies and large-scale wearable data analysis.

The class handles feature computation failures gracefully, allowing analysis to continue even when some datasets fail to process. It provides both individual feature access and aggregated statistical summaries.

Parameters:
  • handlers (List[DataHandler]) – List of DataHandler instances containing ENMO data. Each handler should have been properly initialized and loaded with data.

  • features_args (dict, optional) – Arguments for feature computation passed to WearableFeatures. Common arguments include: - ‘pa_params’: Physical activity parameters - ‘sleep_params’: Sleep detection parameters Defaults to empty dict.

  • compute_distributions (bool, optional) – Whether to compute statistical distributions across all features. If False, only individual features are computed. Defaults to True.

  • cosinor_age_inputs (List[dict], optional) – List of dictionaries containing age and gender information for CosinorAge computation. Each dictionary should contain: - ‘age’: Chronological age (float) - ‘gender’: Gender (‘male’, ‘female’, or ‘unknown’, optional, defaults to ‘unknown’) - ‘gt_cosinor_age’: Ground truth cosinor age (float, optional) Must be the same length as handlers if provided. If all dictionaries contain ‘gt_cosinor_age’, a ‘cosinor_age_prediction_error’ feature will be computed. Defaults to None.

handlers

List of DataHandler instances provided during initialization

Type:

List[DataHandler]

features_args

Arguments for feature computation

Type:

dict

cosinor_age_inputs

List of age/gender dictionaries for CosinorAge computation

Type:

List[dict]

individual_features

List of feature dictionaries for each handler. Failed computations are represented as None.

Type:

List[dict]

distribution_stats

Statistical distributions across all features. Only populated if compute_distributions=True.

Type:

dict

failed_handlers

List of (handler_index, error_message) tuples for handlers that failed during feature computation.

Type:

List[tuple]

Examples

>>> from cosinorage.datahandlers import GalaxyDataHandler
>>> from cosinorage.features import BulkWearableFeatures
>>>
>>> # Create multiple handlers
>>> handlers = []
>>> for i in range(3):
...     handler = GalaxyDataHandler(f"data/participant_{i}.csv")
...     handler.load_data()
...     handlers.append(handler)
>>>
>>> # Define age and gender information for CosinorAge computation
>>> cosinor_age_inputs = [
...     {"age": 25.5, "gender": "female", "gt_cosinor_age": 26.2},
...     {"age": 30.2, "gender": "male", "gt_cosinor_age": 31.1},
...     {"age": 28.0, "gender": "unknown", "gt_cosinor_age": 27.8}
... ]
>>>
>>> # Compute bulk features with CosinorAge
>>> bulk = BulkWearableFeatures(
...     handlers,
...     compute_distributions=True,
...     cosinor_age_inputs=cosinor_age_inputs
... )
>>>
>>> # Get statistical summary (includes CosinorAge features)
>>> stats = bulk.get_distribution_stats()
>>> print(f"Computed features for {len(stats)} feature types")
>>>
>>> # Check for failures
>>> failed = bulk.get_failed_handlers()
>>> if failed:
...     print(f"Failed handlers: {len(failed)}")
__init__(handlers, features_args={}, cosinor_age_inputs=None, compute_distributions=True)[source]

Initialize BulkWearableFeatures with multiple DataHandler instances.

Parameters:
  • handlers (List[DataHandler]) – List of DataHandler instances containing ENMO data. Each handler should have been properly initialized and loaded with data.

  • features_args (dict, optional) – Arguments for feature computation passed to WearableFeatures. Common arguments include: - ‘pa_params’: Physical activity parameters - ‘sleep_params’: Sleep detection parameters Defaults to empty dict.

  • compute_distributions (bool, optional) – Whether to compute statistical distributions across all features. If False, only individual features are computed. Defaults to True.

  • cosinor_age_inputs (List[dict] | None)

Notes

Empty handlers list is allowed and will result in empty individual_features and distribution_stats.

get_individual_features()[source]

Returns the individual feature dictionaries for each handler.

This method provides access to the raw feature dictionaries computed for each handler. Failed computations are represented as None entries in the list.

Returns:

List of feature dictionaries, one per handler. Each dictionary contains nested feature categories (cosinor, nonparam, physical_activity, sleep). If a handler failed during computation, its entry is None.

Return type:

List[dict]

Examples

>>> features = bulk.get_individual_features()
>>> for i, feat in enumerate(features):
...     if feat is not None:
...         print(f"Handler {i}: MESOR = {feat['cosinor']['mesor']:.3f}")
...     else:
...         print(f"Handler {i}: Failed")
get_distribution_stats()[source]

Returns the statistical distributions across all features.

This method provides comprehensive statistical measures for each feature across all successful computations. The statistics include descriptive measures, distribution characteristics, and quartile information.

Returns:

Statistical distributions for each feature. Keys are feature names (e.g., ‘cosinor_mesor’, ‘nonparam_IS’). Values are dictionaries containing statistical measures: - count, mean, std, min, max, median - q25, q75, iqr (interquartile range) - mode, skewness

Return type:

Dict[str, Dict[str, float]]

Examples

>>> stats = bulk.get_distribution_stats()
>>> mesor_stats = stats['cosinor_mesor']
>>> print(f"MESOR: mean={mesor_stats['mean']:.3f}, std={mesor_stats['std']:.3f}")
get_failed_handlers()[source]

Returns information about handlers that failed during feature computation.

This method provides details about which handlers failed and why, allowing for debugging and quality control in large-scale analyses.

Returns:

List of (handler_index, error_message) tuples for handlers that failed during feature computation. Empty list if all handlers succeeded.

Return type:

List[tuple]

Examples

>>> failed = bulk.get_failed_handlers()
>>> for idx, error in failed:
...     print(f"Handler {idx} failed: {error}")
get_summary_dataframe()[source]

Returns a summary DataFrame with all statistical measures for each feature.

This method converts the statistical distributions into a pandas DataFrame format, making it easy to export, analyze, or visualize the results.

Returns:

Summary DataFrame with features as rows and statistics as columns. Columns include: feature, count, mean, std, min, max, median, q25, q75, iqr, mode, skewness. Empty DataFrame if no distributions were computed.

Return type:

pd.DataFrame

Examples

>>> summary_df = bulk.get_summary_dataframe()
>>> print(summary_df.head())
>>> # Export to CSV
>>> summary_df.to_csv('feature_summary.csv', index=False)
get_feature_correlation_matrix()[source]

Returns correlation matrix between features across all handlers.

This method computes pairwise correlations between all numeric features across all successful computations. This is useful for understanding feature relationships and identifying redundant or highly correlated features.

Returns:

Correlation matrix of features. Values range from -1 to 1, where 1 indicates perfect positive correlation, -1 indicates perfect negative correlation, and 0 indicates no correlation. Empty DataFrame if insufficient data (less than 2 features or no successful computations).

Return type:

pd.DataFrame

Examples

>>> corr_matrix = bulk.get_feature_correlation_matrix()
>>> print(corr_matrix['cosinor_mesor']['nonparam_IS'])  # Correlation between MESOR and IS
>>> # Visualize with heatmap
>>> import seaborn as sns
>>> sns.heatmap(corr_matrix, annot=True, cmap='coolwarm')

Utility Functions

Cosinor (Circadian Rhythm Analysis) Analysis

cosinor_multiday(df)[source]

A parametric approach to study circadian rhythmicity assuming cosinor shape, fitting a model for multiple days.

Parameters:

dfpandas.DataFrame

DataFrame with a Timestamp index and a column ‘ENMO’ containing minute-level activity data.

returns:
  • tuple

    • dict: Dictionary containing cosinor parameters:
      • MESOR: Midline Estimating Statistic Of Rhythm (rhythm-adjusted mean)

      • amplitude: Half the difference between maximum and minimum values

      • acrophase: Time of peak relative to midnight in radians

      • acrophase_time: Time of peak in hours (0-24)

    • pandas.Series: Fitted values for each timepoint

  • Raises

  • ——-

  • ValueError – If DataFrame doesn’t have required ‘ENMO’ column or timestamp index If data length is not a multiple of 1440 (minutes in a day)

Parameters:

df (DataFrame)

Return type:

DataFrame

cosinor_model(t, M, A, phi, tau)[source]

Cosinor model function with counterclockwise acrophase.

This function implements the standard cosinor model for fitting periodic data. The model assumes a cosine function with adjustable amplitude, phase, and period.

Parameters:
  • t (array-like) – Time points at which to evaluate the model.

  • M (float) – MESOR (Midline Statistic of Rhythm) - the rhythm-adjusted mean.

  • A (float) – Amplitude - half the peak-to-trough difference (always positive).

  • phi (float) – Acrophase in radians (counterclockwise orientation).

  • tau (float) – Period of the rhythm in the same units as t.

Returns:

Fitted values at the specified time points.

Return type:

array-like

Notes

  • The model uses counterclockwise acrophase orientation (negative sign before phi)

  • The function implements: M + A * cos(2π * t / τ + φ)

  • Amplitude A should always be positive

  • The acrophase φ represents the time of peak activity

Examples

>>> import numpy as np
>>>
>>> # Create time points (24 hours in minutes)
>>> t = np.arange(0, 1440, 1)  # 0 to 1440 minutes
>>>
>>> # Define cosinor parameters
>>> M = 0.5    # MESOR
>>> A = 0.3    # Amplitude
>>> phi = 0    # Acrophase (peak at midnight)
>>> tau = 1440 # Period (24 hours in minutes)
>>>
>>> # Generate fitted values
>>> fitted = cosinor_model(t, M, A, phi, tau)
>>> print(f"Peak value: {fitted.max():.3f}")
>>> print(f"Trough value: {fitted.min():.3f}")
fit_cosinor(time, data, period=24)[source]

Fit cosinor model to time series data.

This function fits a cosinor model to time series data using the CosinorPy library. It estimates the MESOR, amplitude, and acrophase parameters that best describe the periodic pattern in the data.

Parameters:
  • time (array-like) – Time points corresponding to the data values.

  • data (array-like) – Observed values to fit the cosinor model to.

  • period (float, default=24) – Known period of the rhythm in the same units as time. For circadian rhythms, typically 24 hours or 1440 minutes.

Returns:

Dictionary containing fitted parameters and statistics: - ‘MESOR’: Midline Estimating Statistic Of Rhythm (rhythm-adjusted mean) - ‘amplitude’: Half the peak-to-trough difference - ‘acrophase’: Time of peak relative to the period in radians - ‘fitted_values’: Array of fitted values at each time point

Return type:

dict

Notes

  • Uses CosinorPy.cosinor1.fit_cosinor for the core fitting algorithm

  • The period is converted to minutes (period * 60) for internal processing

  • The function returns both the fitted parameters and the fitted values

  • The acrophase is returned in radians and represents the timing of peak activity

Examples

>>> import numpy as np
>>> import pandas as pd
>>>
>>> # Create sample circadian data
>>> time = np.arange(0, 1440, 1)  # 24 hours in minutes
>>> data = 0.5 + 0.3 * np.cos(2 * np.pi * time / 1440 + np.pi/4) + 0.1 * np.random.randn(1440)
>>>
>>> # Fit cosinor model
>>> results = fit_cosinor(time, data, period=24)
>>> print(f"MESOR: {results['MESOR']:.3f}")
>>> print(f"Amplitude: {results['amplitude']:.3f}")
>>> print(f"Acrophase: {results['acrophase']:.3f} radians")

Non-parametric (Circadian Rhythm Analysis) Analysis

IV(data)[source]

Calculate the intradaily variability (IV) for the entire dataset.

Intradaily variability quantifies the fragmentation of rest-activity patterns within each 24-hour period. It is calculated as the ratio of the mean squared first derivative to the variance.

Parameters:

data (pd.Series) – Time series data containing activity measurements with datetime index and ‘ENMO’ column. Should contain multiple days of minute-level data.

Returns:

Intradaily variability value, where: - Lower values indicate less fragmented activity patterns - Higher values indicate more fragmented activity patterns Returns np.nan if insufficient data or calculation fails.

Return type:

float

Notes

  • Resamples data to hourly resolution for calculation

  • IV = (P * sum((z_p - z_{p-1})²)) / ((P-1) * sum((z_p - z_mean)²))

  • Lower values indicate more consolidated rest-activity periods

  • Higher values indicate more fragmented sleep and activity patterns

  • Used in circadian rhythm analysis to assess rhythm fragmentation

Examples

>>> import pandas as pd
>>> import numpy as np
>>>
>>> # Create sample multi-day activity data
>>> dates = pd.date_range('2023-01-01', periods=4320, freq='min')  # 3 days
>>> # Simulate fragmented activity pattern
>>> hours = dates.hour
>>> enmo = pd.Series(np.random.uniform(0, 1, 4320), index=dates)  # Random activity
>>>
>>> # Calculate intradaily variability
>>> iv_value = IV(enmo)
>>> print(f"Intradaily Variability: {iv_value:.3f}")
>>> # Higher values indicate more fragmented activity patterns
IS(data)[source]

Calculate the interdaily stability (IS) for the entire dataset.

Interdaily stability quantifies the strength of coupling between the rest-activity rhythm and environmental zeitgebers. It compares the 24-hour pattern across days.

Parameters:

data (pd.Series) – Time series data containing activity measurements with datetime index and ‘ENMO’ column. Should contain multiple days of minute-level data.

Returns:

Interdaily stability value ranging from 0 to 1, where: - 0 indicates no stability (random activity patterns) - 1 indicates perfect stability (identical daily patterns) Returns np.nan if insufficient data or calculation fails.

Return type:

float

Notes

  • Resamples data to hourly resolution for calculation

  • IS = (D * sum((hourly_means - overall_mean)²)) / sum((all_values - overall_mean)²)

  • Higher values indicate more consistent daily activity patterns

  • Requires multiple days of data for meaningful calculation

  • Used in circadian rhythm analysis to assess rhythm stability

Examples

>>> import pandas as pd
>>> import numpy as np
>>>
>>> # Create sample multi-day activity data
>>> dates = pd.date_range('2023-01-01', periods=4320, freq='min')  # 3 days
>>> # Simulate consistent daily pattern
>>> hours = dates.hour
>>> enmo = pd.Series(np.sin(hours * np.pi / 12) + 1 + np.random.normal(0, 0.1, 4320), index=dates)
>>>
>>> # Calculate interdaily stability
>>> is_value = IS(enmo)
>>> print(f"Interdaily Stability: {is_value:.3f}")
>>> # Higher values indicate more consistent daily patterns
RA(m10, l5)[source]

Calculate the relative amplitude (RA) for each day.

Relative amplitude is calculated as the difference between the most active 10-hour period and least active 5-hour period, divided by their sum. This provides a normalized measure of the daily activity rhythm strength.

Parameters:
  • m10 (List[float]) – List of M10 values (mean activity during 10 most active hours) for each day. Should be output from the M10() function.

  • l5 (List[float]) – List of L5 values (mean activity during 5 least active hours) for each day. Should be output from the L5() function.

Returns:

List of relative amplitude values for each day, where: - Values range from 0 to 1 - Higher values indicate stronger daily activity rhythms - Lower values indicate weaker daily activity rhythms Returns empty list if input lists are empty.

Return type:

List[float]

Notes

  • RA = (M10 - L5) / (M10 + L5)

  • Normalized measure that accounts for overall activity level

  • Higher values indicate more pronounced rest-activity cycles

  • Used in circadian rhythm analysis to assess rhythm strength

  • Requires both M10 and L5 values from the same dataset

Examples

>>> import pandas as pd
>>> import numpy as np
>>>
>>> # Create sample multi-day activity data
>>> dates = pd.date_range('2023-01-01', periods=4320, freq='min')  # 3 days
>>> hours = dates.hour
>>> enmo = pd.Series(np.sin(hours * np.pi / 12) + 1 + np.random.normal(0, 0.1, 4320), index=dates)
>>>
>>> # Calculate M10 and L5 first
>>> m10_values, m10_starts = M10(enmo)
>>> l5_values, l5_starts = L5(enmo)
>>>
>>> # Calculate relative amplitude
>>> ra_values = RA(m10_values, l5_values)
>>> print(f"Relative Amplitude values: {ra_values}")
>>> # Higher values indicate stronger daily activity rhythms
M10(data)[source]

Calculate the M10 (mean activity during the 10 most active hours) and the start time of the 10 most active hours (M10_start) for each day.

M10 provides information about the most active period during each day, which typically corresponds to the main activity phase.

Parameters:

data (pd.Series) – Time series data containing activity measurements with datetime index and ‘ENMO’ column. Should contain minute-level data for multiple days.

Returns:

Tuple containing two lists: - m10: List of mean activity values during the 10 most active hours for each day - m10_start: List of start times (datetime) of the 10 most active hours for each day Returns empty lists if insufficient data.

Return type:

tuple

Notes

  • Uses rolling 10-hour windows (600 minutes) to find the most active period

  • Calculates mean activity within each window and finds the maximum

  • Returns both the activity level and start time of the most active period

  • Used in circadian rhythm analysis to identify the main activity phase

  • Typically corresponds to daytime activity periods

Examples

>>> import pandas as pd
>>> import numpy as np
>>>
>>> # Create sample multi-day activity data
>>> dates = pd.date_range('2023-01-01', periods=4320, freq='min')  # 3 days
>>> # Simulate activity with peak during day
>>> hours = dates.hour
>>> enmo = pd.Series(np.sin(hours * np.pi / 12) + 1 + np.random.normal(0, 0.1, 4320), index=dates)
>>>
>>> # Calculate M10 for each day
>>> m10_values, m10_starts = M10(enmo)
>>> print(f"M10 values: {m10_values}")
>>> print(f"M10 start times: {m10_starts}")
L5(data)[source]

Calculate the L5 (mean activity during the 5 least active hours) and the start time of the 5 least active hours (L5_start) for each day.

L5 provides information about the least active period during each day, which typically corresponds to the main rest phase.

Parameters:

data (pd.Series) – Time series data containing activity measurements with datetime index and ‘ENMO’ column. Should contain minute-level data for multiple days.

Returns:

Tuple containing two lists: - l5: List of mean activity values during the 5 least active hours for each day - l5_start: List of start times (datetime) of the 5 least active hours for each day Returns empty lists if insufficient data.

Return type:

tuple

Notes

  • Uses rolling 5-hour windows (300 minutes) to find the least active period

  • Calculates mean activity within each window and finds the minimum

  • Returns both the activity level and start time of the least active period

  • Used in circadian rhythm analysis to identify the main rest phase

  • Typically corresponds to nighttime sleep periods

Examples

>>> import pandas as pd
>>> import numpy as np
>>>
>>> # Create sample multi-day activity data
>>> dates = pd.date_range('2023-01-01', periods=4320, freq='min')  # 3 days
>>> # Simulate activity with low during night
>>> hours = dates.hour
>>> enmo = pd.Series(np.sin(hours * np.pi / 12) + 1 + np.random.normal(0, 0.1, 4320), index=dates)
>>>
>>> # Calculate L5 for each day
>>> l5_values, l5_starts = L5(enmo)
>>> print(f"L5 values: {l5_values}")
>>> print(f"L5 start times: {l5_starts}")

Physical Activity Metrics

activity_metrics(data, pa_params={'lm': 0.1, 'mv': 0.4, 'sl': 0.03})[source]

Calculate Sedentary Behavior (SB), Light Physical Activity (LIPA), and Moderate-to-Vigorous Physical Activity (MVPA) durations in minutes for each day.

This function classifies physical activity levels based on ENMO (Euclidean Norm Minus One) values using established cutpoints and returns the duration spent in each activity level for each day in the dataset.

Parameters:
  • data (pd.Series) – A pandas Series with a DatetimeIndex and ENMO (Euclidean Norm Minus One) values. The index should be datetime with minute-level resolution. The values should be float numbers representing acceleration in g units.

  • pa_params (dict, default=cutpoints) – Dictionary containing physical activity cutpoints: - ‘sl’ or ‘pa_cutpoint_sl’: Sedentary behavior threshold (default: 0.030g) - ‘lm’ or ‘pa_cutpoint_lm’: Light activity threshold (default: 0.100g) - ‘mv’ or ‘pa_cutpoint_mv’: Moderate-to-vigorous activity threshold (default: 0.400g)

Returns:

Tuple containing four lists of daily activity durations in minutes: - sedentary_minutes: Minutes spent in sedentary behavior (ENMO ≤ sl) - light_minutes: Minutes spent in light physical activity (sl < ENMO ≤ lm) - moderate_minutes: Minutes spent in moderate physical activity (lm < ENMO ≤ mv) - vigorous_minutes: Minutes spent in vigorous physical activity (ENMO > mv)

Return type:

tuple

Raises:

ValueError – If required cutpoints are not found in the pa_params dictionary.

Notes

  • The function assumes minute-level data and returns durations in minutes

  • ENMO cutpoints are based on established thresholds for physical activity classification

  • Cutpoints may vary depending on accelerometer type, position, and user characteristics

  • Returns empty lists if input data is empty

  • Groups data by date and calculates daily totals

Examples

>>> import pandas as pd
>>> import numpy as np
>>>
>>> # Create sample ENMO data for one day
>>> dates = pd.date_range('2023-01-01', periods=1440, freq='min')  # One day
>>> enmo_data = pd.Series(np.random.uniform(0, 0.5, 1440), index=dates)
>>>
>>> # Calculate activity metrics
>>> sb, lipa, mvpa, vig = activity_metrics(enmo_data)
>>> print(f"Sedentary minutes: {sb[0]}")
>>> print(f"Light activity minutes: {lipa[0]}")
>>> print(f"Moderate activity minutes: {mvpa[0]}")
>>> print(f"Vigorous activity minutes: {vig[0]}")
>>>
>>> # Use custom cutpoints
>>> custom_cutpoints = {
...     'pa_cutpoint_sl': 0.020,
...     'pa_cutpoint_lm': 0.080,
...     'pa_cutpoint_mv': 0.300
... }
>>> sb, lipa, mvpa, vig = activity_metrics(enmo_data, pa_params=custom_cutpoints)

Sleep Metrics

apply_sleep_wake_predictions(data, sleep_params)[source]

Apply sleep-wake prediction to accelerometer data using ENMO values.

This function uses machine learning algorithms to classify each minute as either sleep or wake based on the activity level (ENMO values). The prediction is based on the assumption that lower activity levels correspond to sleep periods.

Parameters:
  • data (pd.DataFrame) – Input DataFrame containing ENMO values in a column named ‘ENMO’. Should have a datetime index with minute-level resolution.

  • sleep_params (dict) – Dictionary containing sleep prediction parameters: - ‘sleep_ck_sf’: Sampling frequency for sleep classification (default: 0.0025) - ‘sleep_rescore’: Whether to rescore sleep predictions (default: True)

Returns:

Series containing sleep predictions with the same index as input data: - 1 = sleep - 0 = wake

Return type:

pd.Series

Raises:

ValueError – If ‘ENMO’ column is not found in DataFrame.

Notes

  • Uses skdh.sleep.sleep_classification.compute_sleep_predictions for the core algorithm

  • The function adds a ‘sleep’ column to the input DataFrame

  • Sleep predictions are based on activity patterns and circadian rhythms

  • The algorithm is trained on large datasets of polysomnography-validated sleep

Examples

>>> import pandas as pd
>>> import numpy as np
>>>
>>> # Create sample ENMO data
>>> timestamps = pd.date_range('2023-01-01', periods=1440, freq='min')  # One day
>>> data = pd.DataFrame({
...     'ENMO': np.random.uniform(0, 0.1, 1440)  # Random activity levels
... }, index=timestamps)
>>>
>>> # Apply sleep-wake predictions
>>> sleep_params = {'sleep_ck_sf': 0.0025, 'sleep_rescore': True}
>>> sleep_predictions = apply_sleep_wake_predictions(data, sleep_params)
>>> print(f"Sleep time: {sleep_predictions.sum()} minutes")
WASO(data)[source]

Calculate Wake After Sleep Onset (WASO) for each 24-hour cycle.

WASO represents the total time spent awake after the first sleep onset until the final wake time. It’s a key metric for sleep quality assessment.

Parameters:

data (pd.DataFrame) – DataFrame with: - datetime index - ‘sleep’ column (0=sleep, 1=wake)

Returns:

List containing WASO values in minutes for each 24-hour cycle. Returns 0 for days where no sleep is detected.

Return type:

List[int]

Notes

  • Processes data in 24-hour cycles starting at midnight

  • Uses the WakeAfterSleepOnset class from sleep_metrics library

  • Higher WASO values indicate more fragmented sleep

  • Important metric for assessing sleep maintenance

  • Zero is returned for days where no sleep is detected

Examples

>>> import pandas as pd
>>>
>>> # Create sample sleep data for one day
>>> dates = pd.date_range('2023-01-01', periods=1440, freq='min')
>>> # Simulate sleep pattern with some wake periods
>>> sleep_pattern = [0] * 480 + [1] * 30 + [0] * 60 + [1] * 20 + [0] * 810  # Sleep with wake periods
>>> data = pd.DataFrame({'sleep': sleep_pattern}, index=dates)
>>>
>>> # Calculate WASO
>>> waso_values = WASO(data)
>>> print(f"Wake After Sleep Onset: {waso_values[0]} minutes")
TST(data)[source]

Calculate Total Sleep Time (TST) for each 24-hour cycle.

TST represents the total time spent in sleep state during the analysis period. It’s a fundamental metric for sleep quantity assessment.

Parameters:

data (pd.DataFrame) – DataFrame with: - datetime index - ‘sleep’ column (1=sleep, 0=wake)

Returns:

List containing total sleep time in minutes for each 24-hour cycle.

Return type:

List[int]

Notes

  • Processes data in 24-hour cycles starting at midnight

  • Uses the TotalSleepTime class from sleep_metrics library

  • Sleep time is calculated by counting all epochs marked as sleep (1)

  • Key metric for assessing sleep quantity

  • Recommended TST varies by age group (7-9 hours for adults)

Examples

>>> import pandas as pd
>>>
>>> # Create sample sleep data for one day
>>> dates = pd.date_range('2023-01-01', periods=1440, freq='min')
>>> # Simulate 8 hours of sleep
>>> sleep_pattern = [1] * 480 + [0] * 960  # 8 hours sleep, 16 hours wake
>>> data = pd.DataFrame({'sleep': sleep_pattern}, index=dates)
>>>
>>> # Calculate TST
>>> tst_values = TST(data)
>>> print(f"Total Sleep Time: {tst_values[0]} minutes ({tst_values[0]/60:.1f} hours)")
PTA(data)[source]

Calculate Percent Time Asleep (PTA) for each 24-hour cycle.

PTA represents the percentage of time spent asleep relative to the total recording time. It provides a normalized measure of sleep quantity.

Parameters:

data (pd.DataFrame) – DataFrame with: - datetime index - ‘sleep’ column (1=sleep, 0=wake)

Returns:

List containing percent time asleep (0-1) for each 24-hour cycle. Values range from 0 (no sleep) to 1 (100% sleep).

Return type:

List[float]

Notes

  • Processes data in 24-hour cycles starting at midnight

  • Uses the PercentTimeAsleep class from sleep_metrics library

  • PTA = (number of sleep epochs) / (total number of epochs)

  • Useful for comparing sleep patterns across different recording durations

  • Typical PTA values range from 0.25 to 0.40 (25-40% of day)

Examples

>>> import pandas as pd
>>>
>>> # Create sample sleep data for one day
>>> dates = pd.date_range('2023-01-01', periods=1440, freq='min')
>>> # Simulate 8 hours of sleep (33.3% of day)
>>> sleep_pattern = [1] * 480 + [0] * 960  # 8 hours sleep, 16 hours wake
>>> data = pd.DataFrame({'sleep': sleep_pattern}, index=dates)
>>>
>>> # Calculate PTA
>>> pta_values = PTA(data)
>>> print(f"Percent Time Asleep: {pta_values[0]:.3f} ({pta_values[0]*100:.1f}%)")
NWB(data)[source]

Calculate Number of Wake Bouts (NWB) for each 24-hour cycle.

NWB represents the count of distinct wake episodes occurring between sleep periods during the analysis period. It’s a measure of sleep fragmentation.

Parameters:

data (pd.DataFrame) – DataFrame with: - datetime index - ‘sleep’ column (1=sleep, 0=wake)

Returns:

List containing the number of wake bouts for each 24-hour cycle. Higher values indicate more fragmented sleep.

Return type:

List[int]

Notes

  • Processes data in 24-hour cycles starting at midnight

  • Uses the NumberWakeBouts class from sleep_metrics library

  • A wake bout is defined as a continuous period of wake states between sleep states

  • Higher NWB values indicate more fragmented sleep patterns

  • Important metric for assessing sleep quality and continuity

Examples

>>> import pandas as pd
>>>
>>> # Create sample sleep data for one day
>>> dates = pd.date_range('2023-01-01', periods=1440, freq='min')
>>> # Simulate fragmented sleep with multiple wake bouts
>>> sleep_pattern = [1] * 240 + [0] * 30 + [1] * 120 + [0] * 20 + [1] * 120 + [0] * 910
>>> data = pd.DataFrame({'sleep': sleep_pattern}, index=dates)
>>>
>>> # Calculate NWB
>>> nwb_values = NWB(data)
>>> print(f"Number of Wake Bouts: {nwb_values[0]}")
SOL(data)[source]

Calculate Sleep Onset Latency (SOL) for each 24-hour cycle.

SOL represents the time taken to fall asleep, measured from the start of the recording period until the first sleep onset. It’s a key metric for sleep initiation.

Parameters:

data (pd.DataFrame) – DataFrame with: - datetime index - ‘sleep’ column (1=sleep, 0=wake)

Returns:

List containing sleep onset latency in minutes for each 24-hour cycle. Lower values indicate faster sleep onset.

Return type:

List[int]

Notes

  • Processes data in 24-hour cycles starting at midnight

  • Uses the SleepOnsetLatency class from sleep_metrics library

  • SOL is calculated as the time from the start of the recording until the first detected sleep episode

  • Lower SOL values indicate better sleep initiation

  • Typical SOL values range from 10-30 minutes for healthy sleepers

Examples

>>> import pandas as pd
>>>
>>> # Create sample sleep data for one day
>>> dates = pd.date_range('2023-01-01', periods=1440, freq='min')
>>> # Simulate 30 minutes to fall asleep
>>> sleep_pattern = [0] * 30 + [1] * 1410  # 30 min wake, then sleep
>>> data = pd.DataFrame({'sleep': sleep_pattern}, index=dates)
>>>
>>> # Calculate SOL
>>> sol_values = SOL(data)
>>> print(f"Sleep Onset Latency: {sol_values[0]} minutes")
SRI(data)[source]

Calculate Sleep Regularity Index (SRI) for the entire dataset.

SRI quantifies the day-to-day similarity of sleep-wake patterns. It ranges from -100 (completely irregular) to +100 (perfectly regular).

Parameters:

data (pd.DataFrame) – DataFrame with: - datetime index - ‘sleep’ column (1=sleep, 0=wake)

Returns:

Sleep Regularity Index value ranging from -100 to +100: - -100: Completely irregular sleep patterns - 0: Random sleep patterns - +100: Perfectly regular sleep patterns

Return type:

float

Notes

  • Requires multiple days of data for meaningful calculation

  • Compares sleep-wake patterns across consecutive days

  • Higher SRI values indicate more consistent sleep schedules

  • Important metric for assessing sleep hygiene and circadian rhythm stability

  • Uses overlapping day pairs to calculate similarity

Examples

>>> import pandas as pd
>>>
>>> # Create sample multi-day sleep data
>>> dates = pd.date_range('2023-01-01', periods=4320, freq='min')  # 3 days
>>> # Simulate consistent sleep pattern
>>> sleep_pattern = [1] * 480 + [0] * 960  # 8 hours sleep, 16 hours wake
>>> sleep_data = sleep_pattern * 3  # Repeat for 3 days
>>> data = pd.DataFrame({'sleep': sleep_data}, index=dates)
>>>
>>> # Calculate SRI
>>> sri_value = SRI(data)
>>> print(f"Sleep Regularity Index: {sri_value:.1f}")
>>> # Higher values indicate more regular sleep patterns

Rescaling Functions

min_max_scaling_exclude_outliers(data, upper_quantile=0.999)[source]

Scales the data using min-max scaling to a [0,100] range, excluding outliers based on quantiles.

This function applies min-max scaling to normalize data to a [0,100] range while using robust bounds that exclude extreme outliers. Values above the upper quantile threshold are not excluded from the final result but may exceed 100.

Parameters:
  • data (pd.Series or np.ndarray) – Input data to be scaled. Can be either a pandas Series or numpy array of numeric values.

  • upper_quantile (float, default=0.999) – Upper quantile threshold for excluding outliers when calculating min/max bounds. Defaults to 0.999 (99.9th percentile).

Returns:

Scaled data with values generally ranging from 0 to 100.

Return type:

pd.Series

Notes

  • If input contains all identical values, returns zeros

  • Values above the upper_quantile may exceed 100 in the output

  • Output maintains the same length as input

Raises:

ValueError – If input data is empty.

Notes

  • Uses quantile-based outlier detection for robust scaling

  • Applies min-max scaling: (x - min) / (max - min) * 100

  • Handles edge cases like constant data and zero division

  • Preserves outliers in output but uses robust bounds for scaling

  • Useful for normalizing accelerometer data while handling extreme values

Examples

>>> import pandas as pd
>>> import numpy as np
>>>
>>> # Example with normal data
>>> data = pd.Series([1, 2, 3, 4, 5])
>>> scaled = min_max_scaling_exclude_outliers(data)
>>> print(scaled)
>>> # Output: [0.0, 25.0, 50.0, 75.0, 100.0]
>>>
>>> # Example with outliers
>>> data_with_outliers = pd.Series([1, 2, 3, 100])
>>> scaled = min_max_scaling_exclude_outliers(data_with_outliers, upper_quantile=0.75)
>>> print(scaled)
>>> # Output: [0.0, 50.0, 100.0, 4950.0] (outlier exceeds 100)
>>>
>>> # Example with constant data
>>> constant_data = pd.Series([5, 5, 5, 5])
>>> scaled = min_max_scaling_exclude_outliers(constant_data)
>>> print(scaled)
>>> # Output: [0.0, 0.0, 0.0, 0.0]

Visualization Functions

plot_sleep_predictions(feature_obj, simple=True, start_date=None, end_date=None)[source]

Plot sleep predictions over time.

Creates visualization of sleep/wake predictions, optionally including non-wear periods. Simple mode shows a binary plot with dots, while detailed mode shows ENMO data with colored bands for sleep/wake states.

Parameters:
  • feature_obj (WearableFeatures) – Feature object containing ml_data with sleep predictions. Must have: - ml_data: DataFrame with ‘sleep’ column (1=sleep, 0=wake) - Optional ‘wear’ column for non-wear periods - ‘ENMO’ column for detailed plotting

  • simple (bool, default=True) – If True, shows simple binary plot with dots for sleep/wake states. If False, shows detailed plot with ENMO data and colored bands.

  • start_date (datetime, optional) – Start date for plotting. If None, uses the earliest date in the data.

  • end_date (datetime, optional) – End date for plotting. If None, uses the latest date in the data.

Returns:

Displays the plot using matplotlib.

Return type:

None

Notes

  • Simple mode: Shows binary sleep/wake states as colored dots

  • Detailed mode: Shows ENMO activity data with colored bands for sleep/wake

  • Non-wear periods are shown in red if ‘wear’ column is available

  • The function automatically handles date range selection

Examples

>>> from cosinorage.features import WearableFeatures
>>> from cosinorage.datahandlers import GenericDataHandler
>>>
>>> # Load data and compute features
>>> handler = GenericDataHandler('data.csv')
>>> features = WearableFeatures(handler)
>>>
>>> # Plot sleep predictions
>>> plot_sleep_predictions(features, simple=True)
>>> plot_sleep_predictions(features, simple=False,
...                       start_date='2023-01-01', end_date='2023-01-02')
plot_non_wear(feature_obj, simple=True, start_date=None, end_date=None)[source]

Plot non-wear periods over time.

Creates visualization of wear/non-wear periods. Simple mode shows a binary plot with dots, while detailed mode shows ENMO data with colored bands for wear states.

Parameters:
  • feature_obj (WearableFeatures) – Feature object containing ml_data with wear/non-wear predictions. Must have: - ml_data: DataFrame with ‘wear’ column (1=worn, 0=not worn) - ‘ENMO’ column for detailed plotting

  • simple (bool, default=True) – If True, shows simple binary plot with dots for wear/non-wear states. If False, shows detailed plot with ENMO data and colored bands.

  • start_date (datetime, optional) – Start date for plotting. If None, uses the earliest date in the data.

  • end_date (datetime, optional) – End date for plotting. If None, uses the latest date in the data.

Returns:

Displays the plot using matplotlib.

Return type:

None

Notes

  • Simple mode: Shows binary wear/non-wear states as colored dots

  • Detailed mode: Shows ENMO activity data with colored bands for wear states

  • Non-wear periods are highlighted in red, wear periods in green

  • The function automatically handles date range selection

Examples

>>> from cosinorage.features import WearableFeatures
>>> from cosinorage.datahandlers import GenericDataHandler
>>>
>>> # Load data and compute features
>>> handler = GenericDataHandler('data.csv')
>>> features = WearableFeatures(handler)
>>>
>>> # Plot wear/non-wear periods
>>> plot_non_wear(features, simple=True)
>>> plot_non_wear(features, simple=False,
...               start_date='2023-01-01', end_date='2023-01-02')
plot_cosinor(feature_obj)[source]

Plot cosinor analysis results for activity rhythm analysis.

Creates detailed visualizations of circadian rhythm analysis showing raw activity data (ENMO) overlaid with fitted cosinor curves. Includes markers for key circadian parameters: MESOR (rhythm-adjusted mean), amplitude, and acrophase (peak timing).

Parameters:

feature_obj (WearableFeatures) – Feature object containing cosinor analysis results and ENMO data. The ml_data DataFrame must contain ‘ENMO’ and ‘cosinor_fitted’ columns. The feature_dict must contain a ‘cosinor’ key with mesor, amplitude, acrophase, and acrophase_time values.

Returns:

Displays the plot using matplotlib.

Return type:

None

Raises:

ValueError – If cosinor features haven’t been computed (missing ‘cosinor_fitted’ column).

Notes

  • Shows raw ENMO data in red and fitted cosinor curve in blue

  • MESOR is displayed as a horizontal green dashed line

  • The plot provides visual validation of the cosinor fit quality

  • Y-axis limits are automatically adjusted to show the full range of data

Examples

>>> from cosinorage.features import WearableFeatures
>>> from cosinorage.datahandlers import GenericDataHandler
>>>
>>> # Load data and compute features
>>> handler = GenericDataHandler('data.csv')
>>> features = WearableFeatures(handler)
>>>
>>> # Plot cosinor analysis results
>>> plot_cosinor(features)

Dashboard Functions

dashboard(features)[source]

Generate a comprehensive visualization dashboard for accelerometer data analysis.

This function creates multiple plots to visualize various aspects of the accelerometer data: 1. Cosinor fit plot showing ENMO data with mesor, amplitude, and acrophase 2. Daily ENMO plots with M10 and L5 periods highlighted 3. IS (Inter-daily Stability) and IV (Intra-daily Variability) visualization 4. RA (Relative Amplitude) plots for each day 5. Sleep predictions visualization 6. Sleep metrics (TST, WASO, PTA, NWB, SOL) visualization 7. Physical activity breakdown by intensity levels

Parameters:

features (WearableFeatures) – A WearableFeatures object containing all extracted features and raw data. Expected to have the following attributes: - feature_dict: Dictionary containing cosinor, nonparam, sleep, and physical_activity features - get_ml_data(): Method returning DataFrame with ENMO and cosinor_fitted data - get_features(): Method returning all extracted features

Returns:

Displays multiple matplotlib figures using plt.show()

Return type:

None

Notes

  • Creates 7 different visualization panels covering all major analysis aspects

  • Each panel is optimized for the specific metrics being displayed

  • The dashboard provides a comprehensive overview of circadian rhythm analysis

  • All plots use consistent color schemes and formatting

Examples

>>> from cosinorage.features import WearableFeatures
>>> from cosinorage.datahandlers import GenericDataHandler
>>>
>>> # Load data and compute features
>>> handler = GenericDataHandler('data.csv')
>>> features = WearableFeatures(handler)
>>>
>>> # Generate comprehensive dashboard
>>> dashboard(features)