ganblr package

Top-level package for Ganblr.

class ganblr.KdbHighOrderFeatureEncoder

Bases: object

High order feature encoder that uses the kdb model to retrieve the dependencies between features.

fit(X, y, k=0)

Fit the KdbHighOrderFeatureEncoder to X, y.

Parameters
  • X (array_like of shape (n_samples, n_features)) – data to fit in the encoder.

  • y (array_like of shape (n_samples,)) – label to fit in the encoder.

  • k (int, default=0) – k value of the order of the high-order feature. k = 0 will lead to a OneHotEncoder.

Returns

self – Fitted encoder.

Return type

object

fit_transform(X, y, k=0, return_constraints=False)

Fit KdbHighOrderFeatureEncoder to X, y, then transform X.

Equivalent to fit(X, y, k).transform(X, return_constraints) but more convenient.

Parameters
  • X (array_like of shape (n_samples, n_features)) – data to fit in the encoder.

  • y (array_like of shape (n_samples,)) – label to fit in the encoder.

  • k (int, default=0) – k value of the kdb model. k = 0 will lead to a OneHotEncoder.

  • return_constraints (bool, default=False) – whether to return the constraint informations.

Returns

X_out – Transformed input.

Return type

ndarray of shape (n_samples, n_encoded_features)

transform(X, return_constraints=False, use_ohe=True)

Transform X to the high-order features.

Parameters
  • X (array_like of shape (n_samples, n_features)) – Data to fit in the encoder.

  • return_constraints (bool, default=False) – Whether to return the constraint informations.

  • use_ohe (bool, default=True) – Whether to transform output to one-hot format.

Returns

X_out – Transformed input.

Return type

ndarray of shape (n_samples, n_encoded_features)

ganblr.get_demo_data(name='adult')

Download demo dataset from internet.

Parameters

name (str) – Name of dataset. Should be one of [‘adult’, ‘adult-raw’].

Returns

data – the demo dataset.

Return type

pandas.DataFrame