Skip to content

Selectors

AbstractPredictor

Bases: ABC

Abstract base class for all predictors.

Methods

fit(X, Y) Fit the model to the data. predict(X) Predict using the model. save(file_path) Save the model to a file. load(file_path) Load the model from a file.

Source code in asf/predictors/abstract_predictor.py
class AbstractPredictor(ABC):
    """
    Abstract base class for all predictors.

    Methods
    -------
    fit(X, Y)
        Fit the model to the data.
    predict(X)
        Predict using the model.
    save(file_path)
        Save the model to a file.
    load(file_path)
        Load the model from a file.
    """

    def __init__(self):
        """
        Initialize the predictor.
        """
        pass

    @abstractmethod
    def fit(self, X: Any, Y: Any, **kwargs):
        """
        Fit the model to the data.

        Parameters
        ----------
        X : array-like
            Training data.
        Y : array-like
            Target values.
        """
        pass

    @abstractmethod
    def predict(self, X: Any, **kwargs) -> Any:
        """
        Predict using the model.

        Parameters
        ----------
        X : array-like
            Data to predict on.

        Returns
        -------
        array-like
            Predicted values.
        """
        pass

    @abstractmethod
    def save(self, file_path: str):
        """
        Save the model to a file.

        Parameters
        ----------
        file_path : str
            Path to the file where the model will be saved.
        """
        pass

    @abstractmethod
    def load(self, file_path: str):
        """
        Load the model from a file.

        Parameters
        ----------
        file_path : str
            Path to the file from which the model will be loaded.
        """
        pass

    def get_configuration_space(self):
        """
        Get the configuration space for the predictor.

        Returns
        -------
        ConfigurationSpace
            The configuration space for the predictor.
        """
        raise NotImplementedError(
            "get_configuration_space() is not implemented for this predictor"
        )

    @staticmethod
    def get_from_configuration(configuration):
        """
        Get the configuration space for the predictor.

        Returns
        -------
        AbstractPredictor
            The predictor.
        """
        raise NotImplementedError(
            "get_from_configuration() is not implemented for this predictor"
        )

__init__()

Initialize the predictor.

Source code in asf/predictors/abstract_predictor.py
def __init__(self):
    """
    Initialize the predictor.
    """
    pass

fit(X, Y, **kwargs) abstractmethod

Fit the model to the data.

Parameters

X : array-like Training data. Y : array-like Target values.

Source code in asf/predictors/abstract_predictor.py
@abstractmethod
def fit(self, X: Any, Y: Any, **kwargs):
    """
    Fit the model to the data.

    Parameters
    ----------
    X : array-like
        Training data.
    Y : array-like
        Target values.
    """
    pass

get_configuration_space()

Get the configuration space for the predictor.

Returns

ConfigurationSpace The configuration space for the predictor.

Source code in asf/predictors/abstract_predictor.py
def get_configuration_space(self):
    """
    Get the configuration space for the predictor.

    Returns
    -------
    ConfigurationSpace
        The configuration space for the predictor.
    """
    raise NotImplementedError(
        "get_configuration_space() is not implemented for this predictor"
    )

get_from_configuration(configuration) staticmethod

Get the configuration space for the predictor.

Returns

AbstractPredictor The predictor.

Source code in asf/predictors/abstract_predictor.py
@staticmethod
def get_from_configuration(configuration):
    """
    Get the configuration space for the predictor.

    Returns
    -------
    AbstractPredictor
        The predictor.
    """
    raise NotImplementedError(
        "get_from_configuration() is not implemented for this predictor"
    )

load(file_path) abstractmethod

Load the model from a file.

Parameters

file_path : str Path to the file from which the model will be loaded.

Source code in asf/predictors/abstract_predictor.py
@abstractmethod
def load(self, file_path: str):
    """
    Load the model from a file.

    Parameters
    ----------
    file_path : str
        Path to the file from which the model will be loaded.
    """
    pass

predict(X, **kwargs) abstractmethod

Predict using the model.

Parameters

X : array-like Data to predict on.

Returns

array-like Predicted values.

Source code in asf/predictors/abstract_predictor.py
@abstractmethod
def predict(self, X: Any, **kwargs) -> Any:
    """
    Predict using the model.

    Parameters
    ----------
    X : array-like
        Data to predict on.

    Returns
    -------
    array-like
        Predicted values.
    """
    pass

save(file_path) abstractmethod

Save the model to a file.

Parameters

file_path : str Path to the file where the model will be saved.

Source code in asf/predictors/abstract_predictor.py
@abstractmethod
def save(self, file_path: str):
    """
    Save the model to a file.

    Parameters
    ----------
    file_path : str
        Path to the file where the model will be saved.
    """
    pass

EPMRandomForest

Bases: ForestRegressor, AbstractPredictor

Implementation of random forest as done in the paper "Algorithm runtime prediction: Methods & evaluation" by Hutter, Xu, Hoos, and Leyton-Brown (2014).

Methods

fit(X, Y) Fit the model to the data. predict(X) Predict using the model. save(file_path) Save the model to a file. load(file_path) Load the model from a file.

Source code in asf/predictors/epm_random_forest.py
class EPMRandomForest(ForestRegressor, AbstractPredictor):
    """
    Implementation of random forest as done in the paper
    "Algorithm runtime prediction: Methods & evaluation" by Hutter, Xu, Hoos, and Leyton-Brown (2014).

    Methods
    -------
    fit(X, Y)
        Fit the model to the data.
    predict(X)
        Predict using the model.
    save(file_path)
        Save the model to a file.
    load(file_path)
        Load the model from a file.
    """

    def __init__(
        self,
        n_estimators: int = 100,
        *,
        log=False,
        cross_trees_variance=False,
        criterion="squared_error",
        splitter="random",
        max_depth=None,
        min_samples_split=2,
        min_samples_leaf=1,
        min_weight_fraction_leaf=0.0,
        max_features=1.0,
        max_leaf_nodes=None,
        min_impurity_decrease=0.0,
        bootstrap: bool = False,
        oob_score: bool = False,
        n_jobs=None,
        random_state=None,
        verbose: int = 0,
        warm_start: bool = False,
        ccp_alpha=0.0,
        max_samples=None,
        monotonic_cst=None,
    ) -> None:
        super().__init__(
            DecisionTreeRegressor(),
            n_estimators,
            estimator_params=(
                "criterion",
                "max_depth",
                "min_samples_split",
                "min_samples_leaf",
                "min_weight_fraction_leaf",
                "max_features",
                "max_leaf_nodes",
                "min_impurity_decrease",
                "random_state",
                "ccp_alpha",
                "monotonic_cst",
            ),
            bootstrap=bootstrap,
            oob_score=oob_score,
            n_jobs=n_jobs,
            random_state=random_state,
            verbose=verbose,
            warm_start=warm_start,
            max_samples=max_samples,
        )
        self.criterion = criterion
        self.max_depth = max_depth
        self.min_samples_split = min_samples_split
        self.min_samples_leaf = min_samples_leaf
        self.min_weight_fraction_leaf = min_weight_fraction_leaf
        self.max_features = max_features
        self.max_leaf_nodes = max_leaf_nodes
        self.min_impurity_decrease = min_impurity_decrease
        self.ccp_alpha = ccp_alpha
        self.monotonic_cst = monotonic_cst
        self.splitter = splitter
        self.log = log

    def fit(self, X, y, sample_weight=None):
        """
        Fit the model to the data.

        Parameters
        ----------
        X : array-like
            Training data.
        y : array-like
            Target values.
        """
        assert sample_weight is None, "Sample weights are not supported"
        super().fit(X=X, y=y, sample_weight=sample_weight)

        self.trainX = X
        self.trainY = y
        if self.log:
            for tree, samples_idx in zip(self.estimators_, self.estimators_samples_):
                curX = X[samples_idx]
                curY = y[samples_idx]
                preds = tree.apply(curX)
                for k in np.unique(preds):
                    tree.tree_.value[k, 0, 0] = np.log(np.exp(curY[preds == k]).mean())

    def predict(self, X):
        """
        Predict using the model.

        Parameters
        ----------
        X : array-like
            Data to predict on.

        Returns
        -------
        array-like
            Predicted values.
        """
        preds = []
        for tree, samples_idx in zip(self.estimators_, self.estimators_samples_):
            preds.append(tree.predict(X))
        preds = np.array(preds).T

        means = preds.mean(axis=1)
        vars = preds.var(axis=1)

        return means.reshape(-1, 1), vars.reshape(-1, 1)

    def save(self, file_path: str):
        """
        Save the model to a file.

        Parameters
        ----------
        file_path : str
            Path to the file where the model will be saved.
        """
        import joblib

        joblib.dump(self, file_path)

    def load(self, file_path: str):
        """
        Load the model from a file.

        Parameters
        ----------
        file_path : str
            Path to the file from which the model will be loaded.

        Returns
        -------
        EPMRandomForest
            The loaded model.
        """
        import joblib

        return joblib.load(file_path)

fit(X, y, sample_weight=None)

Fit the model to the data.

Parameters

X : array-like Training data. y : array-like Target values.

Source code in asf/predictors/epm_random_forest.py
def fit(self, X, y, sample_weight=None):
    """
    Fit the model to the data.

    Parameters
    ----------
    X : array-like
        Training data.
    y : array-like
        Target values.
    """
    assert sample_weight is None, "Sample weights are not supported"
    super().fit(X=X, y=y, sample_weight=sample_weight)

    self.trainX = X
    self.trainY = y
    if self.log:
        for tree, samples_idx in zip(self.estimators_, self.estimators_samples_):
            curX = X[samples_idx]
            curY = y[samples_idx]
            preds = tree.apply(curX)
            for k in np.unique(preds):
                tree.tree_.value[k, 0, 0] = np.log(np.exp(curY[preds == k]).mean())

load(file_path)

Load the model from a file.

Parameters

file_path : str Path to the file from which the model will be loaded.

Returns

EPMRandomForest The loaded model.

Source code in asf/predictors/epm_random_forest.py
def load(self, file_path: str):
    """
    Load the model from a file.

    Parameters
    ----------
    file_path : str
        Path to the file from which the model will be loaded.

    Returns
    -------
    EPMRandomForest
        The loaded model.
    """
    import joblib

    return joblib.load(file_path)

predict(X)

Predict using the model.

Parameters

X : array-like Data to predict on.

Returns

array-like Predicted values.

Source code in asf/predictors/epm_random_forest.py
def predict(self, X):
    """
    Predict using the model.

    Parameters
    ----------
    X : array-like
        Data to predict on.

    Returns
    -------
    array-like
        Predicted values.
    """
    preds = []
    for tree, samples_idx in zip(self.estimators_, self.estimators_samples_):
        preds.append(tree.predict(X))
    preds = np.array(preds).T

    means = preds.mean(axis=1)
    vars = preds.var(axis=1)

    return means.reshape(-1, 1), vars.reshape(-1, 1)

save(file_path)

Save the model to a file.

Parameters

file_path : str Path to the file where the model will be saved.

Source code in asf/predictors/epm_random_forest.py
def save(self, file_path: str):
    """
    Save the model to a file.

    Parameters
    ----------
    file_path : str
        Path to the file where the model will be saved.
    """
    import joblib

    joblib.dump(self, file_path)

RankingMLP

Bases: AbstractPredictor

Source code in asf/predictors/ranking_mlp.py
class RankingMLP(AbstractPredictor):
    def __init__(
        self,
        model: torch.nn.Module | None = None,
        input_size: int | None = None,
        loss: Callable | None = bpr_loss,
        optimizer: torch.optim.Optimizer | None = torch.optim.Adam,
        batch_size: int = 128,
        epochs: int = 500,
        seed: int = 42,
        device: str = "cpu",
        compile=True,
        **kwargs,
    ):
        """
        Initializes the JointRanking with the given parameters.

        Args:
            model: The model to be used.
        """
        super().__init__(**kwargs)
        assert TORCH_AVAILABLE, "PyTorch is not available. Please install it."

        assert model is not None or input_size is not None, (
            "Either model or input_size must be provided."
        )

        torch.manual_seed(seed)

        if model is None:
            self.model = get_mlp(input_size=input_size, output_size=1)
        else:
            self.model = model

        self.model.to(device)
        self.device = device

        self.loss = loss
        self.batch_size = batch_size
        self.optimizer = optimizer
        self.epochs = epochs

        if compile:
            self.model = torch.compile(self.model)

    def _get_dataloader(
        self,
        features: pd.DataFrame,
        performance: pd.DataFrame,
        algorithm_features: pd.DataFrame,
    ):
        dataset = RankingDataset(features, performance, algorithm_features)
        return torch.utils.data.DataLoader(
            dataset, batch_size=self.batch_size, shuffle=True, num_workers=4
        )

    def fit(
        self,
        features: pd.DataFrame,
        performance: pd.DataFrame,
        algorithm_features: pd.DataFrame,
    ):
        """
        Fits the model to the given feature and performance data.

        Args:
            features: DataFrame containing the feature data.
            performance: DataFrame containing the performance data.
        """

        print(self.model)
        dataloader = self._get_dataloader(features, performance, algorithm_features)

        optimizer = self.optimizer(self.model.parameters())
        self.model.train()
        for epoch in range(1000):  # self.epochs):
            total_loss = 0
            for i, ((Xc, Xs, Xl), (yc, ys, yl)) in enumerate(dataloader):
                Xc, Xs, Xl = Xc.to(self.device), Xs.to(self.device), Xl.to(self.device)
                yc, ys, yl = yc.to(self.device), ys.to(self.device), yl.to(self.device)

                yc = yc.float().unsqueeze(1)
                ys = ys.float().unsqueeze(1)
                yl = yl.float().unsqueeze(1)

                # yc = torch.log10(yc)
                # ys = torch.log10(ys)
                # yl = torch.log10(yl)
                # print(Xc)
                # print(Xs)
                # print(Xl)

                # print(yc)
                # print(ys)
                # print(yl)
                optimizer.zero_grad()

                y_pred = self.model(Xc)
                y_pred_s = self.model(Xs)
                y_pred_l = self.model(Xl)

                loss = self.loss(y_pred, y_pred_s, y_pred_l, yc, ys, yl)
                # loss = torch.nn.functional.mse_loss(y_pred, yc)
                total_loss += loss.item()

                loss.backward()
                optimizer.step()
            print(f"Epoch {epoch}, Loss: {total_loss / len(dataloader)}")

        return self

    def predict(self, features: pd.DataFrame):
        """
        Predicts the performance of algorithms for the given features.

        Args:
            features: DataFrame containing the feature data.

        Returns:
            DataFrame containing the predicted performance data.
        """
        self.model.eval()

        features = torch.from_numpy(features.values).to(self.device).float()
        predictions = self.model(features).detach().numpy()

        return predictions

    def save(self, file_path):
        torch.save(self.model, file_path)

    def load(self, file_path):
        torch.load(file_path)

__init__(model=None, input_size=None, loss=bpr_loss, optimizer=torch.optim.Adam, batch_size=128, epochs=500, seed=42, device='cpu', compile=True, **kwargs)

Initializes the JointRanking with the given parameters.

Parameters:

Name Type Description Default
model Module | None

The model to be used.

None
Source code in asf/predictors/ranking_mlp.py
def __init__(
    self,
    model: torch.nn.Module | None = None,
    input_size: int | None = None,
    loss: Callable | None = bpr_loss,
    optimizer: torch.optim.Optimizer | None = torch.optim.Adam,
    batch_size: int = 128,
    epochs: int = 500,
    seed: int = 42,
    device: str = "cpu",
    compile=True,
    **kwargs,
):
    """
    Initializes the JointRanking with the given parameters.

    Args:
        model: The model to be used.
    """
    super().__init__(**kwargs)
    assert TORCH_AVAILABLE, "PyTorch is not available. Please install it."

    assert model is not None or input_size is not None, (
        "Either model or input_size must be provided."
    )

    torch.manual_seed(seed)

    if model is None:
        self.model = get_mlp(input_size=input_size, output_size=1)
    else:
        self.model = model

    self.model.to(device)
    self.device = device

    self.loss = loss
    self.batch_size = batch_size
    self.optimizer = optimizer
    self.epochs = epochs

    if compile:
        self.model = torch.compile(self.model)

fit(features, performance, algorithm_features)

Fits the model to the given feature and performance data.

Parameters:

Name Type Description Default
features DataFrame

DataFrame containing the feature data.

required
performance DataFrame

DataFrame containing the performance data.

required
Source code in asf/predictors/ranking_mlp.py
def fit(
    self,
    features: pd.DataFrame,
    performance: pd.DataFrame,
    algorithm_features: pd.DataFrame,
):
    """
    Fits the model to the given feature and performance data.

    Args:
        features: DataFrame containing the feature data.
        performance: DataFrame containing the performance data.
    """

    print(self.model)
    dataloader = self._get_dataloader(features, performance, algorithm_features)

    optimizer = self.optimizer(self.model.parameters())
    self.model.train()
    for epoch in range(1000):  # self.epochs):
        total_loss = 0
        for i, ((Xc, Xs, Xl), (yc, ys, yl)) in enumerate(dataloader):
            Xc, Xs, Xl = Xc.to(self.device), Xs.to(self.device), Xl.to(self.device)
            yc, ys, yl = yc.to(self.device), ys.to(self.device), yl.to(self.device)

            yc = yc.float().unsqueeze(1)
            ys = ys.float().unsqueeze(1)
            yl = yl.float().unsqueeze(1)

            # yc = torch.log10(yc)
            # ys = torch.log10(ys)
            # yl = torch.log10(yl)
            # print(Xc)
            # print(Xs)
            # print(Xl)

            # print(yc)
            # print(ys)
            # print(yl)
            optimizer.zero_grad()

            y_pred = self.model(Xc)
            y_pred_s = self.model(Xs)
            y_pred_l = self.model(Xl)

            loss = self.loss(y_pred, y_pred_s, y_pred_l, yc, ys, yl)
            # loss = torch.nn.functional.mse_loss(y_pred, yc)
            total_loss += loss.item()

            loss.backward()
            optimizer.step()
        print(f"Epoch {epoch}, Loss: {total_loss / len(dataloader)}")

    return self

predict(features)

Predicts the performance of algorithms for the given features.

Parameters:

Name Type Description Default
features DataFrame

DataFrame containing the feature data.

required

Returns:

Type Description

DataFrame containing the predicted performance data.

Source code in asf/predictors/ranking_mlp.py
def predict(self, features: pd.DataFrame):
    """
    Predicts the performance of algorithms for the given features.

    Args:
        features: DataFrame containing the feature data.

    Returns:
        DataFrame containing the predicted performance data.
    """
    self.model.eval()

    features = torch.from_numpy(features.values).to(self.device).float()
    predictions = self.model(features).detach().numpy()

    return predictions

RegressionMLP

Bases: AbstractPredictor

Source code in asf/predictors/regression_mlp.py
class RegressionMLP(AbstractPredictor):
    def __init__(
        self,
        model: torch.nn.Module | None = None,
        input_size: int | None = None,
        loss: torch.nn.modules.loss._Loss | None = torch.nn.MSELoss(),
        optimizer: torch.optim.Optimizer | None = torch.optim.Adam,
        batch_size: int = 128,
        epochs: int = 2000,
        seed: int = 42,
        device: str = "cpu",
        compile=True,
        **kwargs,
    ):
        """
        Initializes the JointRanking with the given parameters.

        Args:
            model: The model to be used.
        """
        super().__init__(**kwargs)

        assert TORCH_AVAILABLE, "PyTorch is not available. Please install it."
        assert model is not None or input_size is not None, (
            "Either model or input_size must be provided."
        )

        torch.manual_seed(seed)

        if model is None:
            self.model = get_mlp(input_size=input_size, output_size=1)
        else:
            self.model = model

        self.model.to(device)
        self.device = device

        self.loss = loss
        self.batch_size = batch_size
        self.optimizer = optimizer
        self.epochs = epochs

        if compile:
            self.model = torch.compile(self.model)

    def _get_dataloader(self, features: pd.DataFrame, performance: pd.DataFrame):
        dataset = RegressionDataset(features, performance)
        return torch.utils.data.DataLoader(
            dataset, batch_size=self.batch_size, shuffle=True
        )

    def fit(self, features: pd.DataFrame, performance: pd.DataFrame):
        """
        Fits the model to the given feature and performance data.

        Args:
            features: DataFrame containing the feature data.
            performance: DataFrame containing the performance data.
        """

        features = pd.DataFrame(
            SimpleImputer().fit_transform(features.values),
            index=features.index,
            columns=features.columns,
        )
        dataloader = self._get_dataloader(features, performance)

        optimizer = self.optimizer(self.model.parameters())
        self.model.train()
        for epoch in range(self.epochs):
            total_loss = 0
            for i, (X, y) in enumerate(dataloader):
                X, y = X.to(self.device), y.to(self.device)
                y = y.unsqueeze(-1)
                optimizer.zero_grad()
                y_pred = self.model(X)
                loss = self.loss(y_pred, y)
                total_loss += loss.item()
                loss.backward()
                optimizer.step()

        return self

    def predict(self, features: pd.DataFrame):
        """
        Predicts the performance of algorithms for the given features.

        Args:
            features: DataFrame containing the feature data.

        Returns:
            DataFrame containing the predicted performance data.
        """
        self.model.eval()

        features = torch.from_numpy(features.values).to(self.device)
        predictions = self.model(features).detach().numpy()

        return predictions

    def save(self, file_path):
        torch.save(self.model, file_path)

    def load(self, file_path):
        torch.load(file_path)

__init__(model=None, input_size=None, loss=torch.nn.MSELoss(), optimizer=torch.optim.Adam, batch_size=128, epochs=2000, seed=42, device='cpu', compile=True, **kwargs)

Initializes the JointRanking with the given parameters.

Parameters:

Name Type Description Default
model Module | None

The model to be used.

None
Source code in asf/predictors/regression_mlp.py
def __init__(
    self,
    model: torch.nn.Module | None = None,
    input_size: int | None = None,
    loss: torch.nn.modules.loss._Loss | None = torch.nn.MSELoss(),
    optimizer: torch.optim.Optimizer | None = torch.optim.Adam,
    batch_size: int = 128,
    epochs: int = 2000,
    seed: int = 42,
    device: str = "cpu",
    compile=True,
    **kwargs,
):
    """
    Initializes the JointRanking with the given parameters.

    Args:
        model: The model to be used.
    """
    super().__init__(**kwargs)

    assert TORCH_AVAILABLE, "PyTorch is not available. Please install it."
    assert model is not None or input_size is not None, (
        "Either model or input_size must be provided."
    )

    torch.manual_seed(seed)

    if model is None:
        self.model = get_mlp(input_size=input_size, output_size=1)
    else:
        self.model = model

    self.model.to(device)
    self.device = device

    self.loss = loss
    self.batch_size = batch_size
    self.optimizer = optimizer
    self.epochs = epochs

    if compile:
        self.model = torch.compile(self.model)

fit(features, performance)

Fits the model to the given feature and performance data.

Parameters:

Name Type Description Default
features DataFrame

DataFrame containing the feature data.

required
performance DataFrame

DataFrame containing the performance data.

required
Source code in asf/predictors/regression_mlp.py
def fit(self, features: pd.DataFrame, performance: pd.DataFrame):
    """
    Fits the model to the given feature and performance data.

    Args:
        features: DataFrame containing the feature data.
        performance: DataFrame containing the performance data.
    """

    features = pd.DataFrame(
        SimpleImputer().fit_transform(features.values),
        index=features.index,
        columns=features.columns,
    )
    dataloader = self._get_dataloader(features, performance)

    optimizer = self.optimizer(self.model.parameters())
    self.model.train()
    for epoch in range(self.epochs):
        total_loss = 0
        for i, (X, y) in enumerate(dataloader):
            X, y = X.to(self.device), y.to(self.device)
            y = y.unsqueeze(-1)
            optimizer.zero_grad()
            y_pred = self.model(X)
            loss = self.loss(y_pred, y)
            total_loss += loss.item()
            loss.backward()
            optimizer.step()

    return self

predict(features)

Predicts the performance of algorithms for the given features.

Parameters:

Name Type Description Default
features DataFrame

DataFrame containing the feature data.

required

Returns:

Type Description

DataFrame containing the predicted performance data.

Source code in asf/predictors/regression_mlp.py
def predict(self, features: pd.DataFrame):
    """
    Predicts the performance of algorithms for the given features.

    Args:
        features: DataFrame containing the feature data.

    Returns:
        DataFrame containing the predicted performance data.
    """
    self.model.eval()

    features = torch.from_numpy(features.values).to(self.device)
    predictions = self.model(features).detach().numpy()

    return predictions

SklearnWrapper

Bases: AbstractPredictor

A generic wrapper for scikit-learn models.

This class allows scikit-learn models to be used with the ASF framework.

Methods

fit(X, Y) Fit the model to the data. predict(X) Predict using the model. save(file_path) Save the model to a file. load(file_path) Load the model from a file.

Source code in asf/predictors/sklearn_wrapper.py
class SklearnWrapper(AbstractPredictor):
    """
    A generic wrapper for scikit-learn models.

    This class allows scikit-learn models to be used with the ASF framework.

    Methods
    -------
    fit(X, Y)
        Fit the model to the data.
    predict(X)
        Predict using the model.
    save(file_path)
        Save the model to a file.
    load(file_path)
        Load the model from a file.
    """

    def __init__(self, model_class: ClassifierMixin, init_params: dict = {}):
        """
        Initialize the wrapper with a scikit-learn model.

        Parameters
        ----------
        model_class : ClassifierMixin
            An instance of a scikit-learn model.
        """
        self.model_class = model_class(**init_params)

    def fit(self, X, Y, sample_weight=None, **kwargs):
        """
        Fit the model to the data.

        Parameters
        ----------
        X : array-like
            Training data.
        Y : array-like
            Target values.
        """
        self.model_class.fit(X, Y, sample_weight=sample_weight, **kwargs)

    def predict(self, X, **kwargs):
        """
        Predict using the model.

        Parameters
        ----------
        X : array-like
            Data to predict on.

        Returns
        -------
        array-like
            Predicted values.
        """
        return self.model_class.predict(X, **kwargs)

    def save(self, file_path: str):
        """
        Save the model to a file.

        Parameters
        ----------
        file_path : str
            Path to the file where the model will be saved.
        """
        import joblib

        joblib.dump(self, file_path)

    def load(self, file_path: str):
        """
        Load the model from a file.

        Parameters
        ----------
        file_path : str
            Path to the file from which the model will be loaded.

        Returns
        -------
        SklearnWrapper
            The loaded model.
        """
        import joblib

        return joblib.load(file_path)

__init__(model_class, init_params={})

Initialize the wrapper with a scikit-learn model.

Parameters

model_class : ClassifierMixin An instance of a scikit-learn model.

Source code in asf/predictors/sklearn_wrapper.py
def __init__(self, model_class: ClassifierMixin, init_params: dict = {}):
    """
    Initialize the wrapper with a scikit-learn model.

    Parameters
    ----------
    model_class : ClassifierMixin
        An instance of a scikit-learn model.
    """
    self.model_class = model_class(**init_params)

fit(X, Y, sample_weight=None, **kwargs)

Fit the model to the data.

Parameters

X : array-like Training data. Y : array-like Target values.

Source code in asf/predictors/sklearn_wrapper.py
def fit(self, X, Y, sample_weight=None, **kwargs):
    """
    Fit the model to the data.

    Parameters
    ----------
    X : array-like
        Training data.
    Y : array-like
        Target values.
    """
    self.model_class.fit(X, Y, sample_weight=sample_weight, **kwargs)

load(file_path)

Load the model from a file.

Parameters

file_path : str Path to the file from which the model will be loaded.

Returns

SklearnWrapper The loaded model.

Source code in asf/predictors/sklearn_wrapper.py
def load(self, file_path: str):
    """
    Load the model from a file.

    Parameters
    ----------
    file_path : str
        Path to the file from which the model will be loaded.

    Returns
    -------
    SklearnWrapper
        The loaded model.
    """
    import joblib

    return joblib.load(file_path)

predict(X, **kwargs)

Predict using the model.

Parameters

X : array-like Data to predict on.

Returns

array-like Predicted values.

Source code in asf/predictors/sklearn_wrapper.py
def predict(self, X, **kwargs):
    """
    Predict using the model.

    Parameters
    ----------
    X : array-like
        Data to predict on.

    Returns
    -------
    array-like
        Predicted values.
    """
    return self.model_class.predict(X, **kwargs)

save(file_path)

Save the model to a file.

Parameters

file_path : str Path to the file where the model will be saved.

Source code in asf/predictors/sklearn_wrapper.py
def save(self, file_path: str):
    """
    Save the model to a file.

    Parameters
    ----------
    file_path : str
        Path to the file where the model will be saved.
    """
    import joblib

    joblib.dump(self, file_path)