Skip to content

Scenario

read_aslib_scenario(path, add_running_time_features=True, training_par_factor=10.0)

Read an ASlib scenario from a directory.

Parameters

path : str The path to the ASlib scenario directory. add_running_time_features : bool, default=True Whether to include running time features (feature costs). training_par_factor : float or None, default=10.0 PAR factor to apply to training performance data. Timeouts (values > budget) are replaced with budget * training_par_factor. Set to None to disable.

Returns

tuple A tuple containing (features, performance, features_running_time, cv, feature_groups, maximize, budget, algorithm_features) where: - features: pd.DataFrame of feature values per instance. - performance: pd.DataFrame of algorithm performance per instance. - features_running_time: pd.DataFrame of feature costs per instance. - cv: pd.DataFrame of cross-validation fold assignments. - feature_groups: dict of feature group definitions. - maximize: bool, True if higher performance values are better. - budget: float, the algorithm cutoff time. - algorithm_features: pd.DataFrame of algorithm features, or None.

Raises

ImportError If the required libraries (pyyaml, liac-arff) are not available.

Source code in asf/scenario/aslib_reader.py
def read_aslib_scenario(
    path: str,
    add_running_time_features: bool = True,
    training_par_factor: float | None = 10.0,
) -> tuple[
    pd.DataFrame,
    pd.DataFrame,
    pd.DataFrame,
    pd.DataFrame,
    dict[str, Any],
    bool,
    float,
    pd.DataFrame | None,
]:
    """
    Read an ASlib scenario from a directory.

    Parameters
    ----------
    path : str
        The path to the ASlib scenario directory.
    add_running_time_features : bool, default=True
        Whether to include running time features (feature costs).
    training_par_factor : float or None, default=10.0
        PAR factor to apply to training performance data. Timeouts (values > budget)
        are replaced with budget * training_par_factor. Set to None to disable.

    Returns
    -------
    tuple
        A tuple containing (features, performance, features_running_time, cv,
        feature_groups, maximize, budget, algorithm_features) where:
        - features: pd.DataFrame of feature values per instance.
        - performance: pd.DataFrame of algorithm performance per instance.
        - features_running_time: pd.DataFrame of feature costs per instance.
        - cv: pd.DataFrame of cross-validation fold assignments.
        - feature_groups: dict of feature group definitions.
        - maximize: bool, True if higher performance values are better.
        - budget: float, the algorithm cutoff time.
        - algorithm_features: pd.DataFrame of algorithm features, or None.

    Raises
    ------
    ImportError
        If the required libraries (pyyaml, liac-arff) are not available.
    """
    if not ASLIB_AVAILABLE:
        raise ImportError(
            "The aslib reader requires 'pyyaml' and 'liac-arff'. "
            "Install them via 'pip install asf[aslib]'."
        )

    description, budget, maximize, feature_groups, algorithm_feature_groups = (
        _load_aslib_description(path)
    )

    performance = _load_aslib_performance(
        path,
        description["performance_measures"][0],
        budget,
        training_par_factor,
    )

    features, features_running_time = _load_aslib_features(
        path, feature_groups, add_running_time_features
    )

    cv = _load_aslib_cv(path)

    algorithm_features = _load_aslib_algorithm_features(path, algorithm_feature_groups)

    return (
        features,
        performance,
        features_running_time,
        cv,
        feature_groups,
        maximize,
        budget,
        algorithm_features,
    )