Survival¶
-
class
rulekit.survival.SurvivalRules(survival_time_attr: Optional[str] = None, minsupp_new: int = 5, max_growing: int = 0.0, enable_pruning: bool = True, ignore_missing: bool = False, max_uncovered_fraction: float = 0.0, select_best_candidate: bool = False, min_rule_covered: Optional[int] = None)¶ Survival model.
- Parameters:
survival_time_attr (str) – name of column containing survival time data (use when data passed to model is padnas dataframe).
minsupp_new (int = 5) – positive integer representing minimum number of previously uncovered examples to be covered by a new rule (positive examples for classification problems); default: 5
max_growing (int = 0.0) – non-negative integer representing maximum number of conditions which can be added to the rule in the growing phase (use this parameter for large datasets if execution time is prohibitive); 0 indicates no limit; default: 0,
enable_pruning (bool = True) – enable or disable pruning, default is True.
ignore_missing (bool = False) – boolean telling whether missing values should be ignored (by default, a missing value of given attribute is always considered as not fulfilling the condition build upon that attribute); default: False.
max_uncovered_fraction (float = 0.0) – Floating-point number from [0,1] interval representing maximum fraction of examples that may remain uncovered by the rule set, default: 0.0.
select_best_candidate (bool = False) – Flag determining if best candidate should be selected from growing phase; default: False.
min_rule_covered (int = None) –
alias to minsupp_new. Parameter is deprecated and will be removed in the next major version, use minsupp_new
Deprecated since version 1.7.0: Use parameter minsupp_new instead.
-
fit(values: Union[numpy.ndarray, pandas.core.frame.DataFrame, list], labels: Union[numpy.ndarray, pandas.core.frame.DataFrame, list], survival_time: Optional[Union[numpy.ndarray, pandas.core.frame.DataFrame, list]] = None) → rulekit.survival.SurvivalRules¶ Train model on given dataset.
- Parameters:
values (
rulekit.operator.Data) – attributeslabels (
rulekit.operator.Data) – survival statussurvival_time (
rulekit.operator.Data) – data about survival time. Could be omitted when survival_time_attr parameter was specified.
- Returns:
self
- Return type:
-
get_coverage_matrix(values: Union[numpy.ndarray, pandas.core.frame.DataFrame, list]) → numpy.ndarray¶ Calculates coverage matrix for ruleset.
- Parameters:
values (
rulekit.operator.Data) – dataset- Returns:
coverage_matrix – Each row of the matrix represent single example from dataset and every column represent on rule from rule set. Value 1 in the matrix cell means that rule covered certain example, value 0 means that it doesn’t.
- Return type:
np.ndarray
-
get_params() → dict¶ - Returns:
hyperparameters – Dictionary containing model hyperparameters.
- Return type:
np.ndarray
-
predict(values: Union[numpy.ndarray, pandas.core.frame.DataFrame, list]) → numpy.ndarray¶ Perform prediction and return estimated survival function for each example.
- Parameters:
values (
rulekit.operator.Data) – attributes- Returns:
result – Each row represent single example from dataset and contains estimated survival function for that example. Estimated survival function is returned as a dictionary containing times and corresponding probabilities.
- Return type:
np.ndarray
-
score(values: Union[numpy.ndarray, pandas.core.frame.DataFrame, list], labels: Union[numpy.ndarray, pandas.core.frame.DataFrame, list], survival_time: Optional[Union[numpy.ndarray, pandas.core.frame.DataFrame, list]] = None) → float¶ Return the Integrated Brier Score on the given dataset and labels (event status indicator).
Integrated Brier Score (IBS) - the Brier score (BS) represents the squared difference between true event status at time T and predicted event status at that time; the Integrated Brier score summarizes the prediction error over all observations and over all times in a test set.
- Parameters:
values (
rulekit.operator.Data) – attributeslabels (
rulekit.operator.Data) – survival statussurvival_time (
rulekit.operator.Data) – data about survival time. Could be omitted when survival_time_attr parameter was specified
- Returns:
score – Integrated Brier Score of self.predict(values) wrt. labels.
- Return type:
float
-
set_params(**kwargs) → object¶ Set models hyperparameters. Parameters are the same as in constructor.
-
class
rulekit.survival.ExpertSurvivalRules(survival_time_attr: Optional[str] = None, minsupp_new: int = 5, max_growing: int = 0.0, enable_pruning: bool = True, ignore_missing: bool = False, max_uncovered_fraction: float = 0.0, select_best_candidate: bool = False, extend_using_preferred: Optional[bool] = None, extend_using_automatic: Optional[bool] = None, induce_using_preferred: Optional[bool] = None, induce_using_automatic: Optional[bool] = None, preferred_conditions_per_rule: Optional[int] = None, preferred_attributes_per_rule: Optional[int] = None, min_rule_covered: Optional[int] = None)¶ Expert Survival model.
- Parameters:
minsupp_new (int = 5) – positive integer representing minimum number of previously uncovered examples to be covered by a new rule (positive examples for classification problems); default: 5
survival_time_attr (str) – name of column containing survival time data (use when data passed to model is pandas dataframe).
min_rule_covered (int = None) – positive integer representing minimum number of previously uncovered examples to be covered by a new rule (positive examples for classification problems); default: 5
max_growing (int = 0.0) – non-negative integer representing maximum number of conditions which can be added to the rule in the growing phase (use this parameter for large datasets if execution time is prohibitive); 0 indicates no limit; default: 0,
enable_pruning (bool = True) – enable or disable pruning, default is True.
ignore_missing (bool = False) – boolean telling whether missing values should be ignored (by default, a missing value of given attribute is always considered as not fulfilling the condition build upon that attribute); default: False.
max_uncovered_fraction (float = 0.0) – Floating-point number from [0,1] interval representing maximum fraction of examples that may remain uncovered by the rule set, default: 0.0.
select_best_candidate (bool = False) – Flag determining if best candidate should be selected from growing phase; default: False.
extend_using_preferred (bool = False) – boolean indicating whether initial rules should be extended with a use of preferred conditions and attributes; default is False
extend_using_automatic (bool = False) – boolean indicating whether initial rules should be extended with a use of automatic conditions and attributes; default is False
induce_using_preferred (bool = False) – boolean indicating whether new rules should be induced with a use of preferred conditions and attributes; default is False
induce_using_automatic (bool = False) – boolean indicating whether new rules should be induced with a use of automatic conditions and attributes; default is False
preferred_conditions_per_rule (int = None) – maximum number of preferred conditions per rule; default: unlimited,
preferred_attributes_per_rule (int = None) – maximum number of preferred attributes per rule; default: unlimited.
min_rule_covered –
alias to minsupp_new. Parameter is deprecated and will be removed in the next major version, use minsupp_new
Deprecated since version 1.7.0: Use parameter minsupp_new instead.
-
fit(values: Union[numpy.ndarray, pandas.core.frame.DataFrame, list], labels: Union[numpy.ndarray, pandas.core.frame.DataFrame, list], survival_time: Optional[Union[numpy.ndarray, pandas.core.frame.DataFrame, list]] = None, expert_rules: Optional[list] = None, expert_preferred_conditions: Optional[list] = None, expert_forbidden_conditions: Optional[list] = None) → rulekit.survival.ExpertSurvivalRules¶ Train model on given dataset.
- Parameters:
values (
rulekit.operator.Data) – attributeslabels (Data) – survival status
survival_time (
rulekit.operator.Data) – data about survival time. Could be omitted when survival_time_attr parameter was specified.expert_rules (List[Union[str, Tuple[str, str]]]) – set of initial rules, either passed as a list of strings representing rules or as list of tuples where first element is name of the rule and second one is rule string.
expert_preferred_conditions (List[Union[str, Tuple[str, str]]]) – multiset of preferred conditions (used also for specifying preferred attributes by using special value Any). Either passed as a list of strings representing rules or as list of tuples where first element is name of the rule and second one is rule string.
expert_forbidden_conditions (List[Union[str, Tuple[str, str]]]) – set of forbidden conditions (used also for specifying forbidden attributes by using special valye Any). Either passed as a list of strings representing rules or as list of tuples where first element is name of the rule and second one is rule string.
- Returns:
self
- Return type:
-
get_coverage_matrix(values: Union[numpy.ndarray, pandas.core.frame.DataFrame, list]) → numpy.ndarray¶ Calculates coverage matrix for ruleset.
- Parameters:
values (
rulekit.operator.Data) – dataset- Returns:
coverage_matrix – Each row of the matrix represent single example from dataset and every column represent on rule from rule set. Value 1 in the matrix cell means that rule covered certain example, value 0 means that it doesn’t.
- Return type:
np.ndarray
-
get_params() → dict¶ - Returns:
hyperparameters – Dictionary containing model hyperparameters.
- Return type:
np.ndarray
-
predict(values: Union[numpy.ndarray, pandas.core.frame.DataFrame, list]) → numpy.ndarray¶ Perform prediction and return estimated survival function for each example.
- Parameters:
values (
rulekit.operator.Data) – attributes- Returns:
result – Each row represent single example from dataset and contains estimated survival function for that example. Estimated survival function is returned as a dictionary containing times and corresponding probabilities.
- Return type:
np.ndarray
-
score(values: Union[numpy.ndarray, pandas.core.frame.DataFrame, list], labels: Union[numpy.ndarray, pandas.core.frame.DataFrame, list], survival_time: Optional[Union[numpy.ndarray, pandas.core.frame.DataFrame, list]] = None) → float¶ Return the Integrated Brier Score on the given dataset and labels (event status indicator).
Integrated Brier Score (IBS) - the Brier score (BS) represents the squared difference between true event status at time T and predicted event status at that time; the Integrated Brier score summarizes the prediction error over all observations and over all times in a test set.
- Parameters:
values (
rulekit.operator.Data) – attributeslabels (
rulekit.operator.Data) – survival statussurvival_time (
rulekit.operator.Data) – data about survival time. Could be omitted when survival_time_attr parameter was specified
- Returns:
score – Integrated Brier Score of self.predict(values) wrt. labels.
- Return type:
float
-
set_params(**kwargs) → object¶ Set models hyperparameters. Parameters are the same as in constructor.