Regression

class rulekit.regression.RuleRegressor(minsupp_new: float = 5.0, induction_measure: Measures = Measures.Correlation, pruning_measure: Measures | str = Measures.Correlation, voting_measure: Measures = Measures.Correlation, max_growing: float = 0.0, enable_pruning: bool = True, ignore_missing: bool = False, max_uncovered_fraction: float = 0.0, select_best_candidate: bool = False, complementary_conditions: bool = False, mean_based_regression: bool = True, max_rule_count: int = 0)

Regression model.

Parameters:

minsupp_new (float = 5.0) –

a minimum number (or fraction, if value < 1.0) of previously uncovered examples
to be covered by a new rule (positive examples for classification problems); default: 5,
induction_measure (rulekit.params.Measures = rulekit.params.Measures.Correlation) – measure used during induction; default measure is correlation
pruning_measure (Union[rulekit.params.Measures, str] = rulekit.params.Measures.Correlation) –

measure used during pruning. Could be user defined (string), for example
2 * p / n; default measure is correlation
voting_measure (rulekit.params.Measures = rulekit.params.Measures.Correlation) – measure used during voting; default measure is correlation
max_growing (int = 0.0) – non-negative integer representing maximum number of conditions which can be added to the rule in the growing phase (use this parameter for large datasets if execution time is prohibitive); 0 indicates no limit; default: 0,
enable_pruning (bool = True) – enable or disable pruning, default is True.
ignore_missing (bool = False) – boolean telling whether missing values should be ignored (by default, a missing value of given attribute is always considered as not fulfilling the condition build upon that attribute); default: False.
max_uncovered_fraction (float = 0.0) –

Floating-point number from [0,1] interval representing maximum fraction of examples
that may remain uncovered by the rule set, default: 0.0.
select_best_candidate (bool = False) –

Flag determining if best candidate should be selected from growing phase;
default: False.
complementary_conditions (bool = False) – If enabled, complementary conditions in the form a = !{value} for nominal attributes are supported.
mean_based_regression (bool = True) – Enable fast induction of mean-based regression rules instead of default median-based.
max_rule_count (int = 0) –

Maximum number of rules to be generated (for classification data sets it applies
to a single class); 0 indicates no limit.

add_event_listener(listener: RuleInductionProgressListener)

Add event listener object to the operator which allows to monitor: rule induction progress.

Example

>>> from rulekit.events import RuleInductionProgressListener
>>> from rulekit.classification import RuleClassifier
>>>
>>> class MyEventListener(RuleInductionProgressListener):
>>>     def on_new_rule(self, rule):
>>>         print('Do something with new rule', rule)
>>>
>>> operator = RuleClassifier()
>>> operator.add_event_listener(MyEventListener())

Parameters:: listener (RuleInductionProgressListener) – listener object

fit(values: ndarray | DataFrame | list, labels: ndarray | DataFrame | list) → RuleRegressor

Train model on given dataset.

Parameters:

values (rulekit.operator.Data) – attributes
labels (rulekit.operator.Data) – target values

Returns:

self

Return type:

RuleRegressor

get_coverage_matrix(values: ndarray | DataFrame | list) → ndarray

Calculates coverage matrix for ruleset.

Parameters:

values (rulekit.operator.Data) – dataset

Returns:

coverage_matrix – Each row of the matrix represent single example from dataset and every column represent on rule from rule set. Value 1 in the matrix cell means that rule covered certain

example, value 0 means that it doesn’t.

Return type:

np.ndarray

get_metadata_routing() → None

Warning

Scikit-learn metadata routing is not supported yet.

Raises:: NotImplementedError – _description_

get_params(deep: bool = True) → dict[str, Any]

Parameters:: deep (rulekit.operator.Data) – Parameter for scikit-learn compatibility. Not used.
Returns:: hyperparameters – Dictionary containing model hyperparameters.
Return type:: np.ndarray

predict(values: ndarray | DataFrame | list) → ndarray

Perform prediction and returns predicted values.

Parameters:: values (rulekit.operator.Data) – attributes
Returns:: result – predicted values
Return type:: np.ndarray

score(values: ndarray | DataFrame | list, labels: ndarray | DataFrame | list) → float

Return the coefficient of determination R2 of the prediction

Parameters:

values (rulekit.operator.Data) – attributes
labels (rulekit.operator.Data) – true target values

Returns:

score – R2 of self.predict(values) wrt. labels.

Return type:

float

set_params(**kwargs) → object: Set models hyperparameters. Parameters are the same as in constructor.

class rulekit.regression.ExpertRuleRegressor(minsupp_new: float = 5.0, induction_measure: Measures = Measures.Correlation, pruning_measure: Measures | str = Measures.Correlation, voting_measure: Measures = Measures.Correlation, max_growing: float = 0.0, enable_pruning: bool = True, ignore_missing: bool = False, max_uncovered_fraction: float = 0.0, select_best_candidate: bool = False, complementary_conditions: bool = False, mean_based_regression: bool = True, max_rule_count: int = 0, extend_using_preferred: bool = False, extend_using_automatic: bool = False, induce_using_preferred: bool = False, induce_using_automatic: bool = False, preferred_conditions_per_rule: int = 2147483647, preferred_attributes_per_rule: int = 2147483647)

Expert Regression model.

Parameters:

minsupp_new (float = 5.0) –

a minimum number (or fraction, if value < 1.0) of previously uncovered examples
to be covered by a new rule (positive examples for classification problems); default: 5,
induction_measure (rulekit.params.Measures = rulekit.params.Measures.Correlation) – measure used during induction; default measure is correlation
pruning_measure (Union[rulekit.params.Measures, str] = rulekit.params.Measures.Correlation) –

measure used during pruning. Could be user defined (string), for example
2 * p / n; default measure is correlation
voting_measure (rulekit.params.Measures = rulekit.params.Measures.Correlation) – measure used during voting; default measure is correlation
max_growing (int = 0.0) –
non-negative integer representing maximum number of conditions which can be added to the rule in the growing phase (use this parameter for large datasets if execution time

is prohibitive); 0 indicates no limit; default: 0,
enable_pruning (bool = True) – enable or disable pruning, default is True.
ignore_missing (bool = False) – boolean telling whether missing values should be ignored (by default, a missing value of given attribute is always considered as not fulfilling the condition build upon that attribute); default: False.
max_uncovered_fraction (float = 0.0) – Floating-point number from [0,1] interval representing maximum fraction of examples that may remain uncovered by the rule set, default: 0.0.
select_best_candidate (bool = False) – Flag determining if best candidate should be selected from growing phase; default: False.
complementary_conditions (bool = False) – If enabled, complementary conditions in the form a = !{value} for nominal attributes are supported.
mean_based_regression (bool = True) – Enable fast induction of mean-based regression rules instead of default median-based.
max_rule_count (int = 0) –

Maximum number of rules to be generated (for classification data sets it applies
to a single class); 0 indicates no limit.
extend_using_preferred (bool = False) – boolean indicating whether initial rules should be extended with a use of preferred conditions and attributes; default is False
extend_using_automatic (bool = False) – boolean indicating whether initial rules should be extended with a use of automatic conditions and attributes; default is False
induce_using_preferred (bool = False) – boolean indicating whether new rules should be induced with a use of preferred conditions and attributes; default is False
induce_using_automatic (bool = False) – boolean indicating whether new rules should be induced with a use of automatic conditions and attributes; default is False
preferred_conditions_per_rule (int = None) – maximum number of preferred conditions per rule; default: unlimited,
preferred_attributes_per_rule (int = None) – maximum number of preferred attributes per rule; default: unlimited.

add_event_listener(listener: RuleInductionProgressListener)

Add event listener object to the operator which allows to monitor: rule induction progress.

Example

>>> from rulekit.events import RuleInductionProgressListener
>>> from rulekit.classification import RuleClassifier
>>>
>>> class MyEventListener(RuleInductionProgressListener):
>>>     def on_new_rule(self, rule):
>>>         print('Do something with new rule', rule)
>>>
>>> operator = RuleClassifier()
>>> operator.add_event_listener(MyEventListener())

Parameters:: listener (RuleInductionProgressListener) – listener object

Train model on given dataset.

Parameters:

values (rulekit.operator.Data) – attributes
labels (rulekit.operator.Data) – target values
expert_rules (List[Union[str, Tuple[str, str]]]) – set of initial rules, either passed as a list of strings representing rules or as list of tuples where first element is name of the rule and second one is rule string.
expert_preferred_conditions (List[Union[str, Tuple[str, str]]]) –

multiset of preferred conditions (used also for specifying preferred attributes by
using special value Any). Either passed as a list of strings representing rules or as list of tuples where first element is name of the rule and second one is rule string.
expert_forbidden_conditions (List[Union[str, Tuple[str, str]]]) –

set of forbidden conditions (used also for specifying forbidden attributes by using
special valye Any). Either passed as a list of strings representing rules or as list of tuples where first element is name of the rule and second one is rule string.

Returns:

self

Return type:

ExpertRuleRegressor

get_coverage_matrix(values: ndarray | DataFrame | list) → ndarray

Calculates coverage matrix for ruleset.

Parameters:

values (rulekit.operator.Data) – dataset

Returns:

coverage_matrix – Each row of the matrix represent single example from dataset and every column represent on rule from rule set. Value 1 in the matrix cell means that rule covered certain

example, value 0 means that it doesn’t.

Return type:

np.ndarray

get_metadata_routing() → None

Warning

Scikit-learn metadata routing is not supported yet.

Raises:: NotImplementedError – _description_

get_params(deep: bool = True) → dict[str, Any]

Parameters:: deep (rulekit.operator.Data) – Parameter for scikit-learn compatibility. Not used.
Returns:: hyperparameters – Dictionary containing model hyperparameters.
Return type:: np.ndarray

predict(values: ndarray | DataFrame | list) → ndarray

Perform prediction and returns predicted values.

Parameters:: values (rulekit.operator.Data) – attributes
Returns:: result – predicted values
Return type:: np.ndarray

score(values: ndarray | DataFrame | list, labels: ndarray | DataFrame | list) → float

Return the coefficient of determination R2 of the prediction

Parameters:

values (rulekit.operator.Data) – attributes
labels (rulekit.operator.Data) – true target values

Returns:

score – R2 of self.predict(values) wrt. labels.

Return type:

float

set_params(**kwargs) → object: Set models hyperparameters. Parameters are the same as in constructor.