Code documentation
- imcp.imcp_curve(y_true, y_score, labels: list = None, abs_tolerance=1e-08)
Calculate imbalanced mcp curve using Hellinger distance. Unequal distribution of classes is taken into account.
- Parameters:
y_true (array-like of shape (n_samples,). True labels.)
y_score (array-like of shape (n_samples, n_classes)) –
Target scores corresponding to probability estimates of a sample belonging to a particular class.
The order of the class scores must correspond to the numerical or lexicographical order of the labels in y_true.
If number of class labels in y_true differs from number of columns in y_score, a list with all labels must be given.
labels (list with all class labels mapped to columns in y_score) – Must be given if any class is not represented in y_true, but y_score contains probabilities for this class. Number of labels must be equal to number of columns in y_score. All labels must be of the same dtype and share that dtype with y_true labels.
abs_tolerance (absolute tolerance threshold for checking whether probabilities) – sum up to 1
- Returns:
curve_x (numpy.array with x-coordinates of calculated curve)
curve_y (numpy.array with y-coordinates of calculated curve)
- imcp.imcp_score(y_true, y_score, labels: list = None, abs_tolerance=1e-08)
Calculate area under imbalanced mcp curve with trapezoid rule.
- Parameters:
y_true (array-like of shape (n_samples,). True labels.)
y_score (array-like of shape (n_samples, n_classes)) –
Target scores corresponding to probability estimates of a sample belonging to a particular class.
The order of the class scores must correspond to the numerical or lexicographical order of the labels in y_true.
If number of class labels in y_true differs from number of columns in y_score, a list with all labels must be given.
labels (list with all class labels mapped to columns in y_score) – Must be given if any class is not represented in y_true, but y_score contains probabilities for this class. Number of labels pairs must be equal to number of columns in y_score. All labels must be of the same dtype and share that dtype with y_true labels.
abs_tolerance (absolute tolerance threshold for checking whether probabilities) – sum up to 1
- Returns:
area
- Return type:
Approximated area under curve
- imcp.mcp_curve(y_true, y_score, labels: list = None, abs_tolerance=1e-08)
Calculate mcp curve using Hellinger distance.
- Parameters:
y_true (array-like of shape (n_samples,). True labels.)
y_score (array-like of shape (n_samples, n_classes)) –
Target scores corresponding to probability estimates of a sample belonging to a particular class.
The order of the class scores must correspond to the numerical or lexicographical order of the labels in y_true.
If number of class labels in y_true differs from number of columns in y_score, a list with all labels must be given.
labels (list with all class labels mapped to columns in y_score) – Must be given if any class is not represented in y_true, but y_score contains probabilities for this class. Number of labels must be equal to number of columns in y_score. All labels must be of the same dtype and share that dtype with y_true labels.
abs_tolerance (absolute tolerance threshold for checking whether probabilities) – sum up to 1
- Returns:
curve_x (numpy.array with x-coordinates of calculated curve)
curve_y (numpy.array with y-coordinates of calculated curve)
- imcp.mcp_score(y_true, y_score, labels: list = None, abs_tolerance=1e-08)
Calculate area under mcp curve with trapezoid rule.
- Parameters:
y_true (array-like of shape (n_samples,). True labels.)
y_score (array-like of shape (n_samples, n_classes)) –
Target scores corresponding to probability estimates of a sample belonging to a particular class.
The order of the class scores must correspond to the numerical or lexicographical order of the labels in y_true.
If number of class labels in y_true differs from number of columns in y_score, a list with all labels must be given.
labels (list with all class labels mapped to columns in y_score) – Must be given if any class is not represented in y_true, but y_score contains probabilities for this class. Number of labels must be equal to number of columns in y_score. All labels must be of the same dtype and share that dtype with y_true labels.
abs_tolerance (absolute tolerance threshold for checking whether probabilities) – sum up to 1
- Returns:
area
- Return type:
Approximated area under curve
- imcp.plot_curve(x, y, label: str | List[str] = None, output_fig_path: str = None, fig_title='(I)MCP curve(s)', xlabel='samples', ylabel='(I)MCP score')
Plot curves described with given x and y coordinates. To plot multiple curves, pass x and y as 2D arrays. Each row will be plotted as a separate curve.
- Parameters:
x (array-like or array of arrays.)
y (array-like or array of arrays. Must be of the same shape as x.)
label (single label or list of labels, which will be displayed on the plot as legend)
output_fig_path (if given, figure will be saved at this location. If no file extension is given,) – png will be used by default. Most backends support png, pdf, ps, eps and svg extensions.
fig_title (title of the figure.)
xlabel (label of the x-axis.)
ylabel (label of the y-axis.)
- imcp.plot_imcp_curve(y_true, y_score, labels: list = None, abs_tolerance=1e-08, output_fig_path: str = None)
Plot imbalanced mcp curve based on given probabilities and labels. If more than one algorithm scores given, plot all curves.
- Parameters:
y_true (array-like of shape (n_samples,). True labels.)
y_score (array-like of shape (n_samples, n_classes) or dictionary with algorithm's label key and array-like value.) – Target scores corresponding to probability estimates of a sample belonging to a particular class. The order of the class scores must correspond to the numerical or lexicographical order of the labels in y_true. If dictionary passed, a curve is plot for each key-value pair existing in dict.
labels (list with all class labels mapped to columns in y_score) – Must be given if any class is not represented in y_true, but y_score contains probabilities for this class. Number of labels must be equal to number of columns in y_score. All labels must be of the same dtype and share that dtype with y_true labels.
abs_tolerance (absolute tolerance threshold for checking whether probabilities) – sum up to 1
output_fig_path (if given, figure will be saved at this location. If no file extension is given,) – png will be used by default. Most backends support png, pdf, ps, eps and svg extensions.
- imcp.plot_mcp_curve(y_true, y_score, labels: list = None, abs_tolerance=1e-08, output_fig_path: str = None)
Plot mcp curve based on given probabilities and labels. If more than one algorithm scores given, plot all curves.
- Parameters:
y_true (array-like of shape (n_samples,). True labels.)
y_score (array-like of shape (n_samples, n_classes) or dictionary with algorithm's label key and array-like value.) – Target scores corresponding to probability estimates of a sample belonging to a particular class. The order of the class scores must correspond to the numerical or lexicographical order of the labels in y_true. If dictionary passed, a curve is plot for each key-value pair existing in dict.
labels (list with all class labels mapped to columns in y_score) – Must be given if any class is not represented in y_true, but y_score contains probabilities for this class. Number of labels must be equal to number of columns in y_score. All labels must be of the same dtype and share that dtype with y_true labels.
abs_tolerance (absolute tolerance threshold for checking whether probabilities) – sum up to 1
output_fig_path (if given, figure will be saved at this location. If no file extension is given,) – png will be used by default. Most backends support png, pdf, ps, eps and svg extensions.