Code documentation

imcp.imcp_curve(y_true, y_score, labels: list = None, abs_tolerance=1e-08)

Calculate imbalanced mcp curve using Hellinger distance. Unequal distribution of classes is taken into account.

Parameters:
  • y_true (array-like of shape (n_samples,). True labels.)

  • y_score (array-like of shape (n_samples, n_classes)) –

    Target scores corresponding to probability estimates of a sample belonging to a particular class.

    The order of the class scores must correspond to the numerical or lexicographical order of the labels in y_true.

    If number of class labels in y_true differs from number of columns in y_score, a list with all labels must be given.

  • labels (list with all class labels mapped to columns in y_score) – Must be given if any class is not represented in y_true, but y_score contains probabilities for this class. Number of labels must be equal to number of columns in y_score. All labels must be of the same dtype and share that dtype with y_true labels.

  • abs_tolerance (absolute tolerance threshold for checking whether probabilities) – sum up to 1

Returns:

  • curve_x (numpy.array with x-coordinates of calculated curve)

  • curve_y (numpy.array with y-coordinates of calculated curve)

imcp.imcp_score(y_true, y_score, labels: list = None, abs_tolerance=1e-08)

Calculate area under imbalanced mcp curve with trapezoid rule.

Parameters:
  • y_true (array-like of shape (n_samples,). True labels.)

  • y_score (array-like of shape (n_samples, n_classes)) –

    Target scores corresponding to probability estimates of a sample belonging to a particular class.

    The order of the class scores must correspond to the numerical or lexicographical order of the labels in y_true.

    If number of class labels in y_true differs from number of columns in y_score, a list with all labels must be given.

  • labels (list with all class labels mapped to columns in y_score) – Must be given if any class is not represented in y_true, but y_score contains probabilities for this class. Number of labels pairs must be equal to number of columns in y_score. All labels must be of the same dtype and share that dtype with y_true labels.

  • abs_tolerance (absolute tolerance threshold for checking whether probabilities) – sum up to 1

Returns:

area

Return type:

Approximated area under curve

imcp.mcp_curve(y_true, y_score, labels: list = None, abs_tolerance=1e-08)

Calculate mcp curve using Hellinger distance.

Parameters:
  • y_true (array-like of shape (n_samples,). True labels.)

  • y_score (array-like of shape (n_samples, n_classes)) –

    Target scores corresponding to probability estimates of a sample belonging to a particular class.

    The order of the class scores must correspond to the numerical or lexicographical order of the labels in y_true.

    If number of class labels in y_true differs from number of columns in y_score, a list with all labels must be given.

  • labels (list with all class labels mapped to columns in y_score) – Must be given if any class is not represented in y_true, but y_score contains probabilities for this class. Number of labels must be equal to number of columns in y_score. All labels must be of the same dtype and share that dtype with y_true labels.

  • abs_tolerance (absolute tolerance threshold for checking whether probabilities) – sum up to 1

Returns:

  • curve_x (numpy.array with x-coordinates of calculated curve)

  • curve_y (numpy.array with y-coordinates of calculated curve)

imcp.mcp_score(y_true, y_score, labels: list = None, abs_tolerance=1e-08)

Calculate area under mcp curve with trapezoid rule.

Parameters:
  • y_true (array-like of shape (n_samples,). True labels.)

  • y_score (array-like of shape (n_samples, n_classes)) –

    Target scores corresponding to probability estimates of a sample belonging to a particular class.

    The order of the class scores must correspond to the numerical or lexicographical order of the labels in y_true.

    If number of class labels in y_true differs from number of columns in y_score, a list with all labels must be given.

  • labels (list with all class labels mapped to columns in y_score) – Must be given if any class is not represented in y_true, but y_score contains probabilities for this class. Number of labels must be equal to number of columns in y_score. All labels must be of the same dtype and share that dtype with y_true labels.

  • abs_tolerance (absolute tolerance threshold for checking whether probabilities) – sum up to 1

Returns:

area

Return type:

Approximated area under curve

imcp.plot_curve(x, y, label: str | List[str] = None, output_fig_path: str = None, fig_title='(I)MCP curve(s)', xlabel='samples', ylabel='(I)MCP score')

Plot curves described with given x and y coordinates. To plot multiple curves, pass x and y as 2D arrays. Each row will be plotted as a separate curve.

Parameters:
  • x (array-like or array of arrays.)

  • y (array-like or array of arrays. Must be of the same shape as x.)

  • label (single label or list of labels, which will be displayed on the plot as legend)

  • output_fig_path (if given, figure will be saved at this location. If no file extension is given,) – png will be used by default. Most backends support png, pdf, ps, eps and svg extensions.

  • fig_title (title of the figure.)

  • xlabel (label of the x-axis.)

  • ylabel (label of the y-axis.)

imcp.plot_imcp_curve(y_true, y_score, labels: list = None, abs_tolerance=1e-08, output_fig_path: str = None)

Plot imbalanced mcp curve based on given probabilities and labels. If more than one algorithm scores given, plot all curves.

Parameters:
  • y_true (array-like of shape (n_samples,). True labels.)

  • y_score (array-like of shape (n_samples, n_classes) or dictionary with algorithm's label key and array-like value.) – Target scores corresponding to probability estimates of a sample belonging to a particular class. The order of the class scores must correspond to the numerical or lexicographical order of the labels in y_true. If dictionary passed, a curve is plot for each key-value pair existing in dict.

  • labels (list with all class labels mapped to columns in y_score) – Must be given if any class is not represented in y_true, but y_score contains probabilities for this class. Number of labels must be equal to number of columns in y_score. All labels must be of the same dtype and share that dtype with y_true labels.

  • abs_tolerance (absolute tolerance threshold for checking whether probabilities) – sum up to 1

  • output_fig_path (if given, figure will be saved at this location. If no file extension is given,) – png will be used by default. Most backends support png, pdf, ps, eps and svg extensions.

imcp.plot_mcp_curve(y_true, y_score, labels: list = None, abs_tolerance=1e-08, output_fig_path: str = None)

Plot mcp curve based on given probabilities and labels. If more than one algorithm scores given, plot all curves.

Parameters:
  • y_true (array-like of shape (n_samples,). True labels.)

  • y_score (array-like of shape (n_samples, n_classes) or dictionary with algorithm's label key and array-like value.) – Target scores corresponding to probability estimates of a sample belonging to a particular class. The order of the class scores must correspond to the numerical or lexicographical order of the labels in y_true. If dictionary passed, a curve is plot for each key-value pair existing in dict.

  • labels (list with all class labels mapped to columns in y_score) – Must be given if any class is not represented in y_true, but y_score contains probabilities for this class. Number of labels must be equal to number of columns in y_score. All labels must be of the same dtype and share that dtype with y_true labels.

  • abs_tolerance (absolute tolerance threshold for checking whether probabilities) – sum up to 1

  • output_fig_path (if given, figure will be saved at this location. If no file extension is given,) – png will be used by default. Most backends support png, pdf, ps, eps and svg extensions.