BayesMinimumRiskClassifier¶
-
class
costcla.models.
BayesMinimumRiskClassifier
(calibration=True)[source]¶ A example-dependent cost-sensitive binary Bayes minimum risk classifier.
Parameters: calibration : bool, optional (default=True)
Whenever or not to calibrate the probabilities.
References
[R1] A. Correa Bahnsen, A. Stojanovic, D.Aouada, B, Ottersten, “Improving Credit Card Fraud Detection with Calibrated Probabilities”, in Proceedings of the fourteenth SIAM International Conference on Data Mining, 677-685, 2014. Examples
>>> from sklearn.ensemble import RandomForestClassifier >>> from sklearn.cross_validation import train_test_split >>> from costcla.datasets import load_creditscoring1 >>> from costcla.models import BayesMinimumRiskClassifier >>> from costcla.metrics import savings_score >>> data = load_creditscoring1() >>> sets = train_test_split(data.data, data.target, data.cost_mat, test_size=0.33, random_state=0) >>> X_train, X_test, y_train, y_test, cost_mat_train, cost_mat_test = sets >>> f = RandomForestClassifier(random_state=0).fit(X_train, y_train) >>> y_prob_test = f.predict_proba(X_test) >>> y_pred_test_rf = f.predict(X_test) >>> f_bmr = BayesMinimumRiskClassifier() >>> f_bmr.fit(y_test, y_prob_test) >>> y_pred_test_bmr = f_bmr.predict(y_prob_test, cost_mat_test) >>> # Savings using only RandomForest >>> print(savings_score(y_test, y_pred_test_rf, cost_mat_test)) 0.12454256594 >>> # Savings using RandomForest and Bayes Minimum Risk >>> print(savings_score(y_test, y_pred_test_bmr, cost_mat_test)) 0.413425845555
Methods
fit
fit_predict
get_params
predict
set_params
-
fit
(y_true_cal=None, y_prob_cal=None)[source]¶ If calibration, then train the calibration of probabilities
Parameters: y_true_cal : array-like of shape = [n_samples], optional default = None
True class to be used for calibrating the probabilities
y_prob_cal : array-like of shape = [n_samples, 2], optional default = None
Predicted probabilities to be used for calibrating the probabilities
Returns: self : object
Returns self.
-
fit_predict
(y_prob, cost_mat, y_true_cal=None, y_prob_cal=None)[source]¶ Calculate the prediction using the Bayes minimum risk classifier.
Parameters: y_prob : array-like of shape = [n_samples, 2]
Predicted probabilities.
cost_mat : array-like of shape = [n_samples, 4]
Cost matrix of the classification problem Where the columns represents the costs of: false positives, false negatives, true positives and true negatives, for each example.
y_true_cal : array-like of shape = [n_samples], optional default = None
True class to be used for calibrating the probabilities
y_prob_cal : array-like of shape = [n_samples, 2], optional default = None
Predicted probabilities to be used for calibrating the probabilities
Returns: y_pred : array-like of shape = [n_samples]
Predicted class
-
get_params
(deep=True)¶ Get parameters for this estimator.
Parameters: deep: boolean, optional
If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns: params : mapping of string to any
Parameter names mapped to their values.
-
predict
(y_prob, cost_mat)[source]¶ Calculate the prediction using the Bayes minimum risk classifier.
Parameters: y_prob : array-like of shape = [n_samples, 2]
Predicted probabilities.
cost_mat : array-like of shape = [n_samples, 4]
Cost matrix of the classification problem Where the columns represents the costs of: false positives, false negatives, true positives and true negatives, for each example.
Returns: y_pred : array-like of shape = [n_samples]
Predicted class
-
set_params
(**params)¶ Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as pipelines). The former have parameters of the form
<component>__<parameter>
so that it’s possible to update each component of a nested object.Returns: self
-