Compute Receiver Operating Characteristic (ROC) curve or other performance curve for classifier output

expand all in page

Description

[X,Y] = perfcurve(labels,scores,posclass) computes a ROC curve for a vector of classifier predictions scores given true class labels, labels . labels can be a numeric vector, logical vector, character matrix, cell array of strings or categorical vector. scores is a numeric vector of scores returned by a classifier for some data. posclass is the positive class label (scalar), either numeric (for numeric labels ), logical (for logical labels ), or char. The returned values X and Y are coordinates for the performance curve and can be visualized with plot(X,Y) . For more information on labels , scores , and posclass , see Input Arguments . For more information on X and Y , see Output Arguments .

[X,Y] = perfcurve(labels,scores,posclass,' Name ',value) specifies one or more optional parameter name/value pairs, with Name in single quotes. See Input Arguments for a list of inputs, parameter name/value pairs, and respective explanations.

See Grouping Variables for more information on grouping variables.

[X,Y,T,AUC,OPTROCPT,SUBY,SUBYNAMES] = perfcurve(labels,scores,posclass) returns:

  • An array of thresholds on classifier scores for the computed values of X and Y ( T ).

  • The area under curve ( AUC ) for the computed values of X and Y .

  • The optimal operating point of the ROC curve ( OPTROCPT ).

  • An array of Y values for negative subclasses ( SUBY ).

  • A cell array of negative class names ( SUBYNAMES ).

For more information on each output, see Output Arguments .

[X,Y,T,AUC] = perfcurve(labels,scores,posclass) also returns pointwise confidence bounds for the computed values X , Y , T , and AUC if you supply cell arrays for labels and scores or set NBoot (see Input Arguments ) to a positive integer. To compute the confidence bounds, perfcurve uses either vertical averaging (VA) or threshold averaging (TA). The returned values Y are an m -by-3 array in which the 1st element in every row gives the mean value, the 2nd element gives the lower bound and the 3rd element gives the upper bound. The returned AUC is a row-vector with 3 elements following the same convention. For VA, the returned values T are an m -by-3 array and X is a column-vector. For TA, the returned values X are an m -by-3 matrix and T is a column-vector.

perfcurve computes confidence bounds using either cross validation or bootstrap. If you supply cell arrays for labels and scores , perfcurve uses cross validation and treats elements in the cell arrays as cross validation folds. labels can be a cell array of numeric vectors, logical vectors, character matrices, cell arrays of strings or categorical vectors. All elements in labels must have the same type. scores is a cell array of numeric vectors. The cell arrays for labels and scores must have the same number of elements, and the number of labels in cell k must be equal to the number of scores in cell k for any k in the range from 1 to the number of elements in scores .

If you set NBoot to a positive integer, perfcurve generates nboot bootstrap replicas to compute pointwise confidence bounds. You cannot supply cell arrays for labels and scores and set NBoot to a positive integer at the same time.

perfcurve returns pointwise confidence bounds. It does not return a simultaneous confidence band for the entire curve.

If you use 'XCrit' or 'YCrit' options described below to set the criterion for X or Y to an anonymous function, perfcurve can only compute confidence bounds by bootstrap.

Input Arguments

labels labels can be a numeric vector, logical vector, character matrix, cell array of strings or categorical vector.
scores scores is a numeric vector of scores returned by a classifier for some data. This vector must have as many elements as labels does.
posclass posclass is the positive class label. If labels is a:
  • Numeric vector, then posclass is a numeric scalar

  • Logical vector, then posclass is a logical scalar

  • Character matrix, then posclass is a character string

  • Cell array of strings, then posclass is a character string or cell containing a character string

  • Categorical vector, then posclass is a categorical scalar

posclass must be a member of labels .

Name-Value Pair Arguments

Name Value and Description
negClass List of negative classes. Can be either a numeric array or an array of chars or a cell array of strings. By default, negClass is set to 'all' and all classes found in the input array of labels that are not the positive class are considered negative. If negClass is a subset of the classes found in the input array of labels, instances with labels that do not belong to either positive or negative classes are discarded.

xCrit

Criterion to compute for X. This criterion must be a monotone function of the positive class score. perfcurve supports the following criteria:
  • TP — Number of true positive instances.

  • FN — Number of false negative instances.

  • FP — Number of false positive instances.

  • TN — Number of true negative instances.

  • TP+FP — Sum of TP and FP .

  • RPP — Rate of positive predictions. RPP=(TP+FP)/(TP+FN+FP+TN)

  • RNP — Rate of negative predictions. RNP=(TN+FN)/(TP+FN+FP+TN)

  • accu — Accuracy. accu = (TP+TN)/(TP+FN+FP+TN )

  • TPR , sens , reca — True positive rate, sensitivity, recall. TPR, sens, reca = TP/(TP+FN)

  • FNR , miss — False negative rate, miss. FNR,miss=FN/(TP+FN)

  • FPR , fall — False positive rate, fallout. FPR,fall=FP/(TN+FP)

  • TNR , spec — True negative rate, specificity. TNR,spec=TN/(TN+FP)

  • PPV , prec — Positive predictive value, precision. PPV,prec=TP/(TP+FP)

  • NPV — Negative predictive value. NPV=TN/(TN+FN)

  • ecost — Expected cost. ecost=(TP*COST(P|P)+FN*COST(N|P)+FP* COST(P|N)+TN*COST(N|N))/(TP+FN+FP+TN)

In addition, you can define an arbitrary criterion by supplying an anonymous function of three arguments, (C,scale,cost) , where C is a 2-by-2 confusion matrix, scale is a 2-by-1 array of class scales, and cost is a 2-by-2 misclassification cost matrix.

Caution Some of these criteria return NaN values at one of the two special thresholds, 'reject all' and 'accept all' .

yCrit Criterion to compute for Y . perfcurve supports the same criteria as for X . This criterion does not have to be a monotone function of the positive class score.
XVals Values for the X criterion. The default value for xVals is 'all' and perfcurve computes X and Y values for all scores. If the value for xVals is not 'all' , it must be a numeric array. In this case, perfcurve computes X and Y only for the specified xVals .
TVals Thresholds for the positive class score. By default, TVals is unset and perfcurve computes X , Y , and T values for all scores. You can set TVals to either 'all' or a numeric array. If TVals is set to 'all' or unset and XVals is unset, perfcurve returns X , Y , and T values for all scores and computes pointwise confidence bounds for Y and X using threshold averaging. If TVals is set to a numeric array, perfcurve returns X , Y , and T values for the specified thresholds and computes pointwise confidence bounds for Y and X at these thresholds using threshold averaging. You cannot set XVals and TVals at the same time.
UseNearest 'on' to use nearest values found in the data instead of the specified numeric XVals or TVals and 'off' otherwise. If you specify numeric XVals and set UseNearest to 'o n', perfcurve returns nearest unique values X found in the data, as well as corresponding values of Y and T . If you specify numeric XVals and set UseNearest to 'off' , perfcurve returns these XVals sorted. By default this parameter is set to 'on' . If you compute confidence bounds by cross validation or bootstrap, this parameter is always 'off' .
ProcessNaN Specifies how perfcurve processes NaN scores. The default value is 'ignore' and perfcurve removes observations with NaN scores from the data. If you set the parameter to 'addtofalse' , perfcurve adds instances with NaN scores to false classification counts in the respective class. That is, perfcurve always counts instances from the positive class as false negative (FN), and always counts instances from the negative class as false positive (FP).
Prior Either string or array with two elements. It represents prior probabilities for the positive and negative class, respectively. Default is 'empirical' , that is, perfcurve derives prior probabilities from class frequencies. If set to 'uniform' , perfcurve sets all prior probabilities equal.
Cost A 2-by-2 matrix of misclassification costs [C(P|P) C(N|P); C(P|N) C(N|N)] where C(I|J) is the cost of misclassifying class J as class I . By default set to [0 0.5; 0.5 0] .
Alpha A numeric value between 0 and 1. perfcurve returns 100*(1- Alpha ) percent pointwise confidence bounds for X , Y , T and AUC . By default set to 0.05 for 95% confidence interval.
Weights A numeric vector of nonnegative observation weights. This vector must have as many elements as scores or labels do. If you supply cell arrays for scores and labels and you need to supply weights , you must supply them as a cell array too. In this case, every element in weights must be a numeric vector with as many elements as the corresponding element in scores : numel(weights{1})==numel(scores{1}) etc. To compute X , Y and T or to compute confidence bounds by cross validation, perfcurve uses these observation weights instead of observation counts. To compute confidence bounds by bootstrap, perfcurve samples N out of N with replacement using these weights as multinomial sampling probabilities.
NBoot Number of bootstrap replicas for computation of confidence bounds. Must be a positive integer. By default this parameter is set to zero, and bootstrap confidence bounds are not computed. If you supply cell arrays for labels and scores , this parameter must be set to zero because perfcurve cannot use both cross validation and bootstrap to compute confidence bounds.
BootType Confidence interval type bootci uses to compute confidence bounds. You can specify any type supported by bootci . By default set to 'bca' .
BootArg Optional input arguments for bootci used to compute confidence bounds. You can specify all arguments bootci supports. Empty by default.

Output Arguments

X x -coordinates for the performance curve. By default, X is false positive rate, FPR, (equivalently, fallout, or 1–specificity). To change this output, use the 'xCrit' name/value input. For accepted criterion, see 'xCrit' in Input Arguments for more information.
Y y -coordinates for the performance curve. By default, Y is true positive rate, TPR, (equivalently, recall, or sensitivity). To change this output, use the 'yCrit' input. For accepted criterion, see 'xCrit' in Input Arguments for more information.
T

An array of thresholds on classifier scores for the computed values of X and Y . It has the same number of rows as X and Y . For each threshold, TP is the count of true positive observations with scores greater or equal to this threshold, and FP is the count of false positive observations with scores greater or equal to this threshold. perfcurve defines negative counts, TN and FN , in a similar way then sorts the thresholds in the descending order which corresponds to the ascending order of positive counts.

For the M distinct thresholds found in the array of scores, perfcurve returns the X , Y and T arrays with M+1 rows. perfcurve sets elements T(2:M+1) to the distinct thresholds, and T(1) replicates T(2) . By convention, T(1) represents the highest 'reject all' threshold and perfcurve computes the corresponding values of X and Y for TP=0 and FP=0 . T(end) is the lowest 'accept all' threshold for which TN=0 and FN=0 .

AUC The area under curve ( AUC ) for the computed values of X and Y . If you set xVals to 'all' (the default), perfcurve computes AUC using the returned X and Y values. If xVals is a numeric array, perfcurve computes AUC using X and Y values found from all distinct scores in the interval specified by the smallest and largest elements of xVals . More precisely, perfcurve finds X values for all distinct thresholds as if xVals were set to 'all' , then uses a subset of these (with corresponding Y values) between min(xVals) and max(xVals) to compute AUC . The function uses trapezoidal approximation to estimate the area. If the first or last value of X or Y are NaN s, perfcurve removes them to allow calculation of AUC . This takes care of criteria that produce NaN s for the special 'reject all' or 'accept all' thresholds, for example, positive predictive value (PPV) or negative predictive value (NPV).
OPTROCPT

The optimal operating point of the ROC curve as an array of size 1-by-2 with FPR and TPR values for the optimal ROC operating point. perfcurve computes optrocpt only for the standard ROC curve and sets to NaN s otherwise. To obtain the optimal operating point for the ROC curve, perfcurve first finds the slope, S , using

where cost(I|J) is the cost of assigning an instance of class J to class I , and P=TP+FN and N=TN+FP are the total instance counts in the positive and negative class, respectively. perfcurve then finds the optimal operating point by moving the straight line with slope S from the upper left corner of the ROC plot ( FPR=0 , TPR=1 ) down and to the right until it intersects the ROC curve.

SUBY An array of Y values for negative subclasses. If you only specify one negative class, SUBY is identical to Y . Otherwise SUBY is a matrix of size M -by- K , where M is the number of returned values for X and Y , and K is the number of negative classes. perfcurve computes Y values by summing counts over all negative classes. SUBY gives values of the Y criterion for each negative class separately. For each negative class, perfcurve places a new column in SUBY and fills it with Y values for TN and FP counted just for this class.
SUBYNAMES A cell array of negative class names. If you provide an input array, negClass , of negative class names, perfcurve copies it into SUBYNAMES . If you do not provide negClass , perfcurve extracts SUBYNAMES from input labels. The order of SUBYNAMES is the same as the order of columns in SUBY , that is, SUBY(:,1) is for negative class SUBYNAMES{1} etc.

Plot a ROC Curve for Classification Algorithms

Plot the ROC curve for classification by logistic regression.

load fisheriris
x = meas(51:end,1:2);
% Iris data, 2 classes and 2 features
y = (1:100)'>50;
% Versicolor = 0, virginica = 1
b = glmfit(x,y,'binomial');
% Logistic regression
p = glmval(b,x,'logit');
% Fit probabilities for scores
[X,Y,T,AUC] = perfcurve(species(51:end,:),p,'virginica');
plot(X,Y)
xlabel('False positive rate'); ylabel('True positive rate')
title('ROC for classification by logistic regression')
     

Obtain errors on TPR by vertical averaging.

[X,Y] = perfcurve(species(51:end,:),p,'virginica',...
   'NBoot',1000,'XVals','All');
% Plot errors
errorbar(X,Y(:,1),Y(:,2)-Y(:,1),Y(:,3)-Y(:,1));
 

References

[1] T. Fawcett, ROC Graphs: Notes and Practical Considerations for Researchers, 2004.

[2] M. Zweig and G. Campbell, Receiver-Operating Characteristic (ROC) Plots: A Fundamental Evaluation Tool in Clinical Medicine, Clin. Chem. 39/4, 561-577, 1993.

[3] J. Davis and M. Goadrich, The relationship between precision-recall and ROC curves, in Proceedings of ICML '06, 233-240, 2006.

[4] C. Moskowitz and M. Pepe, Quantifying and comparing the predictive accuracy of continuous prognostic factors for binary outcomes, Biostatistics 5, 113-127, 2004.

[5] Y. Huang, M. Pepe and Z. Feng, Evaluating the Predictiveness of a Continuous Marker, U. Washington Biostatistics Paper Series, 282, 2006.

[6] W. Briggs and R. Zaretzki, The Skill Plot: A Graphical Technique for Evaluating Continuous Diagnostic Tests, Biometrics 63, 250-261, 2008.

[7] http://www2.cs.uregina.ca/~hamilton/courses/831/notes/lift_chart/lift_chart.html;http://www.dmreview.com/news/5329-1.html.

[8] R. Bettinger, Cost-Sensitive Classifier Selection Using the ROC Convex Hull Method, SAS Institute.

[9] http://www.stata.com/statalist/archive/2003-02/msg00060.html

See Also

bootci | classify | classregtree | glmfit | mnrfit | NaiveBayes

perfcurveCompute Receiver Operating Characteristic (ROC) curve or other performance curve for classifier outputexpand all in pageSyntax[X,Y] = perfcurve(labels,scores,poscl
原文自http://blog.csdn.net/fuhpi/article/details/8813455   ROC曲线是通用的分类器评价工具,matlab函数中自带了绘制该曲线的函数plotroc。    plotroc函数的原型为:plotroc(targets, outputs)  其中参数targets是一个矩阵,代表测试集,每一列表示一个测试样本的标签  如果有两类样本,比
具体使用方法可以参考perfcurve帮助 load heart_scale.mat model = svmtrain(heart_scale_label, heart_scale_inst, '-c 1 -g 0.07'); [predict_label, accuracy, dec_values] = svmpredict(heart_scale_label, heart_scale_inst,model); [X,Y] =perfcurve(heart_scale_la
我们通常使用ROC曲线来评价分类结果的好坏,在MATLAB绘制该曲线其实也十分容易。我们让label表示真实的类别,output表示预测的类别,那么调用: [XRF,YRF,TRF,AUCRF] = perfcurve(label,output,1); 之后运行: plot(XRF,YRF) 即可得到ROC曲线,该函数中还有一个参数,也就是正类的标签(positive class label),该变量是一个常数,可以为数值,也可是是字符串等,需要指定一下。上面的例子中,我们输出的结果和标签都是0或1的
ionosphere数据集是一个二分类问题的数据集,主要是针对天气雷达的电离层特征数据。该数据集一共有34个特征变量,其中前33个是预测变量,第34个是类别变量,类别变量为g和b两类,分别代表天气雷达测得的信号穿过电离层和未穿过电离层两种情况。ROC曲线是评估分类器性能的重要工具之一,能够同时考虑分类器的灵敏度和特异性。 要绘制ionosphere数据集的ROC曲线,首先需要将数据集导入MATLAB并划分为训练集和测试集,可以使用MATLAB中的readtable函数读取数据集,再使用fitcsvm函数拟合支持向量机分类器,并使用predict函数对测试集进行预测。接下来,需要计算分类器的真正率和假正率,真正率是指正确分类为正样本的样本占所有正样本的比例,假正率是指将负样本误分类为正样本的比例。然后,使用MATLAB中的perfcurve函数可以方便地计算并绘制ROC曲线。此函数中需要传入真正率、假正率及其阈值,将roc曲线直接绘制出来。 在绘制ROC曲线的同时,还可以计算分类器的AUC值,即曲线下的面积,其值越大表示分类器性能越好。使用MATLAB中的auc函数可以直接计算AUC值。 绘制ionosphere数据集ROC曲线,可以借助MATLAB提供的强大的函数库,在MATLAB中导入数据集后进行分类器训练和测试,计算真正率和假正率,并使用perfcurve函数绘制曲线,同时计算并输出AUC值,以便对分类器性能进行评估。