<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>ROC Curves on R Views</title>
    <link>https://rviews.rstudio.com/tags/roc-curves/</link>
    <description>Recent content in ROC Curves on R Views</description>
    <generator>Hugo -- gohugo.io</generator>
    <language>en-us</language>
    <lastBuildDate>Thu, 12 Nov 2020 00:00:00 +0000</lastBuildDate>
    <atom:link href="https://rviews.rstudio.com/tags/roc-curves/" rel="self" type="application/rss+xml" />
    
    
    
    
    <item>
      <title>ROC Day at BARUG</title>
      <link>https://rviews.rstudio.com/2020/11/12/roc-day-at-barug/</link>
      <pubDate>Thu, 12 Nov 2020 00:00:00 +0000</pubDate>
      
      <guid>https://rviews.rstudio.com/2020/11/12/roc-day-at-barug/</guid>
      <description>
        

&lt;p&gt;This week, the Bay Area useR Group &lt;a href=&#34;https://www.meetup.com/R-Users/&#34;&gt;(BARUG)&lt;/a&gt; held a mini-conference focused on ROC Curves. Talks discussed the history of the ROC, extending ROC analysis to multiclass problems, various ways to think about and interpret ROC curves, and how to translate concrete business goals into the ROC framework, and pick the optimal threshold for a given problem.&lt;/p&gt;

&lt;h3 id=&#34;some-history&#34;&gt;Some History&lt;/h3&gt;

&lt;p&gt;I introduced the session with a very brief eclectic &lt;a href=&#34;Rickert_ROC.pptx&#34;&gt;&amp;ldquo;history&amp;rdquo;&lt;/a&gt; of the ROC anchored on a few key papers that seem to me to represent inflection points in its development and adoption.&lt;/p&gt;

&lt;p&gt;Anecdotal accounts of early ROC such as this brief mention in &lt;a href=&#34;https://derangedphysiology.com/main/cicm-primary-exam/required-reading/research-methods-and-statistics/Chapter%203.0.5/receiver-operating-characteristic-roc-curve&#34;&gt;Deranged Physiology&lt;/a&gt; make it clear that &lt;em&gt;Receiver Operating Characteristic&lt;/em&gt; referred to the ability of a radar technician, sitting at at a &lt;em&gt;receiver&lt;/em&gt; to look at a blimp on the screen and distinguish an aircraft from background noise. The &lt;a href=&#34;https://apps.dtic.mil/dtic/tr/fulltext/u2/016786.pdf&#34;&gt;DoD report&lt;/a&gt; written by Peterson and Birdsall in 1953 shows that the underlying mathematical theory, and many of the statistical characteristics of the ROC, had already been worked out by that time. Thereafter, (see the references below) the ROC became a popular tool in Psychology, Medicine and many other disciplines seeking to make optimal decisions based on the ability to detect signals.&lt;/p&gt;

&lt;p&gt;Jumping to &amp;ldquo;modern times&amp;rdquo; his &lt;a href=&#34;https://www.cse.ust.hk/nevinZhangGroup/readings/yi/Bradley_PR97.pdf&#34;&gt;1996 paper&lt;/a&gt; Bradley argues for the ROC to replace overall accuracy as the single best measure to describe classifier performance. Given the prevalent use of ROC curves, it is interesting to contemplate a time when that was not so. Finally, the landmark &lt;a href=&#34;http://link.springer.com/article/10.1007/s10994-009-5119-5&#34;&gt;2009 paper&lt;/a&gt; by David Hand indicates that soon after the adoption of the ROC, researchers were already noticing problems using the area under the curve (AUC) to compare the performance of classifiers whose ROC curves cross. Additionally, Hand observes that:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;(The AUC) is fundamentally incoherent in terms of misclassification costs: the AUC uses different misclassification cost distributions for different classifiers. &amp;hellip;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Hand goes on to propose &lt;a href=&#34;https://cran.r-project.org/web/packages/hmeasure/index.html&#34;&gt;H Measure&lt;/a&gt; as an alternative to AUC.&lt;/p&gt;

&lt;h3 id=&#34;multiclass-classification&#34;&gt;Multiclass Classification&lt;/h3&gt;

&lt;p&gt;In his talk, &lt;em&gt;ROC Curves extended to multiclass classification, and how they do or do not map to the binary case&lt;/em&gt; (&lt;a href=&#34;Inchiosa_ROC.pptx&#34;&gt;slides here&lt;/a&gt;), Mario Inchiosa discusses extensions of the ROC curve to multiclass classification and why these extensions don&amp;rsquo;t all apply to the binary case. He distinguishes between multiclass and multilabel classification and discusses the pros and cons of different averaging techniques in the multiclass &lt;em&gt;One vs. Rest&lt;/em&gt; scenario. He also points (see references below) to both R and scikit-learn packages useful in this kind of analysis.&lt;/p&gt;

&lt;h3 id=&#34;intrepreting-the-roc&#34;&gt;Intrepreting the ROC&lt;/h3&gt;

&lt;p&gt;In his highly original talk, &lt;em&gt;Six Ways to Think About ROC Curves&lt;/em&gt; (&lt;a href=&#34;Horton_ROC.pptx&#34;&gt;slides here&lt;/a&gt;), Robert Horton challenges you to see the ROC curve from multiple perspectives. Even if you have been working with ROC curves for some time you are likely to learn something new here. The &amp;ldquo;Turtle Eye&amp;rdquo; view is eye opening for many.&lt;/p&gt;

&lt;p&gt;&lt;img src=&#34;turtle.gif&#34; height = &#34;300&#34; width=&#34;500&#34;&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;The discrete &amp;ldquo;Turtle&amp;rsquo;s Eye&amp;rdquo; view, where labeled cases are sorted by score, and the path of the curve is determined by the order of positive and negative cases.&lt;/li&gt;
&lt;li&gt;The categorical view, where we have to handle tied scores, or when scores put cases in sortable buckets.&lt;/li&gt;
&lt;li&gt;The continuous view, where the cumulative distribution function (CDF) for the positive cases is plotted against the CDF for the negative cases.&lt;/li&gt;
&lt;li&gt;The ROC curve can be thought of as the limit of the cumulative gain curve (or &amp;ldquo;Total Operating Characteristic&amp;rdquo; curve) as the prevalence of positive cases goes to zero.&lt;/li&gt;
&lt;li&gt;The probabilistic view, where AUC is the probability that a randomly chosen positive case will have a higher score than a randomly chosen negative case.&lt;/li&gt;
&lt;li&gt;The ROC curve emerges from a graphical interpretation of the Mann-Whitney Wilcoxon U Test Statistic, which illustrates how AUC relates to this commonly used non-parametric hypothesis test.&lt;/li&gt;
&lt;/ol&gt;

&lt;h3 id=&#34;picking-the-optimal-utility-threshold&#34;&gt;Picking the Optimal Utility Threshold&lt;/h3&gt;

&lt;p&gt;John Mount with a talk on &lt;em&gt;How to Pick an Optimal Utility Threshold Using the ROC Plot&lt;/em&gt; &lt;a href=&#34;Mount_ROC.pptx&#34;&gt;(slides here)&lt;/a&gt; closed out the evening with some original work on how to translate concrete business goals into the ROC framework and then use the ROC plot to pick the optimal classification threshold for a given problem. John emphasizes the advantages of working with parametric representations of ROC curves and the importance of discovering utility requirements through iterated negotiation. All of this flows from John&amp;rsquo;s original and insightful definition of an ROC plot.&lt;/p&gt;

&lt;p&gt;&lt;img src=&#34;Mount.png&#34; height = &#34;300&#34; width=&#34;500&#34;&gt;&lt;/p&gt;

&lt;p&gt;Finally, the &lt;a href=&#34;https://zoom.us/rec/share/gvtdc5lJ_FjBGlLRM0SnKPnuLnQxQUcX7cN_o2bpim8FDas_44LHT3RyxgHueCoo.qhEnNMZ6ccyiSmI2?startTime=1605056465000&#34;&gt;zoom video&lt;/a&gt; covering the talks by Inchiosa, Horton and Mount is well-worth watching.&lt;/p&gt;

&lt;h4 id=&#34;horton-talk-references&#34;&gt;Horton Talk References&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;https://ccrma.stanford.edu/workshops/mir2009/references/ROCintro.pdf&#34;&gt;Fawcett  (2006)&lt;/a&gt; An Introduction to ROC Analysis]&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://kennis-research.shinyapps.io/ROC-Curves/&#34;&gt;Berrizbeitia&lt;/a&gt; Receiver Operating Characteristic (ROC) Curves - Shiny App&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;http://ocw.jhsph.edu/courses/fundepi/PDFs/Lecture11.pdf&#34;&gt;Kanchanaraksa (2008)&lt;/a&gt; [Evaluation of Diagnostic and Screening Tests: Validity and Reliability]&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;http://blog.mldb.ai/blog/posts/2016/01/ml-meets-economics/&#34;&gt;Kruchten (2016)&lt;/a&gt; ML Meets Economics&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://win-vector.com/blog-2/&#34;&gt;Mount and Zumel&lt;/a&gt; The Win-Vector blog]()&lt;/li&gt;
&lt;/ul&gt;

&lt;h4 id=&#34;inchiosa-talk-references&#34;&gt;Inchiosa Talk References&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;https://en.wikipedia.org/wiki/Receiver_operating_characteristic&#34;&gt;ROC&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://en.wikipedia.org/wiki/Multiclass_classification&#34;&gt;Multiclass Classification&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://scikit-learn.org/stable/modules/generated/sklearn.metrics.roc_auc_score.html#sklearn.metrics.roc_auc_score&#34;&gt;roc auc score&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://scikit-learn.org/stable/modules/model_evaluation.html#roc-metrics&#34;&gt;roc metrics&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://scikit-learn.org/stable/auto_examples/model_selection/plot_roc.html&#34;&gt;plot roc&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://link.springer.com/article/10.1023/A:1010920819831&#34;&gt;Hand and Till (2001)&lt;/a&gt; reference for one-vs-one&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://CRAN.R-project.org/package=HandTill2001&#34;&gt;HandTill2001&lt;/a&gt; package for Hand &amp;amp; Till’s “M” measure that extends AUC to multiclass using One vs. One&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://CRAN.R-project.org/package=multiROC&#34;&gt;multiROC&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h4 id=&#34;rickert-talk-references&#34;&gt;Rickert Talk References&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;https://www.cse.ust.hk/nevinZhangGroup/readings/yi/Bradley_PR97.pdf&#34;&gt;Bradley (1996)&lt;/a&gt; The Use of the Area Under the ROC Curve in the Evaluation of Machine Learning Algorithms - recommends ROC replace overall accuracy as a single measure of classifier performance&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://derangedphysiology.com/main/cicm-primary-exam/required-reading/research-methods-and-statistics/Chapter%203.0.5/receiver-operating-characteristic-roc-curve&#34;&gt;Deranged Physiology&lt;/a&gt; ROC characteristic of radar operator&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3755824/#B26&#34;&gt;Hajian-Tilake (2013)&lt;/a&gt; Receiver Operating Characteristic (ROC) Curve Analysis for Medical Diagnostic Test Evaluation&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://link.springer.com/article/10.1007/s10994-009-5119-5&#34;&gt;Hand (2009)&lt;/a&gt; Measuring classifier performance: a coherent alternative to the area under the ROC curve&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://www.nap.edu/read/13062/chapter/7&#34;&gt;McClelland (2011)&lt;/a&gt; Use of Signal Detection Theory as a Tool for Enhancing Performance and Evaluating Tradecraft in Intelligence Analysis&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;http://journals.sagepub.com/doi/pdf/10.1177/0272989X8400400201&#34;&gt;Lusted (1984)&lt;/a&gt; Editorial on medical uses of ROC&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.705.4736&amp;amp;rep=rep1&amp;amp;type=pdf&#34;&gt;Pelli and Farell (1995)&lt;/a&gt; Psychophysical Methods&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://apps.dtic.mil/dtic/tr/fulltext/u2/016786.pdf&#34;&gt;Peterson and Birdsall (1953)&lt;/a&gt; DoD Report on The Theory of Signal Detectability - Early paper referencing ROC&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://www.amazon.com/Probability-Information-Theory-Applications-Radar/dp/1483169642/ref=sr_1_3?crid=23YA3FG89PBOX&amp;amp;dchild=1&amp;amp;keywords=probability+and+information+theory+with+applications+to+radar&amp;amp;qid=1605157181&amp;amp;sprefix=probability+and+information+theory+with+applications+in+%2Caps%2C216&amp;amp;sr=8-3&#34;&gt;Woodward (1953)&lt;/a&gt; Probability and Information Theory, with Applications to Radar - early book mentioning ROC&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://cran.r-project.org/package=hmeasure&#34;&gt;hmeasure&lt;/a&gt; The H-Measure and Other Scalar Classification Performance Metrics&lt;/li&gt;
&lt;/ul&gt;

        &lt;script&gt;window.location.href=&#39;https://rviews.rstudio.com/2020/11/12/roc-day-at-barug/&#39;;&lt;/script&gt;
      </description>
    </item>
    
    <item>
      <title>Some R Packages for ROC Curves</title>
      <link>https://rviews.rstudio.com/2019/03/01/some-r-packages-for-roc-curves/</link>
      <pubDate>Fri, 01 Mar 2019 00:00:00 +0000</pubDate>
      
      <guid>https://rviews.rstudio.com/2019/03/01/some-r-packages-for-roc-curves/</guid>
      <description>
        


&lt;p&gt;In a recent &lt;a href=&#34;https://rviews.rstudio.com/2019/01/17/roc-curves/&#34;&gt;post&lt;/a&gt;, I presented some of the theory underlying ROC curves, and outlined the history leading up to their present popularity for characterizing the performance of machine learning models. In this post, I describe how to search CRAN for packages to plot ROC curves, and highlight six useful packages.&lt;/p&gt;
&lt;p&gt;Although I began with a few ideas about packages that I wanted to talk about, like &lt;a href=&#34;https://cran.r-project.org/package=ROCR&#34;&gt;ROCR&lt;/a&gt; and &lt;a href=&#34;https://cran.r-project.org/package=pROC&#34;&gt;pROC&lt;/a&gt;, which I have found useful in the past, I decided to use Gábor Csárdi’s relatively new package &lt;a href=&#34;https://cran.r-project.org/package=pkgsearch&#34;&gt;pkgsearch&lt;/a&gt; to search through CRAN and see what’s out there. The &lt;code&gt;package_search()&lt;/code&gt; function takes a text string as input and uses basic text mining techniques to search all of CRAN. The algorithm searches through package text fields, and produces a score for each package it finds that is weighted by the number of reverse dependencies and downloads.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;library(tidyverse)  # for data manipulation
library(dlstats)    # for package download stats
library(pkgsearch)  # for searching packages&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;After some trial and error, I settled on the following query, which includes a number of interesting ROC-related packages.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;rocPkg &amp;lt;-  pkg_search(query=&amp;quot;ROC&amp;quot;,size=200)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Then, I narrowed down the field to 46 packages by filtering out orphaned packages and packages with a score less than 190.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;rocPkgShort &amp;lt;- rocPkg %&amp;gt;% 
               filter(maintainer_name != &amp;quot;ORPHANED&amp;quot;, score &amp;gt; 190) %&amp;gt;%
               select(score, package, downloads_last_month) %&amp;gt;%
               arrange(desc(downloads_last_month))
head(rocPkgShort)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## # A tibble: 6 x 3
##   score package  downloads_last_month
##   &amp;lt;dbl&amp;gt; &amp;lt;chr&amp;gt;                   &amp;lt;int&amp;gt;
## 1  690. ROCR                    56356
## 2 7938. pROC                    39584
## 3 1328. PRROC                    9058
## 4  833. sROC                     4236
## 5  266. hmeasure                 1946
## 6 1021. plotROC                  1672&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;To complete the selection process, I did the hard work of browsing the documentation for the packages to pick out what I thought would be generally useful to most data scientists. The following plot uses Guangchuang Yu’s &lt;code&gt;dlstats&lt;/code&gt; package to look at the download history for the six packages I selected to profile.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;library(dlstats)
shortList &amp;lt;- c(&amp;quot;pROC&amp;quot;,&amp;quot;precrec&amp;quot;,&amp;quot;ROCit&amp;quot;, &amp;quot;PRROC&amp;quot;,&amp;quot;ROCR&amp;quot;,&amp;quot;plotROC&amp;quot;)
downloads &amp;lt;- cran_stats(shortList)
ggplot(downloads, aes(end, downloads, group=package, color=package)) +
  geom_line() + geom_point(aes(shape=package)) +
  scale_y_continuous(trans = &amp;#39;log2&amp;#39;)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;/post/2019-02-08-some-r-packages-for-roc-curves_files/figure-html/unnamed-chunk-5-1.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;div id=&#34;rocr---2005&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;&lt;a href=&#34;https://cran.r-project.org/package=ROCR&#34;&gt;ROCR&lt;/a&gt; - 2005&lt;/h3&gt;
&lt;p&gt;ROCR has been around for almost 14 years, and has be a rock-solid workhorse for drawing ROC curves. I particularly like the way the &lt;code&gt;performance()&lt;/code&gt; function has you set up calculation of the curve by entering the true positive rate, &lt;code&gt;tpr&lt;/code&gt;, and false positive rate, &lt;code&gt;fpr&lt;/code&gt;, parameters. Not only is this reassuringly transparent, it shows the flexibility to calculate nearly every performance measure for a &lt;a href=&#34;https://en.wikipedia.org/wiki/Binary_classification&#34;&gt;binary classifier&lt;/a&gt; by entering the appropriate parameter. For example, to produce a precision-recall curve, you would enter &lt;code&gt;prec&lt;/code&gt; and &lt;code&gt;rec&lt;/code&gt;. Although there is no vignette, the documentation of the package is very good.&lt;/p&gt;
&lt;p&gt;The following code sets up and plots the default &lt;code&gt;ROCR&lt;/code&gt; ROC curve using a synthetic data set that comes with the package. I will use this same data set throughout this post.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;library(ROCR)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Loading required package: gplots&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## 
## Attaching package: &amp;#39;gplots&amp;#39;&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## The following object is masked from &amp;#39;package:stats&amp;#39;:
## 
##     lowess&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;# plot a ROC curve for a single prediction run
# and color the curve according to cutoff.
data(ROCR.simple)
df &amp;lt;- data.frame(ROCR.simple)
pred &amp;lt;- prediction(df$predictions, df$labels)
perf &amp;lt;- performance(pred,&amp;quot;tpr&amp;quot;,&amp;quot;fpr&amp;quot;)
plot(perf,colorize=TRUE)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;/post/2019-02-08-some-r-packages-for-roc-curves_files/figure-html/unnamed-chunk-6-1.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;proc---2010&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;&lt;a href=&#34;https://CRAN.R-project.org/package=pROC&#34;&gt;pROC&lt;/a&gt; - 2010&lt;/h3&gt;
&lt;p&gt;It is clear from the downloads curve that &lt;code&gt;pROC&lt;/code&gt; is also popular with data scientists. I like that it is pretty easy to get confidence intervals for the Area Under the Curve, &lt;code&gt;AUC&lt;/code&gt;, on the plot.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;library(pROC)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Type &amp;#39;citation(&amp;quot;pROC&amp;quot;)&amp;#39; for a citation.&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## 
## Attaching package: &amp;#39;pROC&amp;#39;&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## The following objects are masked from &amp;#39;package:stats&amp;#39;:
## 
##     cov, smooth, var&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;pROC_obj &amp;lt;- roc(df$labels,df$predictions,
            smoothed = TRUE,
            # arguments for ci
            ci=TRUE, ci.alpha=0.9, stratified=FALSE,
            # arguments for plot
            plot=TRUE, auc.polygon=TRUE, max.auc.polygon=TRUE, grid=TRUE,
            print.auc=TRUE, show.thres=TRUE)


sens.ci &amp;lt;- ci.se(pROC_obj)
plot(sens.ci, type=&amp;quot;shape&amp;quot;, col=&amp;quot;lightblue&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Warning in plot.ci.se(sens.ci, type = &amp;quot;shape&amp;quot;, col = &amp;quot;lightblue&amp;quot;): Low
## definition shape.&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;plot(sens.ci, type=&amp;quot;bars&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;/post/2019-02-08-some-r-packages-for-roc-curves_files/figure-html/unnamed-chunk-7-1.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;prroc---2014&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;&lt;a href=&#34;https://cran.r-project.org/package=PRROC&#34;&gt;PRROC&lt;/a&gt; - 2014&lt;/h3&gt;
&lt;p&gt;Although not nearly as popular as &lt;code&gt;ROCR&lt;/code&gt; and &lt;code&gt;pROC&lt;/code&gt;, &lt;code&gt;PRROC&lt;/code&gt; seems to be making a bit of a comeback lately. The terminology for the inputs is a bit eclectic, but once you figure that out the &lt;code&gt;roc.curve()&lt;/code&gt; function plots a clean ROC curve with minimal fuss. &lt;code&gt;PRROC&lt;/code&gt; is really set up to do precision-recall curves as the &lt;a href=&#34;https://cran.r-project.org/web/packages/PRROC/vignettes/PRROC.pdf&#34;&gt;vignette&lt;/a&gt; indicates.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;library(PRROC)

PRROC_obj &amp;lt;- roc.curve(scores.class0 = df$predictions, weights.class0=df$labels,
                       curve=TRUE)
plot(PRROC_obj)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;/post/2019-02-08-some-r-packages-for-roc-curves_files/figure-html/unnamed-chunk-8-1.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;plotroc---2014&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;&lt;a href=&#34;https://CRAN.R-project.org/package=plotROC&#34;&gt;plotROC&lt;/a&gt; - 2014&lt;/h3&gt;
&lt;p&gt;&lt;code&gt;plotROC&lt;/code&gt; is an excellent choice for drawing ROC curves with &lt;code&gt;ggplot()&lt;/code&gt;. My guess is that it appears to enjoy only limited popularity because the documentation uses medical terminology like “disease status” and “markers”. Nevertheless, the documentation, which includes both a &lt;a href=&#34;https://cran.r-project.org/web/packages/plotROC/vignettes/examples.html&#34;&gt;vignette&lt;/a&gt; and a &lt;a href=&#34;https://sachsmc.shinyapps.io/plotROC/&#34;&gt;Shiny application&lt;/a&gt;, is very good.&lt;/p&gt;
&lt;p&gt;The package offers a number of feature-rich &lt;code&gt;ggplot()&lt;/code&gt; geoms that enable the production of elaborate plots. The following plot contains some styling, and includes &lt;a href=&#34;https://en.wikipedia.org/wiki/Binomial_proportion_confidence_interval#Clopper%E2%80%93Pearson_interval&#34;&gt;Clopper and Pearson (1934) exact method&lt;/a&gt; confidence intervals.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;library(plotROC)
rocplot &amp;lt;- ggplot(df, aes(m = predictions, d = labels))+ geom_roc(n.cuts=20,labels=FALSE)
rocplot + style_roc(theme = theme_grey) + geom_rocci(fill=&amp;quot;pink&amp;quot;) &lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;/post/2019-02-08-some-r-packages-for-roc-curves_files/figure-html/unnamed-chunk-9-1.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;precrec---2015&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;&lt;a href=&#34;https://cran.r-project.org/package=precrec&#34;&gt;precrec&lt;/a&gt; - 2015&lt;/h3&gt;
&lt;p&gt;&lt;code&gt;precrec&lt;/code&gt; is another library for plotting ROC and precision-recall curves.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;library(precrec)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## 
## Attaching package: &amp;#39;precrec&amp;#39;&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## The following object is masked from &amp;#39;package:pROC&amp;#39;:
## 
##     auc&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;precrec_obj &amp;lt;- evalmod(scores = df$predictions, labels = df$labels)
autoplot(precrec_obj)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;/post/2019-02-08-some-r-packages-for-roc-curves_files/figure-html/unnamed-chunk-10-1.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;p&gt;Parameter options for the &lt;code&gt;evalmod()&lt;/code&gt; function make it easy to produce basic plots of various model features.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;precrec_obj2 &amp;lt;- evalmod(scores = df$predictions, labels = df$labels, mode=&amp;quot;basic&amp;quot;)
autoplot(precrec_obj2)   &lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;/post/2019-02-08-some-r-packages-for-roc-curves_files/figure-html/unnamed-chunk-11-1.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;rocit---2019&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;&lt;a href=&#34;https://cran.r-project.org/package=ROCit&#34;&gt;ROCit&lt;/a&gt; - 2019&lt;/h3&gt;
&lt;p&gt;&lt;code&gt;ROCit&lt;/code&gt; is a new package for plotting ROC curves and other binary classification visualizations that rocketed onto the scene in January, and is climbing quickly in popularity. I would never have discovered it if I had automatically filtered my original search by downloads. The default plot includes the location of the &lt;a href=&#34;https://en.wikipedia.org/wiki/Youden%27s_J_statistic&#34;&gt;Yourden’s J Statistic&lt;/a&gt;.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;library(ROCit)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Warning: package &amp;#39;ROCit&amp;#39; was built under R version 3.5.2&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;ROCit_obj &amp;lt;- rocit(score=df$predictions,class=df$labels)
plot(ROCit_obj)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;/post/2019-02-08-some-r-packages-for-roc-curves_files/figure-html/unnamed-chunk-12-1.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;p&gt;Several other visualizations are possible. The following plot shows the cumulative densities of the positive and negative responses. The KS statistic shows the maximum distance between the two curves.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;ksplot(ROCit_obj)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;/post/2019-02-08-some-r-packages-for-roc-curves_files/figure-html/unnamed-chunk-13-1.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;p&gt;In this attempt to dig into CRAN and uncover some of the resources R contains for plotting ROC curves and other binary classifier visualizations, I have only scratched the surface. Moreover, I have deliberately ignored the many packages available for specialized applications, such as &lt;a href=&#34;https://cran.r-project.org/package=survivalROC&#34;&gt;survivalROC&lt;/a&gt; for computing time-dependent ROC curves from censored survival data, and &lt;a href=&#34;https://cran.r-project.org/web/packages/cvAUC/index.html&#34;&gt;cvAUC&lt;/a&gt;, which contains functions for evaluating cross-validated AUC measures. Nevertheless, I hope that this little exercise will help you find what you are looking for.&lt;/p&gt;
&lt;/div&gt;

        &lt;script&gt;window.location.href=&#39;https://rviews.rstudio.com/2019/03/01/some-r-packages-for-roc-curves/&#39;;&lt;/script&gt;
      </description>
    </item>
    
    <item>
      <title>ROC Curves</title>
      <link>https://rviews.rstudio.com/2019/01/17/roc-curves/</link>
      <pubDate>Thu, 17 Jan 2019 00:00:00 +0000</pubDate>
      
      <guid>https://rviews.rstudio.com/2019/01/17/roc-curves/</guid>
      <description>
        


&lt;p&gt;I have been thinking about writing a short post on R resources for working with (&lt;a href=&#34;https://en.wikipedia.org/wiki/Receiver_operating_characteristic&#34;&gt;ROC&lt;/a&gt;) curves, but first I thought it would be nice to review the basics. In contrast to the usual (usual for data scientists anyway) machine learning point of view, I’ll frame the topic closer to its historical origins as a portrait of practical decision theory.&lt;/p&gt;
&lt;p&gt;ROC curves were invented during WWII to help radar operators decide whether the signal they were getting indicated the presence of an enemy aircraft or was just noise. (&lt;a href=&#34;https://web.stanford.edu/~yesavage/ROC%20Slides%20OHara.ppt&#34;&gt;O’Hara et al.&lt;/a&gt; specifically refer to the Battle of Britain, but I haven’t been able to track that down.)&lt;/p&gt;
&lt;p&gt;I am relying comes from James Egan’s classic text &lt;a href=&#34;https://amzn.to/2FgC3BH&#34;&gt;&lt;em&gt;signal Detection Theory and ROC Analysis&lt;/em&gt;&lt;/a&gt;) for the basic setup of the problem. It goes something like this: suppose there is an observed quantity (maybe the amplitude of the radar blip), X, that could indicate either the presence of a meaningful signal (e.g. from a &lt;a href=&#34;https://en.wikipedia.org/wiki/Messerschmitt_Bf_109&#34;&gt;Messerschmitt&lt;/a&gt;) embedded in noise, or just noise alone (geese). When viewing X in some small interval of time, we would like to establish a threshold or cutoff value, c, such that if X &amp;gt; c we will we can be pretty sure we are observing a signal and not just noise. The situation is illustrated in the little animation below.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;library(tidyverse)
library(gganimate)  #for animation
library(magick)     # to put animations sicde by side&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;We model the noise alone as random draws from a N(0,1) distribution, signal plus noise as draws from N(s_mean, S_sd), and we compute two conditional distributions. The probability of a “Hit” or P(X &amp;gt; c | a signal is present) and the probability of a “False Alarm”, P(X &amp;gt; c | noise only).&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;s_mean &amp;lt;- 2  # signal mean
s_sd &amp;lt;- 1.1   # signal standard deviation

x &amp;lt;- seq(-5,5,by=0.01) # range of signal
signal &amp;lt;- rnorm(100000,s_mean,s_sd)
noise &amp;lt;- rnorm(100000,0,1)

PX_n &amp;lt;- 1 - pnorm(x, mean = 0, sd = 1) # P(X &amp;gt; c | noise only) = False alarm rate
PX_sn &amp;lt;- 1 - pnorm(x, mean = s_mean, sd = s_sd) # P(X &amp;gt; c | signal plus noise) = Hit rate&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;We plot these two distributions in the left panel of the animation for different values of the cutoff threshold threshold.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;threshold &amp;lt;- data.frame(val = seq(from = .5, to = s_mean, by = .2))

dist &amp;lt;- 
  data.frame(signal = signal, noise = noise) %&amp;gt;% 
  gather(data, value) %&amp;gt;% 
  ggplot(aes(x = value, fill = data)) +
  geom_density(trim = TRUE, alpha = .5) +
  ggtitle(&amp;quot;Conditional Distributions&amp;quot;) +
  xlab(&amp;quot;observed signal&amp;quot;)  + 
  scale_fill_manual(values = c(&amp;quot;pink&amp;quot;, &amp;quot;blue&amp;quot;))

p1 &amp;lt;- dist + geom_vline(data = threshold, xintercept = threshold$val, color = &amp;quot;red&amp;quot;) +
            transition_manual(threshold$val)
p1 &amp;lt;- animate(p1)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;And, we plot the ROC curve for our detection system in the right panel. Each point in this plot corresponds to one of the cutoff thresholds in the left panel.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;df2 &amp;lt;- data.frame(x, PX_n, PX_sn)
roc &amp;lt;- ggplot(df2) +
  xlab(&amp;quot;P(X | n)&amp;quot;) + ylab(&amp;quot;P(X | sn)&amp;quot;) +
  geom_line(aes(PX_n, PX_sn)) +
  geom_abline(slope = 1) +
  ggtitle(&amp;quot;ROC Curve&amp;quot;) + 
  coord_equal()

q1 &amp;lt;- roc +
        geom_point(data = threshold, aes(1-pnorm(val),
                          1- pnorm(val, mean = s_mean, sd = s_sd)), 
                          color = &amp;quot;red&amp;quot;) +
                          transition_manual(val)

q1 &amp;lt;- animate(q1)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;(The slick trick of getting these two animation panels to line up in the same frame is due to a helper function from Thomas Pedersen and Patrick Touche that can be found &lt;a href=&#34;https://github.com/thomasp85/gganimate/issues/226&#34;&gt;here&lt;/a&gt;)&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;combine_gifs(p1,q1)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;/post/2019-01-06-roc-curves_files/figure-html/unnamed-chunk-6-1.gif&#34; /&gt;&lt;!-- --&gt;&lt;/p&gt;
&lt;p&gt;Notice that as the cutoff line moves further to the right, giving the decision maker a better chance of making a correct decision, the corresponding point moves down the ROC curve towards a lower Hit rate. This illustrates the fundamental tradefoff between hit rate and false alarm rate in the underlying decision problem. For any given problem, a decision algorithm or classifier will live on some ROC curve in false alarm / hit rate space. Improving the hit rate usually come at the cost of increasing the probability of more false alarms.&lt;/p&gt;
&lt;p&gt;The simulation code also lets you vary s_mean, the mean of the signal, Setting this to a large value (maybe 5), will sufficiently separate the signal from the noise, and you will get the kind of perfect looking ROC curve you may be accustomed to seeing produced by your best classification models.&lt;/p&gt;
&lt;p&gt;The usual practice in machine learning applications is to compute the area under the ROC curve, &lt;a href=&#34;https://en.wikipedia.org/wiki/Receiver_operating_characteristic#Area_under_the_curve&#34;&gt;AUC&lt;/a&gt;. This has become the “gold standard” for evaluating classifiers. Given a choice between different classification algorithms, data scientists routinely select the classifier with the highest AUC. The intuition behind this is compelling: given that the ROC is always a monotone increasing, concave downward curve, the best possible curve will have an inflection point in the upper left hand corner and an AUC approaching one (All of the area in ROC space).&lt;/p&gt;
&lt;p&gt;Unfortunately, the automatic calculation and model selection of the AUC discourages analysis of how the properties and weaknesses of ROC curves may pertain to the problem at hand. Keeping sight of the decision theory point of view may help to protect against the spell of mechanistic thinking encouraged by powerful algorithms. Although, automatically selecting a classifier based on the value of the AUC may make good sense most of the time, things can go wrong. For example, it is not uncommon for analysts to interpret AUC as a measure of the accuracy of the classifier. But, the AUC is not a measure of accuracy as a little thought about the decision problem would make clear. The irony here is that there was a time, not too long ago, when people thought it was necessary to argue that the AUC is a better measure than accuracy for evaluating machine learning algorithms. For example, have a look at the &lt;a href=&#34;https://www.cse.ust.hk/nevinZhangGroup/readings/yi/Bradley_PR97.pdf&#34;&gt;1997 paper&lt;/a&gt; by Andrew Bradley where he concludes that &lt;em&gt;“…AUC be used in preference to overall accuracy for ‘single number’ evaluation of machine learning algorithms”.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;What does the AUC measure? For the binary classification problem of our simple signal processing example, a little calculus will show that the AUC is the probability that a randomly drawn interval with a signal present will produce a higher X value than a signal interval containing noise alone. See &lt;a href=&#34;https://link.springer.com/content/pdf/10.1007%2Fs10994-009-5119-5.pdf&#34;&gt;&lt;em&gt;Hand (2009)&lt;/em&gt;&lt;/a&gt;, and the very informative &lt;a href=&#34;https://stats.stackexchange.com/questions/180638/how-to-derive-the-probabilistic-interpretation-of-the-auc&#34;&gt;&lt;em&gt;StackExchange&lt;/em&gt;&lt;/a&gt; discussion for the math.&lt;/p&gt;
&lt;p&gt;Also note, that in the paper just cited, Hand examines some of the deficiencies of the AUC. His discussion provides an additional incentive for keeping the decision theory tradeoff in mind when working with ROC curves. Hand concludes:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;…it [AUC] is fundamentally incoherent in terms of misclassification costs: the AUC uses different misclassification cost distributions for different classifiers. This means that using the AUC is equivalent to using different metrics to evaluate different classification rules.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;and goes on to propose the &lt;strong&gt;H measure&lt;/strong&gt; for ranking classifiers. (See the R package &lt;a href=&#34;https://cran.r-project.org/package=hmeasure&#34;&gt;hmeasure&lt;/a&gt;) Following up on this will have to be an investigation for another day.&lt;/p&gt;
&lt;p&gt;Our discussion in this post has taken us part way along just one path through the enormous literature on ROC curves which could not be totally explored in a hundred posts. I will just mention that not long after its inception, ROC analysis was used to establish a conceptual framework for problems relating to sensation and perception in the field of psychophysics (&lt;a href=&#34;https://psych.nyu.edu/pelli/pubs/pelli1995methods.pdf&#34;&gt;&lt;em&gt;Pelli and Farell (1995)&lt;/em&gt;&lt;/a&gt;) and thereafter applied to decision problems in Medical Diagnostics, (&lt;a href=&#34;https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3755824/#B26&#34;&gt;&lt;em&gt;Hajian-Tilaki (2013)&lt;/em&gt;&lt;/a&gt;), National Intelligence (&lt;a href=&#34;https://www.nap.edu/read/13062/chapter/7&#34;&gt;&lt;em&gt;McCelland (2011)&lt;/em&gt;&lt;/a&gt;) and just about any field that collects data to support decision making.&lt;/p&gt;
&lt;p&gt;If you are interested in delving deeper into ROC curves, the references in papers mentioned above may help to guide further exploration.&lt;/p&gt;

        &lt;script&gt;window.location.href=&#39;https://rviews.rstudio.com/2019/01/17/roc-curves/&#39;;&lt;/script&gt;
      </description>
    </item>
    
  </channel>
</rss>
