How to interpret a ROC plot?

csoarescsoares MemberPosts:13Contributor II
edited November 2019 inHelp
Hi,

I generate a ROC plot with the process given below and I get a ROC plot. I assume that the red line (ROC) is the proportion of TP against the proportion of FP but I can't understand what the blue line (ROC (Thresholds)) represents. Can anyone explain?

Regards,
Carlos







<参数键= " label_column" value="2"/>





<操作符名称= " NaiveBayes "class="NaiveBayes">



















Tagged:

Answers

  • michaelglovenmichaelgloven RapidMiner Certified Analyst, MemberPosts:46Guru

    I have the same question, I'm sure its a simple answer but can't find an explanation in the documentation.

    Thanks!

  • sgenzersgenzer Administrator, Moderator, Employee, RapidMiner Certified Analyst, Community Manager, Member, University Professor, PM ModeratorPosts:2,959Community Manager

    hello@michaelgloven- so for these kind of fundamental data science background topics I usually go "old school" with books (yes paper). My go-to texts are "Data Mining for the Masses" by Dr. Matthew North, and "Predictive Analytics and Data Mining" by Kotu & Deshpande. Both are excellent and are full of explicit examples using RapidMiner. For your question about ROC curves, Chapter 8 of Kotu & Deshpande is all about Model Evaluation which starts with a long explanation of ROC.

    Scott

  • michaelglovenmichaelgloven RapidMiner Certified Analyst, MemberPosts:46Guru

    many thanks Scott. Figure 8.5 on page 269 of the Kotu book also has the ROC (thresholds) curve without explanation. It looks like the inverse of the ROC curve, probably a simple explanation, but still a mystery to me. I'll check out your second resource suggestion.

    Mike

  • tftemmetftemme Administrator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, RMResearcher, MemberPosts:164RM Research

    Hi@michaelgloven,

    Each point of the ROC curve is the rate of true positives (or proportion of TP as it called in the first post) vs the rate of false positives (proportion of FP) for a specific applied threshold on the confidence of the corresponding classifier.

    The ROC (thresholds) curve just shows this confidence threshold (sometimes also called confidence cut).

    Hopes this helps,

    最好的问候,
    Fabian

    MartinLiebig Thomas_Ott
  • michaelglovenmichaelgloven RapidMiner Certified Analyst, MemberPosts:46Guru

    Fabian, appreciate your explanation on the thresholds...I figured it was simple, but needed an expert to point it out!

    sgenzer Tghadially
  • PatricioWolffPatricioWolff MemberPosts:3Contributor I
    In ROC(Threshold) curve the vertical axis indicates the threshold value and the horizontal axis shows the false positive rate.
    Tghadially
  • SGolbertSGolbert RapidMiner Certified Analyst, MemberPosts:344Unicorn
    edited September 2019
    Very roughly you have to look out for two things:
    1. The area under the curve (AUC): is the integral over the curve. Higher values translate to higher accuracy.
    2. The form of the curve: ideally the curve should be as smoother as possible. Large "jumps" indicate that the model is sensitive to small changes in the dataset. The initial jump is excepted.
    Tghadially
Sign InorRegisterto comment.