Interpretation of ROC Analysis

Muhammed_Fatih_Muhammed_Fatih_ MemberPosts:93Maven
edited February 2020 inHelp
Hello Community,

I have derived the following ROC curves by considering four classification models:



As you see, SVM and k-NN generates a curve where shades respectively exist.

Would it be a correct implication out of the graph to say that only k-NN and SVM were able to learn based on the given dataset and the resting two (DT and NB) were not?

What does the shade mean in detail? I would interpret them as the learning interval deviation which generated the curve between the shade course in mean.

I thank you in advance for your help!

Best regards,

Fatih
Jasmine_

Best Answer

Answers

  • [Deleted User][Deleted User] Posts:0Learner III
    Hello

    you can watch this video and I hope can help you
    https://academy.www.kenlockard.com/learn/video/finding-the-right-model

    All the best
    mbs
    Muhammed_Fatih_ Jasmine_
  • varunm1varunm1 Moderator, MemberPosts:1,207Unicorn
    edited February 2020
    Hello@Muhammed_Fatih_

    Are you sure Decision tree and NB are not learning? I see that their AUC values are 1 or closer to 1 based on the ROC curves. If what I think is correct, then DT and NB are discriminating classes with very high accuracy compared to SVM and KNN.
    Regards,
    Varun
    https://www.varunmandalapu.com/

    Be Safe. Follow precautions and Maintain Social Distancing

    Jasmine_
  • Muhammed_Fatih_Muhammed_Fatih_ MemberPosts:93Maven
    Hello@mbs,

    thank you for the link!

    Helloo@varunm1,

    I am not sure whether they learn or not. But it looks like an indicator for Overfitting when I see that such high values are reached in comparison to SVM and k-NN. How do you see that? Would you interprete DT and NB also as appropriate solutions here? If yes, why?
    [Deleted User] Jasmine_
  • varunm1varunm1 Moderator, MemberPosts:1,207Unicorn
    Hello@Muhammed_Fatih_

    I can comment that based on data and the type of analysis you were doing. If its a split validation, then there is a chance you might get high performance like this randomly. There are also other factors like temporal characteristics in data and many other checks that you need to do when you get this kind of very good results.

    Regards,
    Varun
    https://www.varunmandalapu.com/

    Be Safe. Follow precautions and Maintain Social Distancing

    Jasmine_
  • Muhammed_Fatih_Muhammed_Fatih_ MemberPosts:93Maven
    Hello@varunm1,

    谢谢你的回答!我用交叉有效ation because studies have shown that it generates more accurate predictions in comparison to Split validation.
    Jasmine_
  • varunm1varunm1 Moderator, MemberPosts:1,207Unicorn
    Hello@Muhammed_Fatih_

    Cross-validation is a good validation method, but if your data has some temporal (time-dependent) characteristics and confounding relationships then it might overestimate performance some times. But if you think there is none, then the models might be doing good. Different models work well for different types of data.

    You can also split your original data 70:30 or 80:20 based on the size of your data and then cross-validated on the major portion and test the minor portion to see how the model is doing.
    Regards,
    Varun
    https://www.varunmandalapu.com/

    Be Safe. Follow precautions and Maintain Social Distancing

    Pavithra_Rao Jasmine_
  • Muhammed_Fatih_Muhammed_Fatih_ MemberPosts:93Maven
    edited February 2020
    Hello@varunm1,
    hello@mbs,

    thank you for your answers!

    To come back and to refine the initial question: Do you think that the marked ROC course is common if the ROC curve goes hand in hand with the optimum? Is this possible in general?


    [Deleted User] Jasmine_
Sign InorRegisterto comment.