Analysis of results of anomaly detection algorithm

crimson_crowcrimson_crow MemberPosts:3Contributor I
edited May 2021 inHelp
Hello there. I have built a process in Rapid Miner using k-nn Global Anomaly Score operator in order to detect outliers in my dataset.
The problem is, that I choose attributes in appropriate operators and just in dead-end to describe what is going on vizualization tab. I need help in understanding, what is going on process. I attach two 2 examples of .csv files, just for example. I just can`t understand why ourliers score is counting for each year in dataset. My guess is that I run all dataset through the algorithm in order to get anomaly score and then I should determine which value should be anomalous.
I`m looking for answers in how to understand vizualization tab.
p.s. One dataset is mix of statistics of sales of cars in Europe, and the other one is specific for Germany.

Answers

  • ceaperezceaperez MemberPosts:447Unicorn
    Hi@crimson_crow,

    This operator assign an anomaly score to each observation of an attribute based in kNN with some selected distance algorithm.
    For interpretation, higher score corresponds to an most probably outliner.
    In visualizatión you can create a graph with the atribute values and other with the corresponding score i.e. the color of point corresponds to the anomaly score.

    Best


    lionelderkrikor
Sign InorRegisterto comment.