异常延伸产生ROC似乎mirror FP/FN rate
I am using the anomaly extention against a artificial dataset. I use three algoritms to assign an anomaly score. These are K-NN Global, uCBLOF and LOF. My dataset contains a label of the anomalies that are supposed to show up. I use the Generate ROC to measure performance. What GenerateROC does first is choose a treshold for outlier score and add a boolean "prediction". I noticed that in the resulting confusion matrices the FP and FN count are always identical. It seems as if it choose the treshold based on the label to generate the outliers. That seems odd.
The dataset contains 1676 items labeled 'true'.
Pls see below a historgram of the scores that uses the label as color. As can be seen it fails to assign a high score to the outliers. This is as aspected because our dataset contains global anomalies. Not the Y-axis is logarithmic for readability purposes.
Below that is the resulting confusion matrix from Generate ROC. It contains 1676 FN's which is explainable if you look at the score.
However it also contains 1676 FP's which is suspicious. I looked in the dataset and there are indeed 1676 predictions with the value "true" so it is not a drawing issue.
I am overlooking something?
The dataset contains 1676 items labeled 'true'.
Pls see below a historgram of the scores that uses the label as color. As can be seen it fails to assign a high score to the outliers. This is as aspected because our dataset contains global anomalies. Not the Y-axis is logarithmic for readability purposes.
Below that is the resulting confusion matrix from Generate ROC. It contains 1676 FN's which is explainable if you look at the score.
However it also contains 1676 FP's which is suspicious. I looked in the dataset and there are indeed 1676 predictions with the value "true" so it is not a drawing issue.
I am overlooking something?
Tagged:
0
Best Answer
-
MaartenK MemberPosts:17Contributor II
Answers