Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upQuestion regarding precision@n and roc(@n?) #120
Comments
|
You are correct. roc @ n samples is also a popular choice. Will put this on my todo list. I do not know any particular reason why people usually report the full ROC but reporting roc @ n is not uncommon :) One thought is roc @ n is a point evaluation but roc considers the full picture. |
|
Thank you very much for your answer. If you say normal roc considers the full picture, does a high roc in this case just mean that on average the ground-truth true outliers are ranked ahead of most inlier points, but not necessarily in the top n? |
|
@yzhao062 - Need some help updating the documentation? I'm happy to open a pull request for this |
|
@evanmiller29 sorry for the delay. pr is always welcome :) |
|
@Henlam I think the most relevant paper for this topic is:
Hope this helps |
|
Thanks for passing through the papers. I'm having a reading at the moment. I'm not a 100% outlier detection (more normal ML) person but I'm keen to be involved in the project. You OK with that? |
|
This metric always give the same ROC regardless of level of KNN contamination.
This metric constantly changes depending on the level of KNN contamination. Is this normal.
|
See this one: #144 if y_train_pred is the predicated scores, then it is normal. If y_train_pred is the predicted labels, then it is wried. |
|
Indeed it is quite weird, it is the predicted labels. y_train_pred = clf.labels_ # binary labels (0: inliers, 1: outliers) My reproducible code:
|
|
I run the code. the reason is that you only get 122 outliers among 100000. So you need to change contamination to be small enough to see a difference (<122). Otherwise, it is misclassed anyway. However, you should not use ROC to evaluate label but score. |
|
why precision is used to decide the model... outlier should not be detected as inlier as it would be costliest error... so, false negative rate should be less.. so type 2 error will be taken care to decide the model. |
I think it is indeed precision @ rank n or precision @ rank k, which is still slightly different than precision. |


Hello,
first and foremost, thank you for building this wrapper it is of great use for me and many others.
I have question regarding the evaluation:
Most outlier detection evaluation settings work by setting the ranking number n equal the number of outliers (aka contamination) and so did I in my experiments.
My thought concerning the ROC and AUC score was:
In my case the precision@n of my chosen algorithms are valued in the range of 0.2-0.4 because it is a difficult dataset. However, the AUC score is quite high at the same.
I would appreciate any thoughts on this since I am fairly new to the topic and might not grasp the intuition of the ROC curve for this task.
Best regards
Hlam