-
Notifications
You must be signed in to change notification settings - Fork 308
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Show the number of not classified instances for multi-label models (#3964) #4150
base: master
Are you sure you want to change the base?
Show the number of not classified instances for multi-label models (#3964) #4150
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
y_true_count = (test_classes[:, num].tolist()).count(i) | ||
table[i].append(y_true_count - sum(table[i][1:])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The number of "Not classified" was originally part of the table but dropped earlier:
# Don't show the Not classified row in the table output
if "__NOT_CLASSIFIED__" in labels and not is_multilabel:
confusion_matrix_table.pop(labels.index("__NOT_CLASSIFIED__"))
It may be cleaner to avoid dropping it in the first place.
@@ -499,8 +504,9 @@ def train(self, importance_cutoff=0.15, limit=None): | |||
|
|||
tracking_metrics["report"] = report | |||
|
|||
# no confidence threshold - no need to handle 'Not classified' instances |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this comment is helpful in the context of reviewing this PR. However, without this context, the comment feels out of place (after merging the PR).
@@ -567,8 +573,9 @@ def train(self, importance_cutoff=0.15, limit=None): | |||
labels=confidence_class_names, | |||
) | |||
) | |||
# with confidence threshold - handle 'Not classified' instances by passing y_test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same case as the comment above.
Hi @suhaibmujahid, thanks you for feedback!
Done :)
I just run the command in issue-3964-run-log.txt again and checked the confusion matrix of "Not classified" class, it gives [[6, 0], [1, 1]] for Confidence threshold > 0.6. Now I am not sure how to interpret this. This confusion matrix doesn't seem to correspond to a specific class; but we would like to see the number of 'Not classified' instances in each specific class' visualizer. In my current implementation, I calculate the number of "Not classified" class-wise, i.e., any non-classified samples from the test set are counted. Here, the test set is of size 200, so any non-classified samples among these 200 are counted. Also, this way, the total sum of all elements in each matrix is precisely the test size, 200. Says instead of this using current implementation
We would have this using confusion matrix
Should I remove these comments or how would you prefer it? |
Fix #3964 - Show the number of not classified instances for multi-label models @suhaibmujahid
Done by
Run log attached.
issue-3964-run-log.txt