Collectives™ on Stack Overflow

Find centralized, trusted content and collaborate around the technologies you use most.

Learn more about Collectives

Teams

Q&A for work

Connect and share knowledge within a single location that is structured and easy to search.

Learn more about Teams

I recieve this error while trying to obtain the recall score.

X_test = test_pos_vec + test_neg_vec
Y_test = ["pos"] * len(test_pos_vec) + ["neg"] * len(test_neg_vec)
recall_average = recall_score(Y_test, y_predict, average="binary")
print(recall_average)

This will give me:

    C:\Users\anca_elena.moisa\AppData\Local\Programs\Python\Python36\lib\site-packages\sklearn\metrics\classification.py:1030: FutureWarning: elementwise comparison failed; returning scalar instead, but in the future will perform elementwise comparison
  if pos_label not in present_labels:
Traceback (most recent call last):
  File "G:/PyCharmProjects/NB/accuracy/script.py", line 812, in <module>
    main()
  File "G:/PyCharmProjects/NB/accuracy/script.py", line 91, in main
    evaluate_model(model, train_pos_vec, train_neg_vec, test_pos_vec, test_neg_vec, False)
  File "G:/PyCharmProjects/NB/accuracy/script.py", line 648, in evaluate_model
    recall_average = recall_score(Y_test, y_predict, average="binary")
  File "C:\Users\anca_elena.moisa\AppData\Local\Programs\Python\Python36\lib\site-packages\sklearn\metrics\classification.py", line 1359, in recall_score
    sample_weight=sample_weight)
  File "C:\Users\anca_elena.moisa\AppData\Local\Programs\Python\Python36\lib\site-packages\sklearn\metrics\classification.py", line 1036, in precision_recall_fscore_support
    (pos_label, present_labels))
ValueError: pos_label=1 is not a valid label: array(['neg', 'pos'],
      dtype='<U3')

I tried to transform 'pos' in 1 and 'neg' in 0 this way:

for i in range(len(Y_test)):
     if 'neg' in Y_test[i]:
         Y_test[i] = 0
     else:
         Y_test[i] = 1

But this is giving me another error:

    C:\Users\anca_elena.moisa\AppData\Local\Programs\Python\Python36\lib\site-packages\sklearn\metrics\classification.py:181: FutureWarning: elementwise comparison failed; returning scalar instead, but in the future will perform elementwise comparison
  score = y_true == y_pred
Traceback (most recent call last):
  File "G:/PyCharmProjects/NB/accuracy/script.py", line 812, in <module>
    main()
  File "G:/PyCharmProjects/NB/accuracy/script.py", line 91, in main
    evaluate_model(model, train_pos_vec, train_neg_vec, test_pos_vec, test_neg_vec, False)
  File "G:/PyCharmProjects/NB/accuracy/script.py", line 648, in evaluate_model
    recall_average = recall_score(Y_test, y_predict, average="binary")
  File "C:\Users\anca_elena.moisa\AppData\Local\Programs\Python\Python36\lib\site-packages\sklearn\metrics\classification.py", line 1359, in recall_score
    sample_weight=sample_weight)
  File "C:\Users\anca_elena.moisa\AppData\Local\Programs\Python\Python36\lib\site-packages\sklearn\metrics\classification.py", line 1026, in precision_recall_fscore_support
    present_labels = unique_labels(y_true, y_pred)
  File "C:\Users\anca_elena.moisa\AppData\Local\Programs\Python\Python36\lib\site-packages\sklearn\utils\multiclass.py", line 103, in unique_labels
    raise ValueError("Mix of label input types (string and number)")
ValueError: Mix of label input types (string and number)

What I am trying to do is to obtain the metrics: accuracy, precision, recall, f_measure. With average='weighted', I obtain the same result: accuracy=recall. I guess this is not correct, so I changed the average='binary', but I have those errors. Any ideas?

Sorry, I renamed the variable recall_average to recall when I posted this. Now I edited the post. – Mr. Wizard May 6, 2018 at 18:28 They are from python import from sklearn.metrics import roc_curve, auc, f1_score, recall_score, precision_score – Mr. Wizard May 6, 2018 at 18:29 @MihaiAlexandru-Ionut recall_score() is function from sklearn. As the OP is doing a binary classification, just set pos_label='neg' in the call to recall_score(). – AChampion May 6, 2018 at 18:30 as recall and precision have two scores i) In terms of -ve value ii) In terms of +ve value So you need to pass one of these value in pos_label so that function can return score in terms of that pos label – Prince Kumar Sharma Apr 30, 2019 at 6:37 I am getting this error now ValueError: pos_label='neg' is not a valid label: array(['0', '1'], dtype='<U1'). I solved this using '0' instead of 'neg', It was given in my matrix – 1UC1F3R616 May 3, 2020 at 19:23

When you face this error it means the values of your target variable are not the expected one for recall_score(), which by default are 1 for positive case and 0 for negative case [This also applies to precision_score()]

From the error you mentioned:

pos_label=1 is not a valid label: array(['neg', 'pos']

It is clear that values for your positive scenarios is pos instead of 1 and for the negative neg instead of 0.

Then you have two options to fix this mismatch:

  • Changing the value default in the recall_score() to consider positive scenarios when pos appears with:
  • recall_average = recall_score(Y_test, y_predict, average="binary", pos_label='pos') 
    
  • Changing the values of the target variable in your dataset to be 1 or 0
  • Y_test = Y_test.map({'pos': 1, 'neg': 0}).astype(int)
            

    Thanks for contributing an answer to Stack Overflow!

    • Please be sure to answer the question. Provide details and share your research!

    But avoid

    • Asking for help, clarification, or responding to other answers.
    • Making statements based on opinion; back them up with references or personal experience.

    To learn more, see our tips on writing great answers.