相关文章推荐
严肃的皮蛋  ·  sql like 多个值 - CSDN文库·  2 月前    · 
会开车的烈马  ·  使用 Apache ...·  3 月前    · 
焦虑的柑橘  ·  Python ...·  8 月前    · 
Collectives™ on Stack Overflow

Find centralized, trusted content and collaborate around the technologies you use most.

Learn more about Collectives

Teams

Q&A for work

Connect and share knowledge within a single location that is structured and easy to search.

Learn more about Teams

TypeError: ufunc 'subtract' did not contain a loop with signature matching types dtype('<U1') dtype('<U1') dtype('<U1')

Ask Question

Strange error from numpy via matplotlib when trying to get a histogram of a tiny toy dataset. I'm just not sure how to interpret the error, which makes it hard to see what to do next.

Didn't find much related, though this nltk question and this gdsCAD question are superficially similar.

I intend the debugging info at bottom to be more helpful than the driver code, but if I've missed something, please ask. This is reproducible as part of an existing test suite.

        if n > 1:
            return diff(a[slice1]-a[slice2], n-1, axis=axis)
        else:
>           return a[slice1]-a[slice2]
E           TypeError: ufunc 'subtract' did not contain a loop with signature matching types dtype('<U1') dtype('<U1') dtype('<U1')
../py2.7.11-venv/lib/python2.7/site-packages/numpy/lib/function_base.py:1567: TypeError
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> entering PDB >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> py2.7.11-venv/lib/python2.7/site-packages/numpy/lib/function_base.py(1567)diff()
-> return a[slice1]-a[slice2]
(Pdb) bt
[...]
py2.7.11-venv/lib/python2.7/site-packages/matplotlib/axes/_axes.py(5678)hist()
-> m, bins = np.histogram(x[i], bins, weights=w[i], **hist_kwargs)
  py2.7.11-venv/lib/python2.7/site-packages/numpy/lib/function_base.py(606)histogram()
-> if (np.diff(bins) < 0).any():
> py2.7.11-venv/lib/python2.7/site-packages/numpy/lib/function_base.py(1567)diff()
-> return a[slice1]-a[slice2]
(Pdb) p numpy.__version__
'1.11.0'
(Pdb) p matplotlib.__version__
'1.4.3'
(Pdb) a
a = [u'A' u'B' u'C' u'D' u'E']
n = 1
axis = -1
(Pdb) p slice1
(slice(1, None, None),)
(Pdb) p slice2
(slice(None, -1, None),)
(Pdb)
                Hi, I ran into the same problem in my script, because numpy module was not imported. Could this be the case?
– Mike
                Oct 15, 2020 at 7:11

I got the same error, but in my case I am subtracting dict.key from dict.value. I have fixed this by subtracting dict.value for corresponding key from other dict.value.

cosine_sim = cosine_similarity(e_b-e_a, w-e_c)

here I got error because e_b, e_a and e_c are embedding vector for word a,b,c respectively. I didn't know that 'w' is string, when I sought out w is string then I fix this by following line:

cosine_sim = cosine_similarity(e_b-e_a, word_to_vec_map[w]-e_c)

Instead of subtracting dict.key, now I have subtracted corresponding value for key

Below your answer is an edit button. This information belongs in the answers, not in a comment. – Stephen Rauch Apr 13, 2018 at 14:43

Why is it applying diff to an array of strings.

I get an error at the same point, though with a different message

In [23]: a=np.array([u'A' u'B' u'C' u'D' u'E'])
In [24]: np.diff(a)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-24-9d5a62fc3ff0> in <module>()
----> 1 np.diff(a)
C:\Users\paul\AppData\Local\Enthought\Canopy\User\lib\site-packages\numpy\lib\function_base.pyc in diff(a, n, axis)
   1112         return diff(a[slice1]-a[slice2], n-1, axis=axis)
   1113     else:
-> 1114         return a[slice1]-a[slice2]
TypeError: unsupported operand type(s) for -: 'numpy.ndarray' and 'numpy.ndarray' 

Is this a array the bins parameter? What does the docs say bins should be?

Ah, interesting! I was trying to create a histogram with categorical bins, which means that what I want is not a histogram here at all, but a barplot. Thanks! – Gregory Marton Apr 15, 2016 at 3:49

I had a similar issue where an integer in a row of a DataFrame I was iterating over was of type numpy.int64. I got the

TypeError: ufunc 'subtract' did not contain a loop with signature matching types dtype('<U1') dtype('<U1') dtype('<U1')

error when trying to subtract a float from it.

The easiest fix for me was to convert the row using pd.to_numeric(row).

I was getting a similar error because I also subtracting a 0.0 from 0.00 . pd.to_numeric(DataFrame) It solved my issue – John Sick Jan 9, 2020 at 11:33

I am fairly new to this myself, but I had a similar error and found that it is due to a type casting issue. I was trying to concatenate rather than take the difference but I think the principle is the same here. I provided a similar answer on another question so I hope that is OK.

In essence you need to use a different data type cast, in my case I needed str not float, I suspect yours is the same so my suggested solution is. I am sorry I cannot test it before suggesting but I am unclear from your example what you were doing.

return diff(str(a[slice1])-str(a[slice2]), n-1, axis=axis)

Please see my example code below for the fix to my code, the change occurs on the third to last line. The code is to produce a basic random forest model.

import scipy
import math
import numpy as np
import pandas as pd
from sklearn.ensemble import RandomForestRegressor
from sklearn import preprocessing, metrics, cross_validation
Data = pd.read_csv("Free_Energy_exp.csv", sep=",")
Data = Data.fillna(Data.mean()) # replace the NA values with the mean of the descriptor
header = Data.columns.values # Ues the column headers as the descriptor labels
Data.head()
test_name = "Test.csv"
npArray = np.array(Data)
print header.shape
npheader = np.array(header[1:-1])
print("Array shape X = %d, Y = %d " % (npArray.shape))
datax, datay =  npArray.shape
names = npArray[:,0]
X = npArray[:,1:-1].astype(float)
y = npArray[:,-1] .astype(float)
X = preprocessing.scale(X)
XTrain, XTest, yTrain, yTest = cross_validation.train_test_split(X,y, random_state=0)
# Predictions results initialised 
RFpredictions = []
RF = RandomForestRegressor(n_estimators = 10, max_features = 5, max_depth = 5, random_state=0)
RF.fit(XTrain, yTrain)       # Train the model
print("Training R2 = %5.2f" % RF.score(XTrain,yTrain))
RFpreds = RF.predict(XTest)
with open(test_name,'a') as fpred :
    lenpredictions = len(RFpreds)
    lentrue = yTest.shape[0]
    if lenpredictions == lentrue :
            fpred.write("Names/Label,, Prediction Random Forest,, True Value,\n")
            for i in range(0,lenpredictions) :
                    fpred.write(RFpreds[i]+",,"+yTest[i]+",\n")
    else :
            print "ERROR - names, prediction and true value array size mismatch."

This leads to an error of;

Traceback (most recent call last):
  File "min_example.py", line 40, in <module>
    fpred.write(RFpreds[i]+",,"+yTest[i]+",\n")
TypeError: ufunc 'add' did not contain a loop with signature matching types dtype('S32') dtype('S32') dtype('S32')

The solution is to make each variable a str() type on the third to last line then write to file. No other changes to then code have been made from the above.

import scipy
import math
import numpy as np
import pandas as pd
from sklearn.ensemble import RandomForestRegressor
from sklearn import preprocessing, metrics, cross_validation
Data = pd.read_csv("Free_Energy_exp.csv", sep=",")
Data = Data.fillna(Data.mean()) # replace the NA values with the mean of the descriptor
header = Data.columns.values # Ues the column headers as the descriptor labels
Data.head()
test_name = "Test.csv"
npArray = np.array(Data)
print header.shape
npheader = np.array(header[1:-1])
print("Array shape X = %d, Y = %d " % (npArray.shape))
datax, datay =  npArray.shape
names = npArray[:,0]
X = npArray[:,1:-1].astype(float)
y = npArray[:,-1] .astype(float)
X = preprocessing.scale(X)
XTrain, XTest, yTrain, yTest = cross_validation.train_test_split(X,y, random_state=0)
# Predictions results initialised 
RFpredictions = []
RF = RandomForestRegressor(n_estimators = 10, max_features = 5, max_depth = 5, random_state=0)
RF.fit(XTrain, yTrain)       # Train the model
print("Training R2 = %5.2f" % RF.score(XTrain,yTrain))
RFpreds = RF.predict(XTest)
with open(test_name,'a') as fpred :
    lenpredictions = len(RFpreds)
    lentrue = yTest.shape[0]
    if lenpredictions == lentrue :
            fpred.write("Names/Label,, Prediction Random Forest,, True Value,\n")
            for i in range(0,lenpredictions) :
                    fpred.write(str(RFpreds[i])+",,"+str(yTest[i])+",\n")
    else :
            print "ERROR - names, prediction and true value array size mismatch."

These examples are from a larger code so I hope the examples are clear enough.

I think @James is right. I got stuck by same error while working on Polyval(). And yeah solution is to use the same type of variabes. You can use typecast to cast all variables in the same type.

BELOW IS A EXAMPLE CODE

import numpy
P = numpy.array(input().split(), float)
x = float(input())
print(numpy.polyval(P,x))

here I used float as an output type. so even the user inputs the INT value (whole number). the final answer will be typecasted to float.

Thanks for contributing an answer to Stack Overflow!

  • Please be sure to answer the question. Provide details and share your research!

But avoid

  • Asking for help, clarification, or responding to other answers.
  • Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.