Collectives™ on Stack Overflow

Find centralized, trusted content and collaborate around the technologies you use most.

Learn more about Collectives

Teams

Q&A for work

Connect and share knowledge within a single location that is structured and easy to search.

Learn more about Teams

Getting error: ufunc 'subtract' did not contain a loop with signature matching types dtype('<U32') dtype('<U32') dtype('<U32')

Ask Question

I'm getting the error I stated in the title on my machine learning project. I'm following a guide on the internet . here's the parts that I'm getting the error:

def euclideanDistance(instance1, instance2, length):
    distance = 0
    for x in range(length):
        distance += pow((instance1[x] - instance2[x]), 2)
    return math.sqrt(distance)
def getNeighbors(trainingSet, testInstance, k):
    distances = []
    length = len(testInstance)-1
    for x in range(len(trainingSet)):
        dist = euclideanDistance(testInstance, trainingSet[x], length)
        distances.append((trainingSet[x], dist))
    distances.sort(key=operator.itemgetter(1))
    neighbors = []
    for x in range(k):
        neighbors.append(distances[x][0])
    return neighbors
neighbors = getNeighbors(training_feature_list, test_feature_list, 3)
print(neighbors)

I've looked around the internet about this question and noticed that many people asked this before but as I understand, the problem emerges from trying to use ufunc on different types of variables. But my training_feature_list and test_feature_list are similar.

train set goes like [['5.1' '0.2']['4.9' '0.2']...(30 rows)

test set goes like [['4.8' '0.2']['5.4' '0.4']...(20 rows).

I'd be so glad if anyone could briefly explain why this problem emerges (because I probably didn't understand it well) and how to fix it.

thanks in advance

If your lists really look like [['5.1' '0.2']['4.9' '0.2']... , then the error is probably caused by the fact that you are trying to subtract one string from another as '5.1' is a string, while 5.1 (which you prbably want) is a floating point number.

If that is not the case than another possible cause for the error (although I would expect a different one) is that you are passing lists instead of numpy arrays, which you should preferably do for calculations, as you can not just subtract one list from another.

I'm pretty sure that they're both numpy arrays so it shouldn't be second case. how do I make sure that I handle my numpy arrays as floats instead of strings if it's the case 1? (because as I know default data type is float on numpy) – Emre Unsal Feb 20, 2019 at 15:10 I found it, yes you were right from the beginning. I was handling my arrays as strings because when I first created them, they had their labels in each row so it probably got created as string array. using x.astype saved me. – Emre Unsal Feb 20, 2019 at 15:21 Case one is just because of how you wrote the example of your lists, if they come out of some numpy calculation, then they should already be floats. – trikPu Feb 20, 2019 at 15:22

Thanks for contributing an answer to Stack Overflow!

  • Please be sure to answer the question. Provide details and share your research!

But avoid

  • Asking for help, clarification, or responding to other answers.
  • Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.