Collectives™ on Stack Overflow
Find centralized, trusted content and collaborate around the technologies you use most.
Learn more about Collectives
Teams
Q&A for work
Connect and share knowledge within a single location that is structured and easy to search.
Learn more about Teams
I have a list that countain values, one of the values I got is 'nan'
countries= [nan, 'USA', 'UK', 'France']
I tried to remove it, but I everytime get an error
cleanedList = [x for x in countries if (math.isnan(x) == True)]
TypeError: a float is required
When I tried this one :
cleanedList = cities[np.logical_not(np.isnan(countries))]
cleanedList = cities[~np.isnan(countries)]
TypeError: ufunc 'isnan' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''
–
The question has changed, so too has the answer:
Strings can't be tested using math.isnan
as this expects a float argument. In your countries
list, you have floats and strings.
In your case the following should suffice:
cleanedList = [x for x in countries if str(x) != 'nan']
Old answer
In your countries
list, the literal 'nan'
is a string not the Python float nan
which is equivalent to:
float('NaN')
In your case the following should suffice:
cleanedList = [x for x in countries if x != 'nan']
–
–
countries= [nan, 'USA', 'UK', 'France']
Since nan is not equal to nan (nan != nan
) and countries[0] = nan
, you should observe the following:
countries[0] == countries[0]
False
However,
countries[1] == countries[1]
countries[2] == countries[2]
countries[3] == countries[3]
Therefore, the following should work:
cleanedList = [x for x in countries if x == x]
The problem comes from the fact that np.isnan()
does not handle string values correctly. For example, if you do:
np.isnan("A")
TypeError: ufunc 'isnan' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''
However the pandas version pd.isnull()
works for numeric and string values:
import pandas as pd
pd.isnull("A")
> False
pd.isnull(3)
> False
pd.isnull(np.nan)
pd.isnull(None)
–
–
–
In your example 'nan'
is a string so instead of using isnan()
just check for the string
like this:
cleanedList = [x for x in countries if x != 'nan']
In my opinion most of the solutions suggested do not take into account performance. Loop for and list comprehension are not valid solutions if your list has many values.
The solution below is more efficient in terms of computational time and it doesn't assume your list has numbers or strings.
import numpy as np
import pandas as pd
list_var = [np.nan, 4, np.nan, 20,3, 'test']
df = pd.DataFrame({'list_values':list_var})
list_var2 = list(df['list_values'].dropna())
print("\n* list_var2 = {}".format(list_var2))
If you have a list of items of different types and you want to filter out NaN, you can do the following:
import math
lst = [1.1, 2, 'string', float('nan'), {'di':'ct'}, {'set'}, (3, 4), ['li', 5]]
filtered_lst = [x for x in lst if not (isinstance(x, float) and math.isnan(x))]
Output:
[1.1, 2, 'string', {'di': 'ct'}, {'set'}, (3, 4), ['li', 5]]
I noticed that Pandas for example will return 'nan' for blank values. Since it's not a string you need to convert it to one in order to match it. For example:
ulist = df.column1.unique() #create a list from a column with Pandas which
for loc in ulist:
loc = str(loc) #here 'nan' is converted to a string to compare with if
if loc != 'nan':
print(loc)
–
–