python - TypeError: Mismatch between array dtype ('<U32') and format specifier ('%.18e')

Collectives™ on Stack Overflow

Find centralized, trusted content and collaborate around the technologies you use most.
Learn more about Collectives
Teams
Q&A for work
Connect and share knowledge within a single location that is structured and easy to search.
Learn more about Teams
I am using np.savetxt for the first time, and I am trying to save two variables (a string and a float) in a file named "trial.csv" as follows:
import numpy as np
RT = 2.76197329736740
key_name = 'space'
print(RT,key_name)
# Save data in a CSV file named subj_data_file
np.savetxt("trial.csv", (RT,key_name), delimiter=',', header="RTs,Key_Name")
I got the following error:
TypeError: Mismatch between array dtype ('<U32') and format specifier ('%.18e')
I do not understand the meaning of both ('<U32') and ('%.18e'). As a matter of fact, I do not understand how to use fmt when I have floats, integers and strings ...
It is a simplified example, but concretely, I would have the RT values (floats) in one column "RTs" and the key_name (float) values in another column "Key_Name". I will create more columns later on, and although I provided one value for RT and one value for key_name in this example, there will be more RT values in the column "RTs" as well as key names in the column "Key_Name".
                savetxt writes a numpy array to the file, having first converted it to strings via the fmt.  The default fmt is '%.18e', with converts a number to something like '1.234e10'.  To see what it's trying to save, print(np.array({RT, key_name)).  Saving a mix of numbers and strings with savetxt isn't a trivial task.
– hpaulj
                May 26, 2020 at 16:23
                @hpaulj it prints {2.713, 'space'}. Shall I then choose a different method to save a mix of data types in a CSV file?
– Kathia
                May 26, 2020 at 16:28
                I get array(['2.7619732973674', 'space'], dtype='<U32').  It makes an array with string values,  not a mix of number and string.  Do you really have to save the label with it the number.  For a beginner, saving and loading just numbers is a lot easier than a mix of numbers and strings.
– hpaulj
                May 26, 2020 at 17:01
This happens because the default fmt argument in np.savetxt() is '%.18e' which is suitable for numbers (integers/floats). If you want to save strings as well, you need to change the fmt argument to be '%s'.
Also, you need to change the X shape to reflect the fact that it's one row with two columns. So, you need to change np.savetxt to be just like so:
np.savetxt("trial.csv", [[RT, key_name]], fmt="%s", delimiter=',', header="RTs,Key_Name")
This means that everything will be saved as string. So, the value 2.761.. won't be a float. You can load this file like so:
np.loadtxt("trial.csv",delimiter=',', dtype=str) #notice assiging dtype to str
                This solution gets rid of the type error, thanks. However, when I tried to load the data with data = np.loadtxt("trial.csv",delimiter=',') then pprint.pprint(data.tolist()), I get the error: ValueError: could not convert string to float: 'space'. Moreover, when I open the file "trial.csv", I see that it created two columns "RTs" and "Key_Name" but both values (2.76197329736740 and 'space') are in the rows of the first column (RTs) instead of having 'space' in the column "Key_Name"
– Kathia
                May 26, 2020 at 16:37
                The default datatype in np.loadtxt is float, change it to str like so: np.loadtxt("trial.csv",delimiter=',', dtype=str). Anyway, I will edit my answer to include this part.
– Anwarvic
                May 26, 2020 at 16:42
                It definitely fixes the ValueError since it converts everything to string - which can be a solution for me since I can convert back the values to float when I will process the data. However, I still have both values (2.76197329736740 and 'space') stored in a single column (RTs). Is it because of the formatting fmt="%s"?
– Kathia
                May 26, 2020 at 16:48
                No, because of the way you represent your data. I've edited my answer to use [[RT, key_name]] instead of (RT, key_name)
– Anwarvic
                May 26, 2020 at 16:58
names  = np.array(['NAME_1', 'NAME_2', 'NAME_3'])
floats = np.array([ 0.1234 ,  0.5678 ,  0.9123 ])
ab = np.zeros(names.size, dtype=[('key_name', 'U6'), ('RT', float)])
ab['key_name'] = names
ab['RT'] = floats
np.savetxt('trial.csv', ab, fmt="%10s , %10.3f", header="Keys_Names,RTs")
                Thanks for your solution! I tried to create headers as follows: np.savetxt('trial.csv', ab, fmt="%10s %10.3f", header="Keys_Names,RTs") but all the values (both floats and the names) are under the same column "Keys_Names". How can I add a header and assign each other to its respective variable?
– Kathia
                May 27, 2020 at 10:44
                @Kathia I edited my answer as you wanted headers aligned for columns value, if I could solve please accept the answer.
– Mahsa Hassankashi
                May 27, 2020 at 10:51
        Thanks for contributing an answer to Stack Overflow!
Please be sure to answer the question. Provide details and share your research!
But avoid …
Asking for help, clarification, or responding to other answers.
Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.