Collectives™ on Stack Overflow
Find centralized, trusted content and collaborate around the technologies you use most.
Learn more about Collectives
Teams
Q&A for work
Connect and share knowledge within a single location that is structured and easy to search.
Learn more about Teams
Writing in a table while including the schema is failing on PySpark with Python 3.
Here are the steps that are working.
a = sc.textFile("ad_actions.csv")
b = a.map(lambda x: x.split('||')).toDF()
b.write.saveAsTable('AD_ACTIONS', mode='append')
But if I try to add the schema, it is failing:
a = sc.textFile("ad_actions.csv")
b = a.map(lambda x: x.split('||')).toDF(schema=sqlContext.table("AD_ACTIONS").schema)
b.write.saveAsTable('AD_ACTIONS', mode='append')
AttributeError: 'str' object has no attribute 'toordinal'
Does any of you know how I can fix this?
Do you need to see anything?
Means that you want to cast your string to ordinal. This a function for date.
So, the problem is that you are not able to convert the data to the schema that you want.
What I suggest you, take only the name of the columns. Like this:
a = sc.textFile("ad_actions.csv")
b = a.map(lambda x: x.split('||')).toDF(sqlContext.table("AD_ACTIONS").schema.names)
b.write.saveAsTable('AD_ACTIONS', mode='append')
This will work fine, due to the schema convertion will be handle by your Metadata Store.
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.