Collectives™ on Stack Overflow
Find centralized, trusted content and collaborate around the technologies you use most.
Learn more about Collectives
Teams
Q&A for work
Connect and share knowledge within a single location that is structured and easy to search.
Learn more about Teams
I am trying to copy data of a partitioned Hive table from one cluster to another.
I am using distcp to copy the data but the data underlying data is of a partitioned hive table.
I used the following command.
hadoop distcp -i {src} {tgt}
But as the table was partitioned the directory structure was created according to the partitioned tables. So it is showing error creating duplicates and aborting job.
org.apache.hadoop.toolsCopyListing$DulicateFileException: File would cause duplicates. Aborting
I also used -skipcrccheck -update -overwrite but none worked.
How to copy the data of a table from partitioned file path to destination?
Check the below settings to see if they are false.Set them to true.
hive> set hive.mapred.supports.subdirectories;
hive.mapred.supports.subdirectories=false
hive> set mapreduce.input.fileinputformat.input.dir.recursive;
mapreduce.input.fileinputformat.input.dir.recursive=false
hadoop distcp -Dmapreduce.map.memory.mb=20480 -Dmapreduce.map.java.opts=-Xmx15360m -Dipc.client.fallback-to-simple-auth-allowed=true -Ddfs.checksum.type=CRC32C -m 500 \
-pb -update -delete {src} {target}
Ideally there can't be same file names. So, what's happening in your case is you trying to copy partitioned table from one cluster to other. And, 2 different named partitions have same file name.
Your solution is to correct Source path {src} in your command, such that you provide path uptil partitioned sub directory not the file.
For ex - Refer below :
/a/partcol=1/file1.txt
/a/partcol=2/file1.txt
If you use {src} as "/a/*/*" then you will get the error "File would cause duplicates."
But, if you use {src} as "/a" then you will not get error in copying.
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.