I have created an ADF pipeline which copies data from a CSV stored on a local server to an Azure SQL Database (staging table), and then uses a dataflow to perform some transformations.
When I inspect the dataflow in the monitoring screen, it completes successfully, and also all the data flows through to the database properly. However the pipeline results in a failure caused by the dataflow apparently ending in the following error:
The pipeline run ID of the most recent failure is: 09753ddd-94cb-4ff0-bda6-fbca51c9e966
Operation on target Transform_Daily_SOH_CSV failed: {"StatusCode":"DFExecutorUserError","Message":"Job failed due to reason: None.get","Details":"java.util.NoSuchElementException: None.get\n\tat scala.None$.get(Option.scala:347)\n\tat scala.None$.get(Option.scala:345)\n\tat com.microsoft.dataflow.FlowCode$$anonfun$com$microsoft$dataflow$FlowCode$$recurseLineageNode$1.apply(FlowRunner.scala:517)\n\tat com.microsoft.dataflow.FlowCode$$anonfun$com$microsoft$dataflow$FlowCode$$recurseLineageNode$1.apply(FlowRunner.scala:515)\n\tat scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)\n\tat scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)\n\tat com.microsoft.dataflow.FlowCode.com$microsoft$dataflow$FlowCode$$recurseLineageNode(FlowRunner.scala:515)\n\tat com.microsoft.dataflow.FlowCode$$anonfun$com$microsoft$dataflow$FlowCode$$recurseLineageNode$1.apply(FlowRunner.scala:518)\n\tat com.microsoft.dataflow.FlowCode$$anonfun$com$microsoft$dataflow$FlowCode$$recurseLineageNode$1.apply(FlowRunner.scala:515)\n\tat scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)\n\tat scala.collection.mutable.ArrayBuffer.foreach(A"}
Any help would be appreciated!
Hello
@William Wright
,
Welcome to the MS Q&A platform.
Could you please help me understand the below questions?
Do you have any retry logic in your data flow? If yes, did the pipeline run start again after the pipeline failure?
Is this the 1st time you are seeing this? Or is this a re-occurring behavior?
There was a hotfix deployment yesterday for the error:
'StatusCode':'DFExecutorUserError','Message':'Job failed due to reason:Not started.
I am guessing this issue could be related to yesterday's hotfix deployment.
If you are still seeing the same behavior, please let me know.
Hey
@BhargavaGunnam-MSFT
,
Thanks for the reply. My answers are below:
There is no retry logic in the data flow. It's a straight pick from source, derive, cast, and rename columns, then sink.
This is not the first time. It happened all week last week so I don't think it is related to that hotfix. Every time the error was triggered, I checked the dataflow and it ran perfectly, only the pipeline returned the error. I also checked the destination table every time and the data was always complete and accurate.
I am still seeing this error as of 1 hour ago.
Thanks, Will
Hello
@William Wright
,
Thank you for the reply, and sorry for the inconvenience here. I suspect the issue could be related to underlying clusters too.
The best approach is to file a support ticket, and a support engineer can take a deeper look into your data flow and cluster logs to troubleshoot the issue further. If you don't have a support plan, I can enable one-time free support for you to work closely on this matter.
I am looking forward to hearing from you.