I have deployed a Synapse workspace through devops and running some sql queries inside it. I can run simple SQL queries but can not run anything related to Delta Table. Commands like -

  • SHOW TABLES
  • %%sql
    CREATE DATABASE AdventureWorksLT2019
  • DROP TABLE IF EXISTS table_name
  • all fail with 'Error: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.NullPointerException'. I originally want to save a delta table to my ADLS but my saveAsTable command fails. Running new_target_df.write.format("delta") \ .mode('append').option("overwriteSchema", "true") \ .option("path", delta_table_path) \ .partitionBy('subscriptionId','year','month','day') \ .saveAsTable(delta_table_name) # External table
    It gives -

    AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.NullPointerException
    Traceback (most recent call last):
      File "/opt/spark/python/lib/pyspark.zip/pyspark/sql/readwriter.py", line 1158, in saveAsTable
        self._jwrite.saveAsTable(name)
      File "/home/trusted-service-user/cluster-env/env/lib/python3.8/site-packages/py4j/java_gateway.py", line 1304, in __call__
        return_value = get_return_value(
      File "/opt/spark/python/lib/pyspark.zip/pyspark/sql/utils.py", line 117, in deco
        raise converted from None
    pyspark.sql.utils.AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.NullPointerException
    

    I am however able to write dataframe as delta lake at destination using - new_target_df.write.format('delta').mode('append').option("overwriteSchema", "true").save(delta_table_path) so I don't think it is a permission issue. I am Synapse Administrator, Synapse workspace has Storage Blob Data Contributor on ADLS.

    It seems to be an error with Hive Metastore maybe but I dont understand clearly. I recreated workspace to no avail. Please help

    Hi @Yash Tamakuwala ,
    Thankyou for using Microsoft Q&A platform and posting your queries.
    In Azure Synapse workspace, you need to go to Develop tab, and create a new notebook in order to run these queries . The notebook should be attached to Spark pool . You can create Apache spark pool in Manage tab of Synapse Workspace and attach your notebook with the spark pool.

    Note: As Spark pools are a provisioned service, you pay for the resources provisioned. You can go with small node size and keep Max number of nodes as 3 for getting the least charge

    1. SHOW TABLES

    2. CREATE DATABASE AdventureWorksLT2019

    3. DROP TABLE IF EXISTS table_name

    Hope this will help. Please let us know if any further queries.

    ------------------------------

  • Please don't forget to click on 130616-image.png or upvote 130671-image.png button whenever the information provided helps you.
    Original posters help the community find answers faster by identifying the correct answer. Here is how
  • Want a reminder to come back and check responses? Here is how to subscribe to a notification
  • If you are interested in joining the VM program and help shape the future of Q&A: Here is how you can be part of Q&A Volunteer Moderators
  • Hi, thanks for replying. I tried that, but doesn't work. Even tried with a new spark pool. ![167658-shot-220124-192517.png][1]

    [1]: /api/attachments/167658-shot-220124-192517.png?platform=QnA I am able to read and write to the ADLS so I do not think this is a permission issue.

    I found the problem @AnnuKumari-MSFT . During resource creation, we have to give the name of the default storage account and container. I wasn't deploying the container in my ARM template. When I made that change, it was all working fine. The error message could have been more user-friendly. Nothing in the stack trace says anything about container not being present.

    Hi @Yash Tamakuwala ,
    Thanks for providing the update. Glad that you found the way out. I tried to repro your scenario and did not get the NullPointException. In case you find any such
    similar issues in future , you can directly create a support ticket with Microsoft. To get more details on how to raise a support ticket , kindly refer this article : Create an Azure support request

    Please consider hitting Accept Answer button and upvote for the same. Accepted answers helps community as well.

    Hi @Yash Tamakuwala ,
    What do you mean by : " During resource creation, we have to give the name of the default storage account and container"
    Could you provide the code please ? Are you talking about the creation of the sparkpool or the synapse environnement creation.
    Honestly, It's not really clear. Let me know, I'm facing the same issue and I don't see any default storage account property on sparkpool initialization in manage apache spark pool of synapse tab.

    Thanks in advance.