SparkSession 访问 Hive 表数据报错:org.apache.spark.sql.AnalysisException: Table or view not found

最新推荐文章于 2022-11-15 17:31:43 发布
最新推荐文章于 2022-11-15 17:31:43 发布

当通过 SparkSession 访问 hive 中的表数据时,报错如下:

Exception in thread "main" org.apache.spark.sql.AnalysisException: Table or view not found: emp; line 1 pos 47
	at org.apache.spark.sql.catalyst.analysis.package$AnalysisErrorAt.failAnalysis(package.scala:42)
	at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.org$apache$spark$sql$catalyst$analysis$Analyzer$ResolveRelations$$lookupTableFromCatalog(Analyzer.scala:460)
	at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$$anonfun$apply$8.applyOrElse(Analyzer.scala:479)
	at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$$anonfun$apply$8.applyOrElse(Analyzer.scala:464)
	at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan$$anonfun$resolveOperators$1.apply(LogicalPlan.scala:61)
	at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan$$anonfun$resolveOperators$1.apply(LogicalPlan.scala:61)
	at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:70)
	at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.resolveOperators(LogicalPlan.scala:60)
	at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan$$anonfun$1.apply(LogicalPlan.scala:58)
	at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan$$anonfun$1.apply(LogicalPlan.scala:58)
	at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:307)
	at org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:188)
	at org.apache.spark.sql.catalyst.trees.TreeNode.mapChildren(TreeNode.scala:305)
	at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.resolveOperators(LogicalPlan.scala:58)
	at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan$$anonfun$1.apply(LogicalPlan.scala:58)
	at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan$$anonfun$1.apply(LogicalPlan.scala:58)
	at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:307)
	at org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:188)
	at org.apache.spark.sql.catalyst.trees.TreeNode.mapChildren(TreeNode.scala:305)
	at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.resolveOperators(LogicalPlan.scala:58)
	at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.apply(Analyzer.scala:464)
	at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.apply(Analyzer.scala:454)
	at org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$execute$1$$anonfun$apply$1.apply(RuleExecutor.scala:85)
	at org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$execute$1$$anonfun$apply$1.apply(RuleExecutor.scala:82)
	at scala.collection.LinearSeqOptimized$class.foldLeft(LinearSeqOptimized.scala:124)
	at scala.collection.immutable.List.foldLeft(List.scala:84)
	at org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$execute$1.apply(RuleExecutor.scala:82)
	at org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$execute$1.apply(RuleExecutor.scala:74)
	at scala.collection.immutable.List.foreach(List.scala:381)
	at org.apache.spark.sql.catalyst.rules.RuleExecutor.execute(RuleExecutor.scala:74)
	at org.apache.spark.sql.execution.QueryExecution.analyzed$lzycompute(QueryExecution.scala:69)
	at org.apache.spark.sql.execution.QueryExecution.analyzed(QueryExecution.scala:67)
	at org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:50)
	at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:63)
	at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:592)
	at sparkSql.SparkSQLHiveDemo$.main(SparkSQLHiveDemo.scala:15)
	at sparkSql.SparkSQLHiveDemo.main(SparkSQLHiveDemo.scala)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:743)
	at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:187)
	at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:212)
	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:126)
	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

代码如下:

def main(args: Array[String]): Unit = {
    val spark = SparkSession.builder().getOrCreate()
    //读取 hive 表中的数据
    spark.sql("select e.empno,e.ename,e.job,e.mgr,e.comm from emp e join dept d on e.deptno = d.deptno")
      .filter("comm is not null")
      .write.parquet("hdfs://hadoop102:9000/demp");
    spark.close()

通过 HiveContext 访问是 OK 的,spark-shell 也是OK。SparkSession 默认没有开启访问 Hive 的功能

开启访问 Hive 的功能

val spark = SparkSession.builder().enableHiveSupport().getOrCreate()

再次打包运行,运行成功

SparkSession 访问 Hive 表数据报错:org.apache.spark.sql.AnalysisException: Table or view not found 错误场景当通过 SparkSession 访问 hive 中的表数据时,报错如下:Exception in thread "main" org.apache.spark.sql.AnalysisException: Table or view not found: emp; line 1 pos 47 at org.apache.spark.sql.catalyst.analys...
1、spark项目 + 外部配置文件【最优,如CDH平台,配置改变,无需重新编译、打包】 2、spark项目(resources目录下存放hive-site.xml)【次之,如果配置变动需要更新配置文件,重新编译打包】 3、spark项目(在代码中写死配置选项)【最差,涉及配置修改后,需要对代码修改,然后重新编译打包】
在本地调试运行spark程序时,报错Exception in thread “main” java.lang.NoClassDefFoundError: org/apache/spark/SparkConf,这个错误就是程序在运行时找不到类 Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/spark/SparkConf at cn.jin.spark.JavaLambdaWordCount.main(JavaLam
org.apache.flink.table.api.ValidationException: Could not find any factory for identifier 'doris' 打包时需要将多个构件中的class文件或资源文件聚合。资源的聚合需要shade插件提供了丰富的Transformer工具类。
spark中遇到Exception in thread “main” org.apache.spark.sql.,这往往是所选取 'pibid'字段不存在,或者名字写错了。所以要查该字段!!! Exception in thread “main” org.apache.spark.sql.AnalysisException: cannot resolve ‘pibid’ given input columns: [spark_catalog.lg_edu_warehouse.dwd_t_baseexamh
1、用./bin/spark-shell启动spark时遇到异常:java.net.BindException: Can't assign requested address: Service 'sparkDriver' failed after 16 retries! 解决方法:add export SPARK_LOCAL_IP="127.0.0.1" to spark-env....
Exception in thread "main" org.apache.spark.sql.catalyst.analysis.NoSuchDatabaseException:Table or view 'emp' not found in database 'default';
在使用flinkTableAPI开发的时候,执行SQL,使用insert overwrite时,提示DynamicTableSink of table 'default_catalog.default_database.xx' implements the SupportsOverwrite interface. 这时我们需要在pom.xml文件中添加如下依赖,就可以解决: <dependency> <groupId>org.apache.flink</grou
这个是报错信息 Exception in thread "main" org.apache.spark.sql.AnalysisException: Cannot up cast `age` from bigint to int. The type path of the target object is: - field (class: "scala.Int", name: "age") Exception in thread "main" org.apache.spark.sql.AnalysisE
1. SparkHive版本不兼容。请确保SparkHive的版本匹配。 2. 缺少必要的Hive依赖项。请检查您的Spark配置,确保所有必要的Hive依赖项都已正确设置。 3. 您的Spark配置中可能存在错误。请检查您的Spark配置文件,确保所有配置都正确设置。 4. 您的Hive数据存储可能已损坏。请尝试重新创建Hive数据存储并重新启动Hive服务。 如果您无法解决此问题,请提供更多详细信息,以便我们更好地帮助您解决问题。
Jenkins构建时报错 hudson.plugins.git.GitException: Failed to fetch from ssh://git@gitlab.lucy.com 27382 Jenkins执行Maven编译时报错 java.lang.NoSuchMethodError: No such DSL method 'withMaven' found among steps 10379 Spark-submit 提交 报错 org.apache.spark.sql.execution.datasources.orc.OrcFileFormat could not be instant Jenkins构建时报错 hudson.plugins.git.GitException: Failed to fetch from ssh://git@gitlab.lucy.com qq_31979267: 特意进来点赞,本来扫了一眼,不知道说的什么。照着做了下,确实有用 Jenkins构建时报错 hudson.plugins.git.GitException: Failed to fetch from ssh://git@gitlab.lucy.com 好用,感谢。 Jenkins构建时报错 hudson.plugins.git.GitException: Failed to fetch from ssh://git@gitlab.lucy.com 乌拉1nl: ?????? Jenkins构建时报错 hudson.plugins.git.GitException: Failed to fetch from ssh://git@gitlab.lucy.com 残月古浪: DevOps | GitLab Webhook插件测试触发Jenkins自动构建报错 Hook executed successfully but returned HTTP 403 007星辰: