-- hive-site.xml
<property>
<name>system:java.io.tmpdir</name>
<value>/tmp/hive/java</value>
</property>
<property>
<name>system:user.name</name>
<value>${user.name}</value>
</property>
-- 以上两个属性,非常重要!底下会讲它
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://192.168.31.10/metastore?createDatabaseIfNotExist=true</value>
<description>
JDBC connect string for a JDBC metastore.
To use SSL to encrypt/authenticate the connection, provide database-specific SSL flag in the connection URL.
For example, jdbc:postgresql://myhost/db?ssl=true for postgres database.
</description>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.cj.jdbc.Driver</value>
<description>Driver class name for a JDBC metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>MySQLAdmin</value>
<description>Username to use against metastore database</description>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>LuckyNumber234.</value>
<description>password to use against metastore database</description>
</property>
--如果修改了默认hdfs路径
<property>
<name>hive.metastore.warehouse.dir</name>
<value>/user/hive/warehouse</value>
<description>location of default database for the warehouse</description>
</property>
初始化 metastore
各种配置文件完工后,需要初始化 Hive 的元数据库,即 metastore.
schematool -initSchema -dbType mysql
[hadoopadmin@namenode lib]$ schematool -initSchema -dbType mysql
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/Hive/Hive.3.1.2/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/Hadoop/hadoop-3.2.2/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Exception in thread "main" java.lang.NoSuchMethodError: com.google.common.base.Preconditions.checkArgument(ZLjava/lang/String;Ljava/lang/Object;)
Go to $HIVE_HOME (%HIVE_HOME%)/lib folder and find out the version of guava. For Hive 3.0.0, it is guava-19.0.jar.
Go to $HADOOP_HOME (%HADOOP_HOME%)/share/hadoop/common/lib folder and find out the version of guava. For Hadoop 3.2.1, the version is guava-27.0-jre.jar.
If they are not same (which is true for this case), delete the older version and copy the newer version in both. In this case, delete** guava-19.0.jar** in Hive lib folder, and then copy guava-27.0-jre.jar from Hadoop folder to Hive.
第二个问题:com.ctc.wstx.exc.WstxParsingException: Illegal character entity: expansion character
这是 hive-site.xml 中有不可识别字符造成的,定位到这里:
在 "for" 和 "transactional" 之间,有个不可识别字符,删掉即可。
第三个问题,MySQL 8 的时区问题
org.apache.hadoop.hive.metastore.HiveMetaException: Failed to get schema version.
Underlying cause: java.sql.SQLException : The server time zone value 'EDT' is unrecognized or represents more than one time zone. You must configure either the server or JDBC driver (via the 'serverTimezone' configuration property) to use a more specific time zone value if you want to utilize time zone support.
SQL Error code: 0
这里暂停 mysql 服务,配置下 /etc/my.cnf ,加入时区就行
default_time_zone='+08:00'
Hive 运行
启动 Hive: hiveserver2
问题:启动时间太长
^C[hadoopadmin@namenode lib]$ hiveserver2
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
which: no hbase in (/home/hadoopadmin/.local/bin:/home/hadoopadmin/bin:/usr/local/bin:/usr/local/sbin:/usr/bin:/usr/sbin:/opt/java/jdk8/bin:/opt/java/VSCode-linux-x64/bin:/opt/Hadoop/hadoop-3.2.2/bin:/opt/Hadoop/hadoop-3.2.2/sbin:/opt/java/jdk8/bin:/opt/java/VSCode-linux-x64/bin:/opt/Hadoop/hadoop-3.2.2/bin:/opt/Hadoop/hadoop-3.2.2/sbin:/opt/java/jdk8/bin:/opt/java/VSCode-linux-x64/bin:/opt/Hadoop/hadoop-3.2.2/bin:/opt/Hadoop/hadoop-3.2.2/sbin:/opt/Hive/Hive.3.1.2/bin)
2021-04-07 11:15:42: Starting HiveServer2
Hive Session ID = 8239b67b-ef9f-4596-b802-fc3240e8bf8b
Hive Session ID = 37aa29dd-e2df-4f4d-9905-deaca088b149
Hive Session ID = e821c4f7-7872-4b3b-932b-ac3bd41161ad
……!!!此处省去20行
Hive Session ID = 6650ff30-bd57-4890-a278-b1cdd75aad45
Hive Session ID = fec29941-494c-4a5e-ac88-de45e5e37894
Hive Session ID = 71dd04d5-bb0b-4829-b0d3-3d20b5a385ab
[hadoopadmin@namenode hadoopadmin]$ beeline
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/Hive/Hive.3.1.2/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/Hadoop/hadoop-3.2.2/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Beeline version 3.1.2 by Apache Hive
beeline> !connect jdbc:hive2://localhost:10000
Connecting to jdbc:hive2://localhost:10000
Enter username for jdbc:hive2://localhost:10000: hadoopadmin
Enter password for jdbc:hive2://localhost:10000: **********
21/04/08 11:28:01 [main]: WARN jdbc.HiveConnection: Failed to connect to localhost:10000
Error: Could not open client transport with JDBC Uri: jdbc:hive2://localhost:10000: Failed to open new session: java.lang.RuntimeException: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.authorize.AuthorizationException): User: hadoopadmin is not allowed to impersonate hadoopadmin (state=08S01,code=0)
在执行 connect 时,使用用户名 hadoopadmin 登录失败。
同样,使用 dbeaver , 一款连接数据库,NoSQL 和大数据的跨界神器,
连接过程也不是一帆风顺:
Required field 'serverProtocolVersion' is unset! Struct:TOpenSessionResp(status:TStatus(statusCode:ERROR_STATUS, infoMessages:[*org.apache.hive.service.cli.HiveSQLException:Failed to open new session: java.lang.RuntimeException: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.authorize.AuthorizationException): User: hadoopadmin is not allowed to impersonate anonymous:14:13,
*java.lang.RuntimeException:org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.authorize.AuthorizationException): User: hadoopadmin is not allowed to impersonate anonymous:29:7,
*org.apache.hadoop.ipc.RemoteException:User: hadoopadmin is not allowed to impersonate anonymous:54:25, org.apache.hadoop.ipc.Client:getRpcResponse:Client.java:1562,
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.authorize.AuthorizationException): User: hadoopadmin is not allowed to impersonate anonymous), serverProtocolVersion:null)
关键在这个方法错误提示上:(User: hadoopadmin is not allowed to impersonate anonymous:)
hadoop.ipc.RemoteException(org.apache.hadoop.security.authorize.AuthorizationException): User: hadoopadmin is not allowed to impersonate anonymous:14:13,
Hive Server 2 会对客户端进行认证鉴定。授权认证会有多种方法,所以链接的方式会有多样化。
本次实验,选择了 NONE + FALSE 的组合,采用 No Authentication 的策略。
--hive-site.xml
<property>
<name>hive.server2.authentication</name>
<value>NOSASL</value>
<description>
Expects one of [nosasl, none, ldap, kerberos, pam, custom].
Client authentication types.
NONE: no authentication check
LDAP: LDAP/AD based authentication
KERBEROS: Kerberos/GSSAPI authentication
CUSTOM: Custom authentication provider
(Use with property hive.server2.custom.authentication.class)
PAM: Pluggable authentication module
NOSASL: Raw transport
</description>
</property>
<property>
<name>hive.server2.enable.doAs</name>
<value>false</value>
<description>
Setting this property to true will have HiveServer2 execute
Hive operations as the user making the calls to it.
</description>
</property>
再执行 beeline:
beeline> !connect jdbc:hive2://192.168.31.10:10000/default;auth=noSasl
Connecting to jdbc:hive2://192.168.31.10:10000/default;auth=noSasl
Enter username for jdbc:hive2://192.168.31.10:10000/default: root
Enter password for jdbc:hive2://192.168.31.10:10000/default:
Connected to: Apache Hive (version 3.1.2)
Driver: Hive JDBC (version 3.1.2)
Transaction isolation: TRANSACTION_REPEATABLE_READ
0: jdbc:hive2://192.168.31.10:10000/default> show databases ;
DEBUG : Acquired the compile lock.
INFO : Compiling command(queryId=hadoopadmin_20210409105440_2c63fa0a-6c15-41cb-b11c-64284ff6a1ea): show databases
INFO : Concurrency mode is disabled, not creating a lock manager
INFO : Semantic Analysis Completed (retrial = false)
INFO : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:database_name, type:string, comment:from deserializer)], properties:null)
INFO : Completed compiling command(queryId=hadoopadmin_20210409105440_2c63fa0a-6c15-41cb-b11c-64284ff6a1ea); Time taken: 0.977 seconds
INFO : Concurrency mode is disabled, not creating a lock manager
INFO : Executing command(queryId=hadoopadmin_20210409105440_2c63fa0a-6c15-41cb-b11c-64284ff6a1ea): show databases
INFO : Starting task [Stage-0:DDL] in serial mode
INFO : Completed executing command(queryId=hadoopadmin_20210409105440_2c63fa0a-6c15-41cb-b11c-64284ff6a1ea); Time taken: 0.057 seconds
INFO : OK
INFO : Concurrency mode is disabled, not creating a lock manager
DEBUG : Shutting down query show databases
+----------------+
| database_name |
+----------------+
| default |
+----------------+
最重要的是这段链接:
beeline> !connect jdbc:hive2://192.168.31.10:10000/default;auth=noSasl
Connecting to jdbc:hive2://192.168.31.10:10000/default;auth=noSasl
Enter username for jdbc:hive2://192.168.31.10:10000/default: root
Enter password for jdbc:hive2://192.168.31.10:10000/default:
因采用了 No Authentication 的策略,在构建 Jdbc 连接字符串时,需加 auth=noSasl 设定。
<property>
<name>hive.server2.thrift.client.user</name>
<value>anonymous</value>
<description>Username to use against thrift client</description>
</property>
<property>
<name>hive.server2.thrift.client.password</name>
<value>anonymous</value>
<description>Password to use against thrift client</description>
</property>