Hive Metastore ObjectStore PersistenceManager自动关闭bug解析_lalaguozhe的博客

相关文章推荐

暴躁的香烟 · 怎么检查是否启动了JMX服务 - CSDN文库· 1 月前 ·

想出国的烤地瓜 · Hive函数全解——思维导图 + ...· 1 月前 ·

坏坏的雪糕 · 如何自动语法高亮整个Quill编辑器中不含代 ...· 9 月前 ·

飞翔的饺子 · c# wpf tabcontrol ...· 1 年前 ·

悲伤的野马 · Bash printf: how to ...· 1 年前 ·

慈祥的萝卜 · java - Spring SPEL ...· 1 年前 ·

狂野的丝瓜 · java 如何查看jdk版本&位数 - ...· 1 年前 ·

最近在测试HCatalog，由于Hcatalog本身就是一个独立JAR包，虽然它也可以运行service，但是其实这个service就是metastore thrift server，我们在写基于Hcatalog的mapreduce job时候只要把hcatalog JAR包和对应的hive-site.xml文件加入libjars和HADOOP_CLASSPATH中就可以了。不过在测试的时候还是遇到了一些问题，hive metastore server在运行了一段时间后会抛如下错误

2013-06-19 10:35:51,718 ERROR server.TThreadPoolServer (TThreadPoolServer.java:run(182)) - Error occurred during processing of message.
javax.jdo.JDOFatalUserException: Persistence Manager has been closed
        at org.datanucleus.jdo.JDOPersistenceManager.assertIsOpen(JDOPersistenceManager.java:2124)
        at org.datanucleus.jdo.JDOPersistenceManager.currentTransaction(JDOPersistenceManager.java:315)
        at org.apache.hadoop.hive.metastore.ObjectStore.openTransaction(ObjectStore.java:294)
        at org.apache.hadoop.hive.metastore.ObjectStore.getTable(ObjectStore.java:732)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.hive.metastore.RetryingRawStore.invoke(RetryingRawStore.java:111)
        at com.sun.proxy.$Proxy5.getTable(Unknown Source)
        at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_table(HiveMetaStore.java:982)
        at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$get_table.getResult(ThriftHiveMetastore.java:5017)
        at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$get_table.getResult(ThriftHiveMetastore.java:5005)
        at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:32)
        at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:34)

其中PersistenceManager负责控制一组持久化对象包括创建持久化对象和查询对象，它是ObjectStore的一个实例变量，每个ObjectStore拥有一个pm，RawStore是metastore逻辑层和物理底层元数据库（比如derby）交互的接口类，ObjectStore是RawStore的默认实现类。Hive Metastore Server启动的时候会指定一个TProcessor，包装了一个HMSHandler，内部有一个ThreadLocal<RawStore> threadLocalMS实例变量，每个thread维护一个RawStore

    private final ThreadLocal<RawStore> threadLocalMS =
      new ThreadLocal<RawStore>() {
        @Override
        protected synchronized RawStore initialValue() {
          return null;
每一个从hive metastore client过来的请求都会从线程池中分配一个 
WorkerProcess来处理，在HMSHandler中每一个方法都会通过getMS()获取rawstore instance来做具体操作 
    public RawStore getMS() throws MetaException {
      RawStore ms = threadLocalMS.get();
      if (ms == null) {
        ms = newRawStore();
        threadLocalMS.set(ms);
        ms = threadLocalMS.get();
      return ms;
看得出来RawStore是延迟加载，初始化后绑定到threadlocal变量中可以为以后复用 
    private RawStore newRawStore() throws MetaException {
      LOG.info(addPrefix("Opening raw store with implemenation class:"
          + rawStoreClassName));
      Configuration conf = getConf();
      return RetryingRawStore.getProxy(hiveConf, conf, rawStoreClassName, threadLocalId.get());
RawStore使用了动态代理模式(继承 
InvocationHandler接口 
)，内部实现了invoke函数，通过method.invoke()执行真正的逻辑，这样的好处是可以在 
method.invoke()上下文中添加自己其他的逻辑，RetryingRawStore就是在通过捕捉invoke函数抛出的异常，来达到重试的效果。由于使用reflection机制，异常是wrap在 
InvocationTargetException中的， 
不过在hive 0.9中竟然在捕捉到 
此异常后直接throw出来了，而不是retry，明显不对啊。我对它修改了下，拿出wrap的target exception，判断是不是instance of jdoexception的，再做相应的处理 
  @Override
  public Object invoke(Object proxy, Method method, Object[] args) throws Throwable {
    Object ret = null;
    boolean gotNewConnectUrl = false;
    boolean reloadConf = HiveConf.getBoolVar(hiveConf,
        HiveConf.ConfVars.METASTOREFORCERELOADCONF);
    boolean reloadConfOnJdoException = false;
    if (reloadConf) {
      updateConnectionURL(getConf(), null);
    int retryCount = 0;
    Exception caughtException = null;
    while (true) {
      try {
        if (reloadConf || gotNewConnectUrl || reloadConfOnJdoException) {
          initMS();
        ret = method.invoke(base, args);
        break;
      } catch (javax.jdo.JDOException e) {
        caughtException = (javax.jdo.JDOException) e.getCause();
      } catch (UndeclaredThrowableException e) {
        throw e.getCause();
      } catch (InvocationTargetException e) {
        Throwable t = e.getTargetException();
        if (t instanceof JDOException){
          caughtException = (JDOException) e.getTargetException();
          reloadConfOnJdoException = true;
          LOG.error("rawstore jdoexception:" + caughtException.toString());
        }else {
            throw e.getCause();
      if (retryCount >= retryLimit) {
        throw caughtException;
      assert (retryInterval >= 0);
      retryCount++;
      LOG.error(
          String.format(
              "JDO datastore error. Retrying metastore command " +
                  "after %d ms (attempt %d of %d)", retryInterval, retryCount, retryLimit));
      Thread.sleep(retryInterval);
      // If we have a connection error, the JDO connection URL hook might
      // provide us with a new URL to access the datastore.
      String lastUrl = getConnectionURL(getConf());
      gotNewConnectUrl = updateConnectionURL(getConf(), lastUrl);
    return ret;
初始化RawStore有两种方式，一种是在 
RetryingRawStore的构造函数中调用" 
this.base = (RawStore) ReflectionUtils.newInstance(rawStoreClass, conf); 
"  因为ObjectStore实现了Configurable，在newInstance方法中主动调用里面的setConf(conf)方法初始化RawStore，还有一种情况是在捕捉到异常后retry，也会调用 
base.setConf(getConf()); 
private void initMS() {
    base.setConf(getConf());
ObjectStore的setConf方法中，先将PersistenceManagerFactory锁住，pm close掉，设置成NULL，再初始化pm 
public void setConf(Configuration conf) {
    // Although an instance of ObjectStore is accessed by one thread, there may
    // be many threads with ObjectStore instances. So the static variables
    // pmf and prop need to be protected with locks.
    pmfPropLock.lock();
    try {
      isInitialized = false;
      hiveConf = conf;
      Properties propsFromConf = getDataSourceProps(conf);
      boolean propsChanged = !propsFromConf.equals(prop);
      if (propsChanged) {
        pmf = null;
        prop = null;
      assert(!isActiveTransaction());
      shutdown();
      // Always want to re-create pm as we don't know if it were created by the
      // most recent instance of the pmf
      pm = null;
      openTrasactionCalls = 0;
      currentTransaction = null;
      transactionStatus = TXN_STATUS.NO_STATE;
      initialize(propsFromConf);
      if (!isInitialized) {
        throw new RuntimeException(
        "Unable to create persistence manager. Check dss.log for details");
      } else {
        LOG.info("Initialized ObjectStore");
    } finally {
      pmfPropLock.unlock();
private void initialize(Properties dsProps) {
    LOG.info("ObjectStore, initialize called");
    prop = dsProps;
    pm = getPersistenceManager();
    isInitialized = pm != null;
    return;
回到一开始报错的那段信息，怎么会Persistence Manager会被关闭呢，仔细排查后才发现是由于HCatalog使用HiveMetastoreClient用完后主动调用了close方法，而一般Hive里面内部不会调这个方法. 
 HiveMetaStoreClient.java 
public void close() {
    isConnected = false;
    try {
      if (null != client) {
        client.shutdown();
    } catch (TException e) {
      LOG.error("Unable to shutdown local metastore client", e);
    // Transport would have got closed via client.shutdown(), so we dont need this, but
    // just in case, we make this call.
    if ((transport != null) && transport.isOpen()) {
      transport.close();
对应server端HMSHandler中的shutdown方法 
@Override
    public void shutdown() {
      logInfo("Shutting down the object store...");
      RawStore ms = threadLocalMS.get();
      if (ms != null) {
        ms.shutdown();
        ms = null;
      logInfo("Metastore shutdown complete.");
ObjectStore的shutdown方法 
public void shutdown() {
    if (pm != null) {
      pm.close();
 我们看到shutdown方法里面只是把当前thread的ObjectStore拿出来后，做了一个ObjectStore shutdown方法，把pm关闭了。但是并没有把ObjectStore销毁掉，它还是存在于threadLocalMS中，下次还是会被拿出来，下一次这个thread服务于另外一个请求的时候又会被get出ObjectSture来，但是由于里面的pm已经close掉了所以肯定抛异常。正确的做法是应该加上threadLocalMS.remove()或者threadLocalMS.set(null)，主动将其从ThreadLocalMap中删除。 
shutdown方法 
public void shutdown() {
      logInfo("Shutting down the object store...");
      RawStore ms = threadLocalMS.get();
      if (ms != null) {
        ms.shutdown();
        ms = null;
        threadLocalMS.remove();
      logInfo("Metastore shutdown complete.");
改好后重启metastore server，再也没有碰到Persistence Manager报已经close的情况了 
本文链接http://blog.csdn.net/lalaguozhe/article/details/9161799，转载请注明
 
                    最近在测试HCatalog，由于Hcatalog本身就是一个独立JAR包，虽然它也可以运行service，但是其实这个service就是metastore thrift server，我们在写基于Hcatalog的mapreduce job时候只要把hcatalog JAR包和对应的hive-site.xml文件加入libjars和HADOOP_CLASSPATH中就可以了。不过在测试的时候还是遇
				最近在自己电脑配置了spark与hive的集成 避免不了用到hive metastore服务 但是这个鬼东西开启容易关闭难
索性写了个脚本 可以一键关闭 思路是从kafka关闭脚本获取到的
1.在hive的安装目录下的/bin执行
touch stop-metastore.sh #创建脚本文件
chmod 777 stop-metastore.sh #给文件赋权
2.在创建的...
				从package结构来看，主要的5个package，让我们来看看这几个package的内容
(1)m有点etastore:是metastore模块的入口，也是整个metastore模块的核心所在，里面包含了HiveMetaStore类作为整个模块的核心，接收来自hive的请求，返回需要的信息。从package结构来看，主要的5个package，让我们来看看这几个package的内容
(2)...
重点在后面！ 删除了master的/tmp目录，结果jps和bin/hive就不能用了，从slave上复制了一份过来，可是jps之后什么都没有··只好重新启动了集群，还是没有，查看了刚复制过来的/tmp，有了本来应该有的.sh文件。/tmp目录下的文件应该是启动时自动创建的，于是我重新启动了master虚拟机，然后jps就有了~好开心 :D
可是bin/hive还不能用···报告缺少目录#s...
				Hive安装、Metastore配置和常见错误处理
文章目录Hive安装、Metastore配置和常见错误处理1 方法简介2 步骤2.1 安装Hive2.2 安装mysql2.3 在mysql中配置相应用户2.4 配置JDBC支持2.5 在Hive中配置metastore3 常见问题举例3.1 无法创建SessionHiveMetaStoreClient实例3.2 创建事务连接错误3.3 绝对路径中包含相对路径3.4 无法创建目录4 参考
1 方法简介
Hive的本身结构决定了其需要把一部分数据存储在关系数
				从package结构来看，主要的5个package：
metastore是metastore模块的入口，也是整个metastore模块的核心所在，里面包含了HiveMetaStore类作为整个模块的核心，接收来自Hive的请求，返回需要的信息
metastore.api包含了调用和访问metastore模块的接口以及接口参数和返回值类型，metastore模块的用户可以通过api对metastore模块进行访问
metastore.events用于metastore模块内部的观察者模式。因为metast.
					hive初始化数据库失败 org.apache.hadoop.hive.metastore.HiveMetaException: Schema initialization FAILED!
安装好hive后，执行初始化数据库命令schematool -dbType mysql -initSchema,出现以下错误：
Starting metastore schema initialization to 2.3.0
Initialization script hive-schema-2.3.0.mysql.sql
Error: Got error 168 from storage engine (state=HY000,code=1030)
org.apache.hadoop.hive.
				通过spark-sql 来访问hive的元数据，hive元数据信息存储在mysql但是启动的时候是报如下的错误，检查过配置,metastore已经配置，并且启动，用hive启动可以正常使用，但是spark-sql不行，报错信息
Hive: Failed to access metastore. This class should not accessed in runtime.
org.apac...
insert into table tablename values("xx")；
出现错误:
1)、metastore.RetryingMetaStoreClient: MetaStoreClient lost connection. Attempting to reconnect (1 of 1) after 1s. setPartitionColumnStatistics
org.apache.t
				INFO [main] metastore.HiveMetaStore: 0: Opening raw store with implementation class:org.apache.hadoop.hive.metastore.ObjectStore
2022-01-06T10:01:13,592  INFO [main] metastore.ObjectStore: ObjectStore, initialize called
2022-01-06T10:01:13,597  WARN [mai..
● hive-metastore.service - Hive metastore service
   Loaded: loaded (/usr/lib/systemd/system/hive-metastore.service; enabled; vendor preset: disabled)
   Active: inactive (dead) since Wed 2021-11-03 09:54:03 UTC; 1s ago
 Main PID: 12345 (code=exited, status=0/SUCCESS)
Nov 03 09:52:34 server1 systemd[1]: Starting Hive metastore service...
Nov 03 09:54:03 server1 systemd[1]: Stopped Hive metastore service.
现在，Hive Metastore 服务已经被停止了。如果你需要重新启动它，可以使用以下命令：
sudo systemctl start hive-metastore