Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-4020

Catalog update can fail due to database creation/deletion in Hive.

    Details

      Description

      Create a hive script that creates and drops databases:
      for a in

      {1..100}; do echo "create database blah$a;" >> hive_script.sql; echo "drop database blah$a;" >> hive_script.sql; done

      run this continuously:

      beeline -u jdbc:hive2://nightly57-unsecure-1.gce.cloudera.com:10000/default -u hive -p hive -f hive_script.sql

      In another session invalidate metadata:

      for a in {1..100}

      ; do impala-shell -q "invalidate metadata"; done
      Eventually you will see the following in the catalog server logs:
      E0817 01:10:53.680232 19081 CatalogServiceCatalog.java:583] NoSuchObjectException(message:blah55)
      E0817 01:10:53.703266 19081 catalog-server.cc:76] CatalogException: Error initializing Catalog. Catalog may be empty.

      This is a corner case and this is why it happens.
      Impala reset() function in catalog invokes "invalidate metadata" and hive creates and drops the tables at the following times as shown in the comment
      MetaStoreClient msClient = metaStoreClientPool_.getClient(); // create database test; done in hive
      try {
      for (String dbName: msClient.getHiveClient().getAllDatabases()) {
      List<org.apache.hadoop.hive.metastore.api.Function> javaFns = // drop database test called in hive
      Lists.newArrayList();
      for (String javaFn: msClient.getHiveClient().getFunctions(dbName, "*"))

      { // This call fails and throws an exception because this database does not exist in HMS now javaFns.add(msClient.getHiveClient().getFunction(dbName, javaFn)); }

      org.apache.hadoop.hive.metastore.api.Database msDb =
      msClient.getHiveClient().getDatabase(dbName);
      Db db = new Db(dbName, this, msDb);
      // Restore UDFs that aren't persisted.
      Db oldDb = oldDbCache.get(db.getName().toLowerCase());

      Also recreated it by deliberately putting a breakpoints and parallely deleting the databases through hive.

        Activity

        Hide
        dtsirogiannis Dimitris Tsirogiannis added a comment -

        Anuj Phadke, are you currently working on this?

        Show
        dtsirogiannis Dimitris Tsirogiannis added a comment - Anuj Phadke , are you currently working on this?
        Hide
        bharathv bharath v added a comment -

        Commit: ec7b3a5ae7c387416c957908d42969b794955820
        http://github.mtv.cloudera.com/CDH/Impala/commit/ec7b3a5ae7c387416c957908d42969b794955820
        Author: Bharath Vissapragada <bharathv@cloudera.com>
        Date: 2016-09-06 (Tue, 06 Sep 2016)

        Changed paths:
        M fe/src/main/java/com/cloudera/impala/catalog/CatalogServiceCatalog.java

        Log Message:
        -----------
        IMPALA-4020: Handle external conflicting changes to HMS gracefully

        Currently Catalog can't handle conflicting changes to HMS' databases
        from external clients while running invalidate metadata operation.
        For example, if a database is dropped by a client external to Impala,
        while the invalidate metadata is in process, Catalog aborts the
        metadata load. This commit fixes this issue by handling appropriate
        exceptions when HMS operations fail and only ignores the load for that
        particular database.

        Change-Id: Ic228efbcceb9ef6c165d0d9aeef7202581e3e46a
        Reviewed-on: http://gerrit.cloudera.org:8080/4161
        Reviewed-by: Dimitris Tsirogiannis <dtsirogiannis@cloudera.com>
        Reviewed-by: Marcel Kornacker <marcel@cloudera.com>

        Show
        bharathv bharath v added a comment - Commit: ec7b3a5ae7c387416c957908d42969b794955820 http://github.mtv.cloudera.com/CDH/Impala/commit/ec7b3a5ae7c387416c957908d42969b794955820 Author: Bharath Vissapragada <bharathv@cloudera.com> Date: 2016-09-06 (Tue, 06 Sep 2016) Changed paths: M fe/src/main/java/com/cloudera/impala/catalog/CatalogServiceCatalog.java Log Message: ----------- IMPALA-4020 : Handle external conflicting changes to HMS gracefully Currently Catalog can't handle conflicting changes to HMS' databases from external clients while running invalidate metadata operation. For example, if a database is dropped by a client external to Impala, while the invalidate metadata is in process, Catalog aborts the metadata load. This commit fixes this issue by handling appropriate exceptions when HMS operations fail and only ignores the load for that particular database. Change-Id: Ic228efbcceb9ef6c165d0d9aeef7202581e3e46a Reviewed-on: http://gerrit.cloudera.org:8080/4161 Reviewed-by: Dimitris Tsirogiannis <dtsirogiannis@cloudera.com> Reviewed-by: Marcel Kornacker <marcel@cloudera.com>

          People

          • Assignee:
            bharathv bharath v
            Reporter:
            anujphadke Anuj Phadke
          • Votes:
            1 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development