Uploaded image for project: 'Apache Lens (Retired)'
  1. Apache Lens (Retired)
  2. LENS-910

Add session config to skip filtering cube related tables from all the tables in a database

Details

    • Improvement
    • Status: Closed
    • Minor
    • Resolution: Fixed
    • None
    • 2.5
    • server
    • None

    Description

      CubeMetastoreServiceImpl.java
        private List<String> getNativeTablesFromDB(LensSessionHandle sessionid, String dbName, boolean prependDbName)
          throws LensException {
          IMetaStoreClient msc = null;
          try {
            msc = getSession(sessionid).getMetaStoreClient();
            List<String> tables = msc.getAllTables(
              dbName);
            List<String> result = new ArrayList<String>();
            if (tables != null && !tables.isEmpty()) {
              List<org.apache.hadoop.hive.metastore.api.Table> tblObjects =
                msc.getTableObjectsByName(dbName, tables);
              Iterator<org.apache.hadoop.hive.metastore.api.Table> it = tblObjects.iterator();
              while (it.hasNext()) {
                org.apache.hadoop.hive.metastore.api.Table tbl = it.next();
                if (tbl.getParameters().get(MetastoreConstants.TABLE_TYPE_KEY) == null) {
                  if (prependDbName) {
                    result.add(dbName + "." + tbl.getTableName());
                  } else {
                    result.add(tbl.getTableName());
                  }
                }
              }
            }
            return result;
          } catch (Exception e) {
            throw new LensException("Error getting native tables from DB", e);
          }
        }
      
      

      We have approx. 18K tables in one of our hive databases. When fetching native tables, the getNativeTablesFromDB() filters out the cube related tables from the superset using table property(cube.table.type). This filtering is taking a long time for our heavy databases. The call to metastore API [msc.getTableObjectsByName()] is taking most of the time. Since, the cube related tables may not be present in some/most databases, the filtering is not necessary for them. I think we can add a session property, so that a user can skip filtering if he wants to. Thoughts ?

      Attachments

        1. LENS-910-5.patch
          9 kB
          Deepak Barr

        Activity

          People

            deepak.barr Deepak Barr
            deepak.barr Deepak Barr
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: