Details

    • Type: Bug
    • Status: Reopened
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: v2.1.0
    • Fix Version/s: v2.2.0
    • Component/s: Storage - HBase
    • Labels:
    • Environment:
      kylin:2.1.0
      hadoop:2.7.3
      hive:1.2.1
      hbase: 1.2.5
    • Flags:
      Patch

      Description

      Hbase is running on separate cluster with hive and mapreduce, then it will throw the
      exception. See the link for details http://apache-kylin.74782.x6.nabble.com/wrong-fs-when-use-two-cluster-td8985.html
      Except that, I found that kylin-2.0 is not compatible is kylin-2.1 which
      will cause query failed. In function writeLargeCellToHdfs(...) , the
      kylin-2.1 will write content to cluster of mapreduce. In kylin-2.0, the
      destination is hbase's cluster.

        Issue Links

          Activity

          Hide
          mailpig yzq added a comment - Reporter

          Reupload the path.

          Show
          mailpig yzq added a comment - Reporter Reupload the path.
          Hide
          Shaofengshi Shaofeng SHI added a comment -

          Hi yzq,

          " In function writeLargeCellToHdfs(...) , the kylin-2.1 will write content to cluster of mapreduce. In kylin-2.0, the
          destination is hbase's cluster."

          Are you sure for the above statements? From my investigation, it should be kylin 2.0 writes to the mapreduce cluster, but kylin-2.1 writes to hbase cluster.

          Show
          Shaofengshi Shaofeng SHI added a comment - Hi yzq, " In function writeLargeCellToHdfs(...) , the kylin-2.1 will write content to cluster of mapreduce. In kylin-2.0, the destination is hbase's cluster." Are you sure for the above statements? From my investigation, it should be kylin 2.0 writes to the mapreduce cluster, but kylin-2.1 writes to hbase cluster.
          Hide
          mailpig yzq added a comment - Reporter

          You are right, I made a mistake. It needs to be corrected here.

          Show
          mailpig yzq added a comment - Reporter You are right, I made a mistake. It needs to be corrected here.
          Hide
          Shaofengshi Shaofeng SHI added a comment - - edited

          Write to HBase cluster should be better because the meta store is in HBase; From another perspective, Kylin should be closer to the HBase cluster, putting the large files to HBase cluster's HDFS is also better than putting to computing cluster. So Kylin 2.1 is better than 2.0 on this.

          If the deployment is R/W separated, when upgrade to 2.1, need do a migration for the metadata file on hdfs, e.g:
          hadoop distcp hdfs://hive-hdfs:8020/kylin/kylin_default_instance/resources/* hdfs://hbase-hdfs:8020/kylin/kylin_default_instance/resources/

          For the error you mentioned in the email, that is a bug; I had made the change (only add one line) in a local branch. When it passed the integration testing, I will push it.

          Show
          Shaofengshi Shaofeng SHI added a comment - - edited Write to HBase cluster should be better because the meta store is in HBase; From another perspective, Kylin should be closer to the HBase cluster, putting the large files to HBase cluster's HDFS is also better than putting to computing cluster. So Kylin 2.1 is better than 2.0 on this. If the deployment is R/W separated, when upgrade to 2.1, need do a migration for the metadata file on hdfs, e.g: hadoop distcp hdfs://hive-hdfs:8020/kylin/kylin_default_instance/resources/* hdfs://hbase-hdfs:8020/kylin/kylin_default_instance/resources/ For the error you mentioned in the email, that is a bug; I had made the change (only add one line) in a local branch. When it passed the integration testing, I will push it.
          Hide
          Shaofengshi Shaofeng SHI added a comment -

          Another problem in 2.1, when configured using separated HBase cluster, the HFile was generated on Hive/MR cluster, instead of HBase's HDFS. It will not throw error, but you will see 1) the bulk load takes longer, 2) the HFile on working dir will not be removed after loaded into HBase.

          Show
          Shaofengshi Shaofeng SHI added a comment - Another problem in 2.1, when configured using separated HBase cluster, the HFile was generated on Hive/MR cluster, instead of HBase's HDFS. It will not throw error, but you will see 1) the bulk load takes longer, 2) the HFile on working dir will not be removed after loaded into HBase.
          Hide
          mailpig yzq added a comment - Reporter

          Great! but for my project, the hbase cluster is much small than Hive/MR, so I'm more inclined to use the latter. The other reason is for compatibility.
          Thanks for your detailed explanation.The issues is done.

          Show
          mailpig yzq added a comment - Reporter Great! but for my project, the hbase cluster is much small than Hive/MR, so I'm more inclined to use the latter. The other reason is for compatibility. Thanks for your detailed explanation.The issues is done.
          Hide
          Shaofengshi Shaofeng SHI added a comment -

          Hi Alex, let's keep this JIRA open until it is released in a Kylin formal version. Now I would set its fix version as 2.2

          Show
          Shaofengshi Shaofeng SHI added a comment - Hi Alex, let's keep this JIRA open until it is released in a Kylin formal version. Now I would set its fix version as 2.2
          Hide
          mailpig yzq added a comment - Reporter

          OK

          Show
          mailpig yzq added a comment - Reporter OK

            People

            • Assignee:
              Shaofengshi Shaofeng SHI
              Reporter:
              mailpig yzq
              Request participants:
              None
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:

                Time Tracking

                Estimated:
                Original Estimate - 72h
                72h
                Remaining:
                Remaining Estimate - 72h
                72h
                Logged:
                Time Spent - Not Specified
                Not Specified