Uploaded image for project: 'Kylin'
  1. Kylin
  2. KYLIN-5203

From Kylin or Hive, the same query Sql, but the results are inconsistent

VotersWatch issueWatchersLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Blocker
    • Resolution: Won't Do
    • v3.1.2
    • None
    • Query Engine
    • None

    Description

      SQL(SUM, COUNT):

      SELECT 
          SUM(t1.a1),
          COUNT(1)
      FROM
          T1 JOIN T2 ON...
          JOIN T3 ON...
          JOIN T4 ON...
          ...
          JOIN T9 ON...
      WHERE
          T1.c1 = '10000'
          T1.date between '2022-06-11' and '2022-06-21'
          T9.b_type IN ('7', '11', '12');

      Result:

        sum count
      Hive 2134980.9451 36330
      Kylin 1135892.3346 19765

      If remove T9 Filter:

      SELECT 
          SUM(t1.a1),
          COUNT(1)
      FROM
          T1 JOIN T2 ON...
          JOIN T3 ON...
          JOIN T4 ON...
          ...
          JOIN T9 ON...
      WHERE
          T1.c1 = '10000'
          T1.date between '2022-06-11' and '2022-06-21';

      Result:

        sum count
      Hive 3184089.5551 65333
      Kylin 3184089.5551 65333

      理论上,Hive和kylin的结果一致,但是不加上T9表的过滤条件,结果一致,加上Filter,结果丢失;
      In theory, the results of Hive and kylin are the same, but the filter conditions of the T9 table are not added, the results are the same, and the results are lost when Filter is added;

      env:
          Hive, 
          一共九张表,主表Fact Table是分区表,其余八张表中,两个千万大表,剩下的是维表,表类型是分桶表
          There are nine tables. The main table, Fact Table, is a partition table. The other eight tables, there are two large tables. The rest are dimension tables , bucket tables.

          Kylin:
          Create Intermediate Flat Hive Table
          Redistribute Flat Hive Table
          Extract Fact Table Distinct Columns(Map Input)
          Segment: 
              Source Count: ???

          From log, the same data count

       

       

       

       

       

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned
            xinxin wang
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment