Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-25269

When the skew and parallel parameters are true simultaneously, the result is less data

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 3.1.0, 3.1.2
    • None
    • Physical Optimizer, SQL
    • None

    Description

      When the params of hive.optimize.skewjoin, hive.groupby.skewindata and hive.exec.parallel are true, and exec sql such as 'INSERT... FROM (SUBQUERY UNIONALL ...GROUP BY...) A JOIN/LEFT JOIN A.expression', result data will be reduced. Details of SQL and test data can be found in the attachment

      Attachments

        1. test.sql
          4 kB
          GuangMing Lu
        2. P10IDS_RISKLIST.zip
          48 kB
          GuangMing Lu
        3. p10ids_riskcon.zip
          69 kB
          GuangMing Lu
        4. p10ids_realpayrc_ygz.zip
          98 kB
          GuangMing Lu
        5. p10ids_prerec_split_ygz.zip
          98 kB
          GuangMing Lu
        6. comb_classcode.zip
          22 kB
          GuangMing Lu

        Activity

          People

            Unassigned Unassigned
            luguangming GuangMing Lu
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: