XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • hive-14535
    • None
    • None

    Description

      set hive.optimize.skewjoin = true;
      set hive.skewjoin.key = 2;
      set hive.optimize.metadataonly=false;
      
      CREATE TABLE dest_j1(key INT, value STRING) STORED AS TEXTFILE tblproperties ("transactional"="true", "transactional_properties"="insert_only");
      
      FROM src src1 JOIN src src2 ON (src1.key = src2.key)
      INSERT OVERWRITE TABLE dest_j1 SELECT src1.key, src2.value;
      
      select count(distinct key) from dest_j1;
      

      Different results for MM and non-MM table.

      Probably has something to do with how skewjoin handles files; however, looking at MM/debugging logs, there are no suspicious deletes, and everything looks the same for both cases; all the logging for skewjoin row containers and stuff is identical between the two runs (except for the numbers/guids; the number of files, paths, etc. are all the same). So not sure what's going on. Probably dfs dump can answer this question, but it doesn't work for me currently on q files.

      Attachments

        Activity

          People

            sershe Sergey Shelukhin
            sershe Sergey Shelukhin
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: