Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-10720

Pig using HCatLoader to access RCFile and perform join but get incorrect result.

Log workAgile BoardRank to TopRank to BottomBulk Copy AttachmentsBulk Move AttachmentsAdd voteVotersWatch issueWatchersCreate sub-taskMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 1.3.0
    • None
    • HCatalog
    • None

    Description

      Create table tbl1 (c1 string, c2 string, key string, value string) stored as rcfile;
      Create table tbl2 (key string, value string);
      insert into tbl1 values('c1', 'c2', '1', 'value1');
      insert into tbl2 values('1', 'value2');
      

      Pig script:

      tbl1 = LOAD 'tbl1' USING org.apache.hive.hcatalog.pig.HCatLoader();
      tbl2 = LOAD 'tbl2' USING org.apache.hive.hcatalog.pig.HCatLoader();
      
      src_tbl1 = FILTER tbl1 BY (key == '1');
      prj_tbl1 = FOREACH src_tbl1 GENERATE
                 c1 as c1,
                 c2 as c2,
                 key as tbl1_key;
                 
      src_tbl2 = FILTER tbl2 BY (key == '1');
      prj_tbl2 = FOREACH src_tbl2 GENERATE
                 key as tbl2_key;
                 
      result = JOIN prj_tbl1 BY (tbl1_key), prj_tbl2 BY (tbl2_key);
      dump result;
      

      You will see result "(,,1,1)" and we are missing c1 and c2 values.

      Attachments

        1. HIVE-10720.patch
          0.9 kB
          Viraj Bhat

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            aihuaxu Aihua Xu Assign to me
            aihuaxu Aihua Xu

            Dates

              Created:
              Updated:

              Slack

                Issue deployment