Uploaded image for project: 'Pig'
  1. Pig
  2. PIG-4705

Error Schema for data cannot be determined using HCatalog

VotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Duplicate
    • 0.15.0
    • 0.16.0
    • tez
    • None
    • HDP 2.3.2

    Description

      When we use HCatalog as source and destination of data for Pig on Tez we get ERROR 1115: Schema for data cannot be determined.
      Pig works fine when we use map reduce or use HCatalog only as one of endpoints i.e. load data directly from file and store using HCatalog.

      The error appears after upgrading from Pig 0.14 on Tez 0.5.2 to Pig 0.15 on Tez 0.7.0 ( HDP 2.2.6 to HDP 2.3.2).

      To reproduce:

      data = LOAD 'table_input' USING org.apache.hive.hcatalog.pig.HCatLoader();
      items_unique = DISTINCT data;
      
      counted = FOREACH (GROUP items_unique BY col2)
      	    GENERATE
      	      group AS name,
      	      COUNT(items_unique) AS value;
        
      STORE counted INTO 'table_output' USING org.apache.hive.hcatalog.pig.HCatStorer();
      

      Attachments

        1. stack_trace.log
          9 kB
          Krzysztof Indyk
        2. hive_tables.hql
          0.2 kB
          Krzysztof Indyk
        3. sample.csv
          0.1 kB
          Krzysztof Indyk

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned
            Krzysztof Indyk Krzysztof Indyk
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment