Hive
  1. Hive
  2. HIVE-1328

make mapred.input.dir.recursive work for select *

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.6.0
    • Fix Version/s: 0.6.0
    • Component/s: Query Processor
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      For the script below, we would like the behavior from MAPREDUCE-1501 to apply so that the select * returns two rows instead of none.

      create table fact_daily(x int)
      partitioned by (ds string);

      create table fact_tz(x int)
      partitioned by (ds string, hr string, gmtoffset string);

      alter table fact_tz
      add partition (ds='2010-01-03', hr='1', gmtoffset='-8');
      insert overwrite table fact_tz
      partition (ds='2010-01-03', hr='1', gmtoffset='-8')
      select key+11 from src where key=484;

      alter table fact_tz
      add partition (ds='2010-01-03', hr='2', gmtoffset='-7');
      insert overwrite table fact_tz
      partition (ds='2010-01-03', hr='2', gmtoffset='-7')
      select key+12 from src where key=484;

      alter table fact_daily
      set tblproperties('EXTERNAL'='TRUE');

      alter table fact_daily
      add partition (ds='2010-01-03')
      location '/user/hive/warehouse/fact_tz/ds=2010-01-03';

      set mapred.input.dir.recursive=true;
      select * from fact_daily where ds='2010-01-03';

        Issue Links

          Activity

          Gavin made changes -
          Link This issue is depended upon by HIVE-1336 [ HIVE-1336 ]
          Gavin made changes -
          Link This issue blocks HIVE-1336 [ HIVE-1336 ]
          Carl Steinbach made changes -
          Status Resolved [ 5 ] Closed [ 6 ]
          Namit Jain made changes -
          Status Patch Available [ 10002 ] Resolved [ 5 ]
          Hadoop Flags [Reviewed]
          Resolution Fixed [ 1 ]
          Hide
          Namit Jain added a comment -

          Committed. Thanks John

          Show
          Namit Jain added a comment - Committed. Thanks John
          Hide
          Namit Jain added a comment -

          +1

          looks good

          Show
          Namit Jain added a comment - +1 looks good
          John Sichi made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Hide
          John Sichi added a comment -

          Review notes:

          • Refactored recursive walk function from BucketizedHIveInputFormat to FileUtils
          • Opened HIVE-1336 for test case.
          Show
          John Sichi added a comment - Review notes: Refactored recursive walk function from BucketizedHIveInputFormat to FileUtils Opened HIVE-1336 for test case.
          John Sichi made changes -
          Link This issue blocks HIVE-1336 [ HIVE-1336 ]
          John Sichi made changes -
          Field Original Value New Value
          Attachment HIVE-1328.1.patch [ 12443534 ]
          Hide
          John Sichi added a comment -

          Still testing this one. Won't be possible to submit an automated test until we're running against a version of Hadoop which includes MAPREDUCE-1501, so I'll open a separate deferred issue for that.

          Show
          John Sichi added a comment - Still testing this one. Won't be possible to submit an automated test until we're running against a version of Hadoop which includes MAPREDUCE-1501 , so I'll open a separate deferred issue for that.
          Hide
          Namit Jain added a comment -

          I haven't heard anyone running into https://issues.apache.org/jira/browse/HIVE-1303 at facebook.

          Show
          Namit Jain added a comment - I haven't heard anyone running into https://issues.apache.org/jira/browse/HIVE-1303 at facebook.
          Hide
          Edward Capriolo added a comment -

          I find external partitions to be pretty badly broken now. I am circling around one or two other bugs in them, that I am about to report. Users (including myself) are frustrated beause rather then working with data they have to work around bugs like HIVE-1318. I understand everyone has their own priorities. Call it what you will (inconsistancy/feature) we are adding to the capability of external tables while current features do not even work well.

          In particular HIVE-1318 is brutal. When working with my data I can make no assumptions when querying. I have to do all types of shell scripting to ensure that partitions exist before I query them, adding extra where clauses to carefully select ranges of partitions.

          If you are using external partitions at facebook, I wonder how you work around HIVE-1318, and I am also curious if you experience HIVE-1303 or is this just something in my environment. The handfull of users I have constantly have issues, does everyone there just 'suck it up'?

          Show
          Edward Capriolo added a comment - I find external partitions to be pretty badly broken now. I am circling around one or two other bugs in them, that I am about to report. Users (including myself) are frustrated beause rather then working with data they have to work around bugs like HIVE-1318 . I understand everyone has their own priorities. Call it what you will (inconsistancy/feature) we are adding to the capability of external tables while current features do not even work well. In particular HIVE-1318 is brutal. When working with my data I can make no assumptions when querying. I have to do all types of shell scripting to ensure that partitions exist before I query them, adding extra where clauses to carefully select ranges of partitions. If you are using external partitions at facebook, I wonder how you work around HIVE-1318 , and I am also curious if you experience HIVE-1303 or is this just something in my environment. The handfull of users I have constantly have issues, does everyone there just 'suck it up'?
          Hide
          John Sichi added a comment -

          Hi Ed,

          This is not a new feature--this is an inconsistency in an existing feature when a particular Hadoop parameter is enabled (it should not matter whether you use select * vs a more complex select, you should get the same results).

          In general, prioritization is driven by a number of factors such as the overall project roadmap, quality, and the use cases which the developer wants or needs to make work (this one happens to be important for Facebook, which is why I'm working on it at the moment); if the ones you mention are high priority for you, please submit patches for them so we can get them resolved.

          Regardless of that, thanks for all the bug reports that you have submitted--they're very valuable in themselves, and we want to get them all fixed too.

          Show
          John Sichi added a comment - Hi Ed, This is not a new feature--this is an inconsistency in an existing feature when a particular Hadoop parameter is enabled (it should not matter whether you use select * vs a more complex select, you should get the same results). In general, prioritization is driven by a number of factors such as the overall project roadmap, quality, and the use cases which the developer wants or needs to make work (this one happens to be important for Facebook, which is why I'm working on it at the moment); if the ones you mention are high priority for you, please submit patches for them so we can get them resolved. Regardless of that, thanks for all the bug reports that you have submitted--they're very valuable in themselves, and we want to get them all fixed too.
          Hide
          Edward Capriolo added a comment -

          Can we look at HIVE-1318 and maybe HIVE-1303 first. Already the external partitions seem to have bugs can we get them working properly before more features are added?

          Show
          Edward Capriolo added a comment - Can we look at HIVE-1318 and maybe HIVE-1303 first. Already the external partitions seem to have bugs can we get them working properly before more features are added?
          John Sichi created issue -

            People

            • Assignee:
              John Sichi
              Reporter:
              John Sichi
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development