Pig
  1. Pig
  2. PIG-1946

HBaseStorage constructor syntax is error prone

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.10.0
    • Component/s: None
    • Labels:
      None
    • Release Note:
      More friendly HBaseStorage constructor syntax.

      Description

      Using HBaseStorage like so seems like a reasonable thing to do, but it will yield unexpected results:

      STORE result INTO 'hbase://foo' USING
       org.apache.pig.backend.hadoop.hbase.HBaseStorage(
       'info:first_name, info:last_name');
      

      The problem us that a column named info:first_name, will be created, with the trailing comma included. I've had numerous developers get tripped up on this issue since everywhere else in Pig variables are separated by commas, so I propose we fix it.

      I propose we trim leading/trailing commas from column names, but I'm open to other ideas.

      Also should we accept column names that are comman-delimited without spaces?

      1. PIG-1946_3.patch
        14 kB
        Bill Graham
      2. PIG-1946_2.patch
        12 kB
        Bill Graham
      3. PIG-1946_1.patch
        6 kB
        Bill Graham

        Activity

        Bill Graham created issue -
        Bill Graham made changes -
        Field Original Value New Value
        Description Using {{HBaseStorage}} like so seems like a reasonable thing to do, but it will yield unexpected results:

        {code}
        STORE result INTO 'hbase://foo' USING
        org.apache.pig.backend.hadoop.hbase.HBaseStorage('info:first_name, info:last_name');
        {code}

        The problem us that a column named {{info:first_name,}} will be created, with the trailing comma included. I've had numerous developers get tripped up on this issue since everywhere else in Pig variables are separated by commas, so I propose we fix it.

        I propose we trim leading/trailing commas from column names, but I'm open to other ideas.

        Also should we accept column names that are comman-delimited without spaces?
        Using {{HBaseStorage}} like so seems like a reasonable thing to do, but it will yield unexpected results:

        {code}
        STORE result INTO 'hbase://foo' USING
         org.apache.pig.backend.hadoop.hbase.HBaseStorage(
         'info:first_name, info:last_name');
        {code}

        The problem us that a column named {{info:first_name,}} will be created, with the trailing comma included. I've had numerous developers get tripped up on this issue since everywhere else in Pig variables are separated by commas, so I propose we fix it.

        I propose we trim leading/trailing commas from column names, but I'm open to other ideas.

        Also should we accept column names that are comman-delimited without spaces?
        Bill Graham made changes -
        Attachment PIG-1946_1.patch [ 12477731 ]
        Bill Graham made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Fix Version/s 0.10 [ 12316246 ]
        Thejas M Nair made changes -
        Status Patch Available [ 10002 ] Open [ 1 ]
        Bill Graham made changes -
        Attachment PIG-1946_2.patch [ 12486502 ]
        Bill Graham made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Release Note More friendly HBaseStorage constructor syntax.
        Bill Graham made changes -
        Attachment PIG-1946_3.patch [ 12486643 ]
        Dmitriy V. Ryaboy made changes -
        Status Patch Available [ 10002 ] Resolved [ 5 ]
        Resolution Fixed [ 1 ]
        Dmitriy V. Ryaboy made changes -
        Status Resolved [ 5 ] Closed [ 6 ]

          People

          • Assignee:
            Bill Graham
            Reporter:
            Bill Graham
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development