Sqoop
  1. Sqoop
  2. SQOOP-411

Precompile Pattern for replacement of Hive delimiters

    Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.4.1-incubating
    • Component/s: None
    • Labels:
      None

      Description

      The method FieldFormatter.hiveStringReplaceDelims is potentially called millions of times so precompiling the replacement pattern makes sense.

      1. SQOOP-411.1.patch
        3 kB
        Lars Francke
      2. SQOOP-411.2.patch
        1 kB
        Lars Francke

        Activity

        Hide
        Lars Francke added a comment -

        Patch to use a precompiled Pattern.

        I've also removed a deprecated class from the cloudera Namespace and used the new one instead. I hope that's okay, couldn't find any decisions on this.

        Show
        Lars Francke added a comment - Patch to use a precompiled Pattern. I've also removed a deprecated class from the cloudera Namespace and used the new one instead. I hope that's okay, couldn't find any decisions on this.
        Hide
        Jarek Jarcec Cecho added a comment -

        Hi Lars,
        can you please upload your patch to review board (https://reviews.apache.org/) so that we can review it and get it committed into trunk? Please use "/" as a "Base directory" on new review request page.

        Thank you,
        Jarcec

        Show
        Jarek Jarcec Cecho added a comment - Hi Lars, can you please upload your patch to review board ( https://reviews.apache.org/ ) so that we can review it and get it committed into trunk? Please use "/" as a "Base directory" on new review request page. Thank you, Jarcec
        Hide
        Lars Francke added a comment -

        Okay, thanks.

        I tried to submit a request for the "Sqoop" repository unfortunately I get a 500 error:

        Something broke! (Error 500)

        It appears something broke when you tried to go to here. This is either a bug in Review Board or a server configuration error. Please report this to your administrator.

        Any ideas?

        Show
        Lars Francke added a comment - Okay, thanks. I tried to submit a request for the "Sqoop" repository unfortunately I get a 500 error: Something broke! (Error 500) It appears something broke when you tried to go to here. This is either a bug in Review Board or a server configuration error. Please report this to your administrator. Any ideas?
        Hide
        Jarek Jarcec Cecho added a comment -

        Hi Lars,
        thank you very much for your time. I've looked on your patch and I believe that problem might be caused by extra lines above the patch itself (first 10 lines seems as an email fragment to me).

        I've noticed that you've removed the explicit module paths on various places in the code (for example you've changed com.cloudera.sqoop.lib.DelimiterSet to DelimiterSet). I know that it's very confusing at the moment, but the reason behind having the explicit full path is to be binary compatible with versions before 1.4.0, so I need to ask you to put full paths back.

        Other than that your patch seems very nice and I would like to see it committed. Might I ask you to fix this two small nits and upload it to review board for review process?

        Jarcec

        Show
        Jarek Jarcec Cecho added a comment - Hi Lars, thank you very much for your time. I've looked on your patch and I believe that problem might be caused by extra lines above the patch itself (first 10 lines seems as an email fragment to me). I've noticed that you've removed the explicit module paths on various places in the code (for example you've changed com.cloudera.sqoop.lib.DelimiterSet to DelimiterSet). I know that it's very confusing at the moment, but the reason behind having the explicit full path is to be binary compatible with versions before 1.4.0, so I need to ask you to put full paths back. Other than that your patch seems very nice and I would like to see it committed. Might I ask you to fix this two small nits and upload it to review board for review process? Jarcec
        Hide
        jiraposter@reviews.apache.org added a comment -

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/3357/
        -----------------------------------------------------------

        Review request for Sqoop.

        Summary
        -------

        Introduces a precompiled Pattern for the Hive delimiter chars

        This addresses bug SQOOP-411.
        https://issues.apache.org/jira/browse/SQOOP-411

        Diffs


        /src/java/org/apache/sqoop/lib/FieldFormatter.java 1226835

        Diff: https://reviews.apache.org/r/3357/diff

        Testing
        -------

        All tests succeed

        Thanks,

        Lars

        Show
        jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3357/ ----------------------------------------------------------- Review request for Sqoop. Summary ------- Introduces a precompiled Pattern for the Hive delimiter chars This addresses bug SQOOP-411 . https://issues.apache.org/jira/browse/SQOOP-411 Diffs /src/java/org/apache/sqoop/lib/FieldFormatter.java 1226835 Diff: https://reviews.apache.org/r/3357/diff Testing ------- All tests succeed Thanks, Lars
        Hide
        Lars Francke added a comment -

        Hey,

        I was developing against the Apache Git mirrors but Review Board doesn't like those Git patches which is annoying but I've created a patch agains SVN and created this review request: https://reviews.apache.org/r/3357/

        Thanks for looking at it.

        Cheers

        Show
        Lars Francke added a comment - Hey, I was developing against the Apache Git mirrors but Review Board doesn't like those Git patches which is annoying but I've created a patch agains SVN and created this review request: https://reviews.apache.org/r/3357/ Thanks for looking at it. Cheers
        Hide
        jiraposter@reviews.apache.org added a comment -

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/3357/#review4178
        -----------------------------------------------------------

        Ship it!

        • Alex

        On 2012-01-03 16:00:41, Lars Francke wrote:

        -----------------------------------------------------------

        This is an automatically generated e-mail. To reply, visit:

        https://reviews.apache.org/r/3357/

        -----------------------------------------------------------

        (Updated 2012-01-03 16:00:41)

        Review request for Sqoop.

        Summary

        -------

        Introduces a precompiled Pattern for the Hive delimiter chars

        This addresses bug SQOOP-411.

        https://issues.apache.org/jira/browse/SQOOP-411

        Diffs

        -----

        /src/java/org/apache/sqoop/lib/FieldFormatter.java 1226835

        Diff: https://reviews.apache.org/r/3357/diff

        Testing

        -------

        All tests succeed

        Thanks,

        Lars

        Show
        jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3357/#review4178 ----------------------------------------------------------- Ship it! Alex On 2012-01-03 16:00:41, Lars Francke wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3357/ ----------------------------------------------------------- (Updated 2012-01-03 16:00:41) Review request for Sqoop. Summary ------- Introduces a precompiled Pattern for the Hive delimiter chars This addresses bug SQOOP-411 . https://issues.apache.org/jira/browse/SQOOP-411 Diffs ----- /src/java/org/apache/sqoop/lib/FieldFormatter.java 1226835 Diff: https://reviews.apache.org/r/3357/diff Testing ------- All tests succeed Thanks, Lars
        Hide
        jiraposter@reviews.apache.org added a comment -

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/3357/#review4179
        -----------------------------------------------------------

        Ship it!

        Hi Lars,
        thank you very much for your time on this issue. Patch looks good. I've noticed that you've already uploaded it to JIRA, so I'll commit it shortly.

        Jarcec

        • Jarek

        On 2012-01-03 16:00:41, Lars Francke wrote:

        -----------------------------------------------------------

        This is an automatically generated e-mail. To reply, visit:

        https://reviews.apache.org/r/3357/

        -----------------------------------------------------------

        (Updated 2012-01-03 16:00:41)

        Review request for Sqoop.

        Summary

        -------

        Introduces a precompiled Pattern for the Hive delimiter chars

        This addresses bug SQOOP-411.

        https://issues.apache.org/jira/browse/SQOOP-411

        Diffs

        -----

        /src/java/org/apache/sqoop/lib/FieldFormatter.java 1226835

        Diff: https://reviews.apache.org/r/3357/diff

        Testing

        -------

        All tests succeed

        Thanks,

        Lars

        Show
        jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3357/#review4179 ----------------------------------------------------------- Ship it! Hi Lars, thank you very much for your time on this issue. Patch looks good. I've noticed that you've already uploaded it to JIRA, so I'll commit it shortly. Jarcec Jarek On 2012-01-03 16:00:41, Lars Francke wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3357/ ----------------------------------------------------------- (Updated 2012-01-03 16:00:41) Review request for Sqoop. Summary ------- Introduces a precompiled Pattern for the Hive delimiter chars This addresses bug SQOOP-411 . https://issues.apache.org/jira/browse/SQOOP-411 Diffs ----- /src/java/org/apache/sqoop/lib/FieldFormatter.java 1226835 Diff: https://reviews.apache.org/r/3357/diff Testing ------- All tests succeed Thanks, Lars
        Hide
        Jarek Jarcec Cecho added a comment -

        Patch committed. Thank you very much Lars!

        Jarcec

        Show
        Jarek Jarcec Cecho added a comment - Patch committed. Thank you very much Lars! Jarcec
        Hide
        Hudson added a comment -

        Integrated in Sqoop-ant-jdk-1.6 #73 (See https://builds.apache.org/job/Sqoop-ant-jdk-1.6/73/)
        SQOOP-411. Precompile Pattern for replacement of Hive delimiters

        (Lars Francke via Jarek Jarcec Cecho)

        jarcec : http://svn.apache.org/viewvc/?view=rev&rev=1226881
        Files :

        • /incubator/sqoop/trunk/src/java/org/apache/sqoop/lib/FieldFormatter.java
        Show
        Hudson added a comment - Integrated in Sqoop-ant-jdk-1.6 #73 (See https://builds.apache.org/job/Sqoop-ant-jdk-1.6/73/ ) SQOOP-411 . Precompile Pattern for replacement of Hive delimiters (Lars Francke via Jarek Jarcec Cecho) jarcec : http://svn.apache.org/viewvc/?view=rev&rev=1226881 Files : /incubator/sqoop/trunk/src/java/org/apache/sqoop/lib/FieldFormatter.java

          People

          • Assignee:
            Lars Francke
            Reporter:
            Lars Francke
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development