Avro
  1. Avro
  2. AVRO-670

Allow DataFileWriteTool to accept schema files as input

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.5.0
    • Component/s: java
    • Labels:
      None
    • Hadoop Flags:
      Incompatible change, Reviewed
    • Release Note:
      The "fromjson" tool now requires either a --schema or --schema-file command-line argument to specify the schema. Previously, the schema was to be specified as the first argument.

      Description

      For non-trivial schemas, it's difficult to pass them inline as a command line argument. I made a patch to use two different arguments: instead of having the first argument be the schema you would now use -schema-file file or -schema schema and then have one other argument (the input JSON file)

      1. AVRO-670.patch
        6 kB
        Ron Bodkin
      2. datafilewritefile.patch
        3 kB
        Ron Bodkin

        Activity

        Hide
        Ron Bodkin added a comment -

        The patch

        Show
        Ron Bodkin added a comment - The patch
        Hide
        Philip Zeyliger added a comment -

        Hi Ron,

        The idea and implementation look good. We should note that it's an incompatible change in terms of the tools command line API, which I think is fine, but should be noted.

        I couldn't get the patch to apply cleanly. Typically (see https://cwiki.apache.org/AVRO/how-to-contribute.html) patches should be generated at top-level (so, lang/java/... should be the path that's being patched). Many folks name their patches AVRO-670.patch, too, though I'm not a stickler there. I also recommend turning off Eclipse's autoimport. I think our style guide discourages star imports (e.g., +import joptsimple.*; is not something your patch should have introduced). Besides that, "patch" gave me a rejects file---how did you generate your patch?

        I was slightly surprised that 'lang/java/src/test/bin/test_tools.sh' (which is run by the ant target 'test-tools') doesn't exercise this code. Would be good to make sure that the tests for this code don't need any modification. (There's a java test somewhere, but it might not exercise the command-line parsing code; I haven't looked.)

        Could you upload a new patch without the import changes and re-generated against the root of the repo?

        Thanks!

        Show
        Philip Zeyliger added a comment - Hi Ron, The idea and implementation look good. We should note that it's an incompatible change in terms of the tools command line API, which I think is fine, but should be noted. I couldn't get the patch to apply cleanly. Typically (see https://cwiki.apache.org/AVRO/how-to-contribute.html ) patches should be generated at top-level (so, lang/java/... should be the path that's being patched). Many folks name their patches AVRO-670 .patch, too, though I'm not a stickler there. I also recommend turning off Eclipse's autoimport. I think our style guide discourages star imports (e.g., +import joptsimple.*; is not something your patch should have introduced). Besides that, "patch" gave me a rejects file---how did you generate your patch? I was slightly surprised that 'lang/java/src/test/bin/test_tools.sh' (which is run by the ant target 'test-tools') doesn't exercise this code. Would be good to make sure that the tests for this code don't need any modification. (There's a java test somewhere, but it might not exercise the command-line parsing code; I haven't looked.) Could you upload a new patch without the import changes and re-generated against the root of the repo? Thanks!
        Hide
        Ron Bodkin added a comment -

        I am working on fixing the one test that actually does test this code (it is indeed incompatible). I'll submit an updated patch with that and the imports less modified. I just used svn diff to generate the patch.

        Show
        Ron Bodkin added a comment - I am working on fixing the one test that actually does test this code (it is indeed incompatible). I'll submit an updated patch with that and the imports less modified. I just used svn diff to generate the patch.
        Hide
        Ron Bodkin added a comment -

        Here's a revised patch from top-level, including updated tests that test the new argument as well.

        Show
        Ron Bodkin added a comment - Here's a revised patch from top-level, including updated tests that test the new argument as well.
        Hide
        Philip Zeyliger added a comment -

        Hi Ron,

        Thanks for your contribution! (And congratulations on your first contribution to AVRO.) I've committed it.

        I made two very minor changes:

        I fixed one "checkstyle" bug ("ant test" runs checkstyle)

        [checkstyle] /data/6/philip/avro-svn/lang/java/src/java/org/apache/avro/tool/DataFileWriteTool.java:121:69: Redundant throws: 'FileNotFoundException' is subclass of 'IOException'.

        I also added a "--schema" to src/test/bin/test_tools.sh in one place, to fix a test failure.

        Show
        Philip Zeyliger added a comment - Hi Ron, Thanks for your contribution! (And congratulations on your first contribution to AVRO.) I've committed it. I made two very minor changes: I fixed one "checkstyle" bug ("ant test" runs checkstyle) [checkstyle] /data/6/philip/avro-svn/lang/java/src/java/org/apache/avro/tool/DataFileWriteTool.java:121:69: Redundant throws: 'FileNotFoundException' is subclass of 'IOException'. I also added a "--schema" to src/test/bin/test_tools.sh in one place, to fix a test failure.

          People

          • Assignee:
            Ron Bodkin
            Reporter:
            Ron Bodkin
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development