Uploaded image for project: 'Crunch (Retired)'
  1. Crunch (Retired)
  2. CRUNCH-543

AvroPathPerKeyTarget copy nested subdirectories

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Reopened
    • Major
    • Resolution: Unresolved
    • None
    • 0.14.0
    • IO
    • None

    Description

      When using AvroPathPerKeyTarget to write out a subpath in the output directory using a String key, the key might indicate multiple subfolders:

      Pair<String, String> kv = new Pair<String, String>("foo/bar", "value");
      PTable<String, String> kvs = pipeline.create(Arrays.asList(kv),Avros.tableOf(Avros.strings(), Avros.strings()));
      PTables.asPTable(kvs).write(new AvroPathPerKeyTarget("output"));

      This throws the error:
      java.io.IOException: java.lang.IllegalArgumentException: Reducer output name 'bar' cannot be parsed
      at org.apache.crunch.impl.mr.exec.CrunchJobHooks$CompletionHook.handleMultiPaths(CrunchJobHooks.java:92)
      ...

      In AvroPathPerKeyTarget the handleOutputs method would need to recursively copy subfolders (currently only checks first level in output directory) to enable keys that define multiple sub folders.

      Attachments

        1. CRUNCH-543.patch
          5 kB
          Josh Wills
        2. CRUNCH-543b.patch
          2 kB
          Adric Eckstein
        3. CRUNCH-543c.patch
          3 kB
          Adric Eckstein

        Issue Links

          Activity

            People

              jwills Josh Wills
              aeckstein Adric Eckstein
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated: