[BEAM-2267] Final files for WordCount not appearing with Apex on YARN - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: P2
Resolution: Done
Affects Version/s: None
Fix Version/s: Not applicable
Component/s: runner-apex
Labels:
None

Description

When I run WordCount with the Apex runner on a YARN cluster - specifically Dataproc, reading/writing GCS - the word counts are all written to temporary files but they are never moved to their final destination.

Hadoop version 2.7.3
Beam RC 2.0.0

Steps to repro:

1. Instantiate archetype (see below)
2. Build uber jar mvn --settings ../beamrc-settings.xml clean package -P apex-runner
3. SCP to master (or wherever you'd like to launch from)
4. java -cp word-count-beam-0.1.jar beamrc.WordCount --runner=ApexRunner --embeddedExecution=false --inputfile=gs://apache-beam-samples/shakespeare/winterstale-personae --output=SOMEWHERE

Appendix: steps to instantiate RC archetype:

Build an RC-specific beamrc-settings.xml

<settings>
  <profiles>
    <profile>
      <id>beam-2.0.0</id>
      <repositories>
        <repository>
          <!-- This id _must_ be "archetype" -->
          <id>archetype</id>
          <url>RC_REPO</url>
        </repository>
      </repositories>
    </profile>
  </profiles>
 
  <activeProfiles>
    <activeProfile>beam-2.0.0</activeProfile>
  </activeProfiles>
</settings>

And then instantiate like so

mvn archetype:generate \
      --settings beam-rc-settings.xml \
      -D archetypeCatalog=internal \
      -D archetypeGroupId=org.apache.beam \
      -D archetypeArtifactId=beam-sdks-java-maven-archetypes-examples \
      -D archetypeVersion=2.0.0 \
      -D groupId=beamrc \
      -D artifactId=word-count-beam \
      -D version="0.1" \
      -D package=beamrc \
      -D interactiveMode=false

Attachments

Activity

People

Assignee:: Unassigned

Reporter:: Kenneth Knowles

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 11/May/17 22:39

Updated:: 16/May/20 13:57

Resolved:: 10/Feb/18 05:33