Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-2267

Final files for WordCount not appearing with Apex on YARN

Details

    • Bug
    • Status: Resolved
    • P2
    • Resolution: Done
    • None
    • Not applicable
    • runner-apex
    • None

    Description

      When I run WordCount with the Apex runner on a YARN cluster - specifically Dataproc, reading/writing GCS - the word counts are all written to temporary files but they are never moved to their final destination.

      Hadoop version 2.7.3
      Beam RC 2.0.0

      Steps to repro:

      1. Instantiate archetype (see below)
      2. Build uber jar mvn --settings ../beamrc-settings.xml clean package -P apex-runner
      3. SCP to master (or wherever you'd like to launch from)
      4. java -cp word-count-beam-0.1.jar beamrc.WordCount --runner=ApexRunner --embeddedExecution=false --inputfile=gs://apache-beam-samples/shakespeare/winterstale-personae --output=SOMEWHERE

      Appendix: steps to instantiate RC archetype:

      Build an RC-specific beamrc-settings.xml

      <settings>
        <profiles>
          <profile>
            <id>beam-2.0.0</id>
            <repositories>
              <repository>
                <!-- This id _must_ be "archetype" -->
                <id>archetype</id>
                <url>RC_REPO</url>
              </repository>
            </repositories>
          </profile>
        </profiles>
       
        <activeProfiles>
          <activeProfile>beam-2.0.0</activeProfile>
        </activeProfiles>
      </settings>
      

      And then instantiate like so

      mvn archetype:generate \
            --settings beam-rc-settings.xml \
            -D archetypeCatalog=internal \
            -D archetypeGroupId=org.apache.beam \
            -D archetypeArtifactId=beam-sdks-java-maven-archetypes-examples \
            -D archetypeVersion=2.0.0 \
            -D groupId=beamrc \
            -D artifactId=word-count-beam \
            -D version="0.1" \
            -D package=beamrc \
            -D interactiveMode=false
      

      Attachments

        Activity

          People

            Unassigned Unassigned
            kenn Kenneth Knowles
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: