Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-10430

Can't run WordCount on EMR With Flink Runner via YARN

Details

    • Bug
    • Status: Resolved
    • P3
    • Resolution: Not A Problem
    • 2.22.0
    • Missing
    • examples-java, runner-flink
    • AWS EMR 5.30.0 running Spark 2.4.5, Flink 1.10.0

    Description

      1) I setup WordCount project as detailed on Beam website..

       {{mvn archetype:generate \
      -DarchetypeGroupId=org.apache.beam \
      -DarchetypeArtifactId=beam-sdks-java-maven-archetypes-examples \
      -DarchetypeVersion=2.22.0 \
      -DgroupId=org.example \
      -DartifactId=word-count-beam \
      -Dversion="0.1" \
      -Dpackage=org.apache.beam.examples \
      -DinteractiveMode=false}}

      2) mvn clean package -Pflink-runner

      3) Ran the application on AWS EMR 5.30.0 with Flink 1.10.0

      flink run -m yarn-cluster -yid <yarn_application_id> -p 4  -c org.apache.beam.examples.WordCount word-count-beam-bundled-0.1.jar –runner=FlinkRunner --inputFile <path_in_s3_of_input_file> --output <path_in_s3_of_output_dir>

      4) Launch failed with the following exception stack trace 

      java.util.ServiceConfigurationError: com.fasterxml.jackson.databind.Module: Provider com.fasterxml.jackson.module.jaxb.JaxbAnnotationModule not a subtype

      at java.util.ServiceLoader.fail(ServiceLoader.java:239)

      at java.util.ServiceLoader.access$300(ServiceLoader.java:185)

      at java.util.ServiceLoader$LazyIterator.nextService(ServiceLoader.java:376)

      at java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:404)

      at java.util.ServiceLoader$1.next(ServiceLoader.java:480)

      at com.fasterxml.jackson.databind.ObjectMapper.findModules(ObjectMapper.java:1054)

      at org.apache.beam.sdk.options.PipelineOptionsFactory.<clinit>(PipelineOptionsFactory.java:471)

      at org.apache.beam.examples.WordCount.main(WordCount.java:190)

      at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

      at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

      at java.lang.reflect.Method.invoke(Method.java:498)

      at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:321)

      at org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:205)

      at org.apache.flink.client.ClientUtils.executeProgram(ClientUtils.java:138)

      at org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:664)

      at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:213)

      at org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:895)

      at org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:968)

      at java.security.AccessController.doPrivileged(Native Method)

      at javax.security.auth.Subject.doAs(Subject.java:422)

      at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1844)

      at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)

      at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:968)

      Attachments

        Issue Links

          Activity

            People

              echauchot Etienne Chauchot
              sassin8080 Shashi
              Votes:
              5 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 7h 10m
                  7h 10m