Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-9143

Restart strategy defined in flink-conf.yaml is ignored

    XMLWordPrintableJSON

Details

    Description

      Restart strategy defined in flink-conf.yaml is disregarded, when user enables checkpointing.

      Steps to reproduce:

      1. Download flink distribution (1.4.2), update flink-conf.yaml:
       
      restart-strategy: none
      state.backend: rocksdb
      state.backend.fs.checkpointdir: file:///tmp/nfsrecovery/flink-checkpoints-metadata
      state.backend.rocksdb.checkpointdir: file:///tmp/nfsrecovery/flink-checkpoints-rocksdb
       
      2. create new java project as described at https://ci.apache.org/projects/flink/flink-docs-release-1.4/quickstart/java_api_quickstart.html
      here's the code:
      public class FailedJob
      {
          static final Logger LOGGER = LoggerFactory.getLogger(FailedJob.class);
          public static void main( String[] args ) throws Exception
          {
              final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
              env.enableCheckpointing(5000, CheckpointingMode.EXACTLY_ONCE);
              DataStream<String> stream = env.fromCollection(Arrays.asList("test"));
              stream.map(new MapFunction<String, String>(){
                  @Override
                  public String map(String obj)

      {                 throw new NullPointerException("NPE");             }

       
              });
              env.execute("Failed job");
          }
      }
       
      3. Compile: mvn clean package; submit it to the cluster
       
      4. Go to Job Manager configuration in WebUI, ensure settings from flink-conf.yaml is there (screenshot attached)
       
      5. Go to Job's configuration, see Execution Configuration section
       
      Expected result: restart strategy as defined in flink-conf.yaml
       
      Actual result: Restart with fixed delay (10000 ms). #2147483647 restart attempts.
       
       
      see attached screenshots and jobmanager log (line 1 and 31)
       

      Attachments

        1. execution_config.png
          100 kB
          Alex Smirnov
        2. jobmanager.log
          6 kB
          Alex Smirnov
        3. jobmanager.png
          126 kB
          Alex Smirnov

        Activity

          People

            dwysakowicz Dawid Wysakowicz
            asmirnov Alex Smirnov
            Votes:
            0 Vote for this issue
            Watchers:
            9 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 10m
                10m