Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-5915

ATS 1.5 FileSystemTimelineWriter causes flush() to be called after every event write

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 3.0.0-alpha1
    • Fix Version/s: 2.8.0, 3.0.0-alpha2
    • Component/s: timelineserver
    • Labels:
      None

      Activity

      Hide
      ASikaria Atul Sikaria added a comment - - edited

      This was seen previously as well, in YARN-4814.

      The issue is with writeEntities method in FileSystemTimelineWriter (https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/api/impl/FileSystemTimelineWriter.java#L317). This calls getObjectMapper().writeValue(…), which does a flush() after every write with default config.

       
      @Override
      public void writeValue(JsonGenerator jgen, Object value)
          throws IOException, JsonGenerationException, JsonMappingException
      {
          SerializationConfig config = copySerializationConfig();
          if (config.isEnabled(SerializationConfig.Feature.CLOSE_CLOSEABLE) && (value instanceof Closeable)) {
              _writeCloseableValue(jgen, value, config);
          } else {
              _serializerProvider.serializeValue(config, jgen, value, _serializerFactory);
              if (config.isEnabled(SerializationConfig.Feature.FLUSH_AFTER_WRITE_VALUE)) {
                  jgen.flush();
              }
          }
      }
      

      On filesystems that map flush() to no-op or trivial operations, this is not a big deal. But on filesystems where flush() incurs a larger cost, this becomes a bottleneck for timeline events flow.

      The fix is to set the property above (FLUSH_AFTER_WRITE_VALUE) to false, so the JSonGenerator does not do a flush after every JSon write.

      The flush of the stream is done in a timer thread at configurable interval (10 seconds by default). As Jason Lowe pointed out in YARN-4814, the timer thread also needs to also do a flush() on the JsonGenerator, to make sure the json serializer does not have any buffered data - so the hflush() in the timer thread actually flushes all the data seen so far.

      Show
      ASikaria Atul Sikaria added a comment - - edited This was seen previously as well, in YARN-4814 . The issue is with writeEntities method in FileSystemTimelineWriter ( https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/api/impl/FileSystemTimelineWriter.java#L317 ). This calls getObjectMapper().writeValue(…), which does a flush() after every write with default config. @Override public void writeValue(JsonGenerator jgen, Object value) throws IOException, JsonGenerationException, JsonMappingException { SerializationConfig config = copySerializationConfig(); if (config.isEnabled(SerializationConfig.Feature.CLOSE_CLOSEABLE) && (value instanceof Closeable)) { _writeCloseableValue(jgen, value, config); } else { _serializerProvider.serializeValue(config, jgen, value, _serializerFactory); if (config.isEnabled(SerializationConfig.Feature.FLUSH_AFTER_WRITE_VALUE)) { jgen.flush(); } } } On filesystems that map flush() to no-op or trivial operations, this is not a big deal. But on filesystems where flush() incurs a larger cost, this becomes a bottleneck for timeline events flow. The fix is to set the property above (FLUSH_AFTER_WRITE_VALUE) to false, so the JSonGenerator does not do a flush after every JSon write. The flush of the stream is done in a timer thread at configurable interval (10 seconds by default). As Jason Lowe pointed out in YARN-4814 , the timer thread also needs to also do a flush() on the JsonGenerator, to make sure the json serializer does not have any buffered data - so the hflush() in the timer thread actually flushes all the data seen so far.
      Hide
      ASikaria Atul Sikaria added a comment -

      Attached patch that should address this issue.

      Show
      ASikaria Atul Sikaria added a comment - Attached patch that should address this issue.
      Hide
      hadoopqa Hadoop QA added a comment -
      -1 overall



      Vote Subsystem Runtime Comment
      0 reexec 0m 19s Docker mode activated.
      +1 @author 0m 0s The patch does not contain any @author tags.
      -1 test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
      +1 mvninstall 7m 36s trunk passed
      +1 compile 0m 27s trunk passed
      +1 checkstyle 0m 22s trunk passed
      +1 mvnsite 0m 30s trunk passed
      +1 mvneclipse 0m 13s trunk passed
      +1 findbugs 0m 54s trunk passed
      +1 javadoc 0m 27s trunk passed
      +1 mvninstall 0m 26s the patch passed
      +1 compile 0m 25s the patch passed
      +1 javac 0m 25s the patch passed
      +1 checkstyle 0m 15s the patch passed
      +1 mvnsite 0m 28s the patch passed
      +1 mvneclipse 0m 12s the patch passed
      +1 whitespace 0m 0s The patch has no whitespace issues.
      +1 findbugs 1m 1s the patch passed
      +1 javadoc 0m 25s the patch passed
      +1 unit 2m 16s hadoop-yarn-common in the patch passed.
      +1 asflicense 0m 15s The patch does not generate ASF License warnings.
      17m 50s



      Subsystem Report/Notes
      Docker Image:yetus/hadoop:a9ad5d6
      JIRA Issue YARN-5915
      JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12839675/YARN-5915.01.patch
      Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
      uname Linux 59dc9cc3a3da 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
      Build tool maven
      Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
      git revision trunk / 7584fbf
      Default Java 1.8.0_111
      findbugs v3.0.0
      Test Results https://builds.apache.org/job/PreCommit-YARN-Build/13977/testReport/
      modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common
      Console output https://builds.apache.org/job/PreCommit-YARN-Build/13977/console
      Powered by Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org

      This message was automatically generated.

      Show
      hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 19s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. -1 test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 mvninstall 7m 36s trunk passed +1 compile 0m 27s trunk passed +1 checkstyle 0m 22s trunk passed +1 mvnsite 0m 30s trunk passed +1 mvneclipse 0m 13s trunk passed +1 findbugs 0m 54s trunk passed +1 javadoc 0m 27s trunk passed +1 mvninstall 0m 26s the patch passed +1 compile 0m 25s the patch passed +1 javac 0m 25s the patch passed +1 checkstyle 0m 15s the patch passed +1 mvnsite 0m 28s the patch passed +1 mvneclipse 0m 12s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 1m 1s the patch passed +1 javadoc 0m 25s the patch passed +1 unit 2m 16s hadoop-yarn-common in the patch passed. +1 asflicense 0m 15s The patch does not generate ASF License warnings. 17m 50s Subsystem Report/Notes Docker Image:yetus/hadoop:a9ad5d6 JIRA Issue YARN-5915 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12839675/YARN-5915.01.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 59dc9cc3a3da 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 7584fbf Default Java 1.8.0_111 findbugs v3.0.0 Test Results https://builds.apache.org/job/PreCommit-YARN-Build/13977/testReport/ modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common Console output https://builds.apache.org/job/PreCommit-YARN-Build/13977/console Powered by Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
      Hide
      djp Junping Du added a comment -

      Thanks Atul Sikaria for reporting the issue and delivering a fix. I have added you to yarn contributor list, and assign this jira to you.
      The fix sounds reasonable to me. CC Xuan Gong and Jason Lowe for more comments who worked on YARN-4814 before.

      Show
      djp Junping Du added a comment - Thanks Atul Sikaria for reporting the issue and delivering a fix. I have added you to yarn contributor list, and assign this jira to you. The fix sounds reasonable to me. CC Xuan Gong and Jason Lowe for more comments who worked on YARN-4814 before.
      Hide
      djp Junping Du added a comment -

      Also, CC Li Lu as well.

      Show
      djp Junping Du added a comment - Also, CC Li Lu as well.
      Hide
      jlowe Jason Lowe added a comment -

      Sorry for missing this in YARN-4814. I thought the fix was equivalent because calling flush on the json generator won't actually flush the underlying stream if JsonGenerator.Feature.FLUSH_PASSED_TO_STREAM is set to false:

          @Override
          public final void flush()
              throws IOException
          {
              _flushBuffer();
              if (_outputStream != null) {
                  if (isEnabled(Feature.FLUSH_PASSED_TO_STREAM)) {
                      _outputStream.flush();
                  }
              }
          }
      

      Could you elaborate on how the filesystem flush is getting called? The mapper code is calling jgen.flush on every write, but how is that making it to the filesystem if FLUSH_PASSED_TO_STREAM is false on the json generator? If that property is false it appears all flushing the json generator will do is simply writing to the underlying output stream (but not flushing), which seems like exactly what we want. Is the output stream unbuffered such that the write is acting like a flush?

      Show
      jlowe Jason Lowe added a comment - Sorry for missing this in YARN-4814 . I thought the fix was equivalent because calling flush on the json generator won't actually flush the underlying stream if JsonGenerator.Feature.FLUSH_PASSED_TO_STREAM is set to false: @Override public final void flush() throws IOException { _flushBuffer(); if (_outputStream != null ) { if (isEnabled(Feature.FLUSH_PASSED_TO_STREAM)) { _outputStream.flush(); } } } Could you elaborate on how the filesystem flush is getting called? The mapper code is calling jgen.flush on every write, but how is that making it to the filesystem if FLUSH_PASSED_TO_STREAM is false on the json generator? If that property is false it appears all flushing the json generator will do is simply writing to the underlying output stream (but not flushing), which seems like exactly what we want. Is the output stream unbuffered such that the write is acting like a flush?
      Hide
      djp Junping Du added a comment -

      Is the output stream unbuffered such that the write is acting like a flush?

      That's also what I am suspecting. Shall we assume the behavior of buffered or unbuffered output stream be consistent in our case? The _flushBuffer() is hard to do so. May be good to buffer things at JsonGenerator?

          protected final void _flushBuffer() throws IOException
          {
              int len = _outputTail - _outputHead;
              if (len > 0) {
                  int offset = _outputHead;
                  _outputTail = _outputHead = 0;
                  _writer.write(_outputBuffer, offset, len);
              }
          }
      
      Show
      djp Junping Du added a comment - Is the output stream unbuffered such that the write is acting like a flush? That's also what I am suspecting. Shall we assume the behavior of buffered or unbuffered output stream be consistent in our case? The _flushBuffer() is hard to do so. May be good to buffer things at JsonGenerator? protected final void _flushBuffer() throws IOException { int len = _outputTail - _outputHead; if (len > 0) { int offset = _outputHead; _outputTail = _outputHead = 0; _writer.write(_outputBuffer, offset, len); } }
      Hide
      jlowe Jason Lowe added a comment -

      Sure if the output stream could be unbuffered then having it buffer at the json generator is fine, and that's what the patch does.

      +1 for the patch, I'll commit tomorrow if no objections.

      Show
      jlowe Jason Lowe added a comment - Sure if the output stream could be unbuffered then having it buffer at the json generator is fine, and that's what the patch does. +1 for the patch, I'll commit tomorrow if no objections.
      Hide
      djp Junping Du added a comment -

      Cool. +1 as well.

      Show
      djp Junping Du added a comment - Cool. +1 as well.
      Hide
      jlowe Jason Lowe added a comment -

      Thanks to Atul Sikaria and to Junping Du for additional review! I committed this to trunk, branch-2, and branch-2.8.

      Show
      jlowe Jason Lowe added a comment - Thanks to Atul Sikaria and to Junping Du for additional review! I committed this to trunk, branch-2, and branch-2.8.
      Hide
      hudson Hudson added a comment -

      SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #10928 (See https://builds.apache.org/job/Hadoop-trunk-Commit/10928/)
      YARN-5915. ATS 1.5 FileSystemTimelineWriter causes flush() to be called (jlowe: rev f304ccae3c2e0849b0b0b24c4bfe7a3a1ec2bb94)

      • (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/api/impl/FileSystemTimelineWriter.java
      Show
      hudson Hudson added a comment - SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #10928 (See https://builds.apache.org/job/Hadoop-trunk-Commit/10928/ ) YARN-5915 . ATS 1.5 FileSystemTimelineWriter causes flush() to be called (jlowe: rev f304ccae3c2e0849b0b0b24c4bfe7a3a1ec2bb94) (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/api/impl/FileSystemTimelineWriter.java

        People

        • Assignee:
          ASikaria Atul Sikaria
          Reporter:
          ASikaria Atul Sikaria
        • Votes:
          0 Vote for this issue
          Watchers:
          5 Start watching this issue

          Dates

          • Created:
            Updated:
            Resolved:

            Development