Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-26586

FileSystem uses unbuffered read I/O

    XMLWordPrintableJSON

Details

    Description

      • I found out that, at least when using LocalFileSystem on a windows system, read I/O to load a savepoint is unbuffered,
      • See example stack [1]
      • i.e. in order to load only a long in a serializer, it needs to go into kernel mode 8 times and load the 8 bytes one by one
      • I coded a BufferedFSDataInputStreamWrapper that allows to opt-in buffered reads on any FileSystem implementation
      • In our setting savepoint load is now 30 times faster
      • I’ve once seen a Jira ticket as to improve savepoint load time in general (lost the link unfortunately), maybe this approach can help with it
      • not sure if HDFS has got the same problem
      • I can contribute my implementation of a BufferedFSDataInputStreamWrapper which can be integrated in any 

      [1] unbuffered reads stack:
      read:207, FileInputStream (java.io)
      read:68, LocalDataInputStream (org.apache.flink.core.fs.local)
      read:50, FSDataInputStreamWrapper (org.apache.flink.core.fs)
      read:42, ForwardingInputStream (org.apache.flink.runtime.util)
      readInt:390, DataInputStream (java.io)
      deserialize:80, BytePrimitiveArraySerializer (org.apache.flink.api.common.typeutils.base.array)
      next:298, FullSnapshotRestoreOperation$KeyGroupEntriesIterator (org.apache.flink.runtime.state.restore)
      next:273, FullSnapshotRestoreOperation$KeyGroupEntriesIterator (org.apache.flink.runtime.state.restore)
      restoreKVStateData:147, RocksDBFullRestoreOperation (org.apache.flink.contrib.streaming.state.restore)

      Attachments

        1. BufferedFSDataInputStreamWrapper.java
          5 kB
          Matthias Schwalbe
        2. BufferedLocalFileSystem.java
          1 kB
          Matthias Schwalbe

        Activity

          People

            Unassigned Unassigned
            Matthias Schwalbe Matthias Schwalbe
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated: