Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-15693

Improve native code's performance when writing to HDFS

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • fuse-dfs, native
    • None

    Description

      For reads, we introduced direct buffers in order to more efficiently communicate between the JVM and the native code, and we have readDirect and pReadDirect in hdfs.c implemented.

      Writes on the other hand still use the putByteArrayRegion call, which results in a copy of the buffer in memory.

      This Jira is to explore what has to be done in order to start to use direct buffers.
      A short initial list I see at the moment:

      • add a new StreamCapability for streams wanting to support writes via direct buffer
        - implement this capability in the DFSOutputStream and DFSStripedOutputStream
      • implement a writeDirect method on the native side

      fuse_dfs can benefit from this.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              pifta István Fajth
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated: