Uploaded image for project: 'Giraph'
  1. Giraph
  2. GIRAPH-1073

Decouple out-of-core persistence infrastructure from out-of-core computation

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.2.0
    • Component/s: None
    • Labels:
      None

      Description

      In the current out-of-core infrastructure, the persistence layer is heavily intertwined with the scheduling and out-of-core engine. This makes it complicated to try new features for the persistence layer. The following changes are needed:

      • The persistence layer should be decoupled from out-of-core infrastructure. This way one can simply implement and plug different data accessors for various persistence resources, e.g. local file system data accessor, HDFS data accessor, serialized in-memory data accessor, etc.
      • We should be able to address out-of-core data in a more efficient and flexible way. Currently, data are accessed/addressed through string literals in various locations of the code. This should be changed so data can be accessed through a unified, more flexible data indexing mechanism.
      • With different implementations of data accessor, now there may be more emphasis on having more IO threads. It is important that these IO threads are load-balanced. Currently, partitions are assigned to IO threads using a hash function. Hash function tent not to balance load with small number of data points (partitions in this case).
      • Currently, out-of-core uses `BufferedInputStream` and `BufferedOutputStream` along with the default (de)serialization mechanism. The IO bandwidth achieved in the current implementation is low. One can simply use: 1) Unsafe (de)serialization mechanism to optimize for memory bandwidth during (de)serialization process, 2) RandomAccessFile's read and write interface to have lower level access to the local file system and avoid overheads in reading/writing from/to local files.

        Attachments

          Activity

            People

            • Assignee:
              heslami Hassan Eslami
              Reporter:
              heslami Hassan Eslami
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: