HBase
  1. HBase
  2. HBASE-2387

FUSE module for mounting exported tablespaces

    Details

    • Type: New Feature New Feature
    • Status: Resolved
    • Priority: Minor Minor
    • Resolution: Later
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      FUSE: http://fuse.sourceforge.net/

      Create a FUSE translator that mounts exported tablespaces into the Linux filesystem namespace. Should run in either of two modes:

      1) Map the exported tablespace under the mount point.

      • If backing with Stargate this is a 1:1 map of resource paths to filesystem paths.
      • Via Thrift or Avro connector, should have the same mapping but implementation will be more involved.

      2) Emulate a filesystem, like s3fs (http://code.google.com/p/s3fs/wiki/FuseOverAmazon)

      • Translate paths under the mount point to row keys for good load spreading, /a/b/c/file.ext becomes file.ext/c/b/a
      • Consider borrowing from Tom White's Hadoop S3 FS (HADOOP-574), and store file data as blocks.
        • After fetching the inode can stream all blocks, e.g. via a multiget if available. This would support arbitrary file sizes. Otherwise there is a practical limit somewhere around 20-50 MB with default regionserver heaps.
        • So, file.ext/c/b/a gets the inode. Blocks would be keyed using the SHA-1 hash of their contents.
        • Use multiversioning on the inode to get snapshots for free: A path in the filesystem like /a/b/c/file.ext;timestamp gets file contents on or before timestamp.
        • Because new writes produce new blocks with unique hashes, this is like a dedup filesystem. Use ICV to maintain use counters on blocks.

      If backing with Stargate, support its multiuser mode.

        Issue Links

          Activity

          Hide
          Andrew Purtell added a comment -

          Cant bring myself to resolve this just yet. Still an interesting idea. But let's not use the REST gateway, it's always going to be slow.

          Show
          Andrew Purtell added a comment - Cant bring myself to resolve this just yet. Still an interesting idea. But let's not use the REST gateway, it's always going to be slow.
          Hide
          Andrew Purtell added a comment -

          Never mind about fuse4j, missed that it is LGPL.

          Show
          Andrew Purtell added a comment - Never mind about fuse4j, missed that it is LGPL.
          Hide
          Andrew Purtell added a comment -

          fuse4j: git://github.com/dtrott/fuse4j.git

          Show
          Andrew Purtell added a comment - fuse4j: git://github.com/dtrott/fuse4j.git
          Hide
          Todd Lipcon added a comment -

          Really cool idea, looking forward to this.

          Show
          Todd Lipcon added a comment - Really cool idea, looking forward to this.
          Hide
          Andrew Purtell added a comment -

          Updated description. Nothing weds this to Stargate specifically. The Thrift connector could be supported also.

          Show
          Andrew Purtell added a comment - Updated description. Nothing weds this to Stargate specifically. The Thrift connector could be supported also.

            People

            • Assignee:
              Unassigned
              Reporter:
              Andrew Purtell
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development