Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-14439

New/Improved Filesystem Abstractions

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Open
    • Priority: Critical
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      Ticket for work in progress on new FileSystem abstractions. Previously, we (Yahoo) submitted a ticket that would add support for humongous (1 million region+) tables via a hierarchical layout (HBASE-13991). However open source is moving in a similar but not identical direction in the future and so the patch will not be merged into open source.

      We will be working on a different patch now with folks from open source. It will create/add to 2 layers-- a path abstraction layer and a use-oriented abstraction layer. The path abstraction layer is epitomized by classes like FsUtils (and in the patch new classes like AFsLayout). The use oriented abstraction layer is epitomized by existing classes like MasterFileSystem/HRegionFileSystem (and possibly new classes later) that build on the path abstraction layer and focus on 'doing things' (eg creating regions) and less on the gritty details like the paths.

      This work on abstracting and isolating the paths from the use cases will help Yahoo not diverge too much from open source with its internal 'Humongous' table hierarchical layout, while also helping open source move further towards the eventual goal of redoing the FS layout in a similar (but different) hierarchical layout later that focuses on data directory uniformity (unlike the humongous patch) and storing hierarchy in the meta table instead which enables new optimizations (see HBASE-14090.)

      Attached to this ticket is some work we've done at Yahoo so far that will be put into an open source HBase branch for further collaboration. The patch is not meant to be complete yet and is a work in progress. (Please wait on patch comments/reviews.) It also includes some Yahoo-specific 'humongous' layout code that will be removed before submission in open source.

        Attachments

        1. abstraction.patch
          262 kB
          Ben Lau

          Issue Links

          1.
          update MasterStorage / RegionStorage to have a exists-in-storage check and archive methods Sub-task Resolved Umesh Agashe
          2.
          Remove directory layout/ filesystem references from CompactionTool Sub-task Resolved Umesh Agashe
          3.
          comment out broken test-compile references Sub-task Resolved Umesh Agashe
          4.
          Remove directory layout/ filesystem references from the code in master/procedure directory Sub-task Resolved Umesh Agashe
          5.
          Remove directory layout/ filesystem references from Master Sub-task Open Unassigned
          6.
          Add ThreadPool in Legacy implementations of MasterStorage/ RegionStorage Sub-task Open Umesh Agashe
          7.
          Remove directory layout / fs references from HBase IO package Sub-task Open Umesh Agashe
          8.
          remove directory layout / fs references from TableSnapshotScanner Sub-task Open Umesh Agashe
          9.
          remove direct layout/fs references from mapreduce utilities Sub-task Open Unassigned
          10.
          remove directory layout / fs references from MOB Sub-task Open Sean Busbey
          11.
          remove directory layout / fs references from compaction Sub-task Open Unassigned
          12.
          decouple Replication from backing files of WAL Sub-task Open Unassigned
          13.
          remove directory layout / fs references from bulkload code Sub-task Open Sean Busbey
          14.
          Remove directory layout/ filesystem references from hbck tool Sub-task Open Xiang Li
          15.
          ensure WAL code no longer presumes colocation with region storage Sub-task Open Unassigned
          16.
          remove directory layout / fs references from snapshots Sub-task Resolved Umesh Agashe
          17.
          fold FSUtil classes into fs integration package Sub-task Open Unassigned
          18.
          ensure split operation doesn't directly reference fs / legacy integrations Sub-task Open Unassigned
          19.
          ensure merge operations don't reference filesystem or legacy implementation directly Sub-task Open Unassigned
          20.
          TEST: update HBaseTestingUtility to avoid direct use of filesystem / legacy implementation Sub-task Resolved Apekshit Sharma
          21.
          TEST: update integration tests to use MasterStorage/RegionStorage Sub-task Open Unassigned
          22.
          TEST: Remove directory layout/ filesystem references form hbck unit tests Sub-task Open Unassigned
          23.
          TEST: Remove directory layout/ filesystem references form unit tests for master/procedure Sub-task Open Unassigned
          24.
          TEST: update ScanPerformanceEvaluation to use MasterStorage / RegionStorage Sub-task Open Unassigned
          25.
          TEST: Remove directory layout/ filesystem references form unit tests for master Sub-task Open Unassigned
          26.
          TEST: update snapshot related tests to rely on MasterStorage / RegionStorage Sub-task In Progress Umesh Agashe
          27.
          TEST: update mapreduce tests to use masterstorage / regionstorage Sub-task Open Unassigned
          28.
          TEST: update MOB unit tests to user MasterStorage/ RegionStorage APIs Sub-task Open Unassigned
          29.
          TEST: update archiving tests to use masterstorage/regionstorage Sub-task Open Unassigned
          30.
          TEST: update split tests Sub-task Open Unassigned
          31.
          TEST: update merge tests Sub-task Open Unassigned
          32.
          TEST: update compaction tests Sub-task Open Unassigned
          33.
          TEST: update unit tests for io package Sub-task Open Unassigned
          34.
          TEST: update tests for misc regionserver tests Sub-task Open Unassigned
          35.
          TEST: cleanup misc tests to ensure no direct filesystem use Sub-task Open Unassigned
          36.
          Remove directory layout/ filesystem references from Cleaners and a few other modules in master Sub-task Resolved Umesh Agashe
          37.
          Refactor ExportSnapshot, SnapshotInfo and remove FS references from it Sub-task In Progress Umesh Agashe
          38.
          Move FileLink and HFileLink classes to fs.legacy package Sub-task Resolved Umesh Agashe

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                benlau Ben Lau
              • Votes:
                0 Vote for this issue
                Watchers:
                25 Start watching this issue

                Dates

                • Created:
                  Updated: