Uploaded image for project: 'Kudu'
  1. Kudu
  2. KUDU-2204

Revamp storage of tablet server metadata

Agile BoardAttach filesAttach ScreenshotAdd voteVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 1.6.0
    • None
    • consensus, tserver

    Description

      Since the very early days of Kudu we've used the file system as a sort of key-value store for various bits of metadata – in particular, each tablet has a "tablet metadata" file and a "consensus metadata file". We also have various per-disk metadata, as well as LBM block metadata, but I'll leave those out of the present discussion.

      Although the file system "works" and got us pretty far, there are various benefits we could gain by switching to something closer to a KV-store:

      • write performance: writing a single file per "record", fsyncing it, and renaming it to replace the old version is pretty slow. File systems empirically do a poor job of group-commit of multiple of these types of operations happening at the same time, especially as the disks get busy
      • read performance: on startup, we need to read all these metadata files, and using separate files means it's likely they're on entirely separate areas of disk. If we have, for example, 100k tablets in the future, we'll pay 100k random reads, which could take 10+ minutes on a cold hard disk.
      • scalability: if we want to get to 100k+ tablets per server, we start entering the territory where file systems are unhappy with such a high quantity of files

      Solving the performance problem also opens us up a lot of headroom to do things such as replicate cmeta information for each tablet onto two or three disks, which has a lot of benefits for handling disk-failure. (you can lose a disk without getting "amnesia" of a raft vote, for example).

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            tlipcon Todd Lipcon
            tlipcon Todd Lipcon

            Dates

              Created:
              Updated:

              Slack

                Issue deployment