Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-14680

Two configs for snapshot timeout and better defaults

VotersStop watchingWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Reviewed

    Description

      One of the clusters timed out taking a snapshot for a disabled table. The table is big enough, and the master operation takes more than 1 min to complete. However while trying to increase the timeout, we noticed that there are two parameters with very similar names configuring different things:

      hbase.snapshot.master.timeout.millis is defined in SnapshotDescriptionUtils and is send to client side and used in disabled table snapshot.

      hbase.snapshot.master.timeoutMillis is defined in SnapshotManager and used as the timeout for the procedure execution.

      So, there are a couple of improvements that we can do:

      • 1 min is too low for big tables. We need to set this to 5 min or 10 min by default. Even a 6T table which is medium sized fails.
      • Unify the two timeouts into one. Decide on either of them, and deprecate the other. Use the biggest one for BC.
      • Add the timeout to hbase-default.xml.
      • Why do we even have a timeout for disabled table snapshots? The master is doing the work so we should not timeout in any case.

      Attachments

        1. HBASE-14680_v1.patch
          8 kB
          Heng Chen
        2. HBASE-14680_v2.patch
          9 kB
          Heng Chen
        3. hbase-14680_v3.patch
          10 kB
          Enis Soztutar
        4. HBASE-14680.patch
          8 kB
          Heng Chen

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            chenheng Heng Chen
            enis Enis Soztutar
            Votes:
            0 Vote for this issue
            Watchers:
            7 Stop watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment