Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-8707 Implement an async pure c++ HDFS client
  3. HDFS-11014

libhdfs++: Make connection to HA clusters faster

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • None
    • None
    • hdfs-client
    • None

    Description

      Right now when we get a StandbyException from the NN we inject a 20 second delay before we try the alternate NN even if it's the first failover. The first failover shouldn't have a delay (java client skips delay on first failover).

      Another minor change I'd like to make is to reduce the default number of failover attempts from 15 (used in the apache config) to 4. My impression is that higher numbers of failovers are really handy for longer running batch jobs but in the libhdfs++ case the client is often an interactive application. In this case it's generally preferable to fail sooner so a user doesn't have to wait the ~8 minutes to time out when using default settings.

      4 failovers is based on the assumption that if we can't immediately connect there is either a GC pause which will most likely be finished before the second connection attempt or it's a network or config issue that will take some sorting out by an admin. It'd still be possible to override these in the config for more tuning if a specific deployment tends to have more or less network issues.

      Attachments

        1. HDFS-11014.HDFS-8707.000.patch
          3 kB
          James Clampffer

        Activity

          People

            James C James Clampffer
            James C James Clampffer
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: