Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.0.0-alpha
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      The work on branch-20-append was to support sync, for durable HBase WALs, not append. The branch-20-append implementation is known to be buggy. There's been confusion about this, we often answer queries on the list like this. Unfortunately, the way to enable correct sync on branch-1 for HBase is to set dfs.support.append to true in your config, which has the side effect of enabling append (which we don't want to do).

      For v1.x let's:

      1. Always enable the sync path (currently only enabled if dfs.support.append is set)
      2. Remove the dfs.support.append configuration option. Let's keep the code paths though in case we ever fix append on branch-1, in which case we can add the config option back

      For 2.x let's

      1. Always enable the hsync/hflush path
      2. The dfs.support.appends only enables the append specific paths (since the hsync/hflush paths are now always on). Append will still default to being enabled so there is no net effect by default.
      1. hdfs-3120.txt
        18 kB
        Eli Collins
      2. hdfs-3120.txt
        21 kB
        Eli Collins

        Issue Links

          Activity

          Eli Collins created issue -
          Jeff Hammerbacher made changes -
          Field Original Value New Value
          Link This issue relates to HDFS-3107 [ HDFS-3107 ]
          Eli Collins made changes -
          Target Version/s 0.23.3, 1.1.0 [ 12320052, 12317959 ] 1.1.0, 2.0.0 [ 12317959, 12320353 ]
          Eli Collins made changes -
          Target Version/s 2.0.0, 1.1.0 [ 12320353, 12317959 ] 1.1.0, 2.0.0 [ 12317959, 12320353 ]
          Description The work on branch-20-append was to support *sync*, for durable HBase WALs, not *append*. The branch-20-append implementation is known to be buggy. There's been confusion about this, we often answer queries on the list [like this|http://search-hadoop.com/m/wfed01VOIJ5]. Unfortunately, the way to enable correct sync on branch-1 for HBase is to set dfs.support.append to true in your config, which has the side effect of enabling append (which we don't want to do).

          Let's add a new *dfs.support.hsync* option that enables working sync (which is basically the current dfs.support.append flag modulo one place where it's not referring to sync). For compatibility, if dfs.support.append is set, dfs.support.sync will be set as well. This way someone can enable sync for HBase and still keep the current behavior that if dfs.support.append is not set then an append operation will result in an IOE indicating append is not supported. We should do this on trunk as well, as there's no reason to conflate hsync and append with a single config even if append works.
          The work on branch-20-append was to support *sync*, for durable HBase WALs, not *append*. The branch-20-append implementation is known to be buggy. There's been confusion about this, we often answer queries on the list [like this|http://search-hadoop.com/m/wfed01VOIJ5]. Unfortunately, the way to enable correct sync on branch-1 for HBase is to set dfs.support.append to true in your config, which has the side effect of enabling append (which we don't want to do).

          Let's add a new *dfs.support.sync* option that enables working sync (which is basically the current dfs.support.append flag modulo one place where it's not referring to sync). For compatibility, if dfs.support.append is set, dfs.support.sync will be set as well. This way someone can enable sync for HBase and still keep the current behavior that if dfs.support.append is not set then an append operation will result in an IOE indicating append is not supported. We should do this on trunk as well, as there's no reason to conflate hsync and append with a single config even if append works.
          Eli Collins made changes -
          Link This issue relates to HADOOP-8230 [ HADOOP-8230 ]
          Eli Collins made changes -
          Link This issue relates to HBASE-5676 [ HBASE-5676 ]
          Eli Collins made changes -
          Attachment hdfs-3120.txt [ 12520627 ]
          Eli Collins made changes -
          Summary Provide ability to enable sync without append Enable hsync and hflush by default
          Affects Version/s 1.0.1 [ 12319502 ]
          Target Version/s 2.0.0, 1.1.0 [ 12320353, 12317959 ] 2.0.0 [ 12320353 ]
          Eli Collins made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Eli Collins made changes -
          Attachment hdfs-3120.txt [ 12521073 ]
          Eli Collins made changes -
          Status Patch Available [ 10002 ] Resolved [ 5 ]
          Hadoop Flags Reviewed [ 10343 ]
          Target Version/s 2.0.0 [ 12320353 ]
          Fix Version/s 2.0.0 [ 12320353 ]
          Resolution Fixed [ 1 ]
          Eli Collins made changes -
          Description The work on branch-20-append was to support *sync*, for durable HBase WALs, not *append*. The branch-20-append implementation is known to be buggy. There's been confusion about this, we often answer queries on the list [like this|http://search-hadoop.com/m/wfed01VOIJ5]. Unfortunately, the way to enable correct sync on branch-1 for HBase is to set dfs.support.append to true in your config, which has the side effect of enabling append (which we don't want to do).

          Let's add a new *dfs.support.sync* option that enables working sync (which is basically the current dfs.support.append flag modulo one place where it's not referring to sync). For compatibility, if dfs.support.append is set, dfs.support.sync will be set as well. This way someone can enable sync for HBase and still keep the current behavior that if dfs.support.append is not set then an append operation will result in an IOE indicating append is not supported. We should do this on trunk as well, as there's no reason to conflate hsync and append with a single config even if append works.
          The work on branch-20-append was to support *sync*, for durable HBase WALs, not *append*. The branch-20-append implementation is known to be buggy. There's been confusion about this, we often answer queries on the list [like this|http://search-hadoop.com/m/wfed01VOIJ5]. Unfortunately, the way to enable correct sync on branch-1 for HBase is to set dfs.support.append to true in your config, which has the side effect of enabling append (which we don't want to do).

          For v1.x let's:
          # Always enable the sync path (currently only enabled if dfs.support.append is set)
          # Remove the dfs.support.append configuration option. Let's keep the code paths though in case we ever fix append on branch-1, in which case we can add the config option back

          For 2.x let's
          # Always enable the hsync/hflush path
          # The dfs.support.appends only enables the append specific paths (since the hsync/hflush paths are now always on). Append will still default to being enabled so there is no net effect by default.
          Harsh J made changes -
          Link This issue supercedes HDFS-1107 [ HDFS-1107 ]
          Eli Collins made changes -
          Assignee Eli Collins [ eli2 ] Eli Collins [ eli ]

            People

            • Assignee:
              Eli Collins
              Reporter:
              Eli Collins
            • Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development