Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-9530

ReservedSpace is not cleared for abandoned Blocks

    Details

    • Hadoop Flags:
      Reviewed

      Description

      i think there are bugs in HDFS
      ===============================================================================
      here is config
      <property>
      <name>dfs.datanode.data.dir</name>
      <value>
      file:///mnt/disk4,file:///mnt/disk1,file:///mnt/disk3,file:///mnt/disk2
      </value>
      </property>

      here is dfsadmin report

      [hadoop@worker-1 ~]$ hadoop dfsadmin -report
      DEPRECATED: Use of this script to execute hdfs command is deprecated.
      Instead use the hdfs command for it.

      Configured Capacity: 240769253376 (224.23 GB)
      Present Capacity: 238604832768 (222.22 GB)
      DFS Remaining: 215772954624 (200.95 GB)
      DFS Used: 22831878144 (21.26 GB)
      DFS Used%: 9.57%
      Under replicated blocks: 4
      Blocks with corrupt replicas: 0
      Missing blocks: 0

      -------------------------------------------------
      Live datanodes (3):

      Name: 10.117.60.59:50010 (worker-2)
      Hostname: worker-2
      Decommission Status : Normal
      Configured Capacity: 80256417792 (74.74 GB)
      DFS Used: 7190958080 (6.70 GB)
      Non DFS Used: 721473536 (688.05 MB)
      DFS Remaining: 72343986176 (67.38 GB)
      DFS Used%: 8.96%
      DFS Remaining%: 90.14%
      Configured Cache Capacity: 0 (0 B)
      Cache Used: 0 (0 B)
      Cache Remaining: 0 (0 B)
      Cache Used%: 100.00%
      Cache Remaining%: 0.00%
      Xceivers: 1
      Last contact: Wed Dec 09 15:55:02 CST 2015

      Name: 10.168.156.0:50010 (worker-3)
      Hostname: worker-3
      Decommission Status : Normal
      Configured Capacity: 80256417792 (74.74 GB)
      DFS Used: 7219073024 (6.72 GB)
      Non DFS Used: 721473536 (688.05 MB)
      DFS Remaining: 72315871232 (67.35 GB)
      DFS Used%: 9.00%
      DFS Remaining%: 90.11%
      Configured Cache Capacity: 0 (0 B)
      Cache Used: 0 (0 B)
      Cache Remaining: 0 (0 B)
      Cache Used%: 100.00%
      Cache Remaining%: 0.00%
      Xceivers: 1
      Last contact: Wed Dec 09 15:55:03 CST 2015

      Name: 10.117.15.38:50010 (worker-1)
      Hostname: worker-1
      Decommission Status : Normal
      Configured Capacity: 80256417792 (74.74 GB)
      DFS Used: 8421847040 (7.84 GB)
      Non DFS Used: 721473536 (688.05 MB)
      DFS Remaining: 71113097216 (66.23 GB)
      DFS Used%: 10.49%
      DFS Remaining%: 88.61%
      Configured Cache Capacity: 0 (0 B)
      Cache Used: 0 (0 B)
      Cache Remaining: 0 (0 B)
      Cache Used%: 100.00%
      Cache Remaining%: 0.00%
      Xceivers: 1
      Last contact: Wed Dec 09 15:55:03 CST 2015

      ================================================================================

      when running hive job , dfsadmin report as follows

      [hadoop@worker-1 ~]$ hadoop dfsadmin -report
      DEPRECATED: Use of this script to execute hdfs command is deprecated.
      Instead use the hdfs command for it.

      Configured Capacity: 240769253376 (224.23 GB)
      Present Capacity: 108266011136 (100.83 GB)
      DFS Remaining: 80078416384 (74.58 GB)
      DFS Used: 28187594752 (26.25 GB)
      DFS Used%: 26.04%
      Under replicated blocks: 7
      Blocks with corrupt replicas: 0
      Missing blocks: 0

      -------------------------------------------------
      Live datanodes (3):

      Name: 10.117.60.59:50010 (worker-2)
      Hostname: worker-2
      Decommission Status : Normal
      Configured Capacity: 80256417792 (74.74 GB)
      DFS Used: 9015627776 (8.40 GB)
      Non DFS Used: 44303742464 (41.26 GB)
      DFS Remaining: 26937047552 (25.09 GB)
      DFS Used%: 11.23%
      DFS Remaining%: 33.56%
      Configured Cache Capacity: 0 (0 B)
      Cache Used: 0 (0 B)
      Cache Remaining: 0 (0 B)
      Cache Used%: 100.00%
      Cache Remaining%: 0.00%
      Xceivers: 693
      Last contact: Wed Dec 09 15:37:35 CST 2015

      Name: 10.168.156.0:50010 (worker-3)
      Hostname: worker-3
      Decommission Status : Normal
      Configured Capacity: 80256417792 (74.74 GB)
      DFS Used: 9163116544 (8.53 GB)
      Non DFS Used: 47895897600 (44.61 GB)
      DFS Remaining: 23197403648 (21.60 GB)
      DFS Used%: 11.42%
      DFS Remaining%: 28.90%
      Configured Cache Capacity: 0 (0 B)
      Cache Used: 0 (0 B)
      Cache Remaining: 0 (0 B)
      Cache Used%: 100.00%
      Cache Remaining%: 0.00%
      Xceivers: 750
      Last contact: Wed Dec 09 15:37:36 CST 2015

      Name: 10.117.15.38:50010 (worker-1)
      Hostname: worker-1
      Decommission Status : Normal
      Configured Capacity: 80256417792 (74.74 GB)
      DFS Used: 10008850432 (9.32 GB)
      Non DFS Used: 40303602176 (37.54 GB)
      DFS Remaining: 29943965184 (27.89 GB)
      DFS Used%: 12.47%
      DFS Remaining%: 37.31%
      Configured Cache Capacity: 0 (0 B)
      Cache Used: 0 (0 B)
      Cache Remaining: 0 (0 B)
      Cache Used%: 100.00%
      Cache Remaining%: 0.00%
      Xceivers: 632
      Last contact: Wed Dec 09 15:37:36 CST 2015

      =========================================================================
      but, df output is as follows on worker-1
      [hadoop@worker-1 ~]$ df
      Filesystem 1K-blocks Used Available Use% Mounted on
      /dev/xvda1 20641404 4229676 15363204 22% /
      tmpfs 8165456 0 8165456 0% /dev/shm
      /dev/xvdc 20642428 2596920 16996932 14% /mnt/disk3
      /dev/xvdb 20642428 2692228 16901624 14% /mnt/disk4
      /dev/xvdd 20642428 2445852 17148000 13% /mnt/disk2
      /dev/xvde 20642428 2909764 16684088 15% /mnt/disk1

      df output conflitcs with dfsadmin report

      any suggestions?

      1. HDFS-9530-branch-2.7-002.patch
        7 kB
        Brahma Reddy Battula
      2. HDFS-9530-branch-2.7-001.patch
        6 kB
        Brahma Reddy Battula
      3. HDFS-9530-branch-2.6.patch
        7 kB
        Brahma Reddy Battula
      4. HDFS-9530-03.patch
        7 kB
        Brahma Reddy Battula
      5. HDFS-9530-02.patch
        6 kB
        Brahma Reddy Battula
      6. HDFS-9530-01.patch
        7 kB
        Brahma Reddy Battula

        Issue Links

          Activity

          Hide
          ferhui Fei Hui added a comment -

          i want to add logs to monitor DataNode DFS Used, Non DFS Used,and so on.
          how can i do that ?which files i need to modify?

          Show
          ferhui Fei Hui added a comment - i want to add logs to monitor DataNode DFS Used, Non DFS Used,and so on. how can i do that ?which files i need to modify?
          Hide
          brahmareddy Brahma Reddy Battula added a comment -

          Fei Hui thanks for reporting this. Can you please check reservedSpace and reservedSpaceForReplicas values in JMX ( you can check through http:<datanodeip>:<httpport>/jmx).

          Show
          brahmareddy Brahma Reddy Battula added a comment - Fei Hui thanks for reporting this. Can you please check reservedSpace and reservedSpaceForReplicas values in JMX ( you can check through http:<datanodeip>:<httpport>/jmx).
          Hide
          xyao Xiaoyu Yao added a comment -

          This looks like the symptom of HDFS-8072, where RBW reserved space is not released when Datanode BlockReceiver encounters an IOException. The space won't be releases until DN restart.

          The fix should be included in hadoop 2.6.2 and 2.7.1. Can you post the "hadoop version" command output?

          Show
          xyao Xiaoyu Yao added a comment - This looks like the symptom of HDFS-8072 , where RBW reserved space is not released when Datanode BlockReceiver encounters an IOException. The space won't be releases until DN restart. The fix should be included in hadoop 2.6.2 and 2.7.1. Can you post the "hadoop version" command output?
          Hide
          cnauroth Chris Nauroth added a comment -

          Additionally, Apache Hadoop 2.7.1 has a bug that configured dfs.datanode.du.reserved space gets counted towards non-DFS used. The work of fixing this is tracked in HDFS-9038.

          Show
          cnauroth Chris Nauroth added a comment - Additionally, Apache Hadoop 2.7.1 has a bug that configured dfs.datanode.du.reserved space gets counted towards non-DFS used. The work of fixing this is tracked in HDFS-9038 .
          Hide
          ferhui Fei Hui added a comment -

          [hdfs@worker-1 ~]$ hadoop version
          Hadoop 2.6.2

          and i am sure it also appears in 2.7.1

          Show
          ferhui Fei Hui added a comment - [hdfs@worker-1 ~] $ hadoop version Hadoop 2.6.2 and i am sure it also appears in 2.7.1
          Hide
          ferhui Fei Hui added a comment -

          <property>
          <name>dfs.namenode.resource.du.reserved</name>
          <value>1073741824</value>
          </property>

          Show
          ferhui Fei Hui added a comment - <property> <name>dfs.namenode.resource.du.reserved</name> <value>1073741824</value> </property>
          Hide
          ferhui Fei Hui added a comment -

          useful information, jmx result conflicts with df
          "VolumeInfo" : "{\"/mnt/disk4/current\":

          {\"freeSpace\":7721571505,\"usedSpace\":2648002383,\"reservedSpace\":1073741824}

          ,\"/mnt/disk1/current\":

          {\"freeSpace\":8503652886,\"usedSpace\":2248676842,\"reservedSpace\":1073741824}

          ,\"/mnt/disk2/current\":

          {\"freeSpace\":8194545617,\"usedSpace\":2173402671,\"reservedSpace\":1073741824}

          ,\"/mnt/disk3/current\":{\"freeSpace\":8316634525,\"usedSpace\":2177793635,\"reservedSpace\":1073741824}}"

          [hadoop@worker-1 ~]$ df
          Filesystem 1K-blocks Used Available Use% Mounted on
          /dev/xvda1 20641404 3647828 15945052 19% /
          tmpfs 8165456 0 8165456 0% /dev/shm
          /dev/xvdc 20642428 2254196 17339656 12% /mnt/disk3
          /dev/xvdb 20642428 2721092 16872760 14% /mnt/disk4
          /dev/xvdd 20642428 2274172 17319680 12% /mnt/disk2
          /dev/xvde 20642428 2316260 17277592 12% /mnt/disk1

          Show
          ferhui Fei Hui added a comment - useful information, jmx result conflicts with df "VolumeInfo" : "{\"/mnt/disk4/current\": {\"freeSpace\":7721571505,\"usedSpace\":2648002383,\"reservedSpace\":1073741824} ,\"/mnt/disk1/current\": {\"freeSpace\":8503652886,\"usedSpace\":2248676842,\"reservedSpace\":1073741824} ,\"/mnt/disk2/current\": {\"freeSpace\":8194545617,\"usedSpace\":2173402671,\"reservedSpace\":1073741824} ,\"/mnt/disk3/current\":{\"freeSpace\":8316634525,\"usedSpace\":2177793635,\"reservedSpace\":1073741824}}" [hadoop@worker-1 ~] $ df Filesystem 1K-blocks Used Available Use% Mounted on /dev/xvda1 20641404 3647828 15945052 19% / tmpfs 8165456 0 8165456 0% /dev/shm /dev/xvdc 20642428 2254196 17339656 12% /mnt/disk3 /dev/xvdb 20642428 2721092 16872760 14% /mnt/disk4 /dev/xvdd 20642428 2274172 17319680 12% /mnt/disk2 /dev/xvde 20642428 2316260 17277592 12% /mnt/disk1
          Hide
          ferhui Fei Hui added a comment -

          freeSpace conflicts with df putputs
          where is freeSpace code? maybe there is a bug!

          Show
          ferhui Fei Hui added a comment - freeSpace conflicts with df putputs where is freeSpace code? maybe there is a bug!
          Hide
          ferhui Fei Hui added a comment -

          freeSpace conflicts with df putputs
          where is freeSpace code? maybe there is a bug!

          Show
          ferhui Fei Hui added a comment - freeSpace conflicts with df putputs where is freeSpace code? maybe there is a bug!
          Hide
          qiuzhuang.lian Qiuzhuang Lian added a comment -

          we see this problem too in hadoop 2.6.2. The dfs non used is inconsistent with du command.

          Show
          qiuzhuang.lian Qiuzhuang Lian added a comment - we see this problem too in hadoop 2.6.2. The dfs non used is inconsistent with du command.
          Hide
          brahmareddy Brahma Reddy Battula added a comment -

          Qiuzhuang Lian this issue is tracked in HDFS-9038..

          Show
          brahmareddy Brahma Reddy Battula added a comment - Qiuzhuang Lian this issue is tracked in HDFS-9038 ..
          Hide
          qiuzhuang.lian Qiuzhuang Lian added a comment -

          @Brahma Reddy Battula, thanks. For now we restart all datanodes to release dfs non used as a temp fix.

          Show
          qiuzhuang.lian Qiuzhuang Lian added a comment - @Brahma Reddy Battula, thanks. For now we restart all datanodes to release dfs non used as a temp fix.
          Hide
          szetszwo Tsz Wo Nicholas Sze added a comment -

          (Removing the hive stack trace from the Description.)

          Should we resolve this as a duplicate of HDFS-9038?

          Show
          szetszwo Tsz Wo Nicholas Sze added a comment - (Removing the hive stack trace from the Description.) Should we resolve this as a duplicate of HDFS-9038 ?
          Hide
          ajisakaa Akira Ajisaka added a comment -

          Should we resolve this as a duplicate of HDFS-9038?

          Agree. Closing this.

          Show
          ajisakaa Akira Ajisaka added a comment - Should we resolve this as a duplicate of HDFS-9038 ? Agree. Closing this.
          Hide
          ferhui Fei Hui added a comment -

          maybe it's different from HDFS-9038.

          the example in description, dfs.datanode.du.reserved is 1G. dfs reports Non DFS Used: 40303602176 (37.54 GB), DFS Remaining: 29943965184 (27.89 GB) on worker-1. but df output is below on worker-1
          /dev/xvdc 20642428 2254196 17339656 12% /mnt/disk3
          /dev/xvdb 20642428 2721092 16872760 14% /mnt/disk4
          /dev/xvdd 20642428 2274172 17319680 12% /mnt/disk2
          /dev/xvde 20642428 2316260 17277592 12% /mnt/disk1

          total 80G use 9.4G, free 68G.

          dfs report non dfs 37.54GB is inconsistent with df command

          Show
          ferhui Fei Hui added a comment - maybe it's different from HDFS-9038 . the example in description, dfs.datanode.du.reserved is 1G. dfs reports Non DFS Used: 40303602176 (37.54 GB), DFS Remaining: 29943965184 (27.89 GB) on worker-1. but df output is below on worker-1 /dev/xvdc 20642428 2254196 17339656 12% /mnt/disk3 /dev/xvdb 20642428 2721092 16872760 14% /mnt/disk4 /dev/xvdd 20642428 2274172 17319680 12% /mnt/disk2 /dev/xvde 20642428 2316260 17277592 12% /mnt/disk1 total 80G use 9.4G, free 68G. dfs report non dfs 37.54GB is inconsistent with df command
          Hide
          ajisakaa Akira Ajisaka added a comment -

          Reopen this. I'll dig into this issue.

          Show
          ajisakaa Akira Ajisaka added a comment - Reopen this. I'll dig into this issue.
          Hide
          ajisakaa Akira Ajisaka added a comment -

          Hi Fei Hui, did you execute df command when running the job? I suppose your hive job write a lot of temporary data in the local directories of the slave servers, and the temporary files are cleaned up after the job.

          Show
          ajisakaa Akira Ajisaka added a comment - Hi Fei Hui , did you execute df command when running the job? I suppose your hive job write a lot of temporary data in the local directories of the slave servers, and the temporary files are cleaned up after the job.
          Hide
          ferhui Fei Hui added a comment -

          execute df & dfs report in order
          after the job finished, non dfs usage is also large

          Show
          ferhui Fei Hui added a comment - execute df & dfs report in order after the job finished, non dfs usage is also large
          Hide
          brahmareddy Brahma Reddy Battula added a comment -

          is this will be there after hive job is completed(while running it might writing so many files)..? I mean, non-dfs used..?

          Show
          brahmareddy Brahma Reddy Battula added a comment - is this will be there after hive job is completed(while running it might writing so many files)..? I mean, non-dfs used..?
          Hide
          ferhui Fei Hui added a comment -

          while job running, the df output is above.there is no non-dfs used, that is why i am confused

          Show
          ferhui Fei Hui added a comment - while job running, the df output is above.there is no non-dfs used, that is why i am confused
          Hide
          raviprak Ravi Prakash added a comment -

          Thanks for reporting Fei Hui. I can confirm that we are seeing this in our 2.7.1 clusters too. I'll dig. It is not explained by HDFS-9038. e.g.

          $ grep dfs.datanode.du.reserved -r .  -A 1
          ./hdfs-site.xml:            <name>dfs.datanode.du.reserved</name>
          ./hdfs-site.xml-            <value>107374182400</value>
          

          We are seeing a deficit of way more than this 100Gb * number of disks.

          Show
          raviprak Ravi Prakash added a comment - Thanks for reporting Fei Hui. I can confirm that we are seeing this in our 2.7.1 clusters too. I'll dig. It is not explained by HDFS-9038 . e.g. $ grep dfs.datanode.du.reserved -r . -A 1 ./hdfs-site.xml: <name>dfs.datanode.du.reserved</name> ./hdfs-site.xml- <value>107374182400</value> We are seeing a deficit of way more than this 100Gb * number of disks.
          Hide
          raviprak Ravi Prakash added a comment -

          I took a heap dump of the Datanode process. I see the value of FsVolumeImpl.reservedForRbw is really large (> 1Tb) . This matches with the kind of discrepancy we are seeing on our cluster.

          Show
          raviprak Ravi Prakash added a comment - I took a heap dump of the Datanode process. I see the value of FsVolumeImpl.reservedForRbw is really large (> 1Tb) . This matches with the kind of discrepancy we are seeing on our cluster.
          Hide
          raviprak Ravi Prakash added a comment -

          HDFS-8072 has not fixed this for us. Arpit Agarwal know anything about this?

          Show
          raviprak Ravi Prakash added a comment - HDFS-8072 has not fixed this for us. Arpit Agarwal know anything about this?
          Hide
          arpitagarwal Arpit Agarwal added a comment -

          I think there was a related fix to HDFS-8072. I can't recall the Jira right now, will comment here later if I find it.

          Show
          arpitagarwal Arpit Agarwal added a comment - I think there was a related fix to HDFS-8072 . I can't recall the Jira right now, will comment here later if I find it.
          Hide
          brahmareddy Brahma Reddy Battula added a comment -

          Arpit Agarwal are you referring HDFS-8626..? This also present in 2.7.1..Let me investigate on this..

          Show
          brahmareddy Brahma Reddy Battula added a comment - Arpit Agarwal are you referring HDFS-8626 ..? This also present in 2.7.1..Let me investigate on this..
          Hide
          brahmareddy Brahma Reddy Battula added a comment -

          Upon my investigation came to know that abandon blocks are not getting release the space even blocks got deleted.

          Fix can be done two ways. Release reserved bytes when
          1) block got deleted.
          2 ) the mirror connection fails, immediately reserved bytes can be released in case of PIPELINE_SETUP_CREATE

          Attaching the patch, with both approaches, 2nd one is commented.

          Show
          brahmareddy Brahma Reddy Battula added a comment - Upon my investigation came to know that abandon blocks are not getting release the space even blocks got deleted. Fix can be done two ways. Release reserved bytes when 1) block got deleted. 2 ) the mirror connection fails, immediately reserved bytes can be released in case of PIPELINE_SETUP_CREATE Attaching the patch, with both approaches, 2nd one is commented.
          Hide
          arpitagarwal Arpit Agarwal added a comment -

          Hi Brahma Reddy Battula, yes I meant HDFS-8626 thanks for finding it.

          I'll take a look at your patch.

          Show
          arpitagarwal Arpit Agarwal added a comment - Hi Brahma Reddy Battula , yes I meant HDFS-8626 thanks for finding it. I'll take a look at your patch.
          Hide
          raviprak Ravi Prakash added a comment -

          Linking these two JIRAs together. Maybe they are related?

          Show
          raviprak Ravi Prakash added a comment - Linking these two JIRAs together. Maybe they are related?
          Hide
          arpitagarwal Arpit Agarwal added a comment -

          Hi Brahma Reddy Battula, I didn't quite understand the patch. Could you please describe the problem?

          There is some dead code in the patch.

          Show
          arpitagarwal Arpit Agarwal added a comment - Hi Brahma Reddy Battula , I didn't quite understand the patch. Could you please describe the problem? There is some dead code in the patch.
          Hide
          brahmareddy Brahma Reddy Battula added a comment -

          If any of the datanode other than 1st DN is down for the first time pipeline creation,then Block will be abandoned and write will continue with new block.
          In this case, the DN previous to bad link, will get IOE while connecting to mirror and in that case, reservedBytes will not be released for that DN.

          There is some dead code in the patch.

          I mentioned two approaches, one is commented,, Just for reference now.

          Show
          brahmareddy Brahma Reddy Battula added a comment - If any of the datanode other than 1st DN is down for the first time pipeline creation,then Block will be abandoned and write will continue with new block. In this case, the DN previous to bad link, will get IOE while connecting to mirror and in that case, reservedBytes will not be released for that DN. There is some dead code in the patch. I mentioned two approaches, one is commented,, Just for reference now.
          Hide
          raviprak Ravi Prakash added a comment -

          FWIW, there is only a trivial amount of data in the RBW directory on the DN. So if we only released the reservation whenever we delete / move an RBW file, we would be fine.

          Here's my analysis based on Hadoop-2.7.1 code. I'll see what makes sense in trunk shortly.
          It seems to me that in addition to the places that we already do, we should be modifying the reserved space in these places too:

          1. BlockPoolSlice.resolveDuplicateReplicas (or perhaps deleteReplica) predicated on the replica being RBW
          2. FsDatasetImpl.convertTemporaryToRbw (after moveBlockFiles)
          3. FsDatasetImpl.recoverRbw after truncateBlock
          4. {FsDatasetImpl.updateReplicaUnderRecovery}} after newReplicaInfo.setNumBytes(newlength);

          I'd be interested in seeing if you all can find other places too.

          I wonder if the current implementation is too brittle and if there isn't a different place we can keep track of the required reservation? There may well not be. I'm surprised the feature itself was added without even a configuration to disable it.

          Show
          raviprak Ravi Prakash added a comment - FWIW, there is only a trivial amount of data in the RBW directory on the DN. So if we only released the reservation whenever we delete / move an RBW file, we would be fine. Here's my analysis based on Hadoop-2.7.1 code. I'll see what makes sense in trunk shortly. It seems to me that in addition to the places that we already do, we should be modifying the reserved space in these places too: BlockPoolSlice.resolveDuplicateReplicas (or perhaps deleteReplica ) predicated on the replica being RBW FsDatasetImpl.convertTemporaryToRbw (after moveBlockFiles ) FsDatasetImpl.recoverRbw after truncateBlock {FsDatasetImpl.updateReplicaUnderRecovery}} after newReplicaInfo.setNumBytes(newlength); I'd be interested in seeing if you all can find other places too. I wonder if the current implementation is too brittle and if there isn't a different place we can keep track of the required reservation? There may well not be. I'm surprised the feature itself was added without even a configuration to disable it.
          Hide
          raviprak Ravi Prakash added a comment -

          Thanks for your attention and patch Brahma!
          On a brief glance, your change to FsDatasetImpl seems to make sense. I'll dig in deeper.
          I suspect releaseAllBytesReserved in DataXceiver (the approach you commented out in your patch) may remove more reservation than ideal because inside blockReceiver = new BlockReceiver(block, storageType, in, the code is a lot more nuanced. There are different contingencies based on whether its a new block / a block to be recovered / duplicate block etc. Please correct me if I am wrong.

          Show
          raviprak Ravi Prakash added a comment - Thanks for your attention and patch Brahma! On a brief glance, your change to FsDatasetImpl seems to make sense. I'll dig in deeper. I suspect releaseAllBytesReserved in DataXceiver (the approach you commented out in your patch) may remove more reservation than ideal because inside blockReceiver = new BlockReceiver(block, storageType, in, the code is a lot more nuanced. There are different contingencies based on whether its a new block / a block to be recovered / duplicate block etc. Please correct me if I am wrong.
          Hide
          brahmareddy Brahma Reddy Battula added a comment -

          To Make it simple for analysis,

          1. Reservation happens only when the block is being received using BlockReceiver. No other places reservation happens, so no need to release as well.
          2. BlockReceiver constructor have a try-catch block where it will release all the bytes reserved, if there is any exceptions after reserving.
          3. BlockReceiver#receiveBlock() have the try-catch block where it will release all the bytes reserved if there is any exceptions during the receiving process.
          4. During successful receiving of packets, ReplicaInPipeline#setBytesAcked(..) will be called by PacketResponder
          5. Once the block is completely received, {{FsDataSetImpl#finalizeReplica(..)} will release all the remaining reserved bytes.

          Only place left is in DataXceiver#writeBlock(), exception can happen after creation of BlockReceiver and before calling BlockReceiver#receiveBlock(), if failed to connect to Mirror nodes.
          Only in this case, bytes will not be released. But a ReplicaInfo instance will be already created in ReplicaMap.

          Here, if the client re-creates the pipeline with same blockId, then same ReplicaInfo instance will be used, So no extra reservation happens. This can be verified using the same testcase as patch, but failing the pipeline for append, where abandonBlock will not be called, and pipeline will be recovered for same block.

          But in case of fresh block creation, block will be abandoned and fresh block with new pipeline will be requested.
          The old block created at Datanode will be eventually be deleted, BUT reserved space was never released. That's why you are not seeing many RBW blocks in RBW directory, but reserved space went on to accumulate > 1TB.

          Though I have given two approaches, #1, releasing reserved bytes while deletion of ReplicaInPipeline instances, will cover all hidden cases, if any, as well.

          Hope this helps.

          Show
          brahmareddy Brahma Reddy Battula added a comment - To Make it simple for analysis, 1. Reservation happens only when the block is being received using BlockReceiver. No other places reservation happens, so no need to release as well. 2. BlockReceiver constructor have a try-catch block where it will release all the bytes reserved, if there is any exceptions after reserving. 3. BlockReceiver#receiveBlock() have the try-catch block where it will release all the bytes reserved if there is any exceptions during the receiving process. 4. During successful receiving of packets, ReplicaInPipeline#setBytesAcked(..) will be called by PacketResponder 5. Once the block is completely received, {{FsDataSetImpl#finalizeReplica(..)} will release all the remaining reserved bytes. Only place left is in DataXceiver#writeBlock() , exception can happen after creation of BlockReceiver and before calling BlockReceiver#receiveBlock() , if failed to connect to Mirror nodes. Only in this case, bytes will not be released. But a ReplicaInfo instance will be already created in ReplicaMap. Here, if the client re-creates the pipeline with same blockId, then same ReplicaInfo instance will be used, So no extra reservation happens. This can be verified using the same testcase as patch, but failing the pipeline for append, where abandonBlock will not be called, and pipeline will be recovered for same block. But in case of fresh block creation, block will be abandoned and fresh block with new pipeline will be requested. The old block created at Datanode will be eventually be deleted, BUT reserved space was never released. That's why you are not seeing many RBW blocks in RBW directory, but reserved space went on to accumulate > 1TB. Though I have given two approaches, #1, releasing reserved bytes while deletion of ReplicaInPipeline instances, will cover all hidden cases, if any, as well. Hope this helps.
          Hide
          brahmareddy Brahma Reddy Battula added a comment -

          anybody can review the attached patch..?

          Show
          brahmareddy Brahma Reddy Battula added a comment - anybody can review the attached patch..?
          Hide
          raviprak Ravi Prakash added a comment -

          1. Reservation happens only when the block is being received using BlockReceiver. No other places reservation happens, so no need to release as well.

          Thanks for reminding me Brahma! Do you think we should change reservedForReplicas when a datanode is started up and an older RBW replica is recovered? Specifically BlockPoolSlice.getVolumeMap addToReplicasMap(volumeMap, rbwDir, lazyWriteReplicaMap, false); . Also it seems to me, since we aren't calling reserveSpaceForReplica in BlockReceiver but instead at a lower level, we will have to worry about calling releaseReservedSpace at that lower level.

          2. BlockReceiver constructor have a try-catch block where it will release all the bytes reserved, if there is any exceptions after reserving.
          3. BlockReceiver#receiveBlock() have the try-catch block where it will release all the bytes reserved if there is any exceptions during the receiving process.

          Could you please point me to the code where you see this happening? I mean specific instances of FsVolumeImpl.releaseReservedSpace being called with the stack trace.

          Only place left is in DataXceiver#writeBlock(), exception can happen after creation of BlockReceiver and before calling BlockReceiver#receiveBlock(), if failed to connect to Mirror nodes.

          Do you mean to imply that the places I found in this comment need not call reserveSpaceForReplica / releaseReservedSpace ?

          Show
          raviprak Ravi Prakash added a comment - 1. Reservation happens only when the block is being received using BlockReceiver. No other places reservation happens, so no need to release as well. Thanks for reminding me Brahma! Do you think we should change reservedForReplicas when a datanode is started up and an older RBW replica is recovered? Specifically BlockPoolSlice.getVolumeMap addToReplicasMap(volumeMap, rbwDir, lazyWriteReplicaMap, false); . Also it seems to me, since we aren't calling reserveSpaceForReplica in BlockReceiver but instead at a lower level, we will have to worry about calling releaseReservedSpace at that lower level. 2. BlockReceiver constructor have a try-catch block where it will release all the bytes reserved, if there is any exceptions after reserving. 3. BlockReceiver#receiveBlock() have the try-catch block where it will release all the bytes reserved if there is any exceptions during the receiving process. Could you please point me to the code where you see this happening? I mean specific instances of FsVolumeImpl.releaseReservedSpace being called with the stack trace. Only place left is in DataXceiver#writeBlock(), exception can happen after creation of BlockReceiver and before calling BlockReceiver#receiveBlock(), if failed to connect to Mirror nodes. Do you mean to imply that the places I found in this comment need not call reserveSpaceForReplica / releaseReservedSpace ?
          Hide
          raviprak Ravi Prakash added a comment -

          To answer one of my own questions: "Could you please point me to the code where you see this happening?"
          In 2, Brahma is likely referring to BlockReceiver:283 -> ReplicaInPipeline:163 -> FsVolumeImpl:480
          In 3, Brahma is likely referring to BlockReceiver:956 -> ReplicaInPipeline:163 -> FsVolumeImpl:480

          Show
          raviprak Ravi Prakash added a comment - To answer one of my own questions: "Could you please point me to the code where you see this happening?" In 2, Brahma is likely referring to BlockReceiver:283 -> ReplicaInPipeline:163 -> FsVolumeImpl:480 In 3, Brahma is likely referring to BlockReceiver:956 -> ReplicaInPipeline:163 -> FsVolumeImpl:480
          Hide
          brahmareddy Brahma Reddy Battula added a comment -

          Do you think we should change reservedForReplicas when a datanode is started up and an older RBW replica is recovered? Specifically BlockPoolSlice.getVolumeMap addToReplicasMap(volumeMap, rbwDir, lazyWriteReplicaMap, false);

          Please Note that, "reservation" happens for Replicas which are instances of ReplicaInPipelineInterface. i.e. Only for ReplicaBeingWritten(RBW) and ReplicaInPipeline(TMP) blocks, both these will go through BlockReceiver. No reservation required during recovery (RWR/RUR) as no other data will be written to this, except recovery with DN restart case for upgrade in which expected block size is not tracked, no reservation happens.

          also it seems to me, since we aren't calling reserveSpaceForReplica in BlockReceiver but instead at a lower level, we will have to worry about calling releaseReservedSpace at that lower level.

          Ideally thats correct. But reservation happens once, but release happens gradually for every packet. So ReplicaInPipeline instance will keep track of how much bytes released, and how much yet to release. Finally, during close/exception all remaining bytes will be released.

          Do you mean to imply that the places I found in this comment need not call reserveSpaceForReplica / releaseReservedSpace ?

          Yes, IMO those places need not require to release any reserved space. As already mentioned above, only ReplicaInPipeline instances need to release, if reserved.

          Show
          brahmareddy Brahma Reddy Battula added a comment - Do you think we should change reservedForReplicas when a datanode is started up and an older RBW replica is recovered? Specifically BlockPoolSlice.getVolumeMap addToReplicasMap(volumeMap, rbwDir, lazyWriteReplicaMap, false); Please Note that, "reservation" happens for Replicas which are instances of ReplicaInPipelineInterface . i.e. Only for ReplicaBeingWritten(RBW) and ReplicaInPipeline(TMP) blocks, both these will go through BlockReceiver. No reservation required during recovery (RWR/RUR) as no other data will be written to this, except recovery with DN restart case for upgrade in which expected block size is not tracked, no reservation happens. also it seems to me, since we aren't calling reserveSpaceForReplica in BlockReceiver but instead at a lower level, we will have to worry about calling releaseReservedSpace at that lower level. Ideally thats correct. But reservation happens once, but release happens gradually for every packet. So ReplicaInPipeline instance will keep track of how much bytes released, and how much yet to release. Finally, during close/exception all remaining bytes will be released. Do you mean to imply that the places I found in this comment need not call reserveSpaceForReplica / releaseReservedSpace ? Yes, IMO those places need not require to release any reserved space. As already mentioned above, only ReplicaInPipeline instances need to release, if reserved.
          Hide
          arpitagarwal Arpit Agarwal added a comment - - edited

          Thanks for continuing to work on this Brahma Reddy Battula and your detailed analyses. Your reasoning sounds correct but I'd need more time to check this thoroughly.

          I agree that this is complex so if I am also open to the option of removing the reservation if there is a simpler alternative.

          Show
          arpitagarwal Arpit Agarwal added a comment - - edited Thanks for continuing to work on this Brahma Reddy Battula and your detailed analyses. Your reasoning sounds correct but I'd need more time to check this thoroughly. I agree that this is complex so if I am also open to the option of removing the reservation if there is a simpler alternative.
          Hide
          raviprak Ravi Prakash added a comment -

          Perhaps we can postpone the question of whether RBW blocks which are recovered during a DN start / refresh of storages should have space reserved to another JIRA (since that is not causing the symptoms mentioned in this JIRA)

          Thanks for the explanations Brahma! They are very helpful for me to understand the code.

          Should we also reduce the reservation in FsDatasetImpl.removeVolumes after it.remove();? How about checkAndUpdate?

          I'm trying to figure out why we missed releasing the space during invalidate as you found out. As you correctly point out, we reserve space only when a BlockReceiver is created.

          Show
          raviprak Ravi Prakash added a comment - Perhaps we can postpone the question of whether RBW blocks which are recovered during a DN start / refresh of storages should have space reserved to another JIRA (since that is not causing the symptoms mentioned in this JIRA) Thanks for the explanations Brahma! They are very helpful for me to understand the code. Should we also reduce the reservation in FsDatasetImpl.removeVolumes after it.remove(); ? How about checkAndUpdate ? I'm trying to figure out why we missed releasing the space during invalidate as you found out. As you correctly point out, we reserve space only when a BlockReceiver is created.
          Hide
          vinayrpet Vinayakumar B added a comment -

          Good analysis Brahma Reddy Battula.
          Your analysis makes sense.

          IMO Releasing the remaining reserved bytes during invalidation is the better approach to deal with this issue.
          As per my understanding and analysis, currently there is no other place where reserve/release is not consistent. But even if it comes in future, this change will avoid growing reserved space.

          IMO, Even though there is workaround to overcome this, restarting datanodes everytime will be hard for clusters.
          So this should be pushed before 2.8 comes out.

          Show
          vinayrpet Vinayakumar B added a comment - Good analysis Brahma Reddy Battula . Your analysis makes sense. IMO Releasing the remaining reserved bytes during invalidation is the better approach to deal with this issue. As per my understanding and analysis, currently there is no other place where reserve/release is not consistent. But even if it comes in future, this change will avoid growing reserved space. IMO, Even though there is workaround to overcome this, restarting datanodes everytime will be hard for clusters. So this should be pushed before 2.8 comes out.
          Hide
          arpitagarwal Arpit Agarwal added a comment -

          I took a deeper look at the reservation code. It was painful to see my own lack of thoroughness.

          I mostly agree with Brahma's analysis. Since reservation is done only for replicas created via BlockReceiver, there are a couple of potential culprits where the reservation could be leaked:

          • Failure in DataXceiver#writeBlock after creating the BlockReceiver.
          • BlockReceiver receives an unchecked exception after reserving.

          Also agree with Vinayakumar B that releasing via invalidate is the safer option although it can lead to the reserved space hanging around longer.

          Do we agree on the following summary of the contract for when space should be reserved and released?

          1. Space is reserved only when the on-disk block file is successfully created for an rbw/temporary replica. This is verifiably true in FsDatasetImpl#createTemporary and FsDatasetImpl#createRbw barring OOM when the ReplicaInPipeline/ReplicaBeingWritten is allocated.
          2. Space continues to be reserved as long as there is an rbw/temporary in the volumeMap.
          3. Space must be released either when the replica is finalized or it is invalidated. FsDatasetImpl#finalizeReplica handles the finalize case. Fixing invalidate would close the remaining gap.
            1. Space may be released earlier if a failure is detected earlier e.g. exception in BlockReceiver which we handle today.
            2. Space may also be released incrementally when some bytes are written to disk which is handled via ReplicaInPipeline#setBytesAcked.

          Thanks again for the detailed analysis on this one Brahma, Ravi Prakash and Vinay. Nice work.

          Show
          arpitagarwal Arpit Agarwal added a comment - I took a deeper look at the reservation code. It was painful to see my own lack of thoroughness. I mostly agree with Brahma's analysis. Since reservation is done only for replicas created via BlockReceiver, there are a couple of potential culprits where the reservation could be leaked: Failure in DataXceiver#writeBlock after creating the BlockReceiver. BlockReceiver receives an unchecked exception after reserving. Also agree with Vinayakumar B that releasing via invalidate is the safer option although it can lead to the reserved space hanging around longer. Do we agree on the following summary of the contract for when space should be reserved and released? Space is reserved only when the on-disk block file is successfully created for an rbw/temporary replica. This is verifiably true in FsDatasetImpl#createTemporary and FsDatasetImpl#createRbw barring OOM when the ReplicaInPipeline/ReplicaBeingWritten is allocated. Space continues to be reserved as long as there is an rbw/temporary in the volumeMap. Space must be released either when the replica is finalized or it is invalidated. FsDatasetImpl#finalizeReplica handles the finalize case. Fixing invalidate would close the remaining gap. Space may be released earlier if a failure is detected earlier e.g. exception in BlockReceiver which we handle today. Space may also be released incrementally when some bytes are written to disk which is handled via ReplicaInPipeline#setBytesAcked. Thanks again for the detailed analysis on this one Brahma, Ravi Prakash and Vinay. Nice work.
          Hide
          vinayrpet Vinayakumar B added a comment -

          Do we agree on the following summary of the contract for when space should be reserved and released?

          Yes, that was a perfect summary.

          Show
          vinayrpet Vinayakumar B added a comment - Do we agree on the following summary of the contract for when space should be reserved and released? Yes, that was a perfect summary.
          Hide
          vinayrpet Vinayakumar B added a comment -

          Also agree with Vinayakumar B that releasing via invalidate is the safer option although it can lead to the reserved space hanging around longer.

          Yes, I agree that it will hang around longer. But I think, that's fine as long as the block file is present on disk.

          Show
          vinayrpet Vinayakumar B added a comment - Also agree with Vinayakumar B that releasing via invalidate is the safer option although it can lead to the reserved space hanging around longer. Yes, I agree that it will hang around longer. But I think, that's fine as long as the block file is present on disk.
          Hide
          raviprak Ravi Prakash added a comment -

          This has been a long standing and complicated problem and your effort was laudable Arpit! It'd be great if we can tie this all down. Even if we can't write a comprehensive unit test to ensure all this byte accounting stays correct despite changes in the future, we should go ahead and fix the release of bytes on invalidate.

          Show
          raviprak Ravi Prakash added a comment - This has been a long standing and complicated problem and your effort was laudable Arpit! It'd be great if we can tie this all down. Even if we can't write a comprehensive unit test to ensure all this byte accounting stays correct despite changes in the future, we should go ahead and fix the release of bytes on invalidate.
          Hide
          brahmareddy Brahma Reddy Battula added a comment -

          Thanks to all.Uploaded patch..

          Attached test case targeted current problem, Tests available in TestSpaceReservation covers many other cases.
          IMO, If any more tests required, can be added as follow-up jira. Right?

          I am thinking, this should go with 2.7.3 release.

          Show
          brahmareddy Brahma Reddy Battula added a comment - Thanks to all.Uploaded patch.. Attached test case targeted current problem, Tests available in TestSpaceReservation covers many other cases. IMO, If any more tests required, can be added as follow-up jira. Right? I am thinking, this should go with 2.7.3 release.
          Hide
          arpitagarwal Arpit Agarwal added a comment -

          Thanks Brahma Reddy Battula, the core change looks fine to me. The test case needs some more work. The failMirrorConnection hook looks unused. Also the DN should be updated to call the hook.

          Using close will not exercise the changed code path since the block is finalized and not invalidated. We probably need to trigger the abandon block path in DataStreamer to trigger this invalidation as you correctly diagnosed earlier.

          If any of the datanode other than 1st DN is down for the first time pipeline creation,then Block will be abandoned and write will continue with new block.
          In this case, the DN previous to bad link, will get IOE while connecting to mirror and in that case, reservedBytes will not be released for that DN.

          Thanks again for sticking with this difficult issue!

          Show
          arpitagarwal Arpit Agarwal added a comment - Thanks Brahma Reddy Battula , the core change looks fine to me. The test case needs some more work. The failMirrorConnection hook looks unused. Also the DN should be updated to call the hook. Using close will not exercise the changed code path since the block is finalized and not invalidated. We probably need to trigger the abandon block path in DataStreamer to trigger this invalidation as you correctly diagnosed earlier. If any of the datanode other than 1st DN is down for the first time pipeline creation,then Block will be abandoned and write will continue with new block. In this case, the DN previous to bad link, will get IOE while connecting to mirror and in that case, reservedBytes will not be released for that DN. Thanks again for sticking with this difficult issue!
          Hide
          brahmareddy Brahma Reddy Battula added a comment -

          Arpit Agarwal thanks for catch.

          The failMirrorConnection hook looks unused. Also the DN should be updated to call the hook.

          Yes, missed this from HDFS-9530-01 to HDFS-9530-02.. Now I uploaded..

          Show
          brahmareddy Brahma Reddy Battula added a comment - Arpit Agarwal thanks for catch. The failMirrorConnection hook looks unused. Also the DN should be updated to call the hook. Yes, missed this from HDFS-9530 -01 to HDFS-9530 -02.. Now I uploaded..
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 21s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
          +1 mvninstall 10m 20s branch-2.7 passed
          +1 compile 1m 37s branch-2.7 passed with JDK v1.8.0_91
          +1 compile 1m 27s branch-2.7 passed with JDK v1.7.0_101
          +1 checkstyle 0m 36s branch-2.7 passed
          +1 mvnsite 1m 26s branch-2.7 passed
          +1 mvneclipse 0m 23s branch-2.7 passed
          +1 findbugs 4m 14s branch-2.7 passed
          +1 javadoc 1m 46s branch-2.7 passed with JDK v1.8.0_91
          +1 javadoc 2m 55s branch-2.7 passed with JDK v1.7.0_101
          +1 mvninstall 1m 10s the patch passed
          +1 compile 1m 35s the patch passed with JDK v1.8.0_91
          +1 javac 1m 35s the patch passed
          +1 compile 1m 20s the patch passed with JDK v1.7.0_101
          +1 javac 1m 20s the patch passed
          -1 checkstyle 0m 30s hadoop-hdfs-project/hadoop-hdfs: The patch generated 2 new + 188 unchanged - 2 fixed = 190 total (was 190)
          +1 mvnsite 1m 15s the patch passed
          +1 mvneclipse 0m 15s the patch passed
          -1 whitespace 0m 0s The patch has 1967 line(s) that end in whitespace. Use git apply --whitespace=fix.
          -1 whitespace 0m 59s The patch 99 line(s) with tabs.
          +1 findbugs 4m 3s the patch passed
          +1 javadoc 1m 45s the patch passed with JDK v1.8.0_91
          +1 javadoc 2m 43s the patch passed with JDK v1.7.0_101
          -1 unit 50m 49s hadoop-hdfs in the patch failed with JDK v1.8.0_91.
          -1 unit 45m 10s hadoop-hdfs in the patch failed with JDK v1.7.0_101.
          -1 asflicense 0m 22s The patch generated 3 ASF License warnings.
          140m 28s



          Reason Tests
          JDK v1.8.0_91 Failed junit tests hadoop.hdfs.server.namenode.snapshot.TestRenameWithSnapshots
            hadoop.hdfs.server.namenode.TestNNThroughputBenchmark
            hadoop.hdfs.TestPread
            hadoop.hdfs.web.TestWebHdfsTokens
            hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes
          JDK v1.7.0_101 Failed junit tests hadoop.hdfs.server.namenode.snapshot.TestRenameWithSnapshots
            hadoop.tools.TestJMXGet
            hadoop.hdfs.server.namenode.TestNNThroughputBenchmark



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:c420dfe
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12811370/HDFS-9530-branch-2.7-002.patch
          JIRA Issue HDFS-9530
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux cfc564b2dc2e 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision branch-2.7 / 138d0f0
          Default Java 1.7.0_101
          Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_91 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_101
          findbugs v3.0.0
          checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/15808/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
          whitespace https://builds.apache.org/job/PreCommit-HDFS-Build/15808/artifact/patchprocess/whitespace-eol.txt
          whitespace https://builds.apache.org/job/PreCommit-HDFS-Build/15808/artifact/patchprocess/whitespace-tabs.txt
          unit https://builds.apache.org/job/PreCommit-HDFS-Build/15808/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.8.0_91.txt
          unit https://builds.apache.org/job/PreCommit-HDFS-Build/15808/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.7.0_101.txt
          unit test logs https://builds.apache.org/job/PreCommit-HDFS-Build/15808/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.8.0_91.txt https://builds.apache.org/job/PreCommit-HDFS-Build/15808/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.7.0_101.txt
          JDK v1.7.0_101 Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/15808/testReport/
          asflicense https://builds.apache.org/job/PreCommit-HDFS-Build/15808/artifact/patchprocess/patch-asflicense-problems.txt
          modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/15808/console
          Powered by Apache Yetus 0.3.0 http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 21s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files. +1 mvninstall 10m 20s branch-2.7 passed +1 compile 1m 37s branch-2.7 passed with JDK v1.8.0_91 +1 compile 1m 27s branch-2.7 passed with JDK v1.7.0_101 +1 checkstyle 0m 36s branch-2.7 passed +1 mvnsite 1m 26s branch-2.7 passed +1 mvneclipse 0m 23s branch-2.7 passed +1 findbugs 4m 14s branch-2.7 passed +1 javadoc 1m 46s branch-2.7 passed with JDK v1.8.0_91 +1 javadoc 2m 55s branch-2.7 passed with JDK v1.7.0_101 +1 mvninstall 1m 10s the patch passed +1 compile 1m 35s the patch passed with JDK v1.8.0_91 +1 javac 1m 35s the patch passed +1 compile 1m 20s the patch passed with JDK v1.7.0_101 +1 javac 1m 20s the patch passed -1 checkstyle 0m 30s hadoop-hdfs-project/hadoop-hdfs: The patch generated 2 new + 188 unchanged - 2 fixed = 190 total (was 190) +1 mvnsite 1m 15s the patch passed +1 mvneclipse 0m 15s the patch passed -1 whitespace 0m 0s The patch has 1967 line(s) that end in whitespace. Use git apply --whitespace=fix. -1 whitespace 0m 59s The patch 99 line(s) with tabs. +1 findbugs 4m 3s the patch passed +1 javadoc 1m 45s the patch passed with JDK v1.8.0_91 +1 javadoc 2m 43s the patch passed with JDK v1.7.0_101 -1 unit 50m 49s hadoop-hdfs in the patch failed with JDK v1.8.0_91. -1 unit 45m 10s hadoop-hdfs in the patch failed with JDK v1.7.0_101. -1 asflicense 0m 22s The patch generated 3 ASF License warnings. 140m 28s Reason Tests JDK v1.8.0_91 Failed junit tests hadoop.hdfs.server.namenode.snapshot.TestRenameWithSnapshots   hadoop.hdfs.server.namenode.TestNNThroughputBenchmark   hadoop.hdfs.TestPread   hadoop.hdfs.web.TestWebHdfsTokens   hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes JDK v1.7.0_101 Failed junit tests hadoop.hdfs.server.namenode.snapshot.TestRenameWithSnapshots   hadoop.tools.TestJMXGet   hadoop.hdfs.server.namenode.TestNNThroughputBenchmark Subsystem Report/Notes Docker Image:yetus/hadoop:c420dfe JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12811370/HDFS-9530-branch-2.7-002.patch JIRA Issue HDFS-9530 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux cfc564b2dc2e 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision branch-2.7 / 138d0f0 Default Java 1.7.0_101 Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_91 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_101 findbugs v3.0.0 checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/15808/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt whitespace https://builds.apache.org/job/PreCommit-HDFS-Build/15808/artifact/patchprocess/whitespace-eol.txt whitespace https://builds.apache.org/job/PreCommit-HDFS-Build/15808/artifact/patchprocess/whitespace-tabs.txt unit https://builds.apache.org/job/PreCommit-HDFS-Build/15808/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.8.0_91.txt unit https://builds.apache.org/job/PreCommit-HDFS-Build/15808/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.7.0_101.txt unit test logs https://builds.apache.org/job/PreCommit-HDFS-Build/15808/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.8.0_91.txt https://builds.apache.org/job/PreCommit-HDFS-Build/15808/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.7.0_101.txt JDK v1.7.0_101 Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/15808/testReport/ asflicense https://builds.apache.org/job/PreCommit-HDFS-Build/15808/artifact/patchprocess/patch-asflicense-problems.txt modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs Console output https://builds.apache.org/job/PreCommit-HDFS-Build/15808/console Powered by Apache Yetus 0.3.0 http://yetus.apache.org This message was automatically generated.
          Hide
          brahmareddy Brahma Reddy Battula added a comment -

          Test failures,checkstyle and ASF License warnings are unrelated to this patch..

          Re-uploaded the trunk patch, as jenkins did run on trunk patch.

          Show
          brahmareddy Brahma Reddy Battula added a comment - Test failures , checkstyle and ASF License warnings are unrelated to this patch.. Re-uploaded the trunk patch, as jenkins did run on trunk patch.
          Hide
          vinayrpet Vinayakumar B added a comment -

          Latest patch looks good. +1.
          Checkstyle and whitespace comments for branch-2.7 patch can be ignored. Not related.

          Waiting for QA report for trunk.

          Show
          vinayrpet Vinayakumar B added a comment - Latest patch looks good. +1. Checkstyle and whitespace comments for branch-2.7 patch can be ignored. Not related. Waiting for QA report for trunk.
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 26s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
          +1 mvninstall 6m 31s trunk passed
          +1 compile 0m 51s trunk passed
          +1 checkstyle 0m 31s trunk passed
          +1 mvnsite 0m 58s trunk passed
          +1 mvneclipse 0m 14s trunk passed
          +1 findbugs 1m 43s trunk passed
          +1 javadoc 0m 57s trunk passed
          +1 mvninstall 0m 58s the patch passed
          +1 compile 0m 44s the patch passed
          +1 javac 0m 44s the patch passed
          -1 checkstyle 0m 26s hadoop-hdfs-project/hadoop-hdfs: The patch generated 2 new + 189 unchanged - 1 fixed = 191 total (was 190)
          +1 mvnsite 0m 51s the patch passed
          +1 mvneclipse 0m 10s the patch passed
          -1 whitespace 0m 0s The patch has 3 line(s) that end in whitespace. Use git apply --whitespace=fix.
          +1 findbugs 1m 48s the patch passed
          +1 javadoc 0m 58s the patch passed
          -1 unit 73m 26s hadoop-hdfs in the patch failed.
          +1 asflicense 0m 17s The patch does not generate ASF License warnings.
          93m 18s



          Reason Tests
          Failed junit tests hadoop.hdfs.TestDecommissionWithStriped



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:e2f6409
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12811784/HDFS-9530-03.patch
          JIRA Issue HDFS-9530
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux 2b5ed2beb667 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / fc6b50c
          Default Java 1.8.0_91
          findbugs v3.0.0
          checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/15831/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
          whitespace https://builds.apache.org/job/PreCommit-HDFS-Build/15831/artifact/patchprocess/whitespace-eol.txt
          unit https://builds.apache.org/job/PreCommit-HDFS-Build/15831/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
          unit test logs https://builds.apache.org/job/PreCommit-HDFS-Build/15831/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
          Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/15831/testReport/
          modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/15831/console
          Powered by Apache Yetus 0.3.0 http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 26s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files. +1 mvninstall 6m 31s trunk passed +1 compile 0m 51s trunk passed +1 checkstyle 0m 31s trunk passed +1 mvnsite 0m 58s trunk passed +1 mvneclipse 0m 14s trunk passed +1 findbugs 1m 43s trunk passed +1 javadoc 0m 57s trunk passed +1 mvninstall 0m 58s the patch passed +1 compile 0m 44s the patch passed +1 javac 0m 44s the patch passed -1 checkstyle 0m 26s hadoop-hdfs-project/hadoop-hdfs: The patch generated 2 new + 189 unchanged - 1 fixed = 191 total (was 190) +1 mvnsite 0m 51s the patch passed +1 mvneclipse 0m 10s the patch passed -1 whitespace 0m 0s The patch has 3 line(s) that end in whitespace. Use git apply --whitespace=fix. +1 findbugs 1m 48s the patch passed +1 javadoc 0m 58s the patch passed -1 unit 73m 26s hadoop-hdfs in the patch failed. +1 asflicense 0m 17s The patch does not generate ASF License warnings. 93m 18s Reason Tests Failed junit tests hadoop.hdfs.TestDecommissionWithStriped Subsystem Report/Notes Docker Image:yetus/hadoop:e2f6409 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12811784/HDFS-9530-03.patch JIRA Issue HDFS-9530 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 2b5ed2beb667 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / fc6b50c Default Java 1.8.0_91 findbugs v3.0.0 checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/15831/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt whitespace https://builds.apache.org/job/PreCommit-HDFS-Build/15831/artifact/patchprocess/whitespace-eol.txt unit https://builds.apache.org/job/PreCommit-HDFS-Build/15831/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt unit test logs https://builds.apache.org/job/PreCommit-HDFS-Build/15831/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/15831/testReport/ modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs Console output https://builds.apache.org/job/PreCommit-HDFS-Build/15831/console Powered by Apache Yetus 0.3.0 http://yetus.apache.org This message was automatically generated.
          Hide
          brahmareddy Brahma Reddy Battula added a comment -

          Thanks Vinayakumar B for your review. I will wait for commit till Arpit Agarwal reviews.

          Show
          brahmareddy Brahma Reddy Battula added a comment - Thanks Vinayakumar B for your review. I will wait for commit till Arpit Agarwal reviews.
          Hide
          arpitagarwal Arpit Agarwal added a comment -

          Hi Brahma Reddy Battula, the test case is still not exercising the changed code path. If you revert the change to FsDatasetImpl.java the test case still passes because closing the file finalizes the block.

          Show
          arpitagarwal Arpit Agarwal added a comment - Hi Brahma Reddy Battula , the test case is still not exercising the changed code path. If you revert the change to FsDatasetImpl.java the test case still passes because closing the file finalizes the block.
          Hide
          arpitagarwal Arpit Agarwal added a comment -

          I take that back. I tested this again and verified the invalidate path is exercised with the failMirrorConnection hook in place. Also confirmed the new test times out without the fix.

          +1 thanks Brahma Reddy Battula.

          Show
          arpitagarwal Arpit Agarwal added a comment - I take that back. I tested this again and verified the invalidate path is exercised with the failMirrorConnection hook in place. Also confirmed the new test times out without the fix. +1 thanks Brahma Reddy Battula .
          Hide
          vinayrpet Vinayakumar B added a comment -

          Thanks for reconfirming Arpit. I too had tested it earlier before giving +1
          for 2.7 patch. After seeing your earlier comment I was confused.

          Show
          vinayrpet Vinayakumar B added a comment - Thanks for reconfirming Arpit. I too had tested it earlier before giving +1 for 2.7 patch. After seeing your earlier comment I was confused.
          Hide
          vinayrpet Vinayakumar B added a comment -

          One of the checkstyle comments can be fixed before final commit. Its in
          test though.

          Show
          vinayrpet Vinayakumar B added a comment - One of the checkstyle comments can be fixed before final commit. Its in test though.
          Hide
          arpitagarwal Arpit Agarwal added a comment -

          Sorry about that. I will hold off committing this in case Brahma Reddy Battula wants to try out his new commit bit.

          Show
          arpitagarwal Arpit Agarwal added a comment - Sorry about that. I will hold off committing this in case Brahma Reddy Battula wants to try out his new commit bit.
          Hide
          vinayrpet Vinayakumar B added a comment -

          Thats good to start commit with his one of the long waited issue. Its
          skipped from almost 3 releases I think.
          On 20 Jun 2016 22:42, "Arpit Agarwal (JIRA)" <jira@apache.org> wrote:

          Arpit Agarwal
          <https://issues.apache.org/jira/secure/ViewProfile.jspa?name=arpitagarwal>
          commented on [image: Bug] HDFS-9530
          <https://issues.apache.org/jira/browse/HDFS-9530>

          Re: huge Non-DFS Used in hadoop 2.6.2 & 2.7.1
          <https://issues.apache.org/jira/browse/HDFS-9530>

          Sorry about that. I will hold off committing this in case Brahma Reddy
          Battula
          <https://issues.apache.org/jira/secure/ViewProfile.jspa?name=brahmareddy>
          wants to try out his new commit bit.
          [image: Add Comment]
          <https://issues.apache.org/jira/browse/HDFS-9530#add-comment> Add Comment
          <https://issues.apache.org/jira/browse/HDFS-9530#add-comment>

          This message was sent by Atlassian JIRA (v6.3.4#6332-sha1:51bc225)
          [image: Atlassian logo]

          Show
          vinayrpet Vinayakumar B added a comment - Thats good to start commit with his one of the long waited issue. Its skipped from almost 3 releases I think. On 20 Jun 2016 22:42, "Arpit Agarwal (JIRA)" <jira@apache.org> wrote: Arpit Agarwal < https://issues.apache.org/jira/secure/ViewProfile.jspa?name=arpitagarwal > commented on [image: Bug] HDFS-9530 < https://issues.apache.org/jira/browse/HDFS-9530 > Re: huge Non-DFS Used in hadoop 2.6.2 & 2.7.1 < https://issues.apache.org/jira/browse/HDFS-9530 > Sorry about that. I will hold off committing this in case Brahma Reddy Battula < https://issues.apache.org/jira/secure/ViewProfile.jspa?name=brahmareddy > wants to try out his new commit bit. [image: Add Comment] < https://issues.apache.org/jira/browse/HDFS-9530#add-comment > Add Comment < https://issues.apache.org/jira/browse/HDFS-9530#add-comment > This message was sent by Atlassian JIRA (v6.3.4#6332-sha1:51bc225) [image: Atlassian logo]
          Hide
          hudson Hudson added a comment -

          SUCCESS: Integrated in Hadoop-trunk-Commit #9993 (See https://builds.apache.org/job/Hadoop-trunk-Commit/9993/)
          HDFS-9530. ReservedSpace is not cleared for abandoned Blocks (brahma: rev f2ac132d6a21c215093b7f87acf2843ac8123716)

          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataXceiver.java
          • hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/TestSpaceReservation.java
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNodeFaultInjector.java
          Show
          hudson Hudson added a comment - SUCCESS: Integrated in Hadoop-trunk-Commit #9993 (See https://builds.apache.org/job/Hadoop-trunk-Commit/9993/ ) HDFS-9530 . ReservedSpace is not cleared for abandoned Blocks (brahma: rev f2ac132d6a21c215093b7f87acf2843ac8123716) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataXceiver.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/TestSpaceReservation.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNodeFaultInjector.java
          Hide
          brahmareddy Brahma Reddy Battula added a comment -

          Committed to trunk,branch-2,branch-2.8,branch-2.7 and branch-2.7.3.

          Thanks a lot Arpit Agarwal and Vinayakumar B for your reviews.. And thanks to Fei Hui for reporting this issue and others..

          Show
          brahmareddy Brahma Reddy Battula added a comment - Committed to trunk,branch-2,branch-2.8,branch-2.7 and branch-2.7.3. Thanks a lot Arpit Agarwal and Vinayakumar B for your reviews.. And thanks to Fei Hui for reporting this issue and others..
          Hide
          srikanth.sampath Srikanth Sampath added a comment -

          We are facing issues that may be resolved with this fix. Will this be ported to 2.6.x? Thanks much in advance.

          Show
          srikanth.sampath Srikanth Sampath added a comment - We are facing issues that may be resolved with this fix. Will this be ported to 2.6.x? Thanks much in advance.
          Hide
          arpitagarwal Arpit Agarwal added a comment -

          Hi Srikanth Sampath, thanks for the report.

          I just did a dry run cherry-pick to branch-2.6 and there was a single conflict that looks straightforward to resolve. Brahma Reddy Battula, do you want to take a crack at backporting this to branch-2.6? If not I can I do so.

          Show
          arpitagarwal Arpit Agarwal added a comment - Hi Srikanth Sampath , thanks for the report. I just did a dry run cherry-pick to branch-2.6 and there was a single conflict that looks straightforward to resolve. Brahma Reddy Battula , do you want to take a crack at backporting this to branch-2.6? If not I can I do so.
          Hide
          brahmareddy Brahma Reddy Battula added a comment -

          Srikanth Sampath thanks for report..

          Arpit Agarwal Uploaded the branch-2.6 patch..Kindly review..

          Show
          brahmareddy Brahma Reddy Battula added a comment - Srikanth Sampath thanks for report.. Arpit Agarwal Uploaded the branch-2.6 patch..Kindly review..
          Hide
          brahmareddy Brahma Reddy Battula added a comment -

          Reopening the issue to attach the branch-2.6 patch and run jenkins against this..

          Show
          brahmareddy Brahma Reddy Battula added a comment - Reopening the issue to attach the branch-2.6 patch and run jenkins against this..
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 15m 43s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
          +1 mvninstall 14m 28s branch-2.6 passed
          -1 compile 0m 50s hadoop-hdfs in branch-2.6 failed with JDK v1.8.0_101.
          -1 compile 0m 50s hadoop-hdfs in branch-2.6 failed with JDK v1.7.0_101.
          +1 checkstyle 0m 27s branch-2.6 passed
          +1 mvnsite 1m 0s branch-2.6 passed
          +1 mvneclipse 0m 18s branch-2.6 passed
          -1 findbugs 3m 1s hadoop-hdfs-project/hadoop-hdfs in branch-2.6 has 273 extant Findbugs warnings.
          +1 javadoc 1m 8s branch-2.6 passed with JDK v1.8.0_101
          +1 javadoc 1m 52s branch-2.6 passed with JDK v1.7.0_101
          +1 mvninstall 0m 53s the patch passed
          -1 compile 0m 42s hadoop-hdfs in the patch failed with JDK v1.8.0_101.
          -1 javac 0m 42s hadoop-hdfs in the patch failed with JDK v1.8.0_101.
          -1 compile 0m 44s hadoop-hdfs in the patch failed with JDK v1.7.0_101.
          -1 javac 0m 44s hadoop-hdfs in the patch failed with JDK v1.7.0_101.
          +1 checkstyle 0m 19s the patch passed
          +1 mvnsite 0m 54s the patch passed
          +1 mvneclipse 0m 14s the patch passed
          -1 whitespace 0m 0s The patch has 2722 line(s) that end in whitespace. Use git apply --whitespace=fix <<patch_file>>. Refer https://git-scm.com/docs/git-apply
          -1 whitespace 1m 2s The patch 107 line(s) with tabs.
          +1 findbugs 3m 2s the patch passed
          +1 javadoc 1m 7s the patch passed with JDK v1.8.0_101
          +1 javadoc 2m 4s the patch passed with JDK v1.7.0_101
          -1 unit 0m 49s hadoop-hdfs in the patch failed with JDK v1.7.0_101.
          -1 asflicense 0m 39s The patch generated 75 ASF License warnings.
          55m 49s



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:44eef0e
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12822981/HDFS-9530-branch-2.6.patch
          JIRA Issue HDFS-9530
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux 9ed7f3a773d9 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision branch-2.6 / 2dc43a2
          Default Java 1.7.0_101
          Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_101 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_101
          compile https://builds.apache.org/job/PreCommit-HDFS-Build/16376/artifact/patchprocess/branch-compile-hadoop-hdfs-project_hadoop-hdfs-jdk1.8.0_101.txt
          compile https://builds.apache.org/job/PreCommit-HDFS-Build/16376/artifact/patchprocess/branch-compile-hadoop-hdfs-project_hadoop-hdfs-jdk1.7.0_101.txt
          findbugs v1.3.9
          findbugs https://builds.apache.org/job/PreCommit-HDFS-Build/16376/artifact/patchprocess/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-warnings.html
          compile https://builds.apache.org/job/PreCommit-HDFS-Build/16376/artifact/patchprocess/patch-compile-hadoop-hdfs-project_hadoop-hdfs-jdk1.8.0_101.txt
          javac https://builds.apache.org/job/PreCommit-HDFS-Build/16376/artifact/patchprocess/patch-compile-hadoop-hdfs-project_hadoop-hdfs-jdk1.8.0_101.txt
          compile https://builds.apache.org/job/PreCommit-HDFS-Build/16376/artifact/patchprocess/patch-compile-hadoop-hdfs-project_hadoop-hdfs-jdk1.7.0_101.txt
          javac https://builds.apache.org/job/PreCommit-HDFS-Build/16376/artifact/patchprocess/patch-compile-hadoop-hdfs-project_hadoop-hdfs-jdk1.7.0_101.txt
          whitespace https://builds.apache.org/job/PreCommit-HDFS-Build/16376/artifact/patchprocess/whitespace-eol.txt
          whitespace https://builds.apache.org/job/PreCommit-HDFS-Build/16376/artifact/patchprocess/whitespace-tabs.txt
          unit https://builds.apache.org/job/PreCommit-HDFS-Build/16376/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.7.0_101.txt
          JDK v1.7.0_101 Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/16376/testReport/
          asflicense https://builds.apache.org/job/PreCommit-HDFS-Build/16376/artifact/patchprocess/patch-asflicense-problems.txt
          modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/16376/console
          Powered by Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 15m 43s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files. +1 mvninstall 14m 28s branch-2.6 passed -1 compile 0m 50s hadoop-hdfs in branch-2.6 failed with JDK v1.8.0_101. -1 compile 0m 50s hadoop-hdfs in branch-2.6 failed with JDK v1.7.0_101. +1 checkstyle 0m 27s branch-2.6 passed +1 mvnsite 1m 0s branch-2.6 passed +1 mvneclipse 0m 18s branch-2.6 passed -1 findbugs 3m 1s hadoop-hdfs-project/hadoop-hdfs in branch-2.6 has 273 extant Findbugs warnings. +1 javadoc 1m 8s branch-2.6 passed with JDK v1.8.0_101 +1 javadoc 1m 52s branch-2.6 passed with JDK v1.7.0_101 +1 mvninstall 0m 53s the patch passed -1 compile 0m 42s hadoop-hdfs in the patch failed with JDK v1.8.0_101. -1 javac 0m 42s hadoop-hdfs in the patch failed with JDK v1.8.0_101. -1 compile 0m 44s hadoop-hdfs in the patch failed with JDK v1.7.0_101. -1 javac 0m 44s hadoop-hdfs in the patch failed with JDK v1.7.0_101. +1 checkstyle 0m 19s the patch passed +1 mvnsite 0m 54s the patch passed +1 mvneclipse 0m 14s the patch passed -1 whitespace 0m 0s The patch has 2722 line(s) that end in whitespace. Use git apply --whitespace=fix <<patch_file>>. Refer https://git-scm.com/docs/git-apply -1 whitespace 1m 2s The patch 107 line(s) with tabs. +1 findbugs 3m 2s the patch passed +1 javadoc 1m 7s the patch passed with JDK v1.8.0_101 +1 javadoc 2m 4s the patch passed with JDK v1.7.0_101 -1 unit 0m 49s hadoop-hdfs in the patch failed with JDK v1.7.0_101. -1 asflicense 0m 39s The patch generated 75 ASF License warnings. 55m 49s Subsystem Report/Notes Docker Image:yetus/hadoop:44eef0e JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12822981/HDFS-9530-branch-2.6.patch JIRA Issue HDFS-9530 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 9ed7f3a773d9 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision branch-2.6 / 2dc43a2 Default Java 1.7.0_101 Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_101 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_101 compile https://builds.apache.org/job/PreCommit-HDFS-Build/16376/artifact/patchprocess/branch-compile-hadoop-hdfs-project_hadoop-hdfs-jdk1.8.0_101.txt compile https://builds.apache.org/job/PreCommit-HDFS-Build/16376/artifact/patchprocess/branch-compile-hadoop-hdfs-project_hadoop-hdfs-jdk1.7.0_101.txt findbugs v1.3.9 findbugs https://builds.apache.org/job/PreCommit-HDFS-Build/16376/artifact/patchprocess/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-warnings.html compile https://builds.apache.org/job/PreCommit-HDFS-Build/16376/artifact/patchprocess/patch-compile-hadoop-hdfs-project_hadoop-hdfs-jdk1.8.0_101.txt javac https://builds.apache.org/job/PreCommit-HDFS-Build/16376/artifact/patchprocess/patch-compile-hadoop-hdfs-project_hadoop-hdfs-jdk1.8.0_101.txt compile https://builds.apache.org/job/PreCommit-HDFS-Build/16376/artifact/patchprocess/patch-compile-hadoop-hdfs-project_hadoop-hdfs-jdk1.7.0_101.txt javac https://builds.apache.org/job/PreCommit-HDFS-Build/16376/artifact/patchprocess/patch-compile-hadoop-hdfs-project_hadoop-hdfs-jdk1.7.0_101.txt whitespace https://builds.apache.org/job/PreCommit-HDFS-Build/16376/artifact/patchprocess/whitespace-eol.txt whitespace https://builds.apache.org/job/PreCommit-HDFS-Build/16376/artifact/patchprocess/whitespace-tabs.txt unit https://builds.apache.org/job/PreCommit-HDFS-Build/16376/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.7.0_101.txt JDK v1.7.0_101 Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/16376/testReport/ asflicense https://builds.apache.org/job/PreCommit-HDFS-Build/16376/artifact/patchprocess/patch-asflicense-problems.txt modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs Console output https://builds.apache.org/job/PreCommit-HDFS-Build/16376/console Powered by Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          vinayrpet Vinayakumar B added a comment -

          Looks some problem in compiling libwebhdfs in 2.6.
          Some docker dependencies needs to be updated?

          Show
          vinayrpet Vinayakumar B added a comment - Looks some problem in compiling libwebhdfs in 2.6. Some docker dependencies needs to be updated?
          Hide
          brahmareddy Brahma Reddy Battula added a comment -

          yes,Seems to be some docker dependcies missed..pinging Allen Wittenauer..

          Show
          brahmareddy Brahma Reddy Battula added a comment - yes,Seems to be some docker dependcies missed..pinging Allen Wittenauer ..
          Hide
          aw Allen Wittenauer added a comment -

          It's branch-2.6. You'll need to ping someone who cares about that branch.

          Show
          aw Allen Wittenauer added a comment - It's branch-2.6. You'll need to ping someone who cares about that branch.
          Hide
          arpitagarwal Arpit Agarwal added a comment -

          Thanks for posting the patch Brahma Reddy Battula. I am out this rest of this week but will review it next week.

          Also I'd just ignore Jenkins and run HDFS unit tests locally to check the patch didn't regress any tests.

          Show
          arpitagarwal Arpit Agarwal added a comment - Thanks for posting the patch Brahma Reddy Battula . I am out this rest of this week but will review it next week. Also I'd just ignore Jenkins and run HDFS unit tests locally to check the patch didn't regress any tests.
          Hide
          brahmareddy Brahma Reddy Battula added a comment -

          Ok.. thanks arpit.. Even I ran before uploading the patch,did not induced any test failure.

          Show
          brahmareddy Brahma Reddy Battula added a comment - Ok.. thanks arpit.. Even I ran before uploading the patch,did not induced any test failure.
          Hide
          brahmareddy Brahma Reddy Battula added a comment -

          Ok.thanks for feedback allen.

          Show
          brahmareddy Brahma Reddy Battula added a comment - Ok.thanks for feedback allen.
          Hide
          vinodkv Vinod Kumar Vavilapalli added a comment -

          Reopening the issue to attach the branch-2.6 patch and run jenkins against this..

          Closing this again for the 2.7.3 release process. If you just want to use Jenkins for 2.6 patch, you can create a clone and use that.

          Show
          vinodkv Vinod Kumar Vavilapalli added a comment - Reopening the issue to attach the branch-2.6 patch and run jenkins against this.. Closing this again for the 2.7.3 release process. If you just want to use Jenkins for 2.6 patch, you can create a clone and use that.
          Hide
          arpitagarwal Arpit Agarwal added a comment -

          I've pushed this to branch-2.6 after verifying the affected unit test.

          Show
          arpitagarwal Arpit Agarwal added a comment - I've pushed this to branch-2.6 after verifying the affected unit test.
          Hide
          vinodkv Vinod Kumar Vavilapalli added a comment -

          Closing the JIRA as part of 2.7.3 release.

          Show
          vinodkv Vinod Kumar Vavilapalli added a comment - Closing the JIRA as part of 2.7.3 release.
          Hide
          brahmareddy Brahma Reddy Battula added a comment -

          Arpit Agarwal thanks for committing to branch-2.6.I think , we need update Change.txt in branch-2.7

          Show
          brahmareddy Brahma Reddy Battula added a comment - Arpit Agarwal thanks for committing to branch-2.6.I think , we need update Change.txt in branch-2.7
          Hide
          arpitagarwal Arpit Agarwal added a comment -

          They are independent release lines so iiuc the branch-2.7 CHANGES.txt needs no update.

          Show
          arpitagarwal Arpit Agarwal added a comment - They are independent release lines so iiuc the branch-2.7 CHANGES.txt needs no update.

            People

            • Assignee:
              brahmareddy Brahma Reddy Battula
              Reporter:
              ferhui Fei Hui
            • Votes:
              0 Vote for this issue
              Watchers:
              16 Start watching this issue

              Dates

              • Due:
                Created:
                Updated:
                Resolved:

                Development