Hadoop HDFS
  1. Hadoop HDFS
  2. HDFS-1508

Ability to do savenamespace without being in safemode

    Details

    • Type: Improvement Improvement
    • Status: Patch Available
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: namenode
    • Labels:
      None

      Description

      In the current code, the administrator can run savenamespace only after putting the namenode in safemode. This means that applications that are writing to HDFS encounters errors because the NN is in safemode. We would like to allow saveNamespace even when the namenode is not in safemode.

      The savenamespace command already acquires the FSNamesystem writelock. There is no need to require that the namenode is in safemode too.

      1. savenamespaceWithoutSafemode.txt
        2 kB
        dhruba borthakur
      2. savenamespaceWithoutSafemode2.txt
        15 kB
        dhruba borthakur
      3. savenamespaceWithoutSafemode3.txt
        18 kB
        dhruba borthakur
      4. savenamespaceWithoutSafemode4.txt
        18 kB
        dhruba borthakur
      5. savenamespaceWithoutSafemode5.txt
        18 kB
        dhruba borthakur

        Issue Links

          Activity

          Hide
          dhruba borthakur added a comment -

          The namenode need not be in safemode while runnign the saveNamespace command. The saveNamespace command acquires the FSNamesystem writelock, thus preventing anybody else from modifying the namespace.

          The lease expiry thread in the LeaseManager acquires the FSNamesystem-writelock too, so it is well protected.

          Show
          dhruba borthakur added a comment - The namenode need not be in safemode while runnign the saveNamespace command. The saveNamespace command acquires the FSNamesystem writelock, thus preventing anybody else from modifying the namespace. The lease expiry thread in the LeaseManager acquires the FSNamesystem-writelock too, so it is well protected.
          Hide
          dhruba borthakur added a comment -
          Show
          dhruba borthakur added a comment - code review here: https://reviews.apache.org/r/125/
          Hide
          Sanjay Radia added a comment -

          The main difference between doing it in safemode and outside safemode is that the clients will not get the safemode exception but instead will find that they are blocked waiting on the NN; new connections will also be accepted up to the OS's limit.

          Given that the NN will be unresponsive I suggest adding a -f parameter to indicate that its okay that it is not in safemode.

          Show
          Sanjay Radia added a comment - The main difference between doing it in safemode and outside safemode is that the clients will not get the safemode exception but instead will find that they are blocked waiting on the NN; new connections will also be accepted up to the OS's limit. Given that the NN will be unresponsive I suggest adding a -f parameter to indicate that its okay that it is not in safemode.
          Hide
          Konstantin Shvachko added a comment -

          Dhruba, what is the use case for that? Why safemode is a problem.
          It could more complicated than just removing the verification. Think of how it will work with checkpointing, if saveNameSpace() comes in the middle of a checkpoint. I think this was the main reason why it required safe mode.

          Show
          Konstantin Shvachko added a comment - Dhruba, what is the use case for that? Why safemode is a problem. It could more complicated than just removing the verification. Think of how it will work with checkpointing, if saveNameSpace() comes in the middle of a checkpoint. I think this was the main reason why it required safe mode.
          Hide
          Konstantin Shvachko added a comment -

          We can use Sanjay's idea of using saveNamespace with -f to indicate that NN should automatically enter safe mode before saving the namespace and then leave it upon completion.

          Show
          Konstantin Shvachko added a comment - We can use Sanjay's idea of using saveNamespace with -f to indicate that NN should automatically enter safe mode before saving the namespace and then leave it upon completion.
          Hide
          Sanjay Radia added a comment -

          > saveNamespace with -f to indicate that NN should automatically enter safe mode
          I was suggesting that -f implies that even though it is not in safemode go ahead (without putting it in safemode). The motivation being that doing saveNamespace is a big deal as could make the NN unresponsive for a while.

          I had not thought about the checkpoint issue that Kons mentioned. If saveNamespace comes in the middle of a checkpoint then can the NN ignore the checkpoint sent by secondary?
          (BTW with HDFS-1073 this problem disappears - one can take as many checkpoint as one likes).
          If it complicates the BNN/SNN's checkpoint then this Jira should wait till 1073 is done.

          As far as motivation: Ops wants to do a checkpoint without the BNN/SNN (say it is down) and is willing to make the NN unresponsive for a short while.

          Show
          Sanjay Radia added a comment - > saveNamespace with -f to indicate that NN should automatically enter safe mode I was suggesting that -f implies that even though it is not in safemode go ahead (without putting it in safemode). The motivation being that doing saveNamespace is a big deal as could make the NN unresponsive for a while. I had not thought about the checkpoint issue that Kons mentioned. If saveNamespace comes in the middle of a checkpoint then can the NN ignore the checkpoint sent by secondary? (BTW with HDFS-1073 this problem disappears - one can take as many checkpoint as one likes). If it complicates the BNN/SNN's checkpoint then this Jira should wait till 1073 is done. As far as motivation: Ops wants to do a checkpoint without the BNN/SNN (say it is down) and is willing to make the NN unresponsive for a short while.
          Hide
          Konstantin Shvachko added a comment -

          > Ops wants to do a checkpoint without the BNN/SNN (say it is down) and is willing to make the NN unresponsive for a short while.

          So they set it into safe mode, saveNamespace, then leave safe mode. A few seconds wont make a difference.

          Show
          Konstantin Shvachko added a comment - > Ops wants to do a checkpoint without the BNN/SNN (say it is down) and is willing to make the NN unresponsive for a short while. So they set it into safe mode, saveNamespace, then leave safe mode. A few seconds wont make a difference.
          Hide
          dhruba borthakur added a comment -

          Thanks Sanjay and Konstantin for looking at this one.

          First the use case: putting the namenode in safemode causes existing applications to fail. This is a severe problem for us. If you are using hdfs for running map-reduce jobs, then putting the namenode is safemode means that tasks fail immediately. A reduce task that has been running for a long long time will fail and has to start all over again. If you are running hbase on hdfs, then hundreds of hbase region servers will die when the namenode goes into safemode.

          @Sanjay: I cluster that runs hbase typically has very few files, less than 100K files. It takes a few seconds to run the savenamespace command. I can generalize: if a user is running hbase on hdfs, then it makes more sense to make savenamespace wait for a few seconds (via the read/write lock) rather than writing special case code in the hbase region servers to handle SafeModeException. As far as backward compatibility is concerned, i can add a "-f" option to indicate "do the savenamespace even if namenode is not in safemode", but still think that this is an option that everybody will use.

          Can one of you explain why we always required savenamespace to have namenode is safemode? isn't it always better to stall the workload rather to fail the workload?

          @Konstantin: can you pl explain the precise problem you have in mind? The rollFSImage() call acquires the FSNamesystem writelock, so it cannot race with saveNamespace. Moreover, saveNamespace truncates the edits log and removes edits.new while rollFSImage will exit out if it does not find edits.new.

          Show
          dhruba borthakur added a comment - Thanks Sanjay and Konstantin for looking at this one. First the use case: putting the namenode in safemode causes existing applications to fail. This is a severe problem for us. If you are using hdfs for running map-reduce jobs, then putting the namenode is safemode means that tasks fail immediately. A reduce task that has been running for a long long time will fail and has to start all over again. If you are running hbase on hdfs, then hundreds of hbase region servers will die when the namenode goes into safemode. @Sanjay: I cluster that runs hbase typically has very few files, less than 100K files. It takes a few seconds to run the savenamespace command. I can generalize: if a user is running hbase on hdfs, then it makes more sense to make savenamespace wait for a few seconds (via the read/write lock) rather than writing special case code in the hbase region servers to handle SafeModeException. As far as backward compatibility is concerned, i can add a "-f" option to indicate "do the savenamespace even if namenode is not in safemode", but still think that this is an option that everybody will use. Can one of you explain why we always required savenamespace to have namenode is safemode? isn't it always better to stall the workload rather to fail the workload? @Konstantin: can you pl explain the precise problem you have in mind? The rollFSImage() call acquires the FSNamesystem writelock, so it cannot race with saveNamespace. Moreover, saveNamespace truncates the edits log and removes edits.new while rollFSImage will exit out if it does not find edits.new.
          Hide
          Sanjay Radia added a comment -

          Dhruab, I understand your use case and do support it as you can see from my comments. I just wanted the -f flag since on a system with a large image it could have impact on clients.

          Wrt to checkpoints - while the writelock will address the race, we need to make sure that the BNN/SNN does not fail or get confused if it happens to do a checkpoint at the concurrently.

          Show
          Sanjay Radia added a comment - Dhruab, I understand your use case and do support it as you can see from my comments. I just wanted the -f flag since on a system with a large image it could have impact on clients. Wrt to checkpoints - while the writelock will address the race, we need to make sure that the BNN/SNN does not fail or get confused if it happens to do a checkpoint at the concurrently.
          Hide
          dhruba borthakur added a comment -

          > I just wanted the -f flag since on a system

          Sounds good, I will enhance the patch.

          > we need to make sure that the BNN/SNN does not fail or get confused

          It does not get confused. It will (correctly) detect the race and the SecondaryNode/CheckpointNode will try to create a fresh checkpoint form scratch.

          Show
          dhruba borthakur added a comment - > I just wanted the -f flag since on a system Sounds good, I will enhance the patch. > we need to make sure that the BNN/SNN does not fail or get confused It does not get confused. It will (correctly) detect the race and the SecondaryNode/CheckpointNode will try to create a fresh checkpoint form scratch.
          Hide
          Konstantin Shvachko added a comment -

          > It does not get confused.

          Could you please provide a test case for that.

          Show
          Konstantin Shvachko added a comment - > It does not get confused. Could you please provide a test case for that.
          Hide
          dhruba borthakur added a comment -

          This patch introduces a "force" option to the saveNamespace command. If this option is set, then the saveNamespace command is executed even if the namenode is not in safemode. The ClientProtocol number is bumped up by one.

          I am manually testing this patch with concurrent checkpointing and saveNamespace. I am unable to write a unit test that would trigger all or any of the races between saveNamespace and checkpoint.

          Show
          dhruba borthakur added a comment - This patch introduces a "force" option to the saveNamespace command. If this option is set, then the saveNamespace command is executed even if the namenode is not in safemode. The ClientProtocol number is bumped up by one. I am manually testing this patch with concurrent checkpointing and saveNamespace. I am unable to write a unit test that would trigger all or any of the races between saveNamespace and checkpoint.
          Hide
          dhruba borthakur added a comment -

          This addresses Konstanitin's request to add a unit test to test the invocation of saveNamespace command in the middle of a checkpoint.

          Show
          dhruba borthakur added a comment - This addresses Konstanitin's request to add a unit test to test the invocation of saveNamespace command in the middle of a checkpoint.
          Hide
          dhruba borthakur added a comment -

          Review also available at https://reviews.apache.org/r/125/

          Show
          dhruba borthakur added a comment - Review also available at https://reviews.apache.org/r/125/
          Hide
          Konstantin Shvachko added a comment -

          > I am unable to write a unit test that would trigger all or any of the races

          This is exactly my point. There is a whole chess game going on underneath with moving files/directories and threads writing in parallel. Changing the position of one pawn can change the outcome of the game.
          If saveNamespace() succeeds we are lucky and checkpoint fails. If not then somebody has to clean up the mess and there is lots of failure scenarios. We with Todd once spent quite some time sorting out all of them. May be I am paranoid and your change doesn't change the game, but it needs some convincing argumentation, which is hard.
          That is why I was asking alternatively about the use case. I understand setting NN in safe mode causes jobs failure. But why do you need to call saveNamespace()? What is wrong with checkpointing?

          Show
          Konstantin Shvachko added a comment - > I am unable to write a unit test that would trigger all or any of the races This is exactly my point. There is a whole chess game going on underneath with moving files/directories and threads writing in parallel. Changing the position of one pawn can change the outcome of the game. If saveNamespace() succeeds we are lucky and checkpoint fails. If not then somebody has to clean up the mess and there is lots of failure scenarios. We with Todd once spent quite some time sorting out all of them. May be I am paranoid and your change doesn't change the game, but it needs some convincing argumentation, which is hard. That is why I was asking alternatively about the use case. I understand setting NN in safe mode causes jobs failure. But why do you need to call saveNamespace()? What is wrong with checkpointing?
          Hide
          dhruba borthakur added a comment -

          > This addresses Konstanitin's request to add a unit test to test the invocation of saveNamespace command in the middle of a checkpoint.

          Unit test was already part of the last patch I uploaded.

          > needs some convincing argumentation, which is hard.
          I am trying to explain this again: checkpointing occurs every half hour or so. if one of the name.dir directory goes bad, then the namenode is like running on a spare tire. It is better not to wait for the next half hour to replace the spare-tire, but better to fix it right away. Especially important when you are running a realtime load via HBase on HDFS!

          Show
          dhruba borthakur added a comment - > This addresses Konstanitin's request to add a unit test to test the invocation of saveNamespace command in the middle of a checkpoint. Unit test was already part of the last patch I uploaded. > needs some convincing argumentation, which is hard. I am trying to explain this again: checkpointing occurs every half hour or so. if one of the name.dir directory goes bad, then the namenode is like running on a spare tire. It is better not to wait for the next half hour to replace the spare-tire, but better to fix it right away. Especially important when you are running a realtime load via HBase on HDFS!
          Hide
          Todd Lipcon added a comment -

          I'm not arguing one way or the other, but I'm curious whether an alternative solution would be to have the 2NN factor in the number of failed name.dirs when deciding to run a checkpoint. Right now we have a config to force a checkpoint after a certain size of edits - why not also have the 2NN check for failed but restorable dirs and force a checkpoint in that case too?

          Show
          Todd Lipcon added a comment - I'm not arguing one way or the other, but I'm curious whether an alternative solution would be to have the 2NN factor in the number of failed name.dirs when deciding to run a checkpoint. Right now we have a config to force a checkpoint after a certain size of edits - why not also have the 2NN check for failed but restorable dirs and force a checkpoint in that case too?
          Hide
          dhruba borthakur added a comment -

          Thanks Todd for your comments. I like your idea but it is not mutually exclusive to the idea behind this patch. I think the 2NN can continuously try to factor in fixing failed fs.name.dir while scheduling the next checkpoint. But on most production systems, it is not catastrophic if the 2NN is continuously running whereas if one of the name.dirs go bad, it is very essential to fix it and bring it back online as soon as possible.

          Show
          dhruba borthakur added a comment - Thanks Todd for your comments. I like your idea but it is not mutually exclusive to the idea behind this patch. I think the 2NN can continuously try to factor in fixing failed fs.name.dir while scheduling the next checkpoint. But on most production systems, it is not catastrophic if the 2NN is continuously running whereas if one of the name.dirs go bad, it is very essential to fix it and bring it back online as soon as possible.
          Hide
          Hairong Kuang added a comment -

          I really think that it is a good idea to be able to save namespace without entering safemode. It's critical for applications like HBase, which cannot afford to lose NameNode.

          I understand Konstantin's concern about the saveNamespace command racing with checkpointing. But saveNamespace holds the fsnamesystem write lock and changes the image signature. So it should abort any on-going checkpointing, right?

          Show
          Hairong Kuang added a comment - I really think that it is a good idea to be able to save namespace without entering safemode. It's critical for applications like HBase, which cannot afford to lose NameNode. I understand Konstantin's concern about the saveNamespace command racing with checkpointing. But saveNamespace holds the fsnamesystem write lock and changes the image signature. So it should abort any on-going checkpointing, right?
          Hide
          Hairong Kuang added a comment -

          The patch looks good. Only a couple of minor comments:
          1. DFSAdmin#saveNamespace's javadoc should update the sections "Usage" and "@see".
          2. TestSaveNamespace#testSaveWhileEditsRolledNotInSafemode could save a few lines if it uses Assert#fail after testSaveWhileEditsRolled instead of using gotException.

          Show
          Hairong Kuang added a comment - The patch looks good. Only a couple of minor comments: 1. DFSAdmin#saveNamespace's javadoc should update the sections "Usage" and "@see". 2. TestSaveNamespace#testSaveWhileEditsRolledNotInSafemode could save a few lines if it uses Assert#fail after testSaveWhileEditsRolled instead of using gotException.
          Hide
          Sanjay Radia added a comment -

          > There is a whole chess game going on underneath with moving files/directories and threads writing in parallel.
          We really need to get HDFS-1073 completed soon since it simplifies the game from chess to checkers.
          It allows checkpoints to be added without renaming the edits.
          This Jira, btw, is independent of that.

          Show
          Sanjay Radia added a comment - > There is a whole chess game going on underneath with moving files/directories and threads writing in parallel. We really need to get HDFS-1073 completed soon since it simplifies the game from chess to checkers. It allows checkpoints to be added without renaming the edits. This Jira, btw, is independent of that.
          Hide
          Konstantin Shvachko added a comment -

          D> It is better not to wait for the next half hour to replace the spare-tire.

          You don't need to wait the next half hour. Just restart SNN and the checkpointing begins. saveNamespace assumes manual; intervention so is restarting of SNN.

          H> But saveNamespace holds the fsnamesystem write lock and changes the image signature.

          This only argues that if saveNamespace() succeeds the simultaneous checkpoint will be aborted. What is needed is a comprehensive analysis of failure scenarios.
          But it seems easier just to restart SNN.

          Show
          Konstantin Shvachko added a comment - D> It is better not to wait for the next half hour to replace the spare-tire. You don't need to wait the next half hour. Just restart SNN and the checkpointing begins. saveNamespace assumes manual; intervention so is restarting of SNN. H> But saveNamespace holds the fsnamesystem write lock and changes the image signature. This only argues that if saveNamespace() succeeds the simultaneous checkpoint will be aborted. What is needed is a comprehensive analysis of failure scenarios. But it seems easier just to restart SNN.
          Hide
          dhruba borthakur added a comment -

          Incorporated Hairong's review comments.

          Show
          dhruba borthakur added a comment - Incorporated Hairong's review comments.
          Hide
          dhruba borthakur added a comment -

          hi konstantin, your suggestion of restarting a daemon that is supposed to be continuously running does not work very well from an operartions-point-of-view. In most of our clusters, we have alerts and monitoring to detect failed daemons. If you kill and restart a daemon, all those alerts fire immediately.

          You could argue that there could be a new command-line utility that talks to the Secondary and requests an immediate checkpoint (instead of waiting for the next half hour), but that is a round-about way of doing things, e.g it does not handle the case when one is not running the Secondary at all.

          Show
          dhruba borthakur added a comment - hi konstantin, your suggestion of restarting a daemon that is supposed to be continuously running does not work very well from an operartions-point-of-view. In most of our clusters, we have alerts and monitoring to detect failed daemons. If you kill and restart a daemon, all those alerts fire immediately. You could argue that there could be a new command-line utility that talks to the Secondary and requests an immediate checkpoint (instead of waiting for the next half hour), but that is a round-about way of doing things, e.g it does not handle the case when one is not running the Secondary at all.
          Hide
          Konstantin Shvachko added a comment -

          Dhruba, I don't think our discussion is productive here. My logic is simple. There are 2 ways to move on with this:

          • either you need to address the validity of proposed changes by analyzing multiple failure scenarios,
          • or you should face operational issues to solve the problem with existing features.

          If you are arguing that operational approach doesn't work for you, then you have to address the validity concerns, which you don't.
          Would you rather risk loosing the last image while testing it on a live cluster running on a spare tire?

          I agree with Sanjay that 1073 will simplify the analysis.

          Show
          Konstantin Shvachko added a comment - Dhruba, I don't think our discussion is productive here. My logic is simple. There are 2 ways to move on with this: either you need to address the validity of proposed changes by analyzing multiple failure scenarios, or you should face operational issues to solve the problem with existing features. If you are arguing that operational approach doesn't work for you, then you have to address the validity concerns, which you don't. Would you rather risk loosing the last image while testing it on a live cluster running on a spare tire? I agree with Sanjay that 1073 will simplify the analysis.
          Hide
          Sanjay Radia added a comment -

          Konstantine, I think there is a use case where one wants to to the manual saveNamespace when one is not running the SNN.
          I do agree with Dhruba that getting the SNN to do an immediate checkpoint is backwards way of doing this.

          H> But saveNamespace holds the fsnamesystem write lock and changes the image signature. So it should abort any on-going checkpointing, right?
          One of NN or SNN will start the checkpoint first. Can this be used to get the other one to abort?
          Can we list the various sceanarios here.

          We could also document the case which says that an operator does not issue the saveNamespace command one is running the SNN; - would this be acceptable?

          Show
          Sanjay Radia added a comment - Konstantine, I think there is a use case where one wants to to the manual saveNamespace when one is not running the SNN. I do agree with Dhruba that getting the SNN to do an immediate checkpoint is backwards way of doing this. H> But saveNamespace holds the fsnamesystem write lock and changes the image signature. So it should abort any on-going checkpointing, right? One of NN or SNN will start the checkpoint first. Can this be used to get the other one to abort? Can we list the various sceanarios here. We could also document the case which says that an operator does not issue the saveNamespace command one is running the SNN; - would this be acceptable?
          Hide
          Hairong Kuang added a comment -

          How about this? if saveNamespace fails, NameNode automatically enters the safe mode. This should address Konstantin's concern.

          Show
          Hairong Kuang added a comment - How about this? if saveNamespace fails, NameNode automatically enters the safe mode. This should address Konstantin's concern.
          Hide
          Sanjay Radia added a comment -

          Does the following table representative the design?

            Previously AbortedCheckpoint SaveNamespace Checkpoint
          Previously AbortedCheckpoint x Continue SaveNamespace Continue Checkpoint
          SaveNamespace Continue SaveNamespace x Continue SaveNamespace but Abort Checkpoint
          Checkpoint Continue Checkpoint Continue SaveNamespace but Abort Checkpoint x

          In which of the above cases will you enter safemode?

          Show
          Sanjay Radia added a comment - Does the following table representative the design?   Previously AbortedCheckpoint SaveNamespace Checkpoint Previously AbortedCheckpoint x Continue SaveNamespace Continue Checkpoint SaveNamespace Continue SaveNamespace x Continue SaveNamespace but Abort Checkpoint Checkpoint Continue Checkpoint Continue SaveNamespace but Abort Checkpoint x In which of the above cases will you enter safemode?
          Hide
          dhruba borthakur added a comment -

          > if saveNamespace fails, NameNode automatically enters the safe mode.

          +1. I like this idea, especially because it maintains existing failure semantics. Also, this behaviour occurs only if one runs
          bin/hadoop dfs -savenamespace force

          sanjay/konstantin?

          Show
          dhruba borthakur added a comment - > if saveNamespace fails, NameNode automatically enters the safe mode. +1. I like this idea, especially because it maintains existing failure semantics. Also, this behaviour occurs only if one runs bin/hadoop dfs -savenamespace force sanjay/konstantin?
          Hide
          Hairong Kuang added a comment -

          Sanjay, your table summarized all the scenarios. But this jira does not invent a new operation saveNamespace. Most cases have already been addressed by the existing code.

          The only new scenario that this jira introduces is that if savenamespace fails, checkpointing can still continue to run and may corrupt the image. So I suggested to put namenode in safemode if savenamespace fails. This makes sense operationally because something that's critical to NameNode happens, NameNode should stop operation and waits for admin's intervention.

          Show
          Hairong Kuang added a comment - Sanjay, your table summarized all the scenarios. But this jira does not invent a new operation saveNamespace. Most cases have already been addressed by the existing code. The only new scenario that this jira introduces is that if savenamespace fails, checkpointing can still continue to run and may corrupt the image. So I suggested to put namenode in safemode if savenamespace fails. This makes sense operationally because something that's critical to NameNode happens, NameNode should stop operation and waits for admin's intervention.
          Hide
          Sanjay Radia added a comment -

          I don't see a problem with the entering the safemode except that it wierd thing (a cmd that has has a side effect - but I agree that a failure of saveNamespace imples that
          the storage has serious problem).
          Q. If saveNamespace fails, why will checkpoint run into a problem? There will be a tmp dir that has probably stuck around.
          Q. In the past a saveNamespace in progress aborted a checkpoint because the NN was in safemode. With this jira is there a valriable called saveNamespaceInProgress that
          will be used to abort the checkpoint?

          Show
          Sanjay Radia added a comment - I don't see a problem with the entering the safemode except that it wierd thing (a cmd that has has a side effect - but I agree that a failure of saveNamespace imples that the storage has serious problem). Q. If saveNamespace fails, why will checkpoint run into a problem? There will be a tmp dir that has probably stuck around. Q. In the past a saveNamespace in progress aborted a checkpoint because the NN was in safemode. With this jira is there a valriable called saveNamespaceInProgress that will be used to abort the checkpoint?
          Hide
          dhruba borthakur added a comment -

          > Q. If saveNamespace fails, why will checkpoint run into a problem?

          savenamespace fails if all directories listed in fs.name.dir are bad. In this case, the succeeding checkpoint will fail too.

          > Q. In the past a saveNamespace in progress aborted a checkpoint because the NN was in safemode.

          In the new code, the saveNamespace call aborts an existing checkpoint. The savenamespace command clears out the edits log. This causes the checkpoint to fail because the rollFSImage call expects to find a edits.new file. That's why a new variable called "saveNamespaceInProgress" is not needed.

          Also, like you said, the existing behaviour is that if an operator wants to run the savenamespace command, he/she first puts the namenode is safemode. This causes a concurrently running checkpoint to fail. This behaviour is retained by this patch. do you think this patch is good to go in its current incantation?

          Show
          dhruba borthakur added a comment - > Q. If saveNamespace fails, why will checkpoint run into a problem? savenamespace fails if all directories listed in fs.name.dir are bad. In this case, the succeeding checkpoint will fail too. > Q. In the past a saveNamespace in progress aborted a checkpoint because the NN was in safemode. In the new code, the saveNamespace call aborts an existing checkpoint. The savenamespace command clears out the edits log. This causes the checkpoint to fail because the rollFSImage call expects to find a edits.new file. That's why a new variable called "saveNamespaceInProgress" is not needed. Also, like you said, the existing behaviour is that if an operator wants to run the savenamespace command, he/she first puts the namenode is safemode. This causes a concurrently running checkpoint to fail. This behaviour is retained by this patch. do you think this patch is good to go in its current incantation?
          Hide
          dhruba borthakur added a comment -

          If anybody has any other comments on this JIRA, please comment.

          Show
          dhruba borthakur added a comment - If anybody has any other comments on this JIRA, please comment.
          Hide
          dhruba borthakur added a comment -

          patch does not compile with latest trunk

          Show
          dhruba borthakur added a comment - patch does not compile with latest trunk
          Hide
          dhruba borthakur added a comment -

          Merged patch with latest trunk.

          Show
          dhruba borthakur added a comment - Merged patch with latest trunk.
          Hide
          dhruba borthakur added a comment -

          resubmit patch for Hudson tests.

          I think the patch is good to go as it stands currently.

          Show
          dhruba borthakur added a comment - resubmit patch for Hudson tests. I think the patch is good to go as it stands currently.
          Hide
          Konstantin Shvachko added a comment -

          Dhruba, there is a related problem discussed in HDFS-1597. Could you please verify that it is addressed in your patch.

          Show
          Konstantin Shvachko added a comment - Dhruba, there is a related problem discussed in HDFS-1597 . Could you please verify that it is addressed in your patch.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12469107/savenamespaceWithoutSafemode5.txt
          against trunk revision 1072023.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 21 new or modified tests.

          -1 patch. The patch command could not apply the patch.

          Console output: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/182//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12469107/savenamespaceWithoutSafemode5.txt against trunk revision 1072023. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 21 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/182//console This message is automatically generated.
          Hide
          Harsh J added a comment -

          I think this makes sense to go in, especially with the feature offered via HDFS-1509.

          Dhruba - Would you have some spare cycles to rebase the patch onto current trunk? If not, I'll get it done by the week.

          Show
          Harsh J added a comment - I think this makes sense to go in, especially with the feature offered via HDFS-1509 . Dhruba - Would you have some spare cycles to rebase the patch onto current trunk? If not, I'll get it done by the week.

            People

            • Assignee:
              dhruba borthakur
              Reporter:
              dhruba borthakur
            • Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated:

                Development