Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-792

Invalid dfs -mv can trash your entire dfs

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.5.0
    • 0.10.0
    • None
    • None

    Description

      If the target path of the dfs -mv command exists within the source path, the dfs becomes corrupt. For example:

      % hadoop dfs -mkdir target
      % hadoop dfs -mv / target

      I'm not certain whether this is reproducible in the current trunk, but I'd bet that it is.

      This problem successfully circumvented my own patch to make dfs -rm a little safer (see my email c.2006-08-30 to nutch-dev for details). I had been deleting old crawl directories from the DFS by copying their names and pasting them into my command buffer. At one point, I paused to do something else, copied some other text (which unfortunately began with a Java comment and included carriage returns), then went back to removing the crawl directories. I must not have pressed hard enough on the "c" key when I did my next copy, since when I pasted into the command buffer, hadoop immediately began executing a dfs -rm / command. No problem - I'm protected, because my patched dfs command is just going to try to move / to /trash (and fail), right?

      Wrong! Even though hadoop isn't really capable of such a move, it apparently tries hard enough to corrupt the namenode's DB.

      Thankfully, I ran into this problem at a relatively opportune time, when the contents of my dfs had little value.

      Attachments

        1. renameerrorcode.patch
          0.5 kB
          Dhruba Borthakur

        Activity

          People

            Unassigned Unassigned
            schmed Chris Schneider
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: