Hadoop Common
  1. Hadoop Common
  2. HADOOP-2760

Distcp must be able to copy from .16 to .15

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Won't Fix
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      Typically the "from" source files are referenced via HTTP, a strategy to evade version incompatibilities. In .16, with permission checking, there is no presumption that any particular file can be read via the web server.

      There must be some alternative for supplying user credentials so as to enable down-version copies.

        Issue Links

          Activity

          Hide
          Doug Cutting added a comment -

          Shouldn't this be a 0.16 bug? We shouldn't force folks to upgrade their 0.15 clusters before they can access data from a 0.16 cluster, but rather just fix 0.16 to interoperate with 0.15.

          Show
          Doug Cutting added a comment - Shouldn't this be a 0.16 bug? We shouldn't force folks to upgrade their 0.15 clusters before they can access data from a 0.16 cluster, but rather just fix 0.16 to interoperate with 0.15.
          Hide
          Mukund Madhugiri added a comment -

          Do we even need this? It works from 0.15 if data in 0.16 is world readable. If somebody is pulling data from a "secure" cluster into an "unsecure" cluster, the data is anyway going to be world readable. I tried the scenarios and here is the data:

          1. distcp initiated from 0.15.3 and 0.16.0 permissions turned OFF:

          • Directory with permission 777, data copied from 0.16.0 to 0.15.3 - WORKS
          • Directory with permission 770, data copied from 0.16.0 to 0.15.3 - WORKS
          • Directory with permission 700, data copied from 0.16.0 to 0.15.3 - WORKS
          • Directory with permission 000, data copied from 0.16.0 to 0.15.3 - WORKS

          2. distcp initiated from 0.15.3 and 0.16.0 permissions turned ON:

          • Directory with permission 777, data copied from 0.16.0 to 0.15.3 - WORKS
          • Directory with permission 770, data copied from 0.16.0 to 0.15.3 - DOES NOT WORK
          • Directory with permission 700, data copied from 0.16.0 to 0.15.3 - DOES NOT WORK
          • Directory with permission 000, data copied from 0.16.0 to 0.15.3 - DOES NOT WORK

          3. distcp initiated from 0.16.0 and 0.16.0 permissions turned ON:

          • data copied from 0.15.3 to 0.16.0 - WORKS

          4. distcp initiated from 0.16.0 and 0.16.0 permissions turned OFF:

          • data copied from 0.15.3 to 0.16.0 - WORKS
          Show
          Mukund Madhugiri added a comment - Do we even need this? It works from 0.15 if data in 0.16 is world readable. If somebody is pulling data from a "secure" cluster into an "unsecure" cluster, the data is anyway going to be world readable. I tried the scenarios and here is the data: 1. distcp initiated from 0.15.3 and 0.16.0 permissions turned OFF: Directory with permission 777, data copied from 0.16.0 to 0.15.3 - WORKS Directory with permission 770, data copied from 0.16.0 to 0.15.3 - WORKS Directory with permission 700, data copied from 0.16.0 to 0.15.3 - WORKS Directory with permission 000, data copied from 0.16.0 to 0.15.3 - WORKS 2. distcp initiated from 0.15.3 and 0.16.0 permissions turned ON: Directory with permission 777, data copied from 0.16.0 to 0.15.3 - WORKS Directory with permission 770, data copied from 0.16.0 to 0.15.3 - DOES NOT WORK Directory with permission 700, data copied from 0.16.0 to 0.15.3 - DOES NOT WORK Directory with permission 000, data copied from 0.16.0 to 0.15.3 - DOES NOT WORK 3. distcp initiated from 0.16.0 and 0.16.0 permissions turned ON: data copied from 0.15.3 to 0.16.0 - WORKS 4. distcp initiated from 0.16.0 and 0.16.0 permissions turned OFF: data copied from 0.15.3 to 0.16.0 - WORKS
          Hide
          Tsz Wo Nicholas Sze added a comment -

          > Do we even need this? It works from 0.15 if data in 0.16 is world readable.

          +1: I agree that we might not need to fix this issue.

          More details:
          For distcp from a 0.16 cluster to a 0.15 cluster, since 0.15 distcp does not provide a ugi, the default web account (the value specified by dfs.web.ugi in the conf, the default is "webuser,webgroup") is used for permission checking. Therefore, for permission OFF or 777, it should work for any web account. However, for some cases like 700, it depends on the owner of the directory. It will work if the owner is the web account or the web account is a superuser.

          Show
          Tsz Wo Nicholas Sze added a comment - > Do we even need this? It works from 0.15 if data in 0.16 is world readable. +1: I agree that we might not need to fix this issue. More details: For distcp from a 0.16 cluster to a 0.15 cluster, since 0.15 distcp does not provide a ugi, the default web account (the value specified by dfs.web.ugi in the conf, the default is "webuser,webgroup") is used for permission checking. Therefore, for permission OFF or 777, it should work for any web account. However, for some cases like 700, it depends on the owner of the directory. It will work if the owner is the web account or the web account is a superuser.
          Hide
          Milind Bhandarkar added a comment -

          Since 0.15 does not have permissions anyway, it makes sense that the only data you can copy from 0.16 to 0.15 via hftp must be world-readable in 0.16.

          So, +1 to Mukund's analysis. This does not need to be fixed in 0.15.

          Show
          Milind Bhandarkar added a comment - Since 0.15 does not have permissions anyway, it makes sense that the only data you can copy from 0.16 to 0.15 via hftp must be world-readable in 0.16. So, +1 to Mukund's analysis. This does not need to be fixed in 0.15.

            People

            • Assignee:
              Tsz Wo Nicholas Sze
              Reporter:
              Robert Chansler
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development