Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-3236

Distcp with hdfs:// passed with error in JT log while copying from .20.204 to .20.205 ( with useIp=false)

    Details

    • Type: Bug Bug
    • Status: Patch Available
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: 0.20.205.0
    • Fix Version/s: None
    • Component/s: jobtracker, security
    • Labels:
      None

      Description

      I tried to copy file from .20.204 to .20.205 by distcp over hdfs:// while using hadoop.security.token.service.use_ip=false in core-site.xml. The copy was successful but found error " org.apache.hadoop.mapreduce.security.token.DelegationTokenRenewal:" exception in .20.205 JT.

      1. HDFS-2447.patch
        2 kB
        Daryn Sharp
      2. HDFS-2447-1.patch
        7 kB
        Daryn Sharp

        Activity

        Hide
        Jitendra Nath Pandey added a comment -

        It is expected that JT can't renew tokens from a different cluster over hdfs:// (i.e. using rpc) if rpc ports are closed for outside cluster access. It is usually expected that http ports are open so it should work over hftp. It seems to me its not a bug.

        Daryn,
        Lets address the issue you pointed out in a separate jira.

        Show
        Jitendra Nath Pandey added a comment - It is expected that JT can't renew tokens from a different cluster over hdfs:// (i.e. using rpc) if rpc ports are closed for outside cluster access. It is usually expected that http ports are open so it should work over hftp. It seems to me its not a bug. Daryn, Lets address the issue you pointed out in a separate jira.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12499756/HDFS-2447-1.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 3 new or modified tests.

        -1 patch. The patch command could not apply the patch.

        Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1086//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12499756/HDFS-2447-1.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1086//console This message is automatically generated.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12499756/HDFS-2447-1.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 3 new or modified tests.

        -1 patch. The patch command could not apply the patch.

        Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/1395//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12499756/HDFS-2447-1.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/1395//console This message is automatically generated.
        Hide
        Daryn Sharp added a comment -

        Add unit tests.

        Show
        Daryn Sharp added a comment - Add unit tests.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12499726/HDFS-2447.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        -1 tests included. The patch doesn't appear to include any new or modified tests.
        Please justify why no new tests are needed for this patch.
        Also please list what manual steps were performed to verify this patch.

        -1 patch. The patch command could not apply the patch.

        Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/1391//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12499726/HDFS-2447.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/1391//console This message is automatically generated.
        Hide
        Daryn Sharp added a comment -

        Move privileged renew operation into new method. The timer task calls it wrapped in an exception handler (as before). The job thread invokes the method directly before scheduling for renewal.

        Existing commit tests are passing. Will see if I can add a specific test case, and run full test suite overnight.

        Show
        Daryn Sharp added a comment - Move privileged renew operation into new method. The timer task calls it wrapped in an exception handler (as before). The job thread invokes the method directly before scheduling for renewal. Existing commit tests are passing. Will see if I can add a specific test case, and run full test suite overnight.
        Hide
        Daryn Sharp added a comment -

        The problem was found to be that the JT couldn't contact the remote NN to renew a token due to a firewall. The tasks on the DNs were however able to contact the remote NN so the job succeeded. However, the job would have failed if it executed past the token expiration since the JT was unable to renew the token.

        If the JT has to acquire tokens for a job, and acquisition fails, the job will fail. This is the ideal behavior, but there's a loophole... If the JT finds the token in the job's token cache, then it "assumes" the token must valid. The reality may be that the token is invalid, canceled, long expired, or the NN can't even be reached. In all of these cases, the tasks get fired off anyway, just to clog up a cluster while they die a long slow death. Actually, on 23, it's been observed that tasks using an invalid token will pound on the NN every second – on one cluster this happened for a month!

        The JT immediately issues a token renewal and then uses a timer for future renewals. However, all renewals are done in a thread which means if the initial renewal fails because the token is bad, the job starts anyway. The simple solution is for the first renewal to occur in the job's context so an exception will kill the job, and future renewals to remain thread-based.

        Show
        Daryn Sharp added a comment - The problem was found to be that the JT couldn't contact the remote NN to renew a token due to a firewall. The tasks on the DNs were however able to contact the remote NN so the job succeeded. However, the job would have failed if it executed past the token expiration since the JT was unable to renew the token. If the JT has to acquire tokens for a job, and acquisition fails, the job will fail. This is the ideal behavior, but there's a loophole... If the JT finds the token in the job's token cache, then it "assumes" the token must valid. The reality may be that the token is invalid, canceled, long expired, or the NN can't even be reached. In all of these cases, the tasks get fired off anyway, just to clog up a cluster while they die a long slow death. Actually, on 23, it's been observed that tasks using an invalid token will pound on the NN every second – on one cluster this happened for a month! The JT immediately issues a token renewal and then uses a timer for future renewals. However, all renewals are done in a thread which means if the initial renewal fails because the token is bad, the job starts anyway. The simple solution is for the first renewal to occur in the job's context so an exception will kill the job, and future renewals to remain thread-based.
        Hide
        Rajit Saha added a comment -

        Ran distcp in .20.205 Client to fetch data from 204 NN to 205
        $ hadoop distcp hdfs://<204 NN hostname>:8020/user/hadoopqa/23File out23hdfsfile
        11/10/13 00:13:52 INFO tools.DistCp: srcPaths=[hdfs://< 204 NN hostname>:8020/user/<USER>/23File]
        11/10/13 00:13:52 INFO tools.DistCp: destPath=out23hdfsfile
        11/10/13 00:13:52 INFO util.NativeCodeLoader: Loaded the native-hadoop library
        11/10/13 00:13:52 INFO security.JniBasedUnixGroupsMapping: Using JniBasedUnixGroupsMapping for Group resolution
        11/10/13 00:13:53 INFO hdfs.DFSClient: Created HDFS_DELEGATION_TOKEN token 28 for <USER> on
        < 204 NN hostname>:8020
        11/10/13 00:13:53 INFO security.TokenCache: Got dt for
        hdfs://< 204 NN hostname>:8020/user/<USER>/23File;uri=< 204 NN hostname>:8020;t.service=< 204 NN hostname>:8020
        11/10/13 00:13:54 INFO hdfs.DFSClient: Created HDFS_DELEGATION_TOKEN token 45 for <USER> on
        < 205 NN Hostname >:8020
        11/10/13 00:13:54 INFO security.TokenCache: Got dt for
        out23hdfsfile;uri=< 205 NN Hostname >:8020;t.service=< 205 NN Hostname >:8020
        11/10/13 00:13:54 INFO tools.DistCp: out23hdfsfile does not exist.
        11/10/13 00:13:54 INFO tools.DistCp: sourcePathsCount=1
        11/10/13 00:13:54 INFO tools.DistCp: filesToCopyCount=1
        11/10/13 00:13:54 INFO tools.DistCp: bytesToCopyCount=7.7k
        11/10/13 00:13:54 INFO mapred.JobClient: Running job: job_201110121725_0036
        11/10/13 00:13:55 INFO mapred.JobClient: map 0% reduce 0%
        11/10/13 00:15:06 INFO mapred.JobClient: Task Id : attempt_201110121725_0036_m_000000_0, Status : FAILED
        java.io.IOException: Copied: 0 Skipped: 0 Failed: 1
        at org.apache.hadoop.tools.DistCp$CopyFilesMapper.close(DistCp.java:582)
        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
        at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
        at org.apache.hadoop.mapred.Child.main(Child.java:249)

        11/10/13 00:15:13 INFO mapred.JobClient: map 100% reduce 0%
        11/10/13 00:15:16 INFO mapred.JobClient: Job complete: job_201110121725_0036
        11/10/13 00:15:16 INFO mapred.JobClient: Counters: 22
        11/10/13 00:15:16 INFO mapred.JobClient: Job Counters
        11/10/13 00:15:16 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=76838
        11/10/13 00:15:16 INFO mapred.JobClient: Total time spent by all reduces waiting after reserving slots (ms)=0
        11/10/13 00:15:16 INFO mapred.JobClient: Total time spent by all maps waiting after reserving slots (ms)=0
        11/10/13 00:15:16 INFO mapred.JobClient: Launched map tasks=2
        11/10/13 00:15:16 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=0
        11/10/13 00:15:16 INFO mapred.JobClient: File Input Format Counters
        11/10/13 00:15:16 INFO mapred.JobClient: Bytes Read=236
        11/10/13 00:15:16 INFO mapred.JobClient: File Output Format Counters
        11/10/13 00:15:16 INFO mapred.JobClient: Bytes Written=0
        11/10/13 00:15:16 INFO mapred.JobClient: FileSystemCounters
        11/10/13 00:15:16 INFO mapred.JobClient: HDFS_BYTES_READ=8220
        11/10/13 00:15:16 INFO mapred.JobClient: FILE_BYTES_WRITTEN=32199
        11/10/13 00:15:16 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=7840
        11/10/13 00:15:16 INFO mapred.JobClient: distcp
        11/10/13 00:15:16 INFO mapred.JobClient: Files copied=1
        11/10/13 00:15:16 INFO mapred.JobClient: Bytes copied=7840
        11/10/13 00:15:16 INFO mapred.JobClient: Bytes expected=7840
        11/10/13 00:15:16 INFO mapred.JobClient: Map-Reduce Framework
        11/10/13 00:15:16 INFO mapred.JobClient: Map input records=1
        11/10/13 00:15:16 INFO mapred.JobClient: Physical memory (bytes) snapshot=59052032
        11/10/13 00:15:16 INFO mapred.JobClient: Spilled Records=0
        11/10/13 00:15:16 INFO mapred.JobClient: CPU time spent (ms)=420
        11/10/13 00:15:16 INFO mapred.JobClient: Total committed heap usage (bytes)=71761920
        11/10/13 00:15:16 INFO mapred.JobClient: Virtual memory (bytes) snapshot=849117184
        11/10/13 00:15:16 INFO mapred.JobClient: Map input bytes=136
        11/10/13 00:15:16 INFO mapred.JobClient: Map output records=0
        11/10/13 00:15:16 INFO mapred.JobClient: SPLIT_RAW_BYTES=144

        The copy was successful
        ========================
        $hadoop dfs -lsr .
        drwx------ - <USER> hdfs 0 2011-10-13 00:15 /user/<USER>/.staging
        drwx------ - <USER> hdfs 0 2011-10-13 00:15 /user/<USER>/_distcp_logs_4bu2mv
        rw------ 3 <USER> hdfs 0 2011-10-13 00:15 /user/<USER>/_distcp_logs_4bu2mv/part-00000
        rw------ 3 <USER> hdfs 7840 2011-10-13 00:15 /user/<USER>/out23hdfsfile

        205 JT log snippet
        =====================
        2011-10-13 00:14:44,700 ERROR org.apache.hadoop.mapreduce.security.token.DelegationTokenRenewal: Exception renewing
        tokenIdent: 00 1c 68 61 64 6f 6f 70 71 61 40 44 45 56 2e 59 47 52 49 44 2e 59 41 48 4f 4f 2e 43 4f 4d 06 6d 61 70 72 65
        64 00 8a 01 32 fa a0 c8 c9 8a 01 33 1e ad 4c c9 1c 02, Pass: 54 79 88 dc 4d 48 09 90 d8 1b 15 6b bd ad 2d f4 d6 33 6c
        cb, Kind: HDFS_DELEGATION_TOKEN, Service: < 204 NN Hostname>:8020. Not rescheduled
        java.net.ConnectException: Call to < 204 NN Hostname>/< 204 NN IP>:8020 failed on connection exception:
        java.net.ConnectException: Connection refused
        at org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
        at org.apache.hadoop.ipc.Client.call(Client.java:1071)
        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
        at $Proxy7.getProtocolVersion(Unknown Source)
        at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396)
        at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:379)
        at org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:118)
        at org.apache.hadoop.hdfs.DFSClient.access$000(DFSClient.java:74)
        at org.apache.hadoop.hdfs.DFSClient$Renewer.renew(DFSClient.java:360)
        at org.apache.hadoop.security.token.Token.renew(Token.java:311)
        at
        org.apache.hadoop.mapreduce.security.token.DelegationTokenRenewal$RenewalTimerTask$1.run(DelegationTokenRenewal.java:216)
        at
        org.apache.hadoop.mapreduce.security.token.DelegationTokenRenewal$RenewalTimerTask$1.run(DelegationTokenRenewal.java:212)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
        at
        org.apache.hadoop.mapreduce.security.token.DelegationTokenRenewal$RenewalTimerTask.run(DelegationTokenRenewal.java:211)
        at java.util.TimerThread.mainLoop(Timer.java:512)
        at java.util.TimerThread.run(Timer.java:462)
        Caused by: java.net.ConnectException: Connection refused
        at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
        at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
        at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
        at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:604)
        at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
        at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
        at org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
        at org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
        at org.apache.hadoop.ipc.Client.call(Client.java:1046)
        ... 16 more
        2011-10-13 00:14:44,701 INFO org.apache.hadoop.hdfs.DFSClient: Renewing HDFS_DELEGATION_TOKEN token 45 for hadoopqa on
        < 205 NN hostname>:8020
        2011-10-13 00:15:05,543 INFO org.apache.hadoop.mapred.TaskInProgress: Error from attempt_201110121725_0036_m_000000_0:
        java.io.IOException: Copied: 0 Skipped: 0 Failed: 1
        at org.apache.hadoop.tools.DistCp$CopyFilesMapper.close(DistCp.java:582)
        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
        at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
        at org.apache.hadoop.mapred.Child.main(Child.java:249)

        Show
        Rajit Saha added a comment - Ran distcp in .20.205 Client to fetch data from 204 NN to 205 $ hadoop distcp hdfs://<204 NN hostname>:8020/user/hadoopqa/23File out23hdfsfile 11/10/13 00:13:52 INFO tools.DistCp: srcPaths= [hdfs://< 204 NN hostname>:8020/user/<USER>/23File] 11/10/13 00:13:52 INFO tools.DistCp: destPath=out23hdfsfile 11/10/13 00:13:52 INFO util.NativeCodeLoader: Loaded the native-hadoop library 11/10/13 00:13:52 INFO security.JniBasedUnixGroupsMapping: Using JniBasedUnixGroupsMapping for Group resolution 11/10/13 00:13:53 INFO hdfs.DFSClient: Created HDFS_DELEGATION_TOKEN token 28 for <USER> on < 204 NN hostname>:8020 11/10/13 00:13:53 INFO security.TokenCache: Got dt for hdfs://< 204 NN hostname>:8020/user/<USER>/23File;uri=< 204 NN hostname>:8020;t.service=< 204 NN hostname>:8020 11/10/13 00:13:54 INFO hdfs.DFSClient: Created HDFS_DELEGATION_TOKEN token 45 for <USER> on < 205 NN Hostname >:8020 11/10/13 00:13:54 INFO security.TokenCache: Got dt for out23hdfsfile;uri=< 205 NN Hostname >:8020;t.service=< 205 NN Hostname >:8020 11/10/13 00:13:54 INFO tools.DistCp: out23hdfsfile does not exist. 11/10/13 00:13:54 INFO tools.DistCp: sourcePathsCount=1 11/10/13 00:13:54 INFO tools.DistCp: filesToCopyCount=1 11/10/13 00:13:54 INFO tools.DistCp: bytesToCopyCount=7.7k 11/10/13 00:13:54 INFO mapred.JobClient: Running job: job_201110121725_0036 11/10/13 00:13:55 INFO mapred.JobClient: map 0% reduce 0% 11/10/13 00:15:06 INFO mapred.JobClient: Task Id : attempt_201110121725_0036_m_000000_0, Status : FAILED java.io.IOException: Copied: 0 Skipped: 0 Failed: 1 at org.apache.hadoop.tools.DistCp$CopyFilesMapper.close(DistCp.java:582) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059) at org.apache.hadoop.mapred.Child.main(Child.java:249) 11/10/13 00:15:13 INFO mapred.JobClient: map 100% reduce 0% 11/10/13 00:15:16 INFO mapred.JobClient: Job complete: job_201110121725_0036 11/10/13 00:15:16 INFO mapred.JobClient: Counters: 22 11/10/13 00:15:16 INFO mapred.JobClient: Job Counters 11/10/13 00:15:16 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=76838 11/10/13 00:15:16 INFO mapred.JobClient: Total time spent by all reduces waiting after reserving slots (ms)=0 11/10/13 00:15:16 INFO mapred.JobClient: Total time spent by all maps waiting after reserving slots (ms)=0 11/10/13 00:15:16 INFO mapred.JobClient: Launched map tasks=2 11/10/13 00:15:16 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=0 11/10/13 00:15:16 INFO mapred.JobClient: File Input Format Counters 11/10/13 00:15:16 INFO mapred.JobClient: Bytes Read=236 11/10/13 00:15:16 INFO mapred.JobClient: File Output Format Counters 11/10/13 00:15:16 INFO mapred.JobClient: Bytes Written=0 11/10/13 00:15:16 INFO mapred.JobClient: FileSystemCounters 11/10/13 00:15:16 INFO mapred.JobClient: HDFS_BYTES_READ=8220 11/10/13 00:15:16 INFO mapred.JobClient: FILE_BYTES_WRITTEN=32199 11/10/13 00:15:16 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=7840 11/10/13 00:15:16 INFO mapred.JobClient: distcp 11/10/13 00:15:16 INFO mapred.JobClient: Files copied=1 11/10/13 00:15:16 INFO mapred.JobClient: Bytes copied=7840 11/10/13 00:15:16 INFO mapred.JobClient: Bytes expected=7840 11/10/13 00:15:16 INFO mapred.JobClient: Map-Reduce Framework 11/10/13 00:15:16 INFO mapred.JobClient: Map input records=1 11/10/13 00:15:16 INFO mapred.JobClient: Physical memory (bytes) snapshot=59052032 11/10/13 00:15:16 INFO mapred.JobClient: Spilled Records=0 11/10/13 00:15:16 INFO mapred.JobClient: CPU time spent (ms)=420 11/10/13 00:15:16 INFO mapred.JobClient: Total committed heap usage (bytes)=71761920 11/10/13 00:15:16 INFO mapred.JobClient: Virtual memory (bytes) snapshot=849117184 11/10/13 00:15:16 INFO mapred.JobClient: Map input bytes=136 11/10/13 00:15:16 INFO mapred.JobClient: Map output records=0 11/10/13 00:15:16 INFO mapred.JobClient: SPLIT_RAW_BYTES=144 The copy was successful ======================== $hadoop dfs -lsr . drwx------ - <USER> hdfs 0 2011-10-13 00:15 /user/<USER>/.staging drwx------ - <USER> hdfs 0 2011-10-13 00:15 /user/<USER>/_distcp_logs_4bu2mv rw ------ 3 <USER> hdfs 0 2011-10-13 00:15 /user/<USER>/_distcp_logs_4bu2mv/part-00000 rw ------ 3 <USER> hdfs 7840 2011-10-13 00:15 /user/<USER>/out23hdfsfile 205 JT log snippet ===================== 2011-10-13 00:14:44,700 ERROR org.apache.hadoop.mapreduce.security.token.DelegationTokenRenewal: Exception renewing tokenIdent: 00 1c 68 61 64 6f 6f 70 71 61 40 44 45 56 2e 59 47 52 49 44 2e 59 41 48 4f 4f 2e 43 4f 4d 06 6d 61 70 72 65 64 00 8a 01 32 fa a0 c8 c9 8a 01 33 1e ad 4c c9 1c 02, Pass: 54 79 88 dc 4d 48 09 90 d8 1b 15 6b bd ad 2d f4 d6 33 6c cb, Kind: HDFS_DELEGATION_TOKEN, Service: < 204 NN Hostname>:8020. Not rescheduled java.net.ConnectException: Call to < 204 NN Hostname>/< 204 NN IP>:8020 failed on connection exception: java.net.ConnectException: Connection refused at org.apache.hadoop.ipc.Client.wrapException(Client.java:1095) at org.apache.hadoop.ipc.Client.call(Client.java:1071) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225) at $Proxy7.getProtocolVersion(Unknown Source) at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396) at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:379) at org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:118) at org.apache.hadoop.hdfs.DFSClient.access$000(DFSClient.java:74) at org.apache.hadoop.hdfs.DFSClient$Renewer.renew(DFSClient.java:360) at org.apache.hadoop.security.token.Token.renew(Token.java:311) at org.apache.hadoop.mapreduce.security.token.DelegationTokenRenewal$RenewalTimerTask$1.run(DelegationTokenRenewal.java:216) at org.apache.hadoop.mapreduce.security.token.DelegationTokenRenewal$RenewalTimerTask$1.run(DelegationTokenRenewal.java:212) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059) at org.apache.hadoop.mapreduce.security.token.DelegationTokenRenewal$RenewalTimerTask.run(DelegationTokenRenewal.java:211) at java.util.TimerThread.mainLoop(Timer.java:512) at java.util.TimerThread.run(Timer.java:462) Caused by: java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:604) at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560) at org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184) at org.apache.hadoop.ipc.Client.getConnection(Client.java:1202) at org.apache.hadoop.ipc.Client.call(Client.java:1046) ... 16 more 2011-10-13 00:14:44,701 INFO org.apache.hadoop.hdfs.DFSClient: Renewing HDFS_DELEGATION_TOKEN token 45 for hadoopqa on < 205 NN hostname>:8020 2011-10-13 00:15:05,543 INFO org.apache.hadoop.mapred.TaskInProgress: Error from attempt_201110121725_0036_m_000000_0: java.io.IOException: Copied: 0 Skipped: 0 Failed: 1 at org.apache.hadoop.tools.DistCp$CopyFilesMapper.close(DistCp.java:582) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059) at org.apache.hadoop.mapred.Child.main(Child.java:249)

          People

          • Assignee:
            Daryn Sharp
            Reporter:
            Rajit Saha
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:

              Development