Uploaded image for project: 'Mesos'
  1. Mesos
  2. MESOS-4585

mesos-fetcher LIBPROCESS_PORT set to 5051 URI fetch failure

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: 0.27.0
    • Fix Version/s: 0.27.1, 0.28.0
    • Component/s: None
    • Labels:

      Description

      When starting a task with a s3a:// URI, the fetcher fails to download the URI, failing when trying to bind to the slave's port 5051. The URI gets successfully downloaded, but the error is fatal. If the URI is changed to http://. The root cause of this is that apparently the mesos-fetcher process has LIBPROCESS_PORT=5051 in its environment as I was able to find from cat "/proc/`pgrep mesos-fetcher`/environ".

      stderr from a failing task:

      I0203 00:11:55.815500 4964 fetcher.cpp:424] Fetcher Info: {"cache_directory":"\/tmp\/mesos\/fetch\/slaves\/ede0e5bc-d7ac-4b9a-8d35-b210fa785db0-S0","items":[{"action":"BYPASS_CACHE","uri":{"cache":false,"executable":false,"extract":true,"value":"s3a:\/\/strava.mesos\/foo"}}],"sandbox_directory":"\/mnt\/mesos\/slaves\/ede0e5bc-d7ac-4b9a-8d35-b210fa785db0-S0\/frameworks\/fe927665-1516-46cf-94dd-6d2ca84007f1-0000\/executors\/uris-test.bc047306-ca0a-11e5-b742-e2162bf6108e\/runs\/24ebd807-b065-4776-a0bf-84bda4a82f01"}
      I0203 00:11:55.816830 4964 fetcher.cpp:379] Fetching URI 's3a://strava.mesos/foo'
      I0203 00:11:55.816846 4964 fetcher.cpp:250] Fetching directly into the sandbox directory
      I0203 00:11:55.816864 4964 fetcher.cpp:187] Fetching URI 's3a://strava.mesos/foo'
      I0203 00:11:56.191640 4964 fetcher.cpp:109] Downloading resource with Hadoop client from 's3a://strava.mesos/foo' to '/mnt/mesos/slaves/ede0e5bc-d7ac-4b9a-8d35-b210fa785db0-S0/frameworks/fe927665-1516-46cf-94dd-6d2ca84007f1-0000/executors/uris-test.bc047306-ca0a-11e5-b742-e2162bf6108e/runs/24ebd807-b065-4776-a0bf-84bda4a82f01/foo'
      F0203 00:11:56.192503 4964 process.cpp:892] Failed to initialize: Failed to bind on 0.0.0.0:5051: Address already in use: Address already in use [98]

          • Check failure stack trace: ***
            @ 0x7f229ce50e7d google::LogMessage::Fail()
            @ 0x7f229ce52c10 google::LogMessage::SendToLog()
            @ 0x7f229ce50a42 google::LogMessage::Flush()
            @ 0x7f229ce50c89 google::LogMessage::~LogMessage()
            @ 0x7f229ce51c32 google::ErrnoLogMessage::~ErrnoLogMessage()
            @ 0x7f229cdf16b9 process::initialize()
            @ 0x7f229cdf2f36 process::ProcessBase::ProcessBase()
            @ 0x7f229ce22875 process::reap()
            @ 0x7f229ce2ced7 process::subprocess()
            @ 0x7f229c50ab7b HDFS::copyToLocal()
            @ 0x40f03e download()
            @ 0x40b69f main
            @ 0x7f229adc8a40 (unknown)
            @ 0x40cf59 _start
            Aborted (core dumped)

        Attachments

        1. hdfs-stderr.log
          4 kB
          Michael Gummelt

          Issue Links

            Activity

              People

              • Assignee:
                lins05 Shuai Lin
                Reporter:
                drewrobb Drew Robb
              • Votes:
                0 Vote for this issue
                Watchers:
                7 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: