Hadoop Common
  1. Hadoop Common
  2. HADOOP-10051

winutil.exe is not included in hadoop bin tarball

    Details

    • Type: Bug Bug
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: 2.2.0, 2.4.0, 2.5.0
    • Fix Version/s: None
    • Component/s: bin
    • Labels:
      None

      Description

      I don't have Windows environment, but one user who tried 2.2.0 release
      on Windows reported that released tar ball doesn't contain
      "winutil.exe" and cannot run any commands. I confirmed that winutil.exe is not included in 2.2.0 bin tarball surely.

        Issue Links

          Activity

          Hide
          Tsuyoshi Ozawa added a comment -

          This is current workaround to use.
          http://stackoverflow.com/questions/18630019/running-apache-hadoop-2-1-0-on-windows

          IIUC, all users need to build winutil.exe and hadoop.dll. Is this assumed?

          Show
          Tsuyoshi Ozawa added a comment - This is current workaround to use. http://stackoverflow.com/questions/18630019/running-apache-hadoop-2-1-0-on-windows IIUC, all users need to build winutil.exe and hadoop.dll. Is this assumed?
          Hide
          Ali S added a comment -

          does not seem to be fixed in 2.4 either

          Show
          Ali S added a comment - does not seem to be fixed in 2.4 either
          Hide
          Glaucio Scheibel added a comment -

          I just tested 2.5 version, and this issue still there. Where can I find this .exe?

          Show
          Glaucio Scheibel added a comment - I just tested 2.5 version, and this issue still there. Where can I find this .exe?
          Hide
          Steve Loughran added a comment -

          This problem is going to continue unless/until the hadoop releases include the native windows libs

          Perhaps

          1. we can build up the windows binaries for every release and stick them up alongside the hadoop.tar -off the -src release
          2. release them shortly after the hadoop release

          in the meantime, we can create the hadoop libs for each release and stick them up somewhere (inside/outside apache), for people that want them.

          Show
          Steve Loughran added a comment - This problem is going to continue unless/until the hadoop releases include the native windows libs Perhaps we can build up the windows binaries for every release and stick them up alongside the hadoop.tar -off the -src release release them shortly after the hadoop release in the meantime, we can create the hadoop libs for each release and stick them up somewhere (inside/outside apache), for people that want them.
          Hide
          Ruslan Dautkhanov added a comment -

          Not fixed in 2.6

          Show
          Ruslan Dautkhanov added a comment - Not fixed in 2.6
          Hide
          Romain Manni-Bucau added a comment -

          still in 2.7

          Show
          Romain Manni-Bucau added a comment - still in 2.7
          Show
          Steve Loughran added a comment - you can pick up a copy here https://github.com/steveloughran/winutils see http://wiki.apache.org/hadoop/WindowsProblems
          Hide
          Romain Manni-Bucau added a comment -

          Steve Loughran that's what I'm doing (patching beam to build on windows ATM) but would be saner and better to rely on an ASF or worse case central (like maven one) binary and not a github one

          Show
          Romain Manni-Bucau added a comment - Steve Loughran that's what I'm doing (patching beam to build on windows ATM) but would be saner and better to rely on an ASF or worse case central (like maven one) binary and not a github one
          Hide
          Steve Loughran added a comment -

          I did sign the JARs, with the same gpg that's listed as my hadoop committer credentials; it was built off the ASF commit ID, and on a dedicated VM that I use for build and test of Hadoop stuff. You can trust it as much as you can any other binary you come from me, and I'm sure your build already passes through code I've done. The main issues with github is durability; how long can you trust it to be there.

          What we are discussing is getting rid of winutils entirely, move to having a JAR containing the native libs inside, libs which are then unzipped depending on the platform...the way snappy does. That way: a JAR in the package or up on maven. Volunteers to help implement/test that welcome.

          Show
          Steve Loughran added a comment - I did sign the JARs, with the same gpg that's listed as my hadoop committer credentials; it was built off the ASF commit ID, and on a dedicated VM that I use for build and test of Hadoop stuff. You can trust it as much as you can any other binary you come from me, and I'm sure your build already passes through code I've done. The main issues with github is durability; how long can you trust it to be there. What we are discussing is getting rid of winutils entirely, move to having a JAR containing the native libs inside, libs which are then unzipped depending on the platform...the way snappy does. That way: a JAR in the package or up on maven. Volunteers to help implement/test that welcome.
          Hide
          Romain Manni-Bucau added a comment -

          +1 the github issue being more github than the signing there. What is blocking to put the binaries in the repo to allow a mvn packaging? We do it for tomee and while it is limited to few files it is acceptable and easy enough I think.

          Show
          Romain Manni-Bucau added a comment - +1 the github issue being more github than the signing there. What is blocking to put the binaries in the repo to allow a mvn packaging? We do it for tomee and while it is limited to few files it is acceptable and easy enough I think.
          Hide
          Steve Loughran added a comment -

          no real problem, except the current builld/release process is done on linux systems; we'd have to coordinate it more, and there isn't currently a policy in place that the releases must come with windows runtimes. Feel free to join in on the hadoop common dev list and push for it...even though not many people run Hadoop clusters on windows, it is more relevant for standalone things downstream where people want to use their tools on windows

          Show
          Steve Loughran added a comment - no real problem, except the current builld/release process is done on linux systems; we'd have to coordinate it more, and there isn't currently a policy in place that the releases must come with windows runtimes. Feel free to join in on the hadoop common dev list and push for it...even though not many people run Hadoop clusters on windows, it is more relevant for standalone things downstream where people want to use their tools on windows
          Hide
          Romain Manni-Bucau added a comment -

          An alternative is to support cygwin, think a lot of win dev have it so detecting CYGWIN or CYGWIN_VERSION as environment variable could allow to use bash style commands instead of win ones for all cases where winutils is expected to be used but missing.

          wdyt?

          Show
          Romain Manni-Bucau added a comment - An alternative is to support cygwin, think a lot of win dev have it so detecting CYGWIN or CYGWIN_VERSION as environment variable could allow to use bash style commands instead of win ones for all cases where winutils is expected to be used but missing. wdyt?
          Hide
          Steve Loughran added a comment -

          complicates the test setup. I'd rather get rid of winutils and switch direct to an embedded DLL. Apparently the maven nar plugin is good for those

          Show
          Steve Loughran added a comment - complicates the test setup. I'd rather get rid of winutils and switch direct to an embedded DLL. Apparently the maven nar plugin is good for those
          Hide
          Chris Nauroth added a comment -

          Historically, going back several years, Hadoop did use Cygwin to achieve partial compatibility on Windows. Ultimately, that approach fell short though. Cygwin makes a set of its own implementation choices on how to map Windows semantics to Unix semantics, and those implementation choices didn't always match with Hadoop's requirements. A few notable areas I remember were mapping NTFS ACLs to Unix-style file permissions, file locking, and child process management. That led to the decision to implement our own native code layer where we could control the semantics.

          Consolidating the native code layer down to just hadoop.dll, without winutils.exe, would simplify this.

          Show
          Chris Nauroth added a comment - Historically, going back several years, Hadoop did use Cygwin to achieve partial compatibility on Windows. Ultimately, that approach fell short though. Cygwin makes a set of its own implementation choices on how to map Windows semantics to Unix semantics, and those implementation choices didn't always match with Hadoop's requirements. A few notable areas I remember were mapping NTFS ACLs to Unix-style file permissions, file locking, and child process management. That led to the decision to implement our own native code layer where we could control the semantics. Consolidating the native code layer down to just hadoop.dll, without winutils.exe, would simplify this.

            People

            • Assignee:
              Unassigned
              Reporter:
              Tsuyoshi Ozawa
            • Votes:
              10 Vote for this issue
              Watchers:
              23 Start watching this issue

              Dates

              • Created:
                Updated:

                Development