Lucene - Core
  1. Lucene - Core
  2. LUCENE-5072

Fix frame injection bug in javadocs generated with Java 6 (and Java 7 prior u25)

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 4.3.1
    • Fix Version/s: 4.4, 5.0
    • Component/s: general/build
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      The Apache Infra / Security team posted to all committers:

      Hi All,

      Oracle has announced [1], [2] a frame injection vulnerability in Javadoc generated by Java 5, Java 6 and Java 7 before update 22.

      [...]

      Please take the necessary steps to fix any currently published Javadoc and to ensure that any future Javadoc published by your project does not contain the vulnerability. The announcement by Oracle includes a link to a tool that can be used to fix Javadoc without regeneration.

      The infrastructure team is investigating options for preventing the publication of vulnerable Javadoc.

      The issue is public and may be discussed freely on your project's dev list.

      Thanks,

      Mark (ASF Infra)

      I fixed all published Javadocs on http://lucene.apache.org (for all historic releases where we have public available Javadocs on the web page).

      The mail also notes that we should not publish javadocs with this javadocs problem in the future. Unfortunately the release manager has to use the latest Java 7u25 version (released 2 days) ago. This would be fine for Lucene trunk (which is Java 7 only).

      But when we generate Javadocs JARs for Lucene 3 and 4, we cannot use Java 7 (to build the official release) because the javadocs would contain e.g. AutoCloaseable interface unless we use a JDK 6 or 5 bootclasspath (like we do for web pages).

      We also want the lucene/solr-*-javadoc.jar files to be correct, but those are built with Java 5 (3.x) or Java 6 (4.x).

      Unfortunately Oracle does not relaese a newer JDK 5 or JDK 6, so its impossible to do a release.

      But Oracle publishes the binary and source code of a "fix tool", that can be run on top of a tree of HTML files, patching all broken files (and only those). You can run it theoretically on the root folder of your harddisk - I did this on the whole lucene.apache.org web site.

      Robert Muir and I were looking for a IVY-compatible solution (the original Oracle tool cannot be automatically downloaded by IVY, as Oracle's website sets cookies and requests license confirmations). We found the following GITHUB project by olamy/karianna:

      https://github.com/AdoptOpenJDK/JavadocUpdaterTool

      As soon as they release the JAR file officially on Maven, we can download it with IVY and use it. This is a Maven Plugin, but it still contains the original source code of Oracle's tool, so we can execute it as ANT task after loading the JAR with IVY's coordinates: <java fork="false" class="..."/>

      In the GITHUB project description they note that you need JDK7 to use the tool, but this is no longer true, the -source/-target is Java 5 now, so we can run it easily.

      I will add the required tasks in common-build.xml's javadoc macro so it post-processes all javadocs and patches vulnerable files. If you build javadocs with a recent JDK, it would do nothing.

      1. LUCENE-5072.patch
        3 kB
        Uwe Schindler
      2. LUCENE-5072.patch
        3 kB
        Uwe Schindler
      3. LUCENE-5072.patch
        2 kB
        Uwe Schindler
      4. LUCENE-5072.patch
        2 kB
        Uwe Schindler

        Issue Links

          Activity

          Hide
          Steve Rowe added a comment -

          As soon as they release the JAR file officially on Maven, we can download it with IVY and use it. This is a Maven Plugin, but it still contains the original source code of Oracle's tool, so we can execute it as ANT task after loading the JAR with IVY's coordinates: <java fork="false" class="..."/>

          See LEGAL-171, where they're discussing whether Apache can distribute this code.

          Show
          Steve Rowe added a comment - As soon as they release the JAR file officially on Maven, we can download it with IVY and use it. This is a Maven Plugin, but it still contains the original source code of Oracle's tool, so we can execute it as ANT task after loading the JAR with IVY's coordinates: <java fork="false" class="..."/> See LEGAL-171 , where they're discussing whether Apache can distribute this code.
          Hide
          Uwe Schindler added a comment - - edited

          Cool, so we could maybe put the tool into our tools folder and use it from there. The original Oracle tool has a forbidden-api problem: It used default charset to process javadocs files

          If LEGAL-171 does not solve the problem, we can use the Maven plugin as it is an internal build tool (like we use svnkit or groovy for our builds) and not linked into the product.

          Show
          Uwe Schindler added a comment - - edited Cool, so we could maybe put the tool into our tools folder and use it from there. The original Oracle tool has a forbidden-api problem: It used default charset to process javadocs files If LEGAL-171 does not solve the problem, we can use the Maven plugin as it is an internal build tool (like we use svnkit or groovy for our builds) and not linked into the product.
          Hide
          Robert Muir added a comment -

          There is no need for any legal decision. We don't need to distribute anything, just invoke the tool in our build. Just like calling javac itself, which is gpl/proprietary.

          Show
          Robert Muir added a comment - There is no need for any legal decision. We don't need to distribute anything, just invoke the tool in our build. Just like calling javac itself, which is gpl/proprietary.
          Hide
          Uwe Schindler added a comment -

          I checked the javadoc pather tool:

          • As said before it uses default encoding to patch files -> die, die, die
          • It only detects a "string signature in a file" -> replace by a <fileSet/> in ANT!
          • It does a simple search/replace of a string -> use ANT copy task to patch with <filterreader/>

          Finally, we dont need to use Oracle's code to recursively patch javadocs, We can do that with a simple fileset after the javadocs run and patch with ANT's own <copy/> task.

          Show
          Uwe Schindler added a comment - I checked the javadoc pather tool: As said before it uses default encoding to patch files -> die, die, die It only detects a "string signature in a file" -> replace by a <fileSet/> in ANT! It does a simple search/replace of a string -> use ANT copy task to patch with <filterreader/> Finally, we dont need to use Oracle's code to recursively patch javadocs, We can do that with a simple fileset after the javadocs run and patch with ANT's own <copy/> task.
          Hide
          Uwe Schindler added a comment - - edited

          Attached is a patch that works without Oracle's proprietary license. The patching is done completely with ANT and its own search/replace logic:

          • Use a fileset on the vulnerable filename pattern **/index.htm*,**/toc.htm*, with restriction to all files that do not contain the "validURL(url)" pattern (which appears after u25)
          • replace the vulnerable call by the javascript code from Oracle's patch

          About License: The Javscript code in this patch is also autogenerated by Oracle's own tool, so its license should not matter (because Oracle's tool prints it in every file we distribute - we just emulate what Oracle's tool does).

          What do you think?

          Show
          Uwe Schindler added a comment - - edited Attached is a patch that works without Oracle's proprietary license. The patching is done completely with ANT and its own search/replace logic: Use a fileset on the vulnerable filename pattern **/index.htm*,**/toc.htm* , with restriction to all files that do not contain the "validURL(url)" pattern (which appears after u25) replace the vulnerable call by the javascript code from Oracle's patch About License: The Javscript code in this patch is also autogenerated by Oracle's own tool, so its license should not matter (because Oracle's tool prints it in every file we distribute - we just emulate what Oracle's tool does). What do you think?
          Hide
          Uwe Schindler added a comment -

          Previous patch had a bug in fileset and includes...

          Show
          Uwe Schindler added a comment - Previous patch had a bug in fileset and includes...
          Hide
          Steve Rowe added a comment - - edited

          About License: The Javscript code in this patch is also autogenerated by Oracle's own tool, so its license should not matter (because Oracle's tool prints it in every file we distribute - we just emulate what Oracle's tool does).

          +1, I agree with this.

          I patched trunk ran ant generate-maven-artifacts using Oracle 1.7.0_21 JDK on OS X - I saw lines like the following in Ant's output, so the macro is doing its job:

          [patch-javadoc] Replaced 1 occurrences in 1 files.
          

          I then unpacked all the *-javadoc.jar files under dist/, then ran the following, which printed nothing, which I judge as success:

          grep -L 'function validURL(url) {' $(grep -l 'function loadFrames() {' $(find . -name index.htm -o -name index.html -o -name toc.htm -o -name toc.html))
          

          Then I ran ant clean documentation and the following script also printed nothing, again success:

          grep -L 'function validURL(url) {' $(grep -l 'function loadFrames() {' $(find {lucene,solr}/build/docs -name index.html -o -name index.htm -o -name toc.htm -o -name toc.html))
          

          +1 for the patch

          Show
          Steve Rowe added a comment - - edited About License: The Javscript code in this patch is also autogenerated by Oracle's own tool, so its license should not matter (because Oracle's tool prints it in every file we distribute - we just emulate what Oracle's tool does). +1, I agree with this. I patched trunk ran ant generate-maven-artifacts using Oracle 1.7.0_21 JDK on OS X - I saw lines like the following in Ant's output, so the macro is doing its job: [patch-javadoc] Replaced 1 occurrences in 1 files. I then unpacked all the *-javadoc.jar files under dist/ , then ran the following, which printed nothing, which I judge as success: grep -L 'function validURL(url) {' $(grep -l 'function loadFrames() {' $(find . -name index.htm -o -name index.html -o -name toc.htm -o -name toc.html)) Then I ran ant clean documentation and the following script also printed nothing, again success: grep -L 'function validURL(url) {' $(grep -l 'function loadFrames() {' $(find {lucene,solr}/build/docs -name index.html -o -name index.htm -o -name toc.htm -o -name toc.html)) +1 for the patch
          Hide
          Uwe Schindler added a comment -

          Minimal update (added expandProperties="false" to replacement text).

          Show
          Uwe Schindler added a comment - Minimal update (added expandProperties="false" to replacement text).
          Hide
          Uwe Schindler added a comment -

          I also checked:

          • I used 1.6.0_32 to build documentation
          • After building I rand the official Oracle tool in check mode: java -jar JavadocUpdaterTool.jar -R -C . and it reported no vulnerability.

          The Javadocs still work (swithcing on/off frames).

          Show
          Uwe Schindler added a comment - I also checked: I used 1.6.0_32 to build documentation After building I rand the official Oracle tool in check mode: java -jar JavadocUpdaterTool.jar -R -C . and it reported no vulnerability. The Javadocs still work (swithcing on/off frames).
          Hide
          Uwe Schindler added a comment -

          Attached is a patch with better "API" to make the macro be reuseable in other projects, to encourage users to use it in their own ant builds.

          The attributes on <patch-javadoc/> are similar to javadoc's encoding with same defaults for charset. It also chaks all possible case-insensitive index.htm* and toc.htm* files as specified in the original Oracle tool.

          Show
          Uwe Schindler added a comment - Attached is a patch with better "API" to make the macro be reuseable in other projects, to encourage users to use it in their own ant builds. The attributes on <patch-javadoc/> are similar to javadoc's encoding with same defaults for charset. It also chaks all possible case-insensitive index.htm* and toc.htm* files as specified in the original Oracle tool.
          Hide
          Uwe Schindler added a comment - - edited

          For all other projects that use ANT and want to fix the javadocs directly after execution of Ant's Javadoc task: Just copy the following Ant macro into your project and invoke it directly after <javadoc/>:

            <!--
              Patch frame injection bugs in javadoc generated files - see CVE-2013-1571, http://www.kb.cert.org/vuls/id/225657
              
              Feel free to use this macro in your own Ant build file. This macro works together with the javadoc task on Ant
              and should be invoked directly after its execution to patch broken javadocs, e.g.:
                <patch-javadoc dir="..." docencoding="UTF-8"/>
              Please make sure that the docencoding parameter uses the same charset like javadoc's docencoding. Default
              is the platform default encoding (like the javadoc task).
              The specified dir is the destination directory of the javadoc task.
            -->
            <macrodef name="patch-javadoc">
              <attribute name="dir"/>
              <attribute name="docencoding" default="${file.encoding}"/>
              <sequential>
                <replace encoding="@{docencoding}" summary="true" taskname="patch-javadoc">
                  <fileset dir="@{dir}" casesensitive="false" includes="**/index.html,**/index.htm,**/toc.html,**/toc.htm">
                    <!-- TODO: add encoding="@{docencoding}" to contains check, when we are on ANT 1.9.0: -->
                    <not><contains text="function validURL(url) {" casesensitive="true" /></not>
                  </fileset>
                  <replacetoken><![CDATA[function loadFrames() {]]></replacetoken>
                  <replacevalue expandProperties="false"><![CDATA[if (targetPage != "" && !validURL(targetPage))
                  targetPage = "undefined";
              function validURL(url) {
                  var pos = url.indexOf(".html");
                  if (pos == -1 || pos != url.length - 5)
                      return false;
                  var allowNumber = false;
                  var allowSep = false;
                  var seenDot = false;
                  for (var i = 0; i < url.length - 5; i++) {
                      var ch = url.charAt(i);
                      if ('a' <= ch && ch <= 'z' ||
                              'A' <= ch && ch <= 'Z' ||
                              ch == '$' ||
                              ch == '_') {
                          allowNumber = true;
                          allowSep = true;
                      } else if ('0' <= ch && ch <= '9'
                              || ch == '-') {
                          if (!allowNumber)
                               return false;
                      } else if (ch == '/' || ch == '.') {
                          if (!allowSep)
                              return false;
                          allowNumber = false;
                          allowSep = false;
                          if (ch == '.')
                               seenDot = true;
                          if (ch == '/' && seenDot)
                               return false;
                      } else {
                          return false;
                      }
                  }
                  return true;
              }
              function loadFrames() {]]></replacevalue>
                </replace>
              </sequential>
            </macrodef>
          
          Show
          Uwe Schindler added a comment - - edited For all other projects that use ANT and want to fix the javadocs directly after execution of Ant's Javadoc task: Just copy the following Ant macro into your project and invoke it directly after <javadoc/>: <!-- Patch frame injection bugs in javadoc generated files - see CVE-2013-1571, http://www.kb.cert.org/vuls/id/225657 Feel free to use this macro in your own Ant build file. This macro works together with the javadoc task on Ant and should be invoked directly after its execution to patch broken javadocs, e.g.: <patch-javadoc dir= "..." docencoding= "UTF-8" /> Please make sure that the docencoding parameter uses the same charset like javadoc's docencoding. Default is the platform default encoding (like the javadoc task). The specified dir is the destination directory of the javadoc task. --> <macrodef name= "patch-javadoc" > <attribute name= "dir" /> <attribute name= "docencoding" default= "${file.encoding}" /> <sequential> <replace encoding= "@{docencoding}" summary= "true" taskname= "patch-javadoc" > <fileset dir= "@{dir}" casesensitive= "false" includes= "**/index.html,**/index.htm,**/toc.html,**/toc.htm" > <!-- TODO: add encoding= "@{docencoding}" to contains check, when we are on ANT 1.9.0: --> <not> <contains text= "function validURL(url) {" casesensitive= "true" /> </not> </fileset> <replacetoken> <![CDATA[function loadFrames() {]]> </replacetoken> <replacevalue expandProperties= "false" > <![CDATA[if (targetPage != "" && !validURL(targetPage)) targetPage = "undefined" ; function validURL(url) { var pos = url.indexOf( ".html" ); if (pos == -1 || pos != url.length - 5) return false; var allowNumber = false; var allowSep = false; var seenDot = false; for (var i = 0; i < url.length - 5; i++) { var ch = url.charAt(i); if ('a' <= ch && ch <= 'z' || 'A' <= ch && ch <= 'Z' || ch == '$' || ch == '_') { allowNumber = true; allowSep = true; } else if ('0' <= ch && ch <= '9' || ch == '-') { if (!allowNumber) return false; } else if (ch == '/' || ch == '.') { if (!allowSep) return false; allowNumber = false; allowSep = false; if (ch == '.') seenDot = true; if (ch == '/' && seenDot) return false; } else { return false; } } return true; } function loadFrames() {]]> </replacevalue> </replace> </sequential> </macrodef>
          Hide
          Sebb added a comment -

          I rand the official Oracle tool in check mode: java -jar JavadocUpdaterTool.jar -R -C . and it reported no vulnerability.

          That's a necessary condition, but not sufficient as the tool only checks certain markers; it does not do a formal analysis of the code. It would not detect an error in the validURL function, nor in the file matching.

          It would be better to compare the output of the Ant tool with the output of the Oracle tool when run against an entire Javadoc tree.

          Ideally this should be repeated with multiple Javadoc trees from different versions of the vulnerable Javadoc tool in different encodings.

          Show
          Sebb added a comment - I rand the official Oracle tool in check mode: java -jar JavadocUpdaterTool.jar -R -C . and it reported no vulnerability. That's a necessary condition, but not sufficient as the tool only checks certain markers; it does not do a formal analysis of the code. It would not detect an error in the validURL function, nor in the file matching. It would be better to compare the output of the Ant tool with the output of the Oracle tool when run against an entire Javadoc tree. Ideally this should be repeated with multiple Javadoc trees from different versions of the vulnerable Javadoc tool in different encodings.
          Hide
          Uwe Schindler added a comment - - edited

          Hi Sebb,
          of course that's what i I did. The confirmation I sent here is just the "final" check. When using correct charset encoding (which is buggy in Oracle's tool, because it uses Oracle's default), the output is identical.

          Show
          Uwe Schindler added a comment - - edited Hi Sebb, of course that's what i I did. The confirmation I sent here is just the "final" check. When using correct charset encoding (which is buggy in Oracle's tool, because it uses Oracle's default), the output is identical.
          Hide
          Uwe Schindler added a comment -

          I am currently also investigating how the tool handles non-official Oracle JDK Javadocs: IBM J9 and Oracle JRockit 6.

          Show
          Uwe Schindler added a comment - I am currently also investigating how the tool handles non-official Oracle JDK Javadocs: IBM J9 and Oracle JRockit 6.
          Hide
          Uwe Schindler added a comment - - edited

          Sebb: the main problem is not only correctness of our tool (implemented in ANT). The tool as provided by Oracle is buggy-as-hell™, so I would never recommend it to use it in any offical build script. The intention of Oracle was to fix already published Javadocs - and it fails to do this correct if your platform's default encoding is not the one of your javadocs!

          As mentioned on MJAVADOC-370 (Codehaus), I would include the "correct" and platform encoding independent solution into Maven (and also Ant!) core libs. Don't use Oracle's buggy, buggy, buggy tool, please - it has more bugs than it fixes!

          Show
          Uwe Schindler added a comment - - edited Sebb : the main problem is not only correctness of our tool (implemented in ANT). The tool as provided by Oracle is buggy-as-hell™, so I would never recommend it to use it in any offical build script. The intention of Oracle was to fix already published Javadocs - and it fails to do this correct if your platform's default encoding is not the one of your javadocs! As mentioned on MJAVADOC-370 (Codehaus) , I would include the "correct" and platform encoding independent solution into Maven (and also Ant!) core libs. Don't use Oracle's buggy, buggy, buggy tool, please - it has more bugs than it fixes!
          Hide
          Uwe Schindler added a comment -

          New patch (with the exact file name pattern, case insensitive as in Oracle's tool), ready to commit!

          I will commit this tomorrow if nobody objects. This will also fix Jenkins javadoc artifacts as mentioned on builds@ao.

          Show
          Uwe Schindler added a comment - New patch (with the exact file name pattern, case insensitive as in Oracle's tool), ready to commit! I will commit this tomorrow if nobody objects. This will also fix Jenkins javadoc artifacts as mentioned on builds@ao.
          Hide
          Uwe Schindler added a comment -

          I filed an ANT issue about this, although it would not really help us (unless we force users to upgrade once a fixed ANT version was released): https://issues.apache.org/bugzilla/show_bug.cgi?id=55132

          Show
          Uwe Schindler added a comment - I filed an ANT issue about this, although it would not really help us (unless we force users to upgrade once a fixed ANT version was released): https://issues.apache.org/bugzilla/show_bug.cgi?id=55132
          Hide
          Steve Rowe added a comment -

          Bulk close resolved 4.4 issues

          Show
          Steve Rowe added a comment - Bulk close resolved 4.4 issues

            People

            • Assignee:
              Uwe Schindler
              Reporter:
              Uwe Schindler
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development