Solr
  1. Solr
  2. SOLR-6388

Update Apache TIKA 1.5's Apache POI dependency to 3.10.1

    Details

      Description

      TIKA 1.5 currently uses Apache POI 1.10-beta2 to extract Microsoft Ofiice documents. Apache POI releases 3.10.1 today (waiting for Maven Central...).

      We should upgrade the Solr POI dependency to 3.10.1, because the older version has various problems.

      1. SOLR-6388.patch
        7 kB
        Uwe Schindler

        Activity

        Hide
        Uwe Schindler added a comment -

        Patch. Maven Central is now up-to-date.

        Show
        Uwe Schindler added a comment - Patch. Maven Central is now up-to-date.
        Hide
        ASF subversion and git services added a comment -

        Commit 1618603 from Uwe Schindler in branch 'dev/trunk'
        [ https://svn.apache.org/r1618603 ]

        SOLR-6388: Update Apache TIKA 1.5's Apache POI dependency to 3.10.1

        Show
        ASF subversion and git services added a comment - Commit 1618603 from Uwe Schindler in branch 'dev/trunk' [ https://svn.apache.org/r1618603 ] SOLR-6388 : Update Apache TIKA 1.5's Apache POI dependency to 3.10.1
        Hide
        ASF subversion and git services added a comment -

        Commit 1618604 from Uwe Schindler in branch 'dev/branches/branch_4x'
        [ https://svn.apache.org/r1618604 ]

        Merged revision(s) 1618603 from lucene/dev/trunk:
        SOLR-6388: Update Apache TIKA 1.5's Apache POI dependency to 3.10.1

        Show
        ASF subversion and git services added a comment - Commit 1618604 from Uwe Schindler in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1618604 ] Merged revision(s) 1618603 from lucene/dev/trunk: SOLR-6388 : Update Apache TIKA 1.5's Apache POI dependency to 3.10.1
        Hide
        Uwe Schindler added a comment -

        FYI: This update was needed for:

        Hallo Apache Solr Users,

        the Apache Lucene PMC wants to make the users of Solr aware of the following issue:

        Apache Solr versions 4.8.0, 4.8.1, 4.9.0 bundle Apache POI 3.10-beta2 with its binary release tarball. This version (and all previous ones) of Apache POI are vulnerable to the following issues:

        CVE-2014-3529: XML External Entity (XXE) problem in Apache POI's OpenXML parser
        Type: Information disclosure
        Description: Apache POI uses Java's XML components to parse OpenXML files produced by Microsoft Office products (DOCX, XLSX, PPTX,...). Applications that accept such files from end-users are vulnerable to XML External Entity (XXE) attacks, which allows remote attackers to bypass security restrictions and read arbitrary files via a crafted OpenXML document that provides an XML external entity declaration in conjunction with an entity reference.

        CVE-2014-3574: XML Entity Expansion (XEE) problem in Apache POI's OpenXML parser
        Type: Denial of service
        Description: Apache POI uses Java's XML components and Apache Xmlbeans to parse OpenXML files produced by Microsoft Office products (DOCX, XLSX, PPTX,...). Applications that accept such files from end-users are vulnerable to XML Entity Expansion (XEE) attacks ("XML bombs"), which allows remote hackers to consume large amounts of CPU resources.

        The Apache POI PMC released a bugfix version (3.10.1) today.

        Solr users are affected by these issues, if they enable the "Apache Solr Content Extraction Library (Solr Cell)" contrib module from the folder "contrib/extraction" of the release tarball.

        Users of Apache Solr are strongly advised to keep the module disabled if they don't use it. Alternatively, users of Apache Solr 4.8.0, 4.8.1, or 4.9.0 can update the affected libraries by replacing the vulnerable JAR files in the distribution folder. Users of previous versions have to update their Solr release first, patching older versions is impossible.

        To replace the vulnerable JAR files follow these steps:

        • Download the Apache POI 3.10.1 binary release: http://poi.apache.org/download.html#POI-3.10.1
        • Unzip the archive
        • Delete the following files in your "solr-4.X.X/contrib/extraction/lib" folder: poi-3.10-beta2.jar, poi-ooxml-3.10-beta2.jar, poi-ooxml-schemas-3.10-beta2.jar, poi-scratchpad-3.10-beta2.jar, xmlbeans-2.3.0.jar
        • Copy the following files from the base folder of the Apache POI distribution to the "solr-4.X.X/contrib/extraction/lib" folder: poi-3.10.1-20140818.jar, poi-ooxml-3.10.1-20140818.jar, poi-ooxml-schemas-3.10.1-20140818.jar, poi-scratchpad-3.10.1-20140818.jar
        • Copy "xmlbeans-2.6.0.jar" from POI's "ooxml-lib/" folder to the "solr-4.X.X/contrib/extraction/lib" folder.
        • Verify that the "solr-4.X.X/contrib/extraction/lib" no longer contains any files with version number "3.10-beta2".
        • Verify that the folder contains one xmlbeans JAR file with version 2.6.0.

        If you just want to disable extraction of Microsoft Office documents, delete the files above and don't replace them. "Solr Cell" will automatically detect this and disable Microsoft Office document extraction.

        Coming versions of Apache Solr will have the updated libraries bundled.

        Happy Searching and Extracting,
        The Apache Lucene Developers

        PS: Thanks to Stefan Kopf, Mike Boufford, and Christian Schneider for reporting these issues!

        Show
        Uwe Schindler added a comment - FYI: This update was needed for: Hallo Apache Solr Users, the Apache Lucene PMC wants to make the users of Solr aware of the following issue: Apache Solr versions 4.8.0, 4.8.1, 4.9.0 bundle Apache POI 3.10-beta2 with its binary release tarball. This version (and all previous ones) of Apache POI are vulnerable to the following issues: CVE-2014-3529: XML External Entity (XXE) problem in Apache POI's OpenXML parser Type: Information disclosure Description: Apache POI uses Java's XML components to parse OpenXML files produced by Microsoft Office products (DOCX, XLSX, PPTX,...). Applications that accept such files from end-users are vulnerable to XML External Entity (XXE) attacks, which allows remote attackers to bypass security restrictions and read arbitrary files via a crafted OpenXML document that provides an XML external entity declaration in conjunction with an entity reference. CVE-2014-3574: XML Entity Expansion (XEE) problem in Apache POI's OpenXML parser Type: Denial of service Description: Apache POI uses Java's XML components and Apache Xmlbeans to parse OpenXML files produced by Microsoft Office products (DOCX, XLSX, PPTX,...). Applications that accept such files from end-users are vulnerable to XML Entity Expansion (XEE) attacks ("XML bombs"), which allows remote hackers to consume large amounts of CPU resources. The Apache POI PMC released a bugfix version (3.10.1) today. Solr users are affected by these issues, if they enable the "Apache Solr Content Extraction Library (Solr Cell)" contrib module from the folder "contrib/extraction" of the release tarball. Users of Apache Solr are strongly advised to keep the module disabled if they don't use it. Alternatively, users of Apache Solr 4.8.0, 4.8.1, or 4.9.0 can update the affected libraries by replacing the vulnerable JAR files in the distribution folder. Users of previous versions have to update their Solr release first, patching older versions is impossible. To replace the vulnerable JAR files follow these steps: Download the Apache POI 3.10.1 binary release: http://poi.apache.org/download.html#POI-3.10.1 Unzip the archive Delete the following files in your "solr-4.X.X/contrib/extraction/lib" folder: poi-3.10-beta2.jar, poi-ooxml-3.10-beta2.jar, poi-ooxml-schemas-3.10-beta2.jar, poi-scratchpad-3.10-beta2.jar, xmlbeans-2.3.0.jar Copy the following files from the base folder of the Apache POI distribution to the "solr-4.X.X/contrib/extraction/lib" folder: poi-3.10.1-20140818.jar, poi-ooxml-3.10.1-20140818.jar, poi-ooxml-schemas-3.10.1-20140818.jar, poi-scratchpad-3.10.1-20140818.jar Copy "xmlbeans-2.6.0.jar" from POI's "ooxml-lib/" folder to the "solr-4.X.X/contrib/extraction/lib" folder. Verify that the "solr-4.X.X/contrib/extraction/lib" no longer contains any files with version number "3.10-beta2". Verify that the folder contains one xmlbeans JAR file with version 2.6.0. If you just want to disable extraction of Microsoft Office documents, delete the files above and don't replace them. "Solr Cell" will automatically detect this and disable Microsoft Office document extraction. Coming versions of Apache Solr will have the updated libraries bundled. Happy Searching and Extracting, The Apache Lucene Developers PS: Thanks to Stefan Kopf, Mike Boufford, and Christian Schneider for reporting these issues!
        Hide
        ASF subversion and git services added a comment -

        Commit 1618959 from Uwe Schindler in branch 'dev/trunk'
        [ https://svn.apache.org/r1618959 ]

        SOLR-6388: Add changes entry

        Show
        ASF subversion and git services added a comment - Commit 1618959 from Uwe Schindler in branch 'dev/trunk' [ https://svn.apache.org/r1618959 ] SOLR-6388 : Add changes entry
        Hide
        ASF subversion and git services added a comment -

        Commit 1618960 from Uwe Schindler in branch 'dev/branches/branch_4x'
        [ https://svn.apache.org/r1618960 ]

        Merged revision(s) 1618959 from lucene/dev/trunk:
        SOLR-6388: Add changes entry

        Show
        ASF subversion and git services added a comment - Commit 1618960 from Uwe Schindler in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1618960 ] Merged revision(s) 1618959 from lucene/dev/trunk: SOLR-6388 : Add changes entry
        Hide
        ASF subversion and git services added a comment -

        Commit 1625908 from Uwe Schindler in branch 'dev/branches/lucene_solr_4_9'
        [ https://svn.apache.org/r1625908 ]

        Merged revision(s) 1618604, 1618960 from lucene/dev/branches/branch_4x:
        Merged revision(s) 1618603 from lucene/dev/trunk:
        SOLR-6388: Update Apache TIKA 1.5's Apache POI dependency to 3.10.1
        ........
        Merged revision(s) 1618959 from lucene/dev/trunk:
        SOLR-6388: Add changes entry

        Show
        ASF subversion and git services added a comment - Commit 1625908 from Uwe Schindler in branch 'dev/branches/lucene_solr_4_9' [ https://svn.apache.org/r1625908 ] Merged revision(s) 1618604, 1618960 from lucene/dev/branches/branch_4x: Merged revision(s) 1618603 from lucene/dev/trunk: SOLR-6388 : Update Apache TIKA 1.5's Apache POI dependency to 3.10.1 ........ Merged revision(s) 1618959 from lucene/dev/trunk: SOLR-6388 : Add changes entry
        Hide
        Michael McCandless added a comment -

        Bulk close for Lucene/Solr 4.9.1 release

        Show
        Michael McCandless added a comment - Bulk close for Lucene/Solr 4.9.1 release

          People

          • Assignee:
            Uwe Schindler
            Reporter:
            Uwe Schindler
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development