Lucene - Core
  1. Lucene - Core
  2. LUCENE-3335 jrebug causes porter stemmer to sigsegv
  3. LUCENE-3349

Place warning about today's released Java7 version on Lucene/Solr/Root webpage's news and send mail to java-user

    Details

    • Type: Sub-task Sub-task
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: general/test
    • Labels:
    • Environment:

      Java7

    • Lucene Fields:
      New

      Description

      Today, JDK/JRE 1.7.0 GA was released by Oracle. Unfortunately they didn't fix the Hotspot problems affecting loops to be miscompiled (LUCENE-3335, LUCENE-3346). This can lead to Solr crashing with default configuration on startup or sudden index corrumption depending on configuration.

      We should send an email to the java-user and solr-user list describing the problem. Also place a note in the news section of Solr, Lucene Core and top-level website.

      I propose the following text:

      Jul 28th, 2011: WARNING: Index corruption and crashes in Apache Lucene Core / Apache Solr with Java 7

      Oracle released Java 7 today. Unfortunately it contains hotspot compiler optimizations, which miscompile some loops. This can affect code of several Apache projects. Sometimes JVMs only crash, but in several cases, results calculated can be incorrect, leading to bugs in applications (see Hotspot bugs http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7070134, http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7044738, http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7068051).

      Apache Lucene Core and Apache Solr are two Apache projects, which are affected by these bugs, namely all versions released until today. Solr users with the default configuration will have Java crashing with SIGSEGV as soon as they start to index documents, as one affected part is the well-known Porter stemmer (see LUCENE-3335). Other loops in Lucene may be miscompiled, too, leading to index corruption (especially on Lucene trunk with pulsing codec; other loops may be affected, too - LUCENE-3346, ).

      These problems were detected only 5 days before the official Java 7 release, so Oracle had no time to fix those bugs, affecting also many more applications. In response to our questions, they proposed to include the fixes into service release u2 (eventually into service release u1, see http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2011-July/005971.html). This means you cannot use Apache Lucene/Solr with Java 7 releases before Update 2! If you do, please don't open bug reports, it is not the committers' fault! At least disable loop optimizations using the -XX:-UseLoopPredicate JVM options to not risk index corruptions.

      Please note: Also Java 6 users are affected, if they use one of those JVM options, which are not enabled by default: -XX:+OptimizeStringConcat or -XX:+AggressiveOpts

      It is strongly recommended not to use any hotspot optimization switches in any Java version without extensive testing!

      In case you upgrade to Java 7, remember that you may have to reindex, as the unicode version shipped with Java 7 changed and tokenization behaves differently (e.g. lowercasing). For more information, read JRE_VERSION_MIGRATION.txt in your distribution package!

        Issue Links

          Activity

          Hide
          Robert Muir added a comment -

          +1

          Show
          Robert Muir added a comment - +1
          Hide
          Robert Muir added a comment -

          Uwe asked me to edit the typos, too bad because i really liked corrumption!!!!

          Show
          Robert Muir added a comment - Uwe asked me to edit the typos, too bad because i really liked corrumption!!!!
          Hide
          Michael McCandless added a comment -

          +1 scary!

          Show
          Michael McCandless added a comment - +1 scary!
          Hide
          Uwe Schindler added a comment -

          Should I also send this mail to announce@apache.org, so press/journalists also get the information? This would maybe make Oracle work faster

          Show
          Uwe Schindler added a comment - Should I also send this mail to announce@apache.org, so press/journalists also get the information? This would maybe make Oracle work faster
          Hide
          Hoss Man added a comment -

          Should I also send this mail to announce@apache.org

          Yes, this is exactly the type of thing that list is for...

          http://www.apache.org/foundation/mailinglists.html

          The Apache Announcements list contains news and announcements about the foundation and its projects. Announcements of major software releases, new projects, and other important news are included. Messages are posted only by the Foundation; there is no discussion.

          (emphasis mine)

          Show
          Hoss Man added a comment - Should I also send this mail to announce@apache.org Yes, this is exactly the type of thing that list is for... http://www.apache.org/foundation/mailinglists.html The Apache Announcements list contains news and announcements about the foundation and its projects. Announcements of major software releases, new projects, and other important news are included. Messages are posted only by the Foundation; there is no discussion. (emphasis mine)
          Hide
          Hoss Man added a comment -

          I might even suggest tweaking the wording so that the initial para is generic info about how this is a potentially serious problem for all java applications (at least: ones that use loops) and then in the subsequent paragraphs mention specificly how it is known to affect Lucene/Solr.

          that way it will be more visible to people who may not particularly care about Lucene/Solr (but do care about java)

          Show
          Hoss Man added a comment - I might even suggest tweaking the wording so that the initial para is generic info about how this is a potentially serious problem for all java applications (at least: ones that use loops) and then in the subsequent paragraphs mention specificly how it is known to affect Lucene/Solr. that way it will be more visible to people who may not particularly care about Lucene/Solr (but do care about java)
          Hide
          Steve Rowe added a comment -

          In the case you upgrade to Java 7, remember that you have to reindex everything, as the unicode version shipped with Java 7 changed and tokenization behaves differently!

          StandardTokenizer is not dependent on JVM Unicode version, so this statement is neither true in the strict sense nor in the "average" sense, assuming the "average" user employs StandardTokenizer/Analyzer.

          Show
          Steve Rowe added a comment - In the case you upgrade to Java 7, remember that you have to reindex everything, as the unicode version shipped with Java 7 changed and tokenization behaves differently! StandardTokenizer is not dependent on JVM Unicode version, so this statement is neither true in the strict sense nor in the "average" sense, assuming the "average" user employs StandardTokenizer/Analyzer.
          Hide
          Uwe Schindler added a comment -

          LowercaseFilter is dependent on unicode version...

          Maybe we make it a little bit softer with "may need to reindex".

          Show
          Uwe Schindler added a comment - LowercaseFilter is dependent on unicode version... Maybe we make it a little bit softer with "may need to reindex".
          Hide
          Uwe Schindler added a comment -

          I changed the text a little bit taking Hoss' and Steven's comments into account.

          Show
          Uwe Schindler added a comment - I changed the text a little bit taking Hoss' and Steven's comments into account.
          Hide
          Robert Muir added a comment -

          I would say maybe we should remove the advice about 'In case you upgrade to java 7', instead recommending that you do not use java 7.

          Show
          Robert Muir added a comment - I would say maybe we should remove the advice about 'In case you upgrade to java 7', instead recommending that you do not use java 7.
          Hide
          Uwe Schindler added a comment -

          Sites updates in SVN and people.apache.org.

          Show
          Uwe Schindler added a comment - Sites updates in SVN and people.apache.org.
          Hide
          Uwe Schindler added a comment -

          Mail sent to announce@apache.org; general@lucene.apache.org; java-user@lucene.apache.org; solr-user@lucene.apache.org

          Show
          Uwe Schindler added a comment - Mail sent to announce@apache.org; general@lucene.apache.org; java-user@lucene.apache.org; solr-user@lucene.apache.org
          Hide
          Uwe Schindler added a comment - - edited

          I posted an article in my (new) blog about the whole Java 7 issue: http://blog.thetaphi.de/2011/07/real-story-behind-java-7-ga-bugs.html

          Show
          Uwe Schindler added a comment - - edited I posted an article in my (new) blog about the whole Java 7 issue: http://blog.thetaphi.de/2011/07/real-story-behind-java-7-ga-bugs.html

            People

            • Assignee:
              Uwe Schindler
              Reporter:
              Uwe Schindler
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Due:
                Created:
                Updated:
                Resolved:

                Development