Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-4176

Can not produce proper collation key for ICUCollatedTermAttributeImp

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 6.0
    • 4.0-BETA, 6.0
    • modules/queryparser
    • None
    • New

    Description

      org.apache.lucene.collation.tokenattributes.ICUCollatedTermAttributeImpl return a hash of collation key's byte.
      The given hash value produce incorrect comparison result.
      The source code below return 1 for Lucene 3.6.
      The code here return 0.
      Code to reproduce:

      IndexWriter writer = new IndexWriter(ramDir, conf);
      Document doc = new Document();
      FieldType fieldType = new FieldType();
      fieldType.setIndexed(true);
      fieldType.setStored(true);
      Field field = new Field("content","เข", fieldType);
      doc.add(field);
      writer.addDocument(doc);
      writer.close();
      IndexSearcher is = new IndexSearcher(DirectoryReader.open(ramDir));
      QueryParser qp = new AnalyzingQueryParser(Version.LUCENE_50,"content", analyzer);

      ScoreDoc[] result = is.search(qp.parse("[\u0e01 TO \u0e03]"), null,1000).scoreDocs;
      System.out.println(result.length);

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned
            natta@th.ibm.com Nattapong Sirilappanich
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment