Issue Details (XML | Word | Printable)

Key: LUCENE-1150
Type: Bug Bug
Status: Closed Closed
Resolution: Fixed
Priority: Major Major
Assignee: Michael McCandless
Reporter: Nicolas Lalevée
Votes: 0
Watchers: 0
Operations

If you were logged in you would be able to see more operations.
Lucene - Java

The token types of the standard tokenizer is not accessible

Created: 25/Jan/08 10:16 AM   Updated: 08/May/08 07:47 PM
Return to search
Component/s: Analysis
Affects Version/s: 2.3
Fix Version/s: 2.3.2, 2.4

Time Tracking:
Not Specified

File Attachments:
  Size
Text File Licensed for inclusion in ASF works LUCENE-1150.patch 2008-01-25 12:40 PM Michael McCandless 7 kB
Text File Licensed for inclusion in ASF works LUCENE-1150.take2.patch 2008-01-25 07:16 PM Michael McCandless 14 kB

Lucene Fields: New
Resolution Date: 15/Apr/08 09:09 AM


 Description  « Hide
The StandardTokenizerImpl not being public, these token types are not accessible :
public static final int ALPHANUM          = 0;
public static final int APOSTROPHE        = 1;
public static final int ACRONYM           = 2;
public static final int COMPANY           = 3;
public static final int EMAIL             = 4;
public static final int HOST              = 5;
public static final int NUM               = 6;
public static final int CJ                = 7;
/**
 * @deprecated this solves a bug where HOSTs that end with '.' are identified
 *             as ACRONYMs. It is deprecated and will be removed in the next
 *             release.
 */
public static final int ACRONYM_DEP       = 8;

public static final String [] TOKEN_TYPES = new String [] {
    "<ALPHANUM>",
    "<APOSTROPHE>",
    "<ACRONYM>",
    "<COMPANY>",
    "<EMAIL>",
    "<HOST>",
    "<NUM>",
    "<CJ>",
    "<ACRONYM_DEP>"
};

So no custom TokenFilter can be based of the token type. Actually even the StandardFilter cannot be writen outside the org.apache.lucene.analysis.standard package.



 All   Comments   Work Log   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Repository Revision Date User Message
ASF #616248 Tue Jan 29 10:51:44 UTC 2008 mikemccand LUCENE-1150: make StandardAnalyzer tokenizer constants public again (public access was accidentally removed with LUCENE-966)
Files Changed
MODIFY /lucene/java/trunk/src/java/org/apache/lucene/analysis/standard/StandardTokenizerImpl.jflex
MODIFY /lucene/java/trunk/src/java/org/apache/lucene/analysis/standard/StandardTokenizer.java
MODIFY /lucene/java/trunk/src/java/org/apache/lucene/analysis/standard/StandardTokenizerImpl.java
MODIFY /lucene/java/trunk/contrib/wikipedia/src/java/org/apache/lucene/wikipedia/analysis/WikipediaTokenizer.java
MODIFY /lucene/java/trunk/contrib/wikipedia/src/java/org/apache/lucene/wikipedia/analysis/WikipediaTokenizerImpl.java
MODIFY /lucene/java/trunk/src/test/org/apache/lucene/analysis/TestAnalyzers.java
MODIFY /lucene/java/trunk/CHANGES.txt
MODIFY /lucene/java/trunk/src/java/org/apache/lucene/store/FSDirectory.java
MODIFY /lucene/java/trunk/contrib/wikipedia/src/java/org/apache/lucene/wikipedia/analysis/WikipediaTokenizerImpl.jflex

Repository Revision Date User Message
ASF #646243 Wed Apr 09 09:31:37 UTC 2008 mikemccand LUCENE-1150: re-instate constants in StandardTokenizer
Files Changed
MODIFY /lucene/java/branches/lucene_2_3/src/java/org/apache/lucene/store/FSDirectory.java
MODIFY /lucene/java/branches/lucene_2_3/src/test/org/apache/lucene/analysis/TestAnalyzers.java
MODIFY /lucene/java/branches/lucene_2_3/src/java/org/apache/lucene/analysis/standard/StandardTokenizer.java
MODIFY /lucene/java/branches/lucene_2_3/src/java/org/apache/lucene/analysis/standard/StandardTokenizerImpl.java
MODIFY /lucene/java/branches/lucene_2_3/contrib/wikipedia/src/java/org/apache/lucene/wikipedia/analysis/WikipediaTokenizer.java
MODIFY /lucene/java/branches/lucene_2_3/contrib/wikipedia/src/java/org/apache/lucene/wikipedia/analysis/WikipediaTokenizerImpl.java
MODIFY /lucene/java/branches/lucene_2_3/contrib/wikipedia/src/java/org/apache/lucene/wikipedia/analysis/WikipediaTokenizerImpl.jflex
MODIFY /lucene/java/branches/lucene_2_3/src/java/org/apache/lucene/analysis/standard/StandardTokenizerImpl.jflex
MODIFY /lucene/java/branches/lucene_2_3/CHANGES.txt

Repository Revision Date User Message
ASF #648183 Tue Apr 15 08:48:41 UTC 2008 mikemccand LUCENE-1150: put back public tokenImage/TOKEN_TYPES in StandardTokenizer and WikipediaTokenizer
Files Changed
MODIFY /lucene/java/trunk/src/java/org/apache/lucene/analysis/standard/StandardTokenizerImpl.jflex
MODIFY /lucene/java/trunk/src/java/org/apache/lucene/analysis/standard/StandardTokenizer.java
MODIFY /lucene/java/trunk/src/java/org/apache/lucene/analysis/standard/StandardTokenizerImpl.java
MODIFY /lucene/java/trunk/contrib/wikipedia/src/java/org/apache/lucene/wikipedia/analysis/WikipediaTokenizer.java
MODIFY /lucene/java/trunk/contrib/wikipedia/src/java/org/apache/lucene/wikipedia/analysis/WikipediaTokenizerImpl.java
MODIFY /lucene/java/trunk/src/test/org/apache/lucene/analysis/TestAnalyzers.java
MODIFY /lucene/java/trunk/contrib/wikipedia/src/java/org/apache/lucene/wikipedia/analysis/WikipediaTokenizerImpl.jflex

Repository Revision Date User Message
ASF #648188 Tue Apr 15 09:12:00 UTC 2008 mikemccand LUCENE-1150: put back public tokenImage/TOKEN_TYPES in StandardTokenizer and WikipediaTokenizer
Files Changed
MODIFY /lucene/java/branches/lucene_2_3/src/test/org/apache/lucene/analysis/TestAnalyzers.java
MODIFY /lucene/java/branches/lucene_2_3/src/java/org/apache/lucene/analysis/standard/StandardTokenizer.java
MODIFY /lucene/java/branches/lucene_2_3/src/java/org/apache/lucene/analysis/standard/StandardTokenizerImpl.java
MODIFY /lucene/java/branches/lucene_2_3/contrib/wikipedia/src/java/org/apache/lucene/wikipedia/analysis/WikipediaTokenizer.java
MODIFY /lucene/java/branches/lucene_2_3/contrib/wikipedia/src/java/org/apache/lucene/wikipedia/analysis/WikipediaTokenizerImpl.java
MODIFY /lucene/java/branches/lucene_2_3/contrib/wikipedia/src/java/org/apache/lucene/wikipedia/analysis/WikipediaTokenizerImpl.jflex
MODIFY /lucene/java/branches/lucene_2_3/src/java/org/apache/lucene/analysis/standard/StandardTokenizerImpl.jflex