Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-1150

The token types of the standard tokenizer is not accessible

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.3
    • Fix Version/s: 2.3.2, 2.4
    • Component/s: modules/analysis
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      The StandardTokenizerImpl not being public, these token types are not accessible :

      public static final int ALPHANUM          = 0;
      public static final int APOSTROPHE        = 1;
      public static final int ACRONYM           = 2;
      public static final int COMPANY           = 3;
      public static final int EMAIL             = 4;
      public static final int HOST              = 5;
      public static final int NUM               = 6;
      public static final int CJ                = 7;
      /**
       * @deprecated this solves a bug where HOSTs that end with '.' are identified
       *             as ACRONYMs. It is deprecated and will be removed in the next
       *             release.
       */
      public static final int ACRONYM_DEP       = 8;
      
      public static final String [] TOKEN_TYPES = new String [] {
          "<ALPHANUM>",
          "<APOSTROPHE>",
          "<ACRONYM>",
          "<COMPANY>",
          "<EMAIL>",
          "<HOST>",
          "<NUM>",
          "<CJ>",
          "<ACRONYM_DEP>"
      };
      

      So no custom TokenFilter can be based of the token type. Actually even the StandardFilter cannot be writen outside the org.apache.lucene.analysis.standard package.

        Attachments

        1. LUCENE-1150.patch
          7 kB
          Michael McCandless
        2. LUCENE-1150.take2.patch
          14 kB
          Michael McCandless

          Activity

            People

            • Assignee:
              mikemccand Michael McCandless
              Reporter:
              hibou Nicolas Lalevée
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: