Uploaded image for project: 'Apache Jena'
  1. Apache Jena
  2. JENA-1062

add ConfigurableAnalyzer to jena-text

    XMLWordPrintableJSON

    Details

    • Type: New Feature
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: Jena 3.0.1
    • Component/s: Text
    • Labels:
      None

      Description

      This is an alternative to JENA-1058 (which implemented a very specific Lucene Analyzer for jena-text). The idea here, based on a comment by Claude Warren on JENA-1058, is to provide a ConfigurableAnalyzer that can be configured with a Tokenizer and (optionally) one or more TokenFilters, like this:

      text:analyzer [
      a text:ConfigurableAnalyzer ;
      text:tokenizer text:KeywordTokenizer ;
      text:filters (text:ASCIIFoldingFilter, text:LowerCaseFilter)
      ]

      I have some code ready to implement this and will open a PR shortly.

        Attachments

          Activity

            People

            • Assignee:
              osma Osma Suominen
              Reporter:
              osma Osma Suominen
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: