Lucene - Core
  1. Lucene - Core
  2. LUCENE-4723

Add AnalyzerFactoryTask to benchmark, and enable analyzer creation via the resulting factories using NewAnalyzerTask

    Details

    • Type: New Feature New Feature
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 4.2
    • Component/s: modules/benchmark
    • Labels:
      None
    • Lucene Fields:
      New, Patch Available

      Description

      Benchmark algorithms can't currently use analysis factories. Instead, one must rely on pre-existing analyzers, or write specialized tasks to construct them.

      Now that all analysis components have factories, benchmark algorithms should be able to use them.

      1. LUCENE-4723.patch
        55 kB
        Steve Rowe
      2. LUCENE-4723.patch
        131 kB
        Steve Rowe

        Activity

        Hide
        Steve Rowe added a comment -

        Patch.

        I extended the algorithm file syntax to allow for double- and single-quoted and nested parenthetical expressions.

        I converted the shingle benchmark to use the new task, and removed NewShingleAnalyzerTask.

        Show
        Steve Rowe added a comment - Patch. I extended the algorithm file syntax to allow for double- and single-quoted and nested parenthetical expressions. I converted the shingle benchmark to use the new task, and removed NewShingleAnalyzerTask.
        Hide
        Uwe Schindler added a comment -

        Hey, this is great! Now the 2nd use-case for those factories!

        Show
        Uwe Schindler added a comment - Hey, this is great! Now the 2nd use-case for those factories!
        Hide
        Robert Muir added a comment -

        I agree!

        maybe we can put the foldToASCII.txt just in the tests folder and not with
        the alg files in conf? do we really need it or maybe we can use something smaller?

        Show
        Robert Muir added a comment - I agree! maybe we can put the foldToASCII.txt just in the tests folder and not with the alg files in conf? do we really need it or maybe we can use something smaller?
        Hide
        Steve Rowe added a comment -

        maybe we can put the foldToASCII.txt just in the tests folder and not with the alg files in conf? do we really need it or maybe we can use something smaller?

        I agree - I've switched to a trimmed down version of mapping-ISOLatin1Accent.txt and put it in the tests folder.

        I had to make TestPerfTasksLogic copy it from there to the work dir, then tell AnalyzerFactoryTask to use work.dir as the base dir for the FilesystemResourceLoader passed to each analysis pipeline component factory on inform(). I think this should be a fairly general way to allow people pass in paths to resources: "put them in the work.dir".

        I think it's ready to go.

        Show
        Steve Rowe added a comment - maybe we can put the foldToASCII.txt just in the tests folder and not with the alg files in conf? do we really need it or maybe we can use something smaller? I agree - I've switched to a trimmed down version of mapping-ISOLatin1Accent.txt and put it in the tests folder. I had to make TestPerfTasksLogic copy it from there to the work dir, then tell AnalyzerFactoryTask to use work.dir as the base dir for the FilesystemResourceLoader passed to each analysis pipeline component factory on inform(). I think this should be a fairly general way to allow people pass in paths to resources: "put them in the work.dir". I think it's ready to go.
        Hide
        Commit Tag Bot added a comment -

        [trunk commit] Steven Rowe
        http://svn.apache.org/viewvc?view=revision&revision=1439510

        LUCENE-4723: Add AnalyzerFactoryTask to benchmark, and enable analyzer creation via the resulting factories using NewAnalyzerTask.

        Show
        Commit Tag Bot added a comment - [trunk commit] Steven Rowe http://svn.apache.org/viewvc?view=revision&revision=1439510 LUCENE-4723 : Add AnalyzerFactoryTask to benchmark, and enable analyzer creation via the resulting factories using NewAnalyzerTask.
        Hide
        Commit Tag Bot added a comment -

        [branch_4x commit] Steven Rowe
        http://svn.apache.org/viewvc?view=revision&revision=1439513

        LUCENE-4723: Add AnalyzerFactoryTask to benchmark, and enable analyzer creation via the resulting factories using NewAnalyzerTask. (merged trunk r* LUCENE-4723: Add AnalyzerFactoryTask to benchmark, and enable analyzer)

        Show
        Commit Tag Bot added a comment - [branch_4x commit] Steven Rowe http://svn.apache.org/viewvc?view=revision&revision=1439513 LUCENE-4723 : Add AnalyzerFactoryTask to benchmark, and enable analyzer creation via the resulting factories using NewAnalyzerTask. (merged trunk r* LUCENE-4723 : Add AnalyzerFactoryTask to benchmark, and enable analyzer)
        Hide
        Steve Rowe added a comment -

        Committed to trunk and branch_4x.

        Show
        Steve Rowe added a comment - Committed to trunk and branch_4x.
        Hide
        Uwe Schindler added a comment -

        Closed after release.

        Show
        Uwe Schindler added a comment - Closed after release.

          People

          • Assignee:
            Steve Rowe
            Reporter:
            Steve Rowe
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development