Lucene - Core
  1. Lucene - Core
  2. LUCENE-3071

PathHierarchyTokenizer adaptation for urls: splits reversed

    Details

    • Type: New Feature New Feature
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 3.5, 4.0-ALPHA
    • Component/s: modules/analysis
    • Labels:
      None

      Description

      PathHierarchyTokenizer should be usable to split urls the a "reversed" way (useful for faceted search against urls):
      www.site.com -> www.site.com, site.com, com

      Moreover, it should be able to skip a given number of first (or last, if reversed) tokens:
      /usr/share/doc/somesoftware/INTERESTING/PART
      Should give with 4 tokens skipped:
      INTERESTING
      INTERESTING/PART

      1. LUCENE-3071.patch
        22 kB
        Ryan McKinley
      2. LUCENE-3071.patch
        19 kB
        Ryan McKinley
      3. LUCENE-3071.patch
        18 kB
        Olivier Favre
      4. ant.log.tar.bz2
        15 kB
        Olivier Favre
      5. LUCENE-3071.patch
        18 kB
        Olivier Favre

        Activity

          People

          • Assignee:
            Ryan McKinley
            Reporter:
            Olivier Favre
          • Votes:
            1 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Time Tracking

              Estimated:
              Original Estimate - 2h
              2h
              Remaining:
              Remaining Estimate - 2h
              2h
              Logged:
              Time Spent - Not Specified
              Not Specified

                Development