Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-8590

example/files improvements

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • None
    • 6.0
    • examples
    • None

    Description

      There are several example/files improvements/fixes that are warranted:

      • Fix e-mail and URL field names (<email>_ss and <url>_ss, with angle brackets in field names), also add display of these fields in /browse results rendering
      • Improve quality of extracted phrases
      • Extract, facet, and display acronyms
      • Add sorting controls, possibly all or some of these: last modified date, created date, relevancy, and title
      • Add grouping by doc_type perhaps
      • fix debug mode - currently does not update the parsed query debug output (this is probably a bug in data driven /browse as well)
      • Harden update-script: it currently errors if documents do not have a "content" field (eg indexing basic CSV), but should instead skip extraction of e-mail addresses and URLs when no "content". Not quite the use case (no "content") for example/files, but no reason to error in the update script at least.
      • Filter out bogus e-mail addresses. I'm seeing email_ss = "?@[^],\,/^@[$_a-z]" for some documents (using Solr docs/ directory as the dataset)

      Attachments

        1. SOLR-8590.patch
          2 kB
          Erik Hatcher

        Activity

          People

            ehatcher Erik Hatcher
            ehatcher Erik Hatcher
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: