Solr
  1. Solr
  2. SOLR-2635

FieldAnalysisRequestHandler; Expose Filter- & Tokenizer-Settings

    Details

    • Type: Improvement Improvement
    • Status: Reopened
    • Priority: Minor Minor
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Labels:
      None

      Description

      The current/old Analysis Page exposes the Filter- & Tokenizer-Settings – the FieldAnalysisRequestHandler not :/

      This Information is already available on the Schema-Browser (through LukeRequestHandler) - so we could load this in parallel and grab the required informations .. but it would be easier if we could add this Information, so that we have all relevant Information at one Place.

        Issue Links

          Activity

          Hide
          Uwe Schindler added a comment -

          How would you expose the args map? The problem of the current namedList is that its not easy to insert that in a backwards compatible way?

          I am currentyl looking into it, hopefully i will find a solution.

          Show
          Uwe Schindler added a comment - How would you expose the args map? The problem of the current namedList is that its not easy to insert that in a backwards compatible way? I am currentyl looking into it, hopefully i will find a solution.
          Hide
          Stefan Matheis (steffkes) added a comment -

          Maybe we can append this List to the existing output .. like it's actually done for highlighting on the select handler?
          Just a suggestion:

          <?xml version="1.0" encoding="UTF-8"?>
          <response>
              <lst name="responseHeader">
                  <int name="status">0</int>
                  <int name="QTime">37</int>
              </lst>
              <lst name="analysis">
                  <!-- .. -->
              </lst>
              <lst name="settings">
                  <lst name="field_types">
                      <lst name="text_general_rev">
                          <lst name="index">
                              <arr name="org.apache.lucene.analysis.standard.StandardTokenizer">
                                  <lst>
                                      <!-- settings -->
                                  </lst>
                              </arr>
                          <lst>
                      </lst>
                  </lst>
              </lst>
          </response>

          That will work w/o problems, as long as the list of used Filter and Tokenizer is unique. If there is at least One, which is used more than once – the relation is only defined through the order of the list, but we could maybe add an counter to the existing output, then it's also no problem :>

          Show
          Stefan Matheis (steffkes) added a comment - Maybe we can append this List to the existing output .. like it's actually done for highlighting on the select handler? Just a suggestion: <?xml version= "1.0" encoding= "UTF-8" ?> <response> <lst name= "responseHeader" > <int name= "status" > 0 </int> <int name= "QTime" > 37 </int> </lst> <lst name= "analysis" > <!-- .. --> </lst> <lst name= "settings" > <lst name= "field_types" > <lst name= "text_general_rev" > <lst name= "index" > <arr name= "org.apache.lucene.analysis.standard.StandardTokenizer" > <lst> <!-- settings --> </lst> </arr> <lst> </lst> </lst> </lst> </response> That will work w/o problems, as long as the list of used Filter and Tokenizer is unique. If there is at least One, which is used more than once – the relation is only defined through the order of the list, but we could maybe add an counter to the existing output, then it's also no problem :>
          Hide
          Uwe Schindler added a comment -

          This solution might work, i just don't like it, because it decouples the settings from the output and makes correlation harder. But thats of course the same for highlighting.

          The list of tokenizers and filters is not necessarily unique, but order would be, so access via index (like for highlighting) is fine. Its possible to add the same TokenFilter at several places in the analysis chain, so a lookup by class name is impossible.

          Show
          Uwe Schindler added a comment - This solution might work, i just don't like it, because it decouples the settings from the output and makes correlation harder. But thats of course the same for highlighting. The list of tokenizers and filters is not necessarily unique, but order would be, so access via index (like for highlighting) is fine. Its possible to add the same TokenFilter at several places in the analysis chain, so a lookup by class name is impossible.
          Hide
          Stefan Matheis (steffkes) added a comment -

          Hm yes, correct :/ Then, what about an additional settings=true -parameter for this Handler which adds a second <lst>-Element with the used Settings?

          <arr name="org.apache.lucene.analysis.standard.StandardTokenizer">
              <lst>
                  <!-- .. existing output ..  -->
              </lst>
              <lst name="settings">
                  <!-- settings -->
              </lst>
          </arr>

          The JSON-Output for this Handler is already not the best, but that should be still usable.

          Show
          Stefan Matheis (steffkes) added a comment - Hm yes, correct :/ Then, what about an additional settings=true -parameter for this Handler which adds a second <lst>-Element with the used Settings? <arr name= "org.apache.lucene.analysis.standard.StandardTokenizer" > <lst> <!-- .. existing output .. --> </lst> <lst name= "settings" > <!-- settings --> </lst> </arr> The JSON-Output for this Handler is already not the best, but that should be still usable.
          Hide
          Uwe Schindler added a comment -

          I was already thinking about an extra param to enable the settings. But like for highlighting, we should add them as a separate list with relation via lst-index. Is this fine?

          To fix the output perfelctly, each list inside the anaysis component array should have a key like "tokens", "settings", but that would make it incompatible. Also the CharFilter output would need some improvements (I prefer to return the CharFilter output like a single token in other compoenents, currently its one level higher - it has no <lst>). But thats out of scope for this issue.

          Show
          Uwe Schindler added a comment - I was already thinking about an extra param to enable the settings. But like for highlighting, we should add them as a separate list with relation via lst-index. Is this fine? To fix the output perfelctly, each list inside the anaysis component array should have a key like "tokens", "settings", but that would make it incompatible. Also the CharFilter output would need some improvements (I prefer to return the CharFilter output like a single token in other compoenents, currently its one level higher - it has no <lst>). But thats out of scope for this issue.
          Hide
          Stefan Matheis (steffkes) added a comment -

          Is this fine?

          Yes, that should be good to work with

          Show
          Stefan Matheis (steffkes) added a comment - Is this fine? Yes, that should be good to work with
          Hide
          Ryan McKinley added a comment -

          I think this was fixed a while back

          Show
          Ryan McKinley added a comment - I think this was fixed a while back
          Hide
          Uwe Schindler added a comment -

          I did not commit anything? How is it fixed?

          Show
          Uwe Schindler added a comment - I did not commit anything? How is it fixed?
          Hide
          Ryan McKinley added a comment -

          My mistake – I thought this was the work you did with Stefan to make the analysis UI better

          Show
          Ryan McKinley added a comment - My mistake – I thought this was the work you did with Stefan to make the analysis UI better
          Hide
          Hoss Man added a comment -

          bulk fixing the version info for 4.0-ALPHA and 4.0 all affected issues have "hoss20120711-bulk-40-change" in comment

          Show
          Hoss Man added a comment - bulk fixing the version info for 4.0-ALPHA and 4.0 all affected issues have "hoss20120711-bulk-40-change" in comment
          Hide
          Robert Muir added a comment -

          rmuir20120906-bulk-40-change

          Show
          Robert Muir added a comment - rmuir20120906-bulk-40-change
          Hide
          Hoss Man added a comment -

          There is no indication that anyone is actively working on this issue, so removing 4.0 from the fixVersion.

          Show
          Hoss Man added a comment - There is no indication that anyone is actively working on this issue, so removing 4.0 from the fixVersion.
          Hide
          Stefan Matheis (steffkes) added a comment -

          Uwe Schindler quick reminder on this issue, if we could solve this, the Legend (proposed in SOLR-4378) might be able to include that somehow

          Show
          Stefan Matheis (steffkes) added a comment - Uwe Schindler quick reminder on this issue, if we could solve this, the Legend (proposed in SOLR-4378 ) might be able to include that somehow

            People

            • Assignee:
              Uwe Schindler
              Reporter:
              Stefan Matheis (steffkes)
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:

                Development