Solr
  1. Solr
  2. SOLR-7724

Clustering Component results parsing

    Details

    • Flags:
      Patch

      Description

      Currently SolrJ org.apache.solr.client.solrj.response.QueryResponse is not managing clusters coming from the Clustering Component .

      It would be nice to have the clusters properly managed and returned with simply getter methods.
      Current Json :

      "clusters":[

      { "labels":["label1"], "score":14.067292538482793, "docs":["id1", "id2", "id3"] }

      ,

      { "labels":["label2"], "score":16.932201244715046, "docs":["id1", "id2", "id3"]}

      ]}

      This will be parsed accordingly .
      Producing an easy to use Java List.
      Clusters

      1. SOLR-7724.patch
        15 kB
        Alessandro Benedetti
      2. SOLR-7724.patch
        15 kB
        Alessandro Benedetti

        Activity

        Hide
        Dawid Weiss added a comment -

        Will review the patch if you provide it, grazie Alessandro.

        Show
        Dawid Weiss added a comment - Will review the patch if you provide it, grazie Alessandro.
        Hide
        Alessandro Benedetti added a comment -

        JUnit tested( parsing a sample xml response ) and locally tested on a real Solr Server .

        Show
        Alessandro Benedetti added a comment - JUnit tested( parsing a sample xml response ) and locally tested on a real Solr Server .
        Hide
        Alessandro Benedetti added a comment -

        You are welcome Dawid, feel free to give me a feedback as these are my first Solr contributions

        Cheers

        Show
        Alessandro Benedetti added a comment - You are welcome Dawid, feel free to give me a feedback as these are my first Solr contributions Cheers
        Hide
        Alessandro Benedetti added a comment -

        minor style changes

        Show
        Alessandro Benedetti added a comment - minor style changes
        Hide
        Dawid Weiss added a comment -

        Hi Alessandro. The code looks o.k. Minor remarks:

        • could you format it according to the project's style conventions? There are numerous offenders, mostly wrt. white spaces around keywords and expressions, sample blow.
          for( int i=0; i<res.size(); i++ ) {
          private List<Cluster> clusters=new LinkedList<Cluster>();
          for(NamedList<Object> clusterNode:clusterInfo){
          
        • It'd be nice to submit patches as SVN diffs rather than IntelliJ's diffs (which contain additional stuff that may be problematic when merging?).
        • I'd change Cluster#docs to Cluster#docIds which is more explicit.
        • Don't know if it's possible but ideally I'd like the test to parse an actual output from the clustering handler... this way you could avoid having a hardcoded response which may go out of sync with the actual response. Shamefully I haven't kept track with solrj development recently so you'd need to dig the code and see if this is at all doable. If not, so be it, we'll live with the hardcoded response.
        Show
        Dawid Weiss added a comment - Hi Alessandro. The code looks o.k. Minor remarks: could you format it according to the project's style conventions? There are numerous offenders, mostly wrt. white spaces around keywords and expressions, sample blow. for ( int i=0; i<res.size(); i++ ) { private List<Cluster> clusters= new LinkedList<Cluster>(); for (NamedList< Object > clusterNode:clusterInfo){ It'd be nice to submit patches as SVN diffs rather than IntelliJ's diffs (which contain additional stuff that may be problematic when merging?). I'd change Cluster#docs to Cluster#docIds which is more explicit. Don't know if it's possible but ideally I'd like the test to parse an actual output from the clustering handler... this way you could avoid having a hardcoded response which may go out of sync with the actual response. Shamefully I haven't kept track with solrj development recently so you'd need to dig the code and see if this is at all doable. If not, so be it, we'll live with the hardcoded response.
        Hide
        Alessandro Benedetti added a comment -

        Hi David,
        Thanks for the feedback,
        i will follow your points and update the Patch.

        Related the last point i initially went for that approach but I found the Embedded Solr Server that is loaded from the sample product to have the clustering handler disabled by default.

        Honestly I didn't invest too much time looking for a way to pass the parameter to enable the clustering request handler in the Embedded Solr instance.
        And i introduced the sample XML response ( which however is used by other components tests).
        I will update the patch as soon as I can.

        Show
        Alessandro Benedetti added a comment - Hi David, Thanks for the feedback, i will follow your points and update the Patch. Related the last point i initially went for that approach but I found the Embedded Solr Server that is loaded from the sample product to have the clustering handler disabled by default. Honestly I didn't invest too much time looking for a way to pass the parameter to enable the clustering request handler in the Embedded Solr instance. And i introduced the sample XML response ( which however is used by other components tests). I will update the patch as soon as I can.
        Hide
        Dawid Weiss added a comment -

        > [...] Embedded Solr Server that is loaded from the sample product to have the clustering handler disabled by default.

        Yeah. I think this may be a showstopper. If so, don't worry about it, the hardcoded response is fine too.

        Show
        Dawid Weiss added a comment - > [...] Embedded Solr Server that is loaded from the sample product to have the clustering handler disabled by default. Yeah. I think this may be a showstopper. If so, don't worry about it, the hardcoded response is fine too.
        Hide
        Alessandro Benedetti added a comment -

        Hi David,
        the new patch is ready but I am fighting against the code styling in IntelliJ.
        I am using the official xml file but it seems not covering the spacing rules ( http://people.apache.org/~erick/Intellij-Lucene-Codestyle.xml)
        Where can I find an exhaustive code style ?
        I find even incongruences in the QueryResponse official code ( possible ? )

        After I solve the code style I will upload the new fixed patch.

        Cheers

        Show
        Alessandro Benedetti added a comment - Hi David, the new patch is ready but I am fighting against the code styling in IntelliJ. I am using the official xml file but it seems not covering the spacing rules ( http://people.apache.org/~erick/Intellij-Lucene-Codestyle.xml ) Where can I find an exhaustive code style ? I find even incongruences in the QueryResponse official code ( possible ? ) After I solve the code style I will upload the new fixed patch. Cheers
        Hide
        Alessandro Benedetti added a comment -

        found this one as well, and it is quite recent :

        https://issues.apache.org/jira/browse/LUCENE-6514

        But I can not see in the style xml any reference to spaces

        Show
        Alessandro Benedetti added a comment - found this one as well, and it is quite recent : https://issues.apache.org/jira/browse/LUCENE-6514 But I can not see in the style xml any reference to spaces
        Hide
        Alessandro Benedetti added a comment -
        • renamed docs -> docIds ( the initial choice was because of the 1:1 mapping with solr clustering response
        • cleaned the patch of the useless style errors ( but there is still the spacing problem pending, waiting for an official style related)
        • the test is still the same
        • it is still build with IntelliJ, actually it does not seem a problem to me, but if necessary i will change it in the future
        Show
        Alessandro Benedetti added a comment - renamed docs -> docIds ( the initial choice was because of the 1:1 mapping with solr clustering response cleaned the patch of the useless style errors ( but there is still the spacing problem pending, waiting for an official style related) the test is still the same it is still build with IntelliJ, actually it does not seem a problem to me, but if necessary i will change it in the future
        Hide
        Dawid Weiss added a comment -

        I reviewed the patch and I looked at the affected classes. I understand your confusion with regard to formatting – those existing classes are terribly formatted. We should fix this incrementally though, so at least new code should adhere to the standard (as Mike very eloquently explained).

        I'll commit it in, thanks Alessandro.

        Show
        Dawid Weiss added a comment - I reviewed the patch and I looked at the affected classes. I understand your confusion with regard to formatting – those existing classes are terribly formatted. We should fix this incrementally though, so at least new code should adhere to the standard (as Mike very eloquently explained). I'll commit it in, thanks Alessandro.
        Hide
        Alessandro Benedetti added a comment -

        Thanks Dawid,
        Now I have much more clear the current official CodeStyle.
        Discussing in the appropriate Lucene issue I think my confusion derived from the classes I have been working with.

        Now for my contribution I am using the official Idea code style, and formatting only my changed code.
        The patch produced are much cleaner
        Thanks for the commit !
        Feel free to close the issue as soon as you do it !
        I have a parallel issue, I think Tommaso is taking care of it :

        https://issues.apache.org/jira/browse/SOLR-7719

        It is pretty much the same but with the Suggester component!

        Show
        Alessandro Benedetti added a comment - Thanks Dawid, Now I have much more clear the current official CodeStyle. Discussing in the appropriate Lucene issue I think my confusion derived from the classes I have been working with. Now for my contribution I am using the official Idea code style, and formatting only my changed code. The patch produced are much cleaner Thanks for the commit ! Feel free to close the issue as soon as you do it ! I have a parallel issue, I think Tommaso is taking care of it : https://issues.apache.org/jira/browse/SOLR-7719 It is pretty much the same but with the Suggester component!
        Hide
        ASF subversion and git services added a comment -

        Commit 1688455 from Dawid Weiss in branch 'dev/trunk'
        [ https://svn.apache.org/r1688455 ]

        SOLR-7724: SolrJ now supports parsing the output of the clustering component. (Alessandro Benedetti via Dawid Weiss)

        Show
        ASF subversion and git services added a comment - Commit 1688455 from Dawid Weiss in branch 'dev/trunk' [ https://svn.apache.org/r1688455 ] SOLR-7724 : SolrJ now supports parsing the output of the clustering component. (Alessandro Benedetti via Dawid Weiss)
        Hide
        ASF subversion and git services added a comment -

        Commit 1688458 from Dawid Weiss in branch 'dev/branches/branch_5x'
        [ https://svn.apache.org/r1688458 ]

        SOLR-7724: SolrJ now supports parsing the output of the clustering component.
        (Alessandro Benedetti via Dawid Weiss)

        Show
        ASF subversion and git services added a comment - Commit 1688458 from Dawid Weiss in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1688458 ] SOLR-7724 : SolrJ now supports parsing the output of the clustering component. (Alessandro Benedetti via Dawid Weiss)
        Hide
        Shalin Shekhar Mangar added a comment -

        Bulk close for 5.3.0 release

        Show
        Shalin Shekhar Mangar added a comment - Bulk close for 5.3.0 release

          People

          • Assignee:
            Dawid Weiss
            Reporter:
            Alessandro Benedetti
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Time Tracking

              Estimated:
              Original Estimate - 24h
              24h
              Remaining:
              Remaining Estimate - 24h
              24h
              Logged:
              Time Spent - Not Specified
              Not Specified

                Development