Solr
  1. Solr
  2. SOLR-8192

SubFacets allBuckets not woring with measures on tokenized fields

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 5.4
    • Component/s: None
    • Labels:
      None

      Description

      Subfacets are not working when you ask for allBuckets on a tokenized fields with measures
      Here is the request:
      {
      hs: {
      field: hs,
      type: terms,
      allBuckets:true,
      sort: "mostrar_bill_price desc",
      facet:

      { mostrar_bill_price: "sum(mostrar_bill_price)" }

      }
      }

      Here is the response:
      {
      "responseHeader": {
      "status": 500,
      "QTime": 92,
      "params": {
      "indent": "true",
      "q": ":",
      "json.facet": "{ hs: { field: hs, type: terms, allBuckets:true, sort: \"mostrar_bill_price desc\", facet:

      { mostrar_bill_price: \"sum(mostrar_bill_price)\" }

      } }",
      "wt": "json",
      "rows": "0"
      }
      },
      "response":

      { "numFound": 35422188, "start": 0, "docs": [] }

      ,
      "error":

      { "trace": "java.lang.ArrayIndexOutOfBoundsException\n", "code": 500 }

      }

      hs fields is defined as:
      <field name="hs" type="text_ws" indexed="true" stored="false" required="false" multiValued="false" />

      mostrar_bill_price is defined as:
      <field name="mostrar_bill_price" type="tdouble" indexed="true" stored="false" required="false" multiValued="false" />

      A part from text_ws, it also happens with text_classic (these are the only ones I've tested it.

      1. SOLR-8192.patch
        6 kB
        Yonik Seeley

        Activity

        Hide
        Yonik Seeley added a comment -

        Thanks Pablo, I've reproduced this issue and am looking into a fix.

        Show
        Yonik Seeley added a comment - Thanks Pablo, I've reproduced this issue and am looking into a fix.
        Hide
        Yonik Seeley added a comment -

        Here's a patch that fixes the issue, as well as SOLR-8206, which I discovered while investigating this.

        Show
        Yonik Seeley added a comment - Here's a patch that fixes the issue, as well as SOLR-8206 , which I discovered while investigating this.
        Hide
        ASF subversion and git services added a comment -

        Commit 1710476 from Yonik Seeley in branch 'dev/trunk'
        [ https://svn.apache.org/r1710476 ]

        SOLR-8192: SOLR-8206: fix allBuckets for multi-valued fields, fix limit:0 and normalize offset to 0 when limit==0

        Show
        ASF subversion and git services added a comment - Commit 1710476 from Yonik Seeley in branch 'dev/trunk' [ https://svn.apache.org/r1710476 ] SOLR-8192 : SOLR-8206 : fix allBuckets for multi-valued fields, fix limit:0 and normalize offset to 0 when limit==0
        Hide
        Pablo Anzorena added a comment -

        Excellent!

        One more thing, is there any performance improvement(response time) when you ask for two subfacets on the same request (i.e. ask for subfacets on dimension1 and dimension 2, where q=banana)? Or is it better to ask for subfacets on dimension1, where q=banana and another simultaneous subfacets request on dimension2, where q=banana?
        I've made some tests, and the second option is better than the first one, in terms of response time. But I would like to hear your opinion.

        Thank you very much.

        Show
        Pablo Anzorena added a comment - Excellent! One more thing, is there any performance improvement(response time) when you ask for two subfacets on the same request (i.e. ask for subfacets on dimension1 and dimension 2, where q=banana)? Or is it better to ask for subfacets on dimension1, where q=banana and another simultaneous subfacets request on dimension2, where q=banana? I've made some tests, and the second option is better than the first one, in terms of response time. But I would like to hear your opinion. Thank you very much.
        Hide
        ASF subversion and git services added a comment -

        Commit 1710480 from Yonik Seeley in branch 'dev/branches/branch_5x'
        [ https://svn.apache.org/r1710480 ]

        SOLR-8192: SOLR-8206: fix allBuckets for multi-valued fields, fix limit:0 and normalize offset to 0 when limit==0

        Show
        ASF subversion and git services added a comment - Commit 1710480 from Yonik Seeley in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1710480 ] SOLR-8192 : SOLR-8206 : fix allBuckets for multi-valued fields, fix limit:0 and normalize offset to 0 when limit==0
        Hide
        Yonik Seeley added a comment -

        One more thing, is there any performance improvement(response time) when you ask for two subfacets on the same request (i.e. ask for subfacets on dimension1 and dimension 2, where q=banana)? Or is it better to ask for subfacets on dimension1, where q=banana and another simultaneous subfacets request on dimension2, where q=banana?

        When you say "simultaneous", are you issuing two completely separate Solr requests at the same time? If so, that's less efficient and causes more work for the system over all, but it can currently lower latency because of the parallelism that is introduced (the faceting code is currently single threaded... only one thread will be used per request).

        We should really develop multi-threaded faceting code!

        Show
        Yonik Seeley added a comment - One more thing, is there any performance improvement(response time) when you ask for two subfacets on the same request (i.e. ask for subfacets on dimension1 and dimension 2, where q=banana)? Or is it better to ask for subfacets on dimension1, where q=banana and another simultaneous subfacets request on dimension2, where q=banana? When you say "simultaneous", are you issuing two completely separate Solr requests at the same time? If so, that's less efficient and causes more work for the system over all, but it can currently lower latency because of the parallelism that is introduced (the faceting code is currently single threaded... only one thread will be used per request). We should really develop multi-threaded faceting code!
        Hide
        Pablo Anzorena added a comment -

        Yes, I was talking about two independent processes.

        It would be fantastic to have multi-threaded subfacets!

        The core idea would be to get all the documents that matched the query, and then one thread per request?

        Speaking from my ignorance, nowadays in the single threaded version, can't you take any advantage when you iterate the documents that matched the query and get the values on dimension1 and dimension2 on the same iteration?

        Show
        Pablo Anzorena added a comment - Yes, I was talking about two independent processes. It would be fantastic to have multi-threaded subfacets! The core idea would be to get all the documents that matched the query, and then one thread per request? Speaking from my ignorance, nowadays in the single threaded version, can't you take any advantage when you iterate the documents that matched the query and get the values on dimension1 and dimension2 on the same iteration?

          People

          • Assignee:
            Yonik Seeley
            Reporter:
            Pablo Anzorena
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development