Details

    • Type: Improvement
    • Status: Closed
    • Priority: Minor
    • Resolution: Implemented
    • Affects Version/s: None
    • Fix Version/s: 6.0
    • Component/s: SolrJ
    • Labels:

      Description

      Adds a new stream called SelectStream which can be used for two purpose.
      1. Limit the set of fields included in an outgoing tuple to remove unwanted fields
      2. Provide aliases for fields. With this it acts as an alternative to the CloudSolrStream's 'aliases' option.

      For example, in a simple case

      select(
        id, 
        fieldA_i as fieldA, 
        fieldB_s as fieldB,
        search(collection1, q="*:*", fl="id,fieldA_i,fieldB_s", sort="fieldA_i asc, fieldB_s asc, id asc")
      )
      

      This can also be used as part of complex expressions to help keep track of what is being worked on. This is particularly useful when merging/joining multiple collections which share field names. For example, the following results in a set of tuples including only the fields id, left.ident, and right.ident even though the total set of fields required to perform the search and join is much larger than just those three fields.

      select(
        id, left.ident, right.ident,
        innerJoin(
          select(
            id, join1_i as left.join1, join2_s as left.join2, ident_s as left.ident,
            search(collection1, q="side_s:left", fl="id,join1_i,join2_s,ident_s", sort="join1_i asc, join2_s asc, id asc")
          ),
          select(
            join3_i as right.join1, join2_s as right.join2, ident_s as right.ident,
            search(collection1, q="side_s:right", fl="join3_i,join2_s,ident_s", sort="join3_i asc, join2_s asc"),
          ),
          on="left.join1=right.join1, left.join2=right.join2"
        )
      )
      

      This depends on SOLR-7584.

      1. SOLR-7669.patch
        53 kB
        Dennis Gove
      2. SOLR-7669.patch
        46 kB
        Dennis Gove
      3. SOLR-7669.patch
        50 kB
        Dennis Gove
      4. SOLR-7669.patch
        47 kB
        Dennis Gove
      5. SOLR-7669.patch
        43 kB
        Dennis Gove
      6. SOLR-7669.patch
        16 kB
        Dennis Gove

        Issue Links

          Activity

          Hide
          dpgove Dennis Gove added a comment - - edited

          Updated to add support for performing operations on the selected values. The only operation included in this patch is Replace which can be used to replace field values (or nulll) with a different value or the value of another field.

          In the following example, if fieldA is null then it will be replaced with value 123 and if fieldB is "foo" then it will be set to "bar".

          select(
            id, 
            fieldA_i as fieldA, 
            fieldB_s as fieldB,
            replace(fieldA, null, withValue=123),
            replace(fieldB, foo, withValue=bar),
            search(collection1, q="*:*", fl="id,fieldA_i,fieldB_s", sort="fieldA_i asc, fieldB_s asc, id asc")
          )
          

          In the following example, if fieldA is null or "???" then it will be replaced with the value of fieldB.

          select(
            id, 
            fieldA_s as fieldA, 
            fieldB_s as fieldB,
            replace(fieldA, null, withField=fieldB),
            replace(fieldA, "???", withField=fieldB)
            search(collection1, q="*:*", fl="id,fieldA_i,fieldB_s", sort="fieldA_i asc, fieldB_s asc, id asc")
          )
          
          Show
          dpgove Dennis Gove added a comment - - edited Updated to add support for performing operations on the selected values. The only operation included in this patch is Replace which can be used to replace field values (or nulll) with a different value or the value of another field. In the following example, if fieldA is null then it will be replaced with value 123 and if fieldB is "foo" then it will be set to "bar". select( id, fieldA_i as fieldA, fieldB_s as fieldB, replace(fieldA, null , withValue=123), replace(fieldB, foo, withValue=bar), search(collection1, q= "*:*" , fl= "id,fieldA_i,fieldB_s" , sort= "fieldA_i asc, fieldB_s asc, id asc" ) ) In the following example, if fieldA is null or "???" then it will be replaced with the value of fieldB. select( id, fieldA_s as fieldA, fieldB_s as fieldB, replace(fieldA, null , withField=fieldB), replace(fieldA, "???" , withField=fieldB) search(collection1, q= "*:*" , fl= "id,fieldA_i,fieldB_s" , sort= "fieldA_i asc, fieldB_s asc, id asc" ) )
          Hide
          dpgove Dennis Gove added a comment -

          Rebased against trunk (git hash f63fc48, SOLR-8114: in Grouping.java rename groupSort to withinGroupSort)

          Required a couple of changes in the SQL and FacetStream areas related to FieldComparator. The FieldComparator has been changed to support different field names on the left and right side. The SQL and FacetStream areas use FieldComparator for sorting (a totally valid use case) but do expect the left and right side field names to be equal. The changes I made go through and validate that assumption.

          In the future I think I may circle back around and create a new FieldComparator with a single field name so that on construction that assumption can be enforced.

          All tests pass.

          Show
          dpgove Dennis Gove added a comment - Rebased against trunk (git hash f63fc48, SOLR-8114 : in Grouping.java rename groupSort to withinGroupSort) Required a couple of changes in the SQL and FacetStream areas related to FieldComparator. The FieldComparator has been changed to support different field names on the left and right side. The SQL and FacetStream areas use FieldComparator for sorting (a totally valid use case) but do expect the left and right side field names to be equal. The changes I made go through and validate that assumption. In the future I think I may circle back around and create a new FieldComparator with a single field name so that on construction that assumption can be enforced. All tests pass.
          Hide
          dpgove Dennis Gove added a comment -

          Deleted the EditStream as its functionality (the removal of fields from a tuple) is superseded by the SelectStream. Updated the SQLHandler to use the SelectStream instead of the EditStream.

          All relevant tests pass.

          Show
          dpgove Dennis Gove added a comment - Deleted the EditStream as its functionality (the removal of fields from a tuple) is superseded by the SelectStream. Updated the SQLHandler to use the SelectStream instead of the EditStream. All relevant tests pass.
          Hide
          dpgove Dennis Gove added a comment -

          Rebased against trunk.

          Show
          dpgove Dennis Gove added a comment - Rebased against trunk.
          Hide
          dpgove Dennis Gove added a comment -

          Fixes for pre-commit failures. Add documentation on the operations.

          Show
          dpgove Dennis Gove added a comment - Fixes for pre-commit failures. Add documentation on the operations.
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit 1713967 from dpgove@apache.org in branch 'dev/trunk'
          [ https://svn.apache.org/r1713967 ]

          SOLR-7669: Add SelectStream and Tuple Operations to the Streaming API and Streaming Expressions

          Show
          jira-bot ASF subversion and git services added a comment - Commit 1713967 from dpgove@apache.org in branch 'dev/trunk' [ https://svn.apache.org/r1713967 ] SOLR-7669 : Add SelectStream and Tuple Operations to the Streaming API and Streaming Expressions

            People

            • Assignee:
              dpgove Dennis Gove
              Reporter:
              dpgove Dennis Gove
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development