Uploaded image for project: 'S2Graph'
  1. S2Graph
  2. S2GRAPH-60

Add divide operation to scorePropagateOp

Attach filesAttach ScreenshotAdd voteVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • New Feature
    • Status: In Progress
    • Trivial
    • Resolution: Unresolved
    • None
    • None
    • None

    Description

      Ratio value in their service is common use cases of service analysis. Known methods to calculate ratio is that divide values between counting data or aggregating values. Already, S2Graph query supports counting or aggregating values within S2Graph storage. With S2Graph's function, you can calculate ratio just dividing values. That is an easy way to calculate the ratio. However, it can be a more simple way to calculate the ratio. It is that calculation occurred in S2Graph web application with just one RPC, one graph query call.
      This is a suggestion of the ratio calculation query.
      If we suppose to have two labels(impression feedbacks label and click feedbacks label), we can get a number of impressions and a number of clicks by a user. Using two value, we can calculate CTR(Click Through Rate) with below two count query.

      Impression query

      {
        "srcVertices": [{
          "serviceName": "some_service",
          "columnName": "user_id",
          "id": "user_a"
        }],
        "steps": [{
          "step": [{
            "label": "impression_feedback_label",
            "direction": "out",
            "offset": 0,
            "limit": 100
          }]
        }]
      }
      

      Click query

      {
        "srcVertices": [{
          "serviceName": "some_service",
          "columnName": "user_id",
          "id": "user_a"
        }],
        "steps": [{
          "step": [{
            "label": "click_feedback_label",
            "direction": "out",
            "offset": 0,
            "limit": 100
          }]
        }]
      }
      

      After fetching each result with upper queries, we can get a CTR.

      However, we can make a one query with `divide` operation to `scorePropagageOp`.

      {
        "limit" : 10,
        "groupBy" : [ "from" ],
        "duplicate" : "sum",
        "srcVertices" : [ {
          "serviceName" : "some_service",
          "columnName" : "user_id",
          "id" : "user_a"
        } ],
        "steps" : [ {
          "step" : [ {
            "label" : "impression_feedback_label",
            "direction" : "out",
            "offset" : 0,
            "limit" : 10,
            "groupBy" : [ "from" ],
            "duplicate" : "countSum",
            "transform" : [ [ "_from" ] ]
          } ]
        }, {
          "step" : [ {
            "label": "click_feedback_label",
            "direction" : "out",
            "offset" : 0,
            "limit" : 10,
            "scorePropagateOp" : "divide",
            "scorePropagateShrinkage" : 500
          } ]
        } ]
      }
      

      There is another query param option key, `scorePropagateShrinkage`. It is used to try normalizing results. We use just ratio value to sort the results. However, ratio value can be non-deterministic. Ratio 1.0 by 1/1 is larger than 0.9 by 9/10. For this reason, we can add `scorePropagateShrinkage` score value which is sufficiently big to the denominator. Now we can re-calculate by 1 / (1 + 500) =0.00199600798403 and 9 / (1 + 500) = 0.01796407185629, then the latter is larger value.

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            steamshon Do Yung Yoon
            steamshon Do Yung Yoon

            Dates

              Created:
              Updated:

              Time Tracking

                Estimated:
                Original Estimate - 168h
                168h
                Remaining:
                Remaining Estimate - 168h
                168h
                Logged:
                Time Spent - Not Specified
                Not Specified

                Slack

                  Issue deployment