Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
Description
Developers oftentimes deploy their own functions to the system to enable decorator pattern for caching to add information to specific key/value pairs. In doing so, they can introduce bottlenecks into the system where server-side functions can cause issues or make things slower than intended. We want a way that users can view functions that they create, and see what the average execution time looks like.
- Meter Type: Timer
- Name: geode.function.executions
- Description: TBD
- Tags: <common_tags>, function (getId on function, if DNE present getClass.getname of deployed function), succeeded (true/false)
Acceptance Criteria
Meter creation/deletion: Create meter on function execution
Measurement: On an individual server, start the timer when a USER function is invoked/executed, and stop the timer when the user function completes OR errors. If it throws a Function Execution or another error then the tag function.isSuccessful=false
Details on Functions and their execution: https://geode.apache.org/docs/guide/110/developing/function_exec/function_execution.html
Scenarios
Scenario: The timers are created when the function is first executed
Given a user executed a function with ID functionToTime on a cluster with 1 locator/1 server
And functionToTime has not been executed previously
Then the server has the following timer:
- name: geode.function.executions
- tag: id = functionToTime
- tag: succeeded = true
- count > 1
- totalTime >= 5,000,000,000ns
And the server has the following timer:
- name: geode.function.executions
- tag: id = functionToTime
- tag: succeeded = false
- count = 0
- totalTime = 0
Scenario: Successful singular function execution (registered execution)
Given a user registers a function with ID functionToTime (that waits for 5 seconds) on a cluster with 1 locator/1 server
When functionToTime is triggered using gfsh command: "execute function --id=functionToTime"
And the function completes without error
Then the server has the following timer:
- name: geode.function.executions
- tag: id = functionToTime
- tag: succeeded = true
- count = 1
- totalTime >= 5,000,000,000ns
Scenario: Successful singular function execution (unregistered execution)
Given an unregistered function with ID functionToTime (that waits for 5 seconds) exists
When triggered on a client using "FunctionService.onServers(cache).execute(new FunctionToTime())"
And the function completes without error
Then the server has the following timer:
- name: geode.function.executions
- tag: id = functionToTime
- tag: succeeded = true
- count = 1
- totalTime >= 5,000,000,000ns
Scenario: Singular function execution with Any Exception
Given an unregistered function with ID functionToTime (that waits for 5 seconds) exists
When triggered on a client using "FunctionService.onServers(cache).execute(new FunctionToTime())"
And the function exits with a Any exception error after running for 5 seconds
Then the server has the following timer:
- name: geode.function.executions
- tag: id = functionToTime
- tag: succeeded = false
- count = 1
- totalTime >= 5,000,000,000ns
Scenario: Function execution onRegion multi-server
Given a cluster with 1 locator (named L1) as well as 2 servers (named S1,S2)
And a region called RR1 that is a replicate region
When a function execution is triggered against that replicate region using "FunctionService.onRegion(regionRR1).execute(new FunctionToTime())"
Then one server has the following timer:
- name: geode.function.executions
- tag: id = functionToTime
- tag: succeeded = true
- count = 1
- totalTime >= 5,000,000,000ns
And the other server has the following timer:
- name: geode.cache.function.executions
- tag: id = functionToTime
- tag: succeeded = true
- count = 0
- totalTime = 0
Scenario: Function execution onRegion with partition region multiple times
Scenario: Function execution onRegion multi-server
Given a cluster with 1 locator (named L1) as well as 2 servers (named S1,S2)
And a partition region called PR1 that only exists on S1
When a function execution is triggered 10 times against that replicate region using "FunctionService.onRegion(regionPR1).execute(new FunctionToTime())"
Then S1 has the following timer:
- name: geode.function.executions
- tag:id = functionToTime
- tag:succeeded = true
- count = 10
And S2 has the following timer:
- name: geode.cache.function.executions
- tag:id = functionToTime
- tag:succeeded = true
- count = 0
Scenario: Function execution onRegion with replicate region multiple times
Scenario: Function execution onRegion multi-server
Given a cluster with 1 locator (named L1) as well as 2 servers (named S1,S2)
And a replicate region called RR1 exists
When a function execution is triggered 10 times against that replicate region using "FunctionService.onRegion(regionRR1).execute(new FunctionToTime())"
Then when you aggregate both S1 and S2 server metrics for geode.function.executions with id:functionToTime with succeeded:true then the total count of the aggregate will be 10