Details
-
New Feature
-
Status: Closed
-
Minor
-
Resolution: Fixed
-
None
Description
Story
As a data scientist, I want to perform session reconstruction on my data set, so that I can prepare for input into other algorithms like path functions, or predictive analytics algorithms.
This is a follow on to
https://issues.apache.org/jira/browse/MADLIB-909
to add optional output controls.
Details
Proposed interface changes:
sessionize ( source_table, output_table, partition_expr, time_stamp, max_time, output_cols -- new create_view -- new )
where
output_cols (optional)
TEXT.
asterisk (i.e., '*') – ALL columns in input table + session column (default)
'x, y, z, ...' – list of columns you want + session column. This list could include the partition expression or other expressions as desired. This should also support '*, expr1, expr2, etc.' where this means output all columns + the extra expressions listed. Needs to a valid SELECT expression.
For example, in the path function http://madlib.incubator.apache.org/docs/latest/group__grp__path.html#examples
we do a similar thing for the aggregate function parameter.
create_view (optional)
BOOLEAN default: TRUE. Determines whether to create a view or materialize a table as output. If you only needed session info once, creating a view could be significantly faster than materializing as a table.