Details
-
Improvement
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
Description
The minibatch preprocessor currently does not support all expressions for independent, dependent and grouping columns.
- Independent varname does not support any logical expression.
- Dependent varname only supports logical expression for numerical columns. For ex 'length >1' is a valid expression but it does not support creating an alias for this expression.
- we might already support expressions that evaluate to array but haven't tested it.
- Grouping col does not support any expressions
This is the only expression that is supported for dependent variable
SELECT madlib.minibatch_preprocessor('minibatch_preprocessing_input', 'minibatch_preprocessing_out', 'y > 10', ' x1,x2', 4);
Not supported :
SELECT madlib.minibatch_preprocessor('minibatch_preprocessing_input', 'minibatch_preprocessing_out', 'y > 10 as foo', 'x1,x2', 4);
SELECT madlib.minibatch_preprocessor('minibatch_preprocessing_input', 'minibatch_preprocessing_out', 'y=''F''', 'x1,x2', 4);
Open Questions :
1. How about expressions that evaluate to array ? We might already support this but haven't tested it yet.
2. Do we need to support logical expressions for all three ?
3. If yes, to what extent ?
4. Should the user be allowed to create an alias for logical expressions?
5. There might be other modules that may partially support logical expressions. Should we find out which modules ?