Details
-
New Feature
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
Description
This is a ticket for organizing the new statistical programming features of Streaming Expressions. It's also a place for the community to discuss what functions are needed to support statistical programming.
Basic Syntax:
let(a = timeseries(...), b = timeseries(...), c = col(a, count(*)), d = col(b, count(*)), r = regress(c, d), tuple(p = predict(r, 50)))
The expression above is doing the following:
1) The let expression is setting variables (a, b, c, d, r).
2) Variables a and b are the output of timeseries() Streaming Expressions. These will be stored in memory as lists of Tuples containing the time series results.
3) Variables c and d are set using the col evaluator. The col evaluator extracts a column of numbers from a list of tuples. In the example col is extracting the count(*) field from the two time series result sets.
4) Variable r is the output from the regress evaluator. The regress evaluator performs a simple regression analysis on two columns of numbers.
5) Once the variables are set, a single Streaming Expression is run by the let expression. In the example the tuple expression is run. The tuple expression outputs a single Tuple with name/value pairs. Any Streaming Expression can be run by the let expression so this can be a complex program. The streaming expression run by let has access to all the variables defined earlier.
6) The tuple expression in the example has one name / value pair. The name p is set to the output of the predict evaluator. The predict evaluator is predicting the value of a dependent variable based on the independent variable 50. The regression result stored in variable r is used to make the prediction.
7) The output of this expression will be a single tuple with the value of the predict function in the p field.
The growing list of issues linked to this ticket are the array manipulation and statistical functions that will form the basis of the stats library. The vast majority of these functions are backed by algorithms in Apache Commons Math. Other machine learning and math libraries will follow.
Attachments
Attachments
Issue Links
- relates to
-
SOLR-10590 Add Cross Correlation Stream Evaluator
- Open
-
SOLR-10681 Add Dynamic Time Warping (DTW) Stream Evaluator
- Open
-
SOLR-10685 Add Fast Cosine Transform and Inverse Fast Cosine Transform Stream Evaluators
- Open
-
SOLR-11337 Add binomial confidence interval Stream Evaluators
- Open
-
SOLR-11340 Add sample size calculator Stream Evaluator
- Open
-
SOLR-11699 Add hyperGeometricDisitribution Stream Evaluator
- Open
-
SOLR-11796 Support prior distributions and Bayesian Networks
- Open
-
SOLR-11802 Add wilcoxonSignedRank Stream Evaluator to support the Wilcoxon Signed Rank Test
- Open
-
SOLR-12411 Add arima Stream Evaluator
- Open
-
SOLR-12747 Add binomialTest Stream Evaluator
- Open
-
SOLR-13554 Add robust flag to movingMAD Stream Evaluation
- Open
-
SOLR-10623 Add sql Streaming Expression
- Resolved
-
SOLR-10633 Add cosAngle Stream Evaluator
- Resolved
-
SOLR-10660 Add reverse Stream Evaluator
- Resolved
-
SOLR-10661 Add copyOf Stream Evaluator
- Resolved
-
SOLR-10662 Add length Stream Evaluator
- Resolved
-
SOLR-10663 Add distance Stream Evaluator
- Resolved
-
SOLR-10664 Add scale Stream Evaluator
- Resolved
-
SOLR-10666 Add rank transformation Stream Evaluator
- Resolved
-
SOLR-10682 Add variance Stream Evaluator
- Resolved
-
SOLR-10683 Add mean Stream Evaluator
- Resolved
-
SOLR-10684 Add finddelay Stream Evaluator
- Resolved
-
SOLR-10693 Add copyOfRange Stream Evaluator
- Resolved
-
SOLR-10696 Add empirical distribution and percentile Stream Evaluators
- Resolved
-
SOLR-10724 Add describe Stream Evaluator
- Resolved
-
SOLR-10731 Add knn Streaming Expression
- Resolved
-
SOLR-10743 Add sequence StreamEvaluator
- Resolved
-
SOLR-10747 Allow /stream handler to execute Stream Evaluators directly
- Resolved
-
SOLR-10753 Add array Stream Evaluator
- Resolved
-
SOLR-10754 Add hist Stream Evaluator
- Resolved
-
SOLR-10765 Add anova Stream Evaluator
- Resolved
-
SOLR-10767 Add movingAvg Stream Evaluator
- Resolved
-
SOLR-10813 Add arraySort Stream Evaluator
- Resolved
-
SOLR-11609 Add randomWalk Stream Evaluator to support time series simulations
- Resolved
-
SOLR-12702 Add zscores Stream Evaluator
- Resolved
-
SOLR-13105 A visual guide to Solr Math Expressions and Streaming Expressions
- Resolved
-
SOLR-13135 Add qqplot function to perform quantile plots in Apache Zeppelin
- Resolved
-
SOLR-13287 Allow zplot to visualize probability distributions in Apache Zeppelin
- Resolved
-
SOLR-13298 Allow zplot to plot matrices
- Resolved
-
SOLR-13555 Add centeredMovingAvg Stream Evaluator
- Resolved
-
SOLR-10559 Add let, get and tuple Streaming Expressions
- Closed
-
SOLR-10582 Add Correlation Stream Evaluator
- Closed
-
SOLR-10622 Add regress and predict Stream Evaluators
- Closed
-
SOLR-10625 Add convolution Stream Evaluator
- Closed
-
SOLR-10626 Add covariance Stream Evaluator
- Closed
-
SOLR-10638 Add normalize Stream Evaluator
- Closed
-
SOLR-10680 Add minMaxScale Stream Evaluator
- Closed
-
SOLR-11019 Add addAll Stream Evaluator
- Closed
-
SOLR-11046 Add residuals Stream Evaluator
- Closed
-
SOLR-11047 Add ebeMultiply Stream Evaluator
- Closed
-
SOLR-11160 Add normalDistribution, uniformDistribution, sample and kolmogorovSmirnov Stream Evaluators
- Closed
-
SOLR-11172 Add Mann-Whitney U test Stream Evaluator
- Closed
-
SOLR-11225 Add cumulativeProbability Stream Evaluator
- Closed
-
SOLR-11241 Add discrete counting and probability Stream Evaluators
- Closed
-
SOLR-11321 Add ebeAdd, ebeSubtract, ebeDivide, ebeMultiply, dotProduct and cosineSimilarity Stream Evaluators
- Closed
-
SOLR-11338 Add Kendall's Tau-b rank and Spearmans rank correlation Stream Evaluators
- Closed
-
SOLR-11339 Add Canberra, Chebyshev, Earth Movers and Manhattan Distance Stream Evaluators
- Closed
-
SOLR-11342 Add sumDifference and meanDifference Stream Evaluators
- Closed
-
SOLR-11350 Add primes Stream Evaluator
- Closed
-
SOLR-11354 Add factorial and movingMedian Stream Evaluators
- Closed
-
SOLR-11374 Add movingMedian Stream Evaluator
- Closed
-
SOLR-11377 Add expMovingAverage (exponential moving average) and binomialCoefficient Stream Evaluators
- Closed
-
SOLR-11388 Add monteCarlo Stream Evaluator to support Monte Carlo simulations
- Closed
-
SOLR-11398 Add weibullDistribution Stream Evaluator
- Closed
-
SOLR-11400 Add logNormalDistribution Stream Evaluator
- Closed
-
SOLR-11401 Add zipFDistribution Stream Evaluator
- Closed
-
SOLR-11414 Add gammaDistribution Stream Evaluator
- Closed
-
SOLR-11415 Add betaDistribution Stream Evaluator
- Closed
-
SOLR-11428 Add spline Stream Evaluator to support spline interpolation
- Closed
-
SOLR-11429 Add loess Stream Evaluator to support Local Regression interpolation
- Closed
-
SOLR-11430 Add lerp and akima Stream Evaluators to support linear and akima spline interpolation
- Closed
-
SOLR-11436 Add polyfit and polyfitDerivative Stream Evaluators
- Closed
-
SOLR-11439 Add harmonicFit Stream Evaluator
- Closed
-
SOLR-11565 Add unit Stream Evaluator to support unitizing of vectors and matrices
- Closed
-
SOLR-11567 Add triangularDistribution Stream Evaluator
- Closed
-
SOLR-11569 Add support for distance matrices to the distance Stream Evaluator
- Closed
-
SOLR-11570 Add support for correlation matrices to the corr Stream Evaluator
- Closed
-
SOLR-11571 Add diff Stream Evaluator to support time series differencing
- Closed
-
SOLR-11572 Add recip Stream Evaluator to support reciprocal transformations
- Closed
-
SOLR-11680 Add normalizeSum Stream Evaluator
- Closed
-
SOLR-11681 Add ttest and pairedTtest Stream Evaluators
- Closed
-
SOLR-11682 Add gtestDataSet Stream Evaluator
- Closed
-
SOLR-11683 Add chiSquareDataSet Stream Evaluator
- Closed
-
SOLR-11697 Add geometricDistribution Stream Evaluator
- Closed
-
SOLR-11734 Add ones and zeros Stream Evaluators
- Closed
-
SOLR-11785 Add multiVariateNormalDistribution Stream Evaluator
- Closed
-
SOLR-11791 Add density Stream Evaluator
- Closed
-
SOLR-11808 Add sumSq Stream Evaluator
- Closed
-
SOLR-11867 Add indexOf, rowCount and columnCount StreamEvaluators
- Closed
-
SOLR-11908 Add multiVariateNormalMixtureDistribution Stream Evaluator
- Closed
-
SOLR-12401 Add getValue() and setValue() Stream Evaluators
- Closed
-
SOLR-12629 The predict evaluator should work with the polyfit function
- Closed
-
SOLR-12634 Add gaussfit Stream Evaluator
- Closed
-
SOLR-12660 Add outliers Stream Evaluator to support outlier detection
- Closed
-
SOLR-12840 Add pairSort Stream Evaluator
- Closed
-
SOLR-12862 Add log10 Stream Evaluator and allow the pow Stream Evaluator to accept a vector of exponents
- Closed
-
SOLR-12936 Allow percentiles Stream Evaluator to accept an array of percentiles to calculate
- Closed
-
SOLR-12975 Add ltrim and rtrim Stream Evaluators
- Closed
-
SOLR-13047 Add facet2D Streaming Expression
- Closed
-
SOLR-13088 Add zplot Stream Evaluator to plot math expressions in Apache Zeppelin
- Closed
-
SOLR-13104 Add natural and repeat Stream Evaluators
- Closed
-
SOLR-13147 Add movingMAD Stream Evaluator
- Closed
-
SOLR-13550 Allow zplot to automatically create the x axis
- Closed
-
SOLR-11593 Add support for covariance matrices to the cov Stream Evaluator
- Closed
-
SOLR-11594 Add precision Stream Evaluator
- Closed
-
SOLR-11602 Add Markov Chain Stream Evaluator
- Closed
-
SOLR-13632 Support integral plots, cosine distance and string truncation with math expressions
- Resolved
-
SOLR-13391 Add variance and standard deviation stream evaluators
- Resolved
-
SOLR-10766 Allow col Stream Evaluator to pull columns from a stream
- Closed
-
SOLR-11674 Support ranges in the probability Stream Evaluator
- Closed
-
SOLR-12158 Allow the monteCarlo Stream Evaluator to support variables
- Closed
-
SOLR-12159 Add memset Stream Evaluator
- Closed
-
SOLR-13494 Add DeepRandomStream implementation
- Closed