
Type: Improvement

Status: Closed

Priority: Major

Resolution: Won't Fix

Affects Version/s: None

Fix Version/s: Not Applicable

Component/s: Algorithms, APIs

Labels:None
Currently many scripts contain usage comments such as the following:
# THIS SCRIPT COMPUTES AN APPROXIMATE FACTORIZATIONOF A LOWRANK MATRIX X INTO TWO MATRICES U AND V # USING ALTERNATINGLEASTSQUARES (ALS) ALGORITHM WITH CONJUGATE GRADIENT # MATRICES U AND V ARE COMPUTED BY MINIMIZING A LOSS FUNCTION (WITH REGULARIZATION) # # INPUT PARAMETERS: #  # NAME TYPE DEFAULT MEANING #  # X String  Location to read the input matrix X to be factorized # U String  Location to write the factor matrix U # V String  Location to write the factor matrix V # rank Int 10 Rank of the factorization # reg String "L2" Regularization: # "L2" = L2 regularization; # "wL2" = weighted L2 regularization # lambda Double 0.000001 Regularization parameter, no regularization if 0.0 # maxi Int 50 Maximum number of iterations # check Boolean FALSE Check for convergence after every iteration, i.e., updating U and V once # thr Double 0.0001 Assuming check is set to TRUE, the algorithm stops and convergence is declared # if the decrease in loss in any two consecutive iterations falls below this threshold; # if check is FALSE thr is ignored # fmt String "text" The output format of the factor matrices L and R, such as "text" or "csv" #  # OUTPUT: # 1 An m x r matrix U, where r is the factorization rank # 2 An r x n matrix V # # HOW TO INVOKE THIS SCRIPT  EXAMPLE: # hadoop jar SystemML.jar f ALSCG.dml nvargs X=INPUT_DIR/X U=OUTPUT_DIR/U V=OUTPUT_DIR/V rank=10 reg="L2" lambda=0.0001 fmt=csv
Comments such as these are difficult to refer to from a programmatic interactive environment such as the Spark Shell. If similar information is provided in a parseable format, such as JSON or XML, it can potentially be parsed and used to provide such information programmatically, such as through the MLContext API in the Spark Shell.