Uploaded image for project: 'Commons Math'
  1. Commons Math
  2. MATH-765

Refactoring the vector and matrix classes

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 3.3
    • 4.X
    • None

    Description

      Warning

      This is not a bug report, but rather a summary of all discussions which have taken place on the mailing list regarding the refactoring of the vector and matrix classes. Indeed, it has been argued many times that the RealVector and RealMatrix interfaces are really cluttered, and could benefit from other approaches (like functional programming).

      The description of this ticket will be updated as the discussion progresses on the mailing-list, and new JIRA tickets will be created to carry out the "real" work. In order to keep this ticket tidy, contributors should refrain from commenting on this website. Instead, messages should be posted on the dev mailing-list.

      The current API (version 3.0)

      In this section, the current interfaces for vectors and matrices are compared. Vectors and matrices are two mathematical objects which are very close in nature. Their implementations should therefore be as similar as possible. The methods will be sorted as follows

      • methods reflecting the mathematical structure of vector space: addition, multiplication by a scalar, matrix-vector product, ...
      • methods reflecting the mathematical structure of euclidean space
      • ...

      Methods reflecting the mathematical structure of vector space

      List of the methods

      RealVector RealMatrix Comments
      RealVector add(RealVector v) RealMatrix add(RealMatrix m)  
      RealVector subtract(RealVector v) RealMatrix subtract(RealMatrix m)  
      int getDimension() int getRowDimension(),
      int getColumnDimension()
       
      RealVector mapMultiply(double d) scalarMultiply(double d) (1)
      RealVector mapMultiplyToSelf(double d)    
      RealVector outerProduct(RealVector v)    
        double getTrace()  
        multiply(RealMatrix m)  
        double[] operate(double[]) (2)
        RealVector operate(RealVector)  
        RealMatrix power(int p)  
        double[] preMultiply(double[]) (2)
        RealMatrix preMultiply(RealMatrix)  
        RealVector preMultiply(RealVector)  
        RealMatrix transpose()  

      Comments on the above methods

      Comment (1)

      RealVector RealVector.mapMultiply(double) and RealMatrix RealMatrix.scalarMultiply(double) perform essentially the same task. Readibility of the classes would be improved if they add the same name. This is very important since these methods reflect the fact that the space of vectors as well as the space of matrices are two vector spaces.

      Comment (2)

      Prior to the release of version 3.0, all methods taking as argument, or returning, double[] as a representation of vectors were removed. The rationale for this is that calling new ArrayRealVector(double[], false) is very easy, and comes at virtually no cost (see MATH-653 and MATH-660). It might be worth considering the same simplification for the RealMatrix interface.

      Methods reflecting the mathematical structure of euclidean space

      List of the methods

      RealVector RealMatrix Comments
      double cosine(RealVector v)    
      double dotProduct(RealVector v)   (3)
      double getDistance(RealVector v)    
      double getNorm() double getFrobeniusNorm()  
      RealVector projection(RealVector v)    
      void unitize()   (4)
      RealVector unitVector()    

      Comments on the above methods

      Comment (3)

      In a way, RealMatrix RealMatrix.transpose() could be seen as a method inherent to the euclidean structure, and the generalization of the dot product. For this reason, transpose() should probably not be externalized.

      Comment (4)

      This could be externalized with the visitor pattern (see below).

      Comment (5)

      Could be externalized in a factory class.

      Constructors, factory methods and related methods

      List of the methods

      RealVector RealMatrix Comments
      RealVector append(double d)    
      RealVector append(RealVector v)    
      RealVector copy() RealMatrix copy()  
        void copySubMatrix(int[] selectedRows, int[] selectedColumns, double[][] destination) (6), (7)
        void copySubMatrix(int startRow, int endRow, int startColumn, int endColumn, double[][] destination) (7)
        createMatrix(int rowDimension, int columnDimension) (9)
        double[] getColumn(int column) (6)
        RealMatrix getColumnMatrix(int column)  
        RealVector getColumnVector(int column)  
        double[] getRow(int row) (6)
        RealMatrix getRowMatrix(int row)  
        RealVector getRowVector(int row)  
        RealMatrix getSubMatrix(int[] selectedRows, int[] selectedColumns)  
      RealVector getSubVector(int index, int n) RealMatrix getSubMatrix(int startRow, int endRow, int startColumn, int endColumn) (8)
        void setColumn(int column, double[] array) (6)
        void setColumnMatrix(int column, RealMatrix matrix)  
        void setColumnVector(int column, RealVector vector)  
        void setRow(int row, double[] array) (6)
        void setRowMatrix(int row, RealMatrix matrix)  
        void setRowVector(int row, RealVector vector)  
      void setSubVector(int index, RealVector v) void setSubMatrix(double[][] subMatrix, int row, int column) (6)
      unmodifiableRealVector(RealVector v)    

      Comments on the above methods

      Comment (6)

      Prior to the release of version 3.0, all methods taking as argument, or returning, double[] as a representation of vectors were removed. The rationale for this is that calling new ArrayRealVector(double[], false) is very easy, and comes at virtually no cost (see MATH-653 and MATH-660). It might be worth considering the same simplification for the RealMatrix interface.

      Comment (7)

      The signature of this method is rather unusual in Commons-Math, as one of the parameters is modified, and nothing is returned.

      Comment (8)

      Inconsistency: in getSubVector(int, int), the second parameter is the number of entries to be copied, while in getSubMatrix(int, int, int, int) the second (resp. fourth) parameters are the indices of the last row (resp. column).

      Comment (9)

      This is a very useful method: one often needs to create a new vector/matrix with same data layout as an existing vector/matrix. This method should probably be generalized to RealVector as well.

      Manipulation of entries

      List of the methods

      RealVector RealMatrix Comments
      double getEntry(int index) double getEntry(int row, int column)  
      void set(double value)   (10)
      void setEntry(int index, double value) void setEntry(int row, int column, double value)  
      void addToEntry(int index, double increment) void addToEntry(int row, int column, double increment) (11)
        void multiplyEntry(int row, int column, double factor) (11)

      Comments on these methods

      Comment (10)

      This useful method could be extended to RealMatrix as well. However, it could be argued that this method is superfluous, as the visitor pattern (or indeed the mapToSelf() method) would do in this case.

      Comment (11)

      These are typical examples of useful methods which can lead to uncontrolled growth of the APIs.

      Various norms and related methods

      List of the methods

      RealVector RealMatrix Comments
      double getL1Distance(RealVector v)   (13)
      double getL1Norm()   (12)
      double getLInfDistance(RealVector v)   (13)
      double getLInfNorm() double getNorm() (12), (14)
      int getMaxIndex()   (12), (15)
      double getMaxValue()   (12), (15)
      int getMinIndex()   (12), (15)
      int getMinValue()   (12), (15)

      Comments on these methods

      Comment (12)

      This method could be removed from the RealVector API. Visitors should be implemented instead.

      Comment (13)

      This method could be removed from the RealVector API. Alternative functional approach should be proposed (e.g. zip).

      Comment (14)

      Inconsistencies: getNorm() does not refer to the same type of norm.

      Comment (15)

      Methods int getXXXIndex() and double getXXXValue() could be merged, the returned type might be a Pair<Integer, Double>. This would avoid inefficient duplicate sweep of the vector if both the value and the index are needed.

      Functional-programming-like methods

      List of the methods

      The methods listed below are both "truly" FP methods, as well as methods which could benefit from a FP approach.

      RealVector RealMatrix Comments
      Iterator<RealVector.Entry> iterator()    
      Iterator<RealVector.Entry> sparseIterator()    
      RealVector map(UnivariateFunction function)   (18)
      RealVector mapToSelf(UnivariateFunction function)   (18)
      RealVector mapAdd(double d) RealMatrix scalarAdd(double d) (16)
      RealVector mapAddToSelf(double d)   (16)
      RealVector mapDivide(double d)   (16)
      RealVector mapDivideToSelf(double d)   (16)
      RealVector mapSubtract(double d)   (16)
      RealVector mapSubtractToSelf(double d)   (16)
      RealVector ebeDivide(RealVector v)   (17)
      RealVector ebeMultiply(RealVector v)   (17)
      RealVector combine(double a, double b, RealVector y)   (17)
      RealVector combineToSelf(double a, double b, RealVector y)   (17)
        double walkInColumnOrder(RealMatrixChangingVisitor visitor) (18)
        double walkInColumnOrder(RealMatrixChangingVisitor visitor, int startRow, int endRow, int startColumn, int endColumn) (18)
        double walkInColumnOrder(RealMatrixPreservingVisitor visitor) (18)
        double walkInColumnOrder(RealMatrixPreservingVisitor visitor, int startRow, int endRow, int startColumn, int endColumn) (18)
        double walkInOptimizedOrder(RealMatrixChangingVisitor visitor) (18)
        double walkInOptimizedOrder(RealMatrixChangingVisitor visitor, int startRow, int endRow, int startColumn, int endColumn) (18)
        double walkInOptimizedOrder(RealMatrixPreservingVisitor visitor) (18)
        double walkInOptimizedOrder(RealMatrixPreservingVisitor visitor, int startRow, int endRow, int startColumn, int endColumn) (18)
        double walkInRowOrder(RealMatrixChangingVisitor visitor) (18)
        double walkInRowOrder(RealMatrixChangingVisitor visitor, int startRow, int endRow, int startColumn, int endColumn) (18)
        double walkInRowOrder(RealMatrixPreservingVisitor visitor) (18)
        double walkInRowOrder(RealMatrixPreservingVisitor visitor, int startRow, int endRow, int startColumn, int endColumn) (18)

      Comments on these methods

      Comment (16)

      These methods could be removed, as a call to map()mapToSelf().

      Comment (17)

      These methods could benefit from a FP approach (e.g. zip).

      Comment (18)

      map() and visit() are similar concepts, which are both useful. Should map() be implemented for RealMatrix? Should visit() be implemented for RealVector?

      Data conversion methods

      List of the methods

      RealVector RealMatrix Comments
      double[] toArray() double[][] getData()  

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              celestin Sebastien Brisard
              Votes:
              1 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated: