Mahout
  1. Mahout
  2. MAHOUT-139

Make use of Vector Iterator capabilities where appropriate

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.2
    • Fix Version/s: 0.2
    • Component/s: None
    • Labels:
      None

      Description

      There are a bunch of places where we loop over the size of the vector when we should be taking advantage of the sparseness, or at least be agnostic about it and use an iterator.

      This patch addresses these issues in the Vector implementations and in the DistanceMeasure implementations

      Also adds iterateNonZero() and interateAll and drops the Iterable portion of Vector since it wasn't clear what it was iterating

      1. MAHOUT-139.patch
        43 kB
        Grant Ingersoll

        Issue Links

          Activity

          Grant Ingersoll made changes -
          Status Resolved [ 5 ] Closed [ 6 ]
          Grant Ingersoll made changes -
          Status Open [ 1 ] Resolved [ 5 ]
          Resolution Fixed [ 1 ]
          Hide
          Grant Ingersoll added a comment -

          Committed revision 788186.

          Show
          Grant Ingersoll added a comment - Committed revision 788186.
          Hide
          Grant Ingersoll added a comment -

          I'd like to commit this soon. My preliminary tests are pretty positive in terms of the performance gains to be had by being smarter about iteration but it would be helpful to have some feedback.

          Show
          Grant Ingersoll added a comment - I'd like to commit this soon. My preliminary tests are pretty positive in terms of the performance gains to be had by being smarter about iteration but it would be helpful to have some feedback.
          Grant Ingersoll made changes -
          Attachment MAHOUT-139.patch [ 12411604 ]
          Hide
          Grant Ingersoll added a comment -

          Draft of a patch that makes a whole lot of conversions to use an appropriate Iterator.

          Drops Vector extends Iterator and instead provides two methods:
          iterateAll()
          iterateNonZero()

          Iterators are now implemented by DenseVect and SparseVect instead of AbstractVector to try and take advantage of class specific data structures.

          Also updates the DistanceMeasures where appropriate.

          All tests passed in core.

          The profiling view looks a lot healthier too, as the primary bottlenecks are now in code that actually does the work, versus the data structures and accessors.

          Show
          Grant Ingersoll added a comment - Draft of a patch that makes a whole lot of conversions to use an appropriate Iterator. Drops Vector extends Iterator and instead provides two methods: iterateAll() iterateNonZero() Iterators are now implemented by DenseVect and SparseVect instead of AbstractVector to try and take advantage of class specific data structures. Also updates the DistanceMeasures where appropriate. All tests passed in core. The profiling view looks a lot healthier too, as the primary bottlenecks are now in code that actually does the work, versus the data structures and accessors.
          Grant Ingersoll made changes -
          Field Original Value New Value
          Link This issue incorporates MAHOUT-77 [ MAHOUT-77 ]
          Grant Ingersoll created issue -

            People

            • Assignee:
              Grant Ingersoll
              Reporter:
              Grant Ingersoll
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development