Details

    • Type: New Feature New Feature
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Later
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      A set of classes that builds matrices from text.

      Currently the API consists of TokenMatrixBuilder and TokenInstanceBuilder. Should be thread safe.

      PostReader imports 20news-bydate. This takes several GB heap. It would be nice to bounce the data via JDBM or perhaps using the PersistentHashMap in MAHOUT-19.

      1. MAHOUT-61.txt
        27 kB
        Karl Wettin
      2. MAHOUT-61.txt
        44 kB
        Karl Wettin
      3. MAHOUT-61.txt
        64 kB
        Karl Wettin

        Issue Links

          Activity

          No work has yet been logged on this issue.

            People

            • Assignee:
              Karl Wettin
              Reporter:
              Karl Wettin
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development