Uploaded image for project: 'Mahout'
  1. Mahout
  2. MAHOUT-162

Added support for mapping String to long IDs in CF code

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.2
    • Fix Version/s: 0.2
    • Labels:
      None

      Description

      Since the framework now only allows long (64-bit integer) IDs, and no longer Strings, we need to provide some support for translating between the two. The basic proposal is this:

      • Define a one-way mapping from Strings to longs that is repeatable and easy to implement in many contexts. In particular I propose using the bottom 64 bits of the MD5 hash of a string.
      • Define support for storing the reverse mapping (longs to Strings) in various ways, in an efficient way, that handles gracefully the very rare possibility of collision

        Attachments

        1. MAHOUT-162.patch
          15 kB
          Sean R. Owen

          Issue Links

            Activity

              People

              • Assignee:
                srowen Sean R. Owen
                Reporter:
                srowen Sean R. Owen
              • Votes:
                1 Vote for this issue
                Watchers:
                1 Start watching this issue

                Dates

                • Due:
                  Created:
                  Updated:
                  Resolved: