Uploaded image for project: 'Mahout'
  1. Mahout
  2. MAHOUT-162

Added support for mapping String to long IDs in CF code

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.2
    • 0.2
    • None
    • None

    Description

      Since the framework now only allows long (64-bit integer) IDs, and no longer Strings, we need to provide some support for translating between the two. The basic proposal is this:

      • Define a one-way mapping from Strings to longs that is repeatable and easy to implement in many contexts. In particular I propose using the bottom 64 bits of the MD5 hash of a string.
      • Define support for storing the reverse mapping (longs to Strings) in various ways, in an efficient way, that handles gracefully the very rare possibility of collision

      Attachments

        1. MAHOUT-162.patch
          15 kB
          Sean R. Owen

        Issue Links

          Activity

            People

              srowen Sean R. Owen
              srowen Sean R. Owen
              Votes:
              1 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: