We are using the coocurrence part of mahout as a library. We get our data from other sources, like for instance Cassandra. We dont want to write that data to disk, and read it back since we already have the data on each slave.
I have created some conversion functions based on one of the IndexedDatasetSpark readers, cant remember which one at the moment.
Is there interest in the community for this kind of feature? I can probably clean it up and add this as a github pull request.