Map-reduce does not handle this well. There are two ways to implement this in Hadoop:
- Null Mapper -> Reducer<IntWritable,VectorWritable>
- The Reducer loads iterators for all VectorWritables, then walks forward monotonically through all iterators.
- Mapper -> Partitioner<1 Reducer per row> -> (Reducer<IntWritable index, DoubleWritable value>)
- More: Reducer setup/teardown creates an output VectorWritable, each reduce() call receives one vector index and one or more values.
The first requires loading into memory the contents for row X, from each matrix, simultaneously. ConcatenateMatrices already has this problem, and does not copy the vectors over the network. The second is a "map-increase" algorithm: it creates a separate key pair for each value in the output matrix. Neither of these scale.
The only way to do this is to precondition the input matrices into one file with ordered rows, and use the above single-threaded concatenator. If you want multiple files, you can partition the matrices into matching sets of rows: part-r-00000 is row 0->499, part-r-00001 is row 500->999... etc. You then run ConcatenateMatrices on each pair.