Users of Canopy clustering report that the single reducer used in the mapreduce version often takes dispropportately long to process the results of multiple mappers. This patch introduces a new Canopy CLI argument,
cf (-clusterFilter), which if present establishes a lower bound on the numPoints of canopies output from the algorithm. The default value for this filter is 0, and all canopies are output. Setting -cf 1 would eliminate any canopies which contain only 1 point from subsequent processing steps.
|Status||Open [ 1 ]||Resolved [ 5 ]|
|Resolution||Fixed [ 1 ]|
|Status||Resolved [ 5 ]||Closed [ 6 ]|
|Transition||Time In Source Status||Execution Times||Last Executer||Last Execution Date|
|1d 22h 8m||1||Jeff Eastman||28/Sep/11 21:15|
|133d 17h 45m||1||Sean Owen||09/Feb/12 14:00|