We should display prediction and id corresponding to all the nodes. Currently PIC is not returning the cluster indices of neighbour IDs which are not there in the ID column.
As per the definition of PIC clustering, given in the code,
PIC takes an affinity matrix between items (or vertices) as input. An affinity matrix
is a symmetric matrix whose entries are non-negative similarities between items.
PIC takes this matrix (or graph) as an adjacency matrix. Specifically, each input row includes:
- idCol: vertex ID
- neighborsCol: neighbors of vertex in idCol
- similaritiesCol: non-negative weights (similarities) of edges between the vertex
in idCol and each neighbor in neighborsCol
- "PIC returns a cluster assignment for each input vertex." It appends a new column predictionCol
containing the cluster assignment in [0,k) for each row (vertex).