Uploaded image for project: 'Apache MADlib'
  1. Apache MADlib
  2. MADLIB-1229

Duplicated result in PageRank output table with grouping

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Minor
    • Resolution: Fixed
    • None
    • v1.15
    • Module: Graph
    • None

    Description

      In madlib 1.13, if I run the follow query

      DROP TABLE IF EXISTS vertex, "EDGE";
      CREATE TABLE vertex(
      id INTEGER
      );
      CREATE TABLE "EDGE"(
      src INTEGER,
      dest INTEGER,
      user_id INTEGER
      );
      INSERT INTO vertex VALUES
      (0),
      (1),
      (2);
      INSERT INTO "EDGE" VALUES
      (0, 1, 1),
      (0, 2, 1),
      (1, 2, 1),
      (2, 1, 1),
      (0, 1, 2);
      
      
      DROP TABLE IF EXISTS pagerank_ppr_grp_out;
      DROP TABLE IF EXISTS pagerank_ppr_grp_out_summary;
      SELECT pagerank(
      'vertex', -- Vertex table
      'id', -- Vertix id column
      '"EDGE"', -- "EDGE" table
      'src=src, dest=dest', -- "EDGE" args
      'pagerank_ppr_grp_out', -- Output table of PageRank
      NULL, -- Default damping factor (0.85)
      NULL, -- Default max iters (100)
      NULL, -- Default Threshold 
      'user_id');

      I will get result

      madlib=# select * from pagerank_ppr_grp_out order by user_id, id; user_id | id | pagerank
      ---------+----+-------------------
      1 | 0 | 0.05
      1 | 0 | 0.05
      1 | 1 | 0.614906399170753
      1 | 2 | 0.614906399170753
      2 | 0 | 0.075
      2 | 1 | 0.13875
      (6 rows)

      where user_id=1, id=1, pagerank=0.05 appears twice.

      We should correct it to only show distinct result.

       

      Besides, for user_id=1, all pagerank scores should sum up to 1. The score for user_id=1, id=1 should be 0.475, and the score for user_id=1, id=2 should be 0.475. We should correct this calculation too.

       

      Attachments

        Issue Links

          Activity

            People

              hpandey Himanshu Pandey
              jingyimei Jingyi Mei
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: