Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-35233

HBase lookup result is wrong when lookup cache is enabled

    XMLWordPrintableJSON

Details

    Description

      HBase table

      rowkey name age
      1 ben 18
      2 ken 19
      3 mark 20

       
      FlinkSQL lookup join with lookup cahce

      CREATE TABLE dim_user (
        rowkey STRING,
        info ROW<name STRING, age STRING>,
        PRIMARY KEY (rowkey) NOT ENFORCED
      ) WITH (
        'connector' = 'hbase-2.2',
        'zookeeper.quorum' = 'localhost:2181',
        'zookeeper.znode.parent' = '/hbase',
        'table-name' = 'default:test',
        'lookup.cache' = 'PARTIAL',
        'lookup.partial-cache.max-rows' = '1000',
        'lookup.partial-cache.expire-after-write' = '1h'
      );
      
      CREATE VIEW user_click AS 
      SELECT user_id, proctime() AS proc_time
      FROM (
        VALUES('1'), ('2'), ('3'), ('1'), ('2')
      ) AS t (user_id);
      
      SELECT 
          user_id, 
          info.name, 
          info.age
      FROM user_click INNER JOIN dim_user
      FOR SYSTEM_TIME AS OF user_click.proc_time
      ON dim_user.rowkey = user_click.user_id;

       
      Expect Result

      rowkey name age
      1 ben 18
      2 ken 19
      3 mark 20
      1 ben 18
      2 ken 19

       

      Actual Result

      rowkey name age
      1 ben 18
      2 ken 19
      3 mark 20
      1 mark 20
      2 mark 20

       
      Wrong result when we lookup user_id 1 and 2 the second time.

      Attachments

        Activity

          People

            Unassigned Unassigned
            tanjialiang tanjialiang
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: