Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-1628

Strange behavior of "where" function during a join

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • 0.9
    • 0.9, 0.8.2
    • API / DataSet

    Description

      Hello!

      If I use the `where` function with a field list during a join, it exhibits strange behavior.

      Here is the sample code that triggers the error: https://gist.github.com/balidani/d9789b713e559d867d5c

      This example joins a DataSet with itself, then counts the number of rows. If I use `.where(0, 1)` the result is (22), which is not correct. If I use `EdgeKeySelector`, I get the correct result (101).

      When I pass a field list to the `equalTo` function (but not `where`), everything works again.

      If I don't include the `groupBy` and `reduceGroup` parts, everything works.

      Also, when working with large DataSets, passing a field list to `where` makes it incredibly slow, even though I don't see any exceptions in the log (in DEBUG mode).

      Does anybody know what might cause this problem?

      Thanks!

      Attachments

        Activity

          People

            fhueske Fabian Hueske
            balidani Daniel Bali
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: