Uploaded image for project: 'Apache Nemo'
  1. Apache Nemo
  2. NEMO-223

Use Row.hashCode in BeamKeyExtractor

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Labels:

      Description

      I may be wrong but Row.hashCode in Beam 2.6.0 has nondeterministic behaviors.

      Running 'ModifiedTPCHITCase' multiple times return different results. Logs show that Row(71).hashCode in Node X and Row(71).hashCode in Node Y return different hashCodes. The two Rows end up in different reducer tasks, resulting in wrong final outputs.

      I added the following code in BeamKeyExtractor to work around this. It'd be nice later to remove the code and simply use Row.hashCode once the issue is resolved.

      } else if (key instanceof Row)

      { // TODO: sth sth return Arrays.hashCode(((Row) key).getValues().toArray()); }

      else {

       

      [NODE X]
      Key of ValueInGlobalWindow{value=KV

      {[71], 1:[71, 3, 193199, 3, 45.0, 50940.45, 0.0, 0.07, N, O, 1998-02-23, 1998-03-20, 1998-03-24, DELIVER IN PERSON, SHIP, ironic packages believe blithely a]}

      , pane=PaneInfo.NO_FIRING} is [71]
      INFO 10-[NODE Y]18 20:35:56,575 FileBlock:103 [TaskExecutor thread-1] - Write: 0 with ValueInGlobalWindow{value=KV

      {[71], 1:[71, 3, 193199, 3, 45.0, 50940.45, 0.0, 0.07, N, O, 1998-02-23, 1998-03-20, 1998-03-24, DELIVER IN PERSON, SHIP, ironic packages believe blithely a]}

      , pane=PaneInfo.NO_FIRING}

      [NODE Y]
      Key of ValueInGlobalWindow{value=KV

      {[71], 0:[71, 32, O, 238400.84, 1998-01-24, 4-NOT SPECIFIED, Clerk#000027015, 0, express deposits along the blithely regul]}

      , pane=PaneInfo.NO_FIRING} is [71]
      INFO 10-18 20:35:56,191 FileBlock:103 [TaskExecutor thread-1] - Write: 1 with ValueInGlobalWindow{value=KV

      {[71], 0:[71, 32, O, 238400.84, 1998-01-24, 4-NOT SPECIFIED, Clerk#000027015, 0, express deposits along the blithely regul]}

      , pane=PaneInfo.NO_FIRING}

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              johnyangk John Yang
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated: