Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-32532

exit code 137 (i.e. OutOfMemoryError) in flink-s3-fs-hadoop module

Details

    Description

      This build https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=50840&view=logs&j=4eda0b4a-bd0d-521a-0916-8285b9be9bb5&t=2ff6d5fa-53a6-53ac-bff7-fa524ea361a9&l=16093

      is failing like

      Jul 03 15:33:35 [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 15.267 s - in org.apache.flink.fs.s3hadoop.HadoopS3FileSystemITCase
      Jul 03 15:33:45 [ERROR] Picked up JAVA_TOOL_OPTIONS: -XX:+HeapDumpOnOutOfMemoryError
      ##[error]Exit code 137 returned from process: file name '/bin/docker', arguments 'exec -i -u 1000  -w /home/agent01_azpcontainer 3e9ac5dd969222db5673644f5c729d323f624390f9dbc3238a1c99b1b3c4679b /__a/externals/node/bin/node /__w/_temp/containerHandlerInvoker.js'.
      Finishing: Test - connect_1
      

      Attachments

        Issue Links

          Activity

            mapohl Matthias Pohl added a comment - - edited

            The failure you describe happened on agent AlibabaCI005-agent01 on Jul 03 at 15:33:45. I checked the CI builds you reported in FLINK-18356. There is a 137 exit code CI failure (you reported it in this comment) in the flink-table module on AlibabaCI005-agent04 (i.e. same VM) on Jul 3 at 15:32:38.

            The 137 OOM errors make all the JVM processes crash on the same machine. We've seen this in the past where there was always a CI build failing in flink-table involved. That brought us to the conclusion that FLINK-18356 is the most likely reason for the OOM. Therefore, you might want to close this Jira issue as a duplicate of FLINK-18356 (it's important to link the Jiras to make sure that we can trace back issues in case the OOM is not only caused by FLINK-18356).

            mapohl Matthias Pohl added a comment - - edited The failure you describe happened on agent AlibabaCI005-agent01 on Jul 03 at 15:33:45. I checked the CI builds you reported in FLINK-18356 . There is a 137 exit code CI failure (you reported it in this comment ) in the flink-table module on AlibabaCI005-agent04 (i.e. same VM) on Jul 3 at 15:32:38. The 137 OOM errors make all the JVM processes crash on the same machine. We've seen this in the past where there was always a CI build failing in flink-table involved. That brought us to the conclusion that FLINK-18356 is the most likely reason for the OOM. Therefore, you might want to close this Jira issue as a duplicate of FLINK-18356 (it's important to link the Jiras to make sure that we can trace back issues in case the OOM is not only caused by FLINK-18356 ).

            thanks for looking here ,
            yes you are right i will close it in favor of FLINK-18356

            Sergey Nuyanzin Sergey Nuyanzin added a comment - thanks for looking here , yes you are right i will close it in favor of FLINK-18356

            People

              Unassigned Unassigned
              Sergey Nuyanzin Sergey Nuyanzin
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: