Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-11027

ZetaSQL Nexmark run takes too long because of excessive logging

Details

    Description

      The execution of the Nexmark ZetaSQL tests takes too long and produces a gigantic log file ~ 1.5GB full of debug information of netty/grpc when run on the Spark runner

      DEBUG com.google.zetasql.io.grpc.netty.NettyClientHandler: [id: 0x555bd369, L:0.0.0.0/0.0.0.0:0 - R:0.0.0.0/0.0.0.0:0] INBOUND HEADERS: streamId=39861 headers=GrpcHttp2ResponseHeaders[grpc-status: 0] padding=0 endStream=true
      DEBUG com.google.zetasql.io.grpc.netty.NettyClientHandler: [id: 0x555bd369, L:0.0.0.0/0.0.0.0:0 - R:0.0.0.0/0.0.0.0:0] INBOUND WINDOW_UPDATE: streamId=0 windowSizeIncrement=151
      DEBUG com.google.zetasql.io.grpc.netty.NettyClientHandler: [id: 0x555bd369, L:0.0.0.0/0.0.0.0:0 - R:0.0.0.0/0.0.0.0:0] OUTBOUND HEADERS: streamId=39867 headers=GrpcHttp2OutboundHeaders[:authority: 0.0.0.0:0, :path: /zetasql.local_service.ZetaSqlLocalService/Evaluate, :method: POST, :scheme: http, content-type: application/grpc, te: trailers, user-agent: grpc-java-netty/1.18.0, grpc-accept-encoding: gzip] streamDependency=0 weight=16 exclusive=false padding=0 endStream=false

      The full run in the CI takes 2h30 because of this
      https://ci-beam.apache.org/job/beam_PostCommit_Java_Nexmark_Spark_PR/52/

      This can be reproduced by doing:
       
      ./gradlew :sdks:java:testing:nexmark:run -Pnexmark.runner=":runners:spark" -Pnexmark.args=" --runner=SparkRunner --streaming=false --suite=SMOKE --queryLanguage=zetaSql --manageResources=false --monitorJobs=true --enforceEncodability=true --enforceImmutability=true"

      Attachments

        Issue Links

          Activity

            People

              iemejia Ismaël Mejía
              iemejia Ismaël Mejía
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 50m
                  50m