Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-7449

memory leak parse_url function

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 1.16.0
    • None
    • Functions - Drill
    • None

    Description

      Requests with parse_url works well when the number of treated rows is low but produce memory leak when number of rows grows (~ between 500 000 and 1 million) (and for certain number of row sometimes the request works and sometimes it failed with memory leaks)

      Extract from dataset tested:

      {"Attributable":true,"Description":"Website has been identified as malicious by Bing","FirstReportedDateTime":"2018-03-12T18:49:38Z","IndicatorExpirationDateTime":"2018-04-11T23:33:13Z","IndicatorProvider":"Bing","IndicatorThreatType":"MaliciousUrl","IsPartnerShareable":true,"IsProductLicensed":true,"LastReportedDateTime":"2018-03-12T18:49:38Z","NetworkDestinationAsn":15169,"NetworkDestinationIPv4":"172.217.8.193","NetworkDestinationPort":80,"Tags":["us"],"ThreatDetectionProduct":"ES","TLPLevel":"Amber","Url":"http://pasuruanbloggers.blogspot.ru/2012/12/beginilah-cara-orang-jepang-berpacaran.html","Version":1.5}
      {"Attributable":true,"Description":"Website has been identified as malicious by Bing","FirstReportedDateTime":"2018-03-12T18:14:51Z","IndicatorExpirationDateTime":"2018-04-11T23:33:13Z","IndicatorProvider":"Bing","IndicatorThreatType":"MaliciousUrl","IsPartnerShareable":true,"IsProductLicensed":true,"LastReportedDateTime":"2018-03-12T18:14:51Z","NetworkDestinationAsn":15169,"NetworkDestinationIPv4":"216.58.192.193","NetworkDestinationPort":80,"Tags":["us"],"ThreatDetectionProduct":"ES","TLPLevel":"Amber","Url":"http://pasuruanbloggers.blogspot.ru/2012/12/cara-membuat-widget-slideshow-postingan.html","Version":1.5}
      

      Request tested:

      ALTER SESSION SET `store.format`='parquet';
      ALTER SESSION SET `store.parquet.use_new_reader` = true;
      ALTER SESSION SET `store.parquet.compression` = 'snappy';
      ALTER SESSION SET `drill.exec.functions.cast_empty_string_to_null`= true;
      ALTER SESSION SET `store.json.all_text_mode` = true;
      ALTER SESSION SET `exec.enable_union_type` = true;
      ALTER SESSION SET `store.json.all_text_mode` = true;
      
      CREATE TABLE dfs.test.`output_pqt` AS
      (
      SELECT R.parsed.host AS Domain
      FROM ( 
        SELECT parse_url(T.Url) AS parsed
        FROM dfs.test.`file.json` AS T
      ) AS R 
      ORDER BY Domain
      );
      

       
      Result when memory leak:

      Error: SYSTEM ERROR: IllegalStateException: Memory was leaked by query. Memory leaked: (256)
      Allocator(frag:3:0) 3000000/256/9337280/30000000000 (res/actual/peak/limit)
      
      
      Fragment 3:0
      
      Please, refer to logs for more information.
      
      [Error Id: 3ffa5b43-0dde-4518-bb5a-ea3aab97f3d4 on servor01:31010]
      
        (java.lang.IllegalStateException) Memory was leaked by query. Memory leaked: (256)
      Allocator(frag:3:0) 3000000/256/9337280/30000000000 (res/actual/peak/limit)
      
          org.apache.drill.exec.memory.BaseAllocator.close():520
          org.apache.drill.exec.ops.FragmentContextImpl.suppressingClose():552
          org.apache.drill.exec.ops.FragmentContextImpl.close():546
          org.apache.drill.exec.work.fragment.FragmentExecutor.closeOutResources():386
          org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup():214
          org.apache.drill.exec.work.fragment.FragmentExecutor.run():329
          org.apache.drill.common.SelfCleaningRunnable.run():38
          java.util.concurrent.ThreadPoolExecutor.runWorker():1149
          java.util.concurrent.ThreadPoolExecutor$Worker.run():624
          java.lang.Thread.run():748 (state=,code=0)
      java.sql.SQLException: SYSTEM ERROR: IllegalStateException: Memory was leaked by query. Memory leaked: (256)
      Allocator(frag:3:0) 3000000/256/9337280/30000000000 (res/actual/peak/limit)
      
      
      Fragment 3:0
      
      Please, refer to logs for more information.
      
      [Error Id: 3ffa5b43-0dde-4518-bb5a-ea3aab97f3d4 on servor01:31010]
      
        (java.lang.IllegalStateException) Memory was leaked by query. Memory leaked: (256)
      Allocator(frag:3:0) 3000000/256/9337280/30000000000 (res/actual/peak/limit)
      
          org.apache.drill.exec.memory.BaseAllocator.close():520
          org.apache.drill.exec.ops.FragmentContextImpl.suppressingClose():552
          org.apache.drill.exec.ops.FragmentContextImpl.close():546
          org.apache.drill.exec.work.fragment.FragmentExecutor.closeOutResources():386
          org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup():214
          org.apache.drill.exec.work.fragment.FragmentExecutor.run():329
          org.apache.drill.common.SelfCleaningRunnable.run():38
          java.util.concurrent.ThreadPoolExecutor.runWorker():1149
          java.util.concurrent.ThreadPoolExecutor$Worker.run():624
          java.lang.Thread.run():748
      
      	at org.apache.drill.jdbc.impl.DrillCursor.nextRowInternally(DrillCursor.java:538)
      	at org.apache.drill.jdbc.impl.DrillCursor.loadInitialSchema(DrillCursor.java:610)
      	at org.apache.drill.jdbc.impl.DrillResultSetImpl.execute(DrillResultSetImpl.java:1278)
      	at org.apache.drill.jdbc.impl.DrillResultSetImpl.execute(DrillResultSetImpl.java:58)
      	at org.apache.calcite.avatica.AvaticaConnection$1.execute(AvaticaConnection.java:667)
      	at org.apache.drill.jdbc.impl.DrillMetaImpl.prepareAndExecute(DrillMetaImpl.java:1107)
      	at org.apache.drill.jdbc.impl.DrillMetaImpl.prepareAndExecute(DrillMetaImpl.java:1118)
      	at org.apache.calcite.avatica.AvaticaConnection.prepareAndExecuteInternal(AvaticaConnection.java:675)
      	at org.apache.drill.jdbc.impl.DrillConnectionImpl.prepareAndExecuteInternal(DrillConnectionImpl.java:200)
      	at org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:156)
      	at org.apache.calcite.avatica.AvaticaStatement.execute(AvaticaStatement.java:217)
      	at sqlline.Commands.execute(Commands.java:971)
      	at sqlline.Commands.sql(Commands.java:906)
      	at sqlline.SqlLine.dispatch(SqlLine.java:740)
      	at sqlline.SqlLine.runCommands(SqlLine.java:1691)
      	at sqlline.Commands.run(Commands.java:1513)
      	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      	at java.lang.reflect.Method.invoke(Method.java:498)
      	at sqlline.ReflectiveCommandHandler.execute(ReflectiveCommandHandler.java:38)
      	at sqlline.SqlLine.dispatch(SqlLine.java:736)
      	at sqlline.SqlLine.initArgs(SqlLine.java:475)
      	at sqlline.SqlLine.begin(SqlLine.java:531)
      	at sqlline.SqlLine.start(SqlLine.java:270)
      	at sqlline.SqlLine.main(SqlLine.java:201)
      Caused by: org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: IllegalStateException: Memory was leaked by query. Memory leaked: (256)
      Allocator(frag:3:0) 3000000/256/9337280/30000000000 (res/actual/peak/limit)
      
      
      Fragment 3:0
      
      Please, refer to logs for more information.
      
      [Error Id: 3ffa5b43-0dde-4518-bb5a-ea3aab97f3d4 on servor01:31010]
      
        (java.lang.IllegalStateException) Memory was leaked by query. Memory leaked: (256)
      Allocator(frag:3:0) 3000000/256/9337280/30000000000 (res/actual/peak/limit)
      
          org.apache.drill.exec.memory.BaseAllocator.close():520
          org.apache.drill.exec.ops.FragmentContextImpl.suppressingClose():552
          org.apache.drill.exec.ops.FragmentContextImpl.close():546
          org.apache.drill.exec.work.fragment.FragmentExecutor.closeOutResources():386
          org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup():214
          org.apache.drill.exec.work.fragment.FragmentExecutor.run():329
          org.apache.drill.common.SelfCleaningRunnable.run():38
          java.util.concurrent.ThreadPoolExecutor.runWorker():1149
          java.util.concurrent.ThreadPoolExecutor$Worker.run():624
          java.lang.Thread.run():748
      
      	at org.apache.drill.exec.rpc.user.QueryResultHandler.resultArrived(QueryResultHandler.java:123)
      	at org.apache.drill.exec.rpc.user.UserClient.handle(UserClient.java:422)
      	at org.apache.drill.exec.rpc.user.UserClient.handle(UserClient.java:96)
      	at org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:273)
      	at org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:243)
      	at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:88)
      	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356)
      	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342)
      	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335)
      	at io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:287)
      	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356)
      	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342)
      	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335)
      	at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102)
      	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356)
      	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342)
      	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335)
      	at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:312)
      	at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:286)
      	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356)
      	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342)
      	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335)
      	at io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
      	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356)
      	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342)
      	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335)
      	at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1294)
      	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356)
      	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342)
      	at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:911)
      	at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
      	at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:645)
      	at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:580)
      	at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:497)
      	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:459)
      	at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:131)
      	at java.lang.Thread.run(Thread.java:748)
      Caused by: java.lang.IllegalStateException: Memory was leaked by query. Memory leaked: (256)
      Allocator(frag:3:0) 3000000/256/9337280/30000000000 (res/actual/peak/limit)
      
      	at org.apache.drill.exec.memory.BaseAllocator.close(BaseAllocator.java:520)
      	at org.apache.drill.exec.ops.FragmentContextImpl.suppressingClose(FragmentContextImpl.java:552)
      	at org.apache.drill.exec.ops.FragmentContextImpl.close(FragmentContextImpl.java:546)
      	at org.apache.drill.exec.work.fragment.FragmentExecutor.closeOutResources(FragmentExecutor.java:386)
      	at org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:214)
      	at org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:329)
      	at org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
      	at .......(:0
      

      Note that same request including some other field and excluding use of parse_url works perfectly well.
       

      Attachments

        Activity

          People

            Unassigned Unassigned
            benj641 benj
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated: