Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Not A Problem
-
None
-
None
-
None
-
None
Description
Tika-Python uses Tika REST Server to parse both content & metadata. In this case, the CSV file was 600 MB in size. Tika REST Server runs out of Heap Space since it tries to parse Content also. There should an option to make a REST API call to Tika Server just to parse & return metadata.
Jun 22, 2016 6:38:40 PM org.slf4j.impl.JCLLoggerAdapter warn WARNING: /rmeta/text java.lang.RuntimeException: org.apache.cxf.interceptor.Fault: Java heap space at org.apache.cxf.interceptor.AbstractFaultChainInitiatorObserver.onMessage(AbstractFaultChainInitiatorObserver.java:116) at org.apache.cxf.phase.PhaseInterceptorChain.doIntercept(PhaseInterceptorChain.java:371) at org.apache.cxf.transport.ChainInitiationObserver.onMessage(ChainInitiationObserver.java:121) at org.apache.cxf.transport.http.AbstractHTTPDestination.invoke(AbstractHTTPDestination.java:251) at org.apache.cxf.transport.http_jetty.JettyHTTPDestination.doService(JettyHTTPDestination.java:261) at org.apache.cxf.transport.http_jetty.JettyHTTPHandler.handle(JettyHTTPHandler.java:70) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1088) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1024) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135) at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116) at org.eclipse.jetty.server.Server.handle(Server.java:370) at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:494) at org.eclipse.jetty.server.AbstractHttpConnection.content(AbstractHttpConnection.java:982) at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.content(AbstractHttpConnection.java:1043) at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:865) at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:240) at org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:82) at org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:696) at org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:53) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608) at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.cxf.interceptor.Fault: Java heap space at org.apache.cxf.service.invoker.AbstractInvoker.createFault(AbstractInvoker.java:163) at org.apache.cxf.service.invoker.AbstractInvoker.invoke(AbstractInvoker.java:129) at org.apache.cxf.jaxrs.JAXRSInvoker.invoke(JAXRSInvoker.java:200) at org.apache.cxf.jaxrs.JAXRSInvoker.invoke(JAXRSInvoker.java:99) at org.apache.cxf.interceptor.ServiceInvokerInterceptor$1.run(ServiceInvokerInterceptor.java:59) at org.apache.cxf.interceptor.ServiceInvokerInterceptor.handleMessage(ServiceInvokerInterceptor.java:96) at org.apache.cxf.phase.PhaseInterceptorChain.doIntercept(PhaseInterceptorChain.java:307) ... 21 more Caused by: java.lang.OutOfMemoryError: Java heap space