Uploaded image for project: 'Tika'
  1. Tika
  2. TIKA-894

Add webapp mode for Tika Server, simplifies deployment

Details

    Description

      For use in production services, Tika Server should really be deployed as a WAR file, under a reliable servlet container that knows how to run as a system service, for example Tomcat or JBoss.

      This is especially important on Windows, where I wasted an entire day trying to make TikaServerCli run as some kind of a service.

      Maven makes building a webapp pretty trivial. With the attached patch applied, "mvn war:war" should work. It seems to run fine in Tomcat, which makes Windows deployment much simpler. Just install Tomcat and drop the WAR file into tomcat's webapps directory and you're away.

      Attachments

        1. tika-server-webapp.patch
          4 kB
          Graham Charters

        Activity

          f.bosch@genkgo.nl Frederik Bosch added a comment -

          Since tika is now using Apache CXF this patch is not valid anymore. However, I would like to deploy tika-server as a Tomcat servlet. Would anyone have the correct code for the Servlet.java?

          f.bosch@genkgo.nl Frederik Bosch added a comment - Since tika is now using Apache CXF this patch is not valid anymore. However, I would like to deploy tika-server as a Tomcat servlet. Would anyone have the correct code for the Servlet.java?

          The following fragment may help:
          http://cxf.apache.org/docs/jaxrs-services-configuration.html#JAXRSServicesConfiguration-ConfiguringJAXRSservicesincontainerwithoutSpring

          I guess the simplest option is to use org.apache.cxf.jaxrs.servlet.CXFNonSpringJaxrsServlet, with jaxrs.serviceClasses pointing to Tika class, and jaxrs.providers - to Tika JAX-RS providers

          sergey_beryozkin Sergey Beryozkin added a comment - The following fragment may help: http://cxf.apache.org/docs/jaxrs-services-configuration.html#JAXRSServicesConfiguration-ConfiguringJAXRSservicesincontainerwithoutSpring I guess the simplest option is to use org.apache.cxf.jaxrs.servlet.CXFNonSpringJaxrsServlet, with jaxrs.serviceClasses pointing to Tika class, and jaxrs.providers - to Tika JAX-RS providers
          f.bosch@genkgo.nl Frederik Bosch added a comment - - edited

          I would love to write the code, but I do not think I am able to. My Java skills are not sufficient. Could you provide me an example?

          f.bosch@genkgo.nl Frederik Bosch added a comment - - edited I would love to write the code, but I do not think I am able to. My Java skills are not sufficient. Could you provide me an example?
          ianw Ian Williams added a comment -

          Hi Frederik, did you get anywhere with this? I'd like to run Tika within Tomcat and wondered if you'd got any further?

          ianw Ian Williams added a comment - Hi Frederik, did you get anywhere with this? I'd like to run Tika within Tomcat and wondered if you'd got any further?
          f.bosch@genkgo.nl Frederik Bosch added a comment -

          Dear Ian,

          Never got this running, unfortunately. I now run TIKA in server mode
          with the following command.

          cd tika-1.3/tika-server
          nohup java -jar target/tika-server-1.3.jar &

          Do not remember how I installed it. I believe it is with mvn. It works,
          but you should run within supervisord or something similar. It crashes
          due to memory reasons. However, I was too lazy to implement that and now
          I restart the service from time to time.

          Good luck! Let me know if you found something useful!
          Regards,

          Frederik

          –
          Frederik Bosch
          Partner - Genkgo

          telefoon: +31 (0)20 - 894 39 31 <callto:+31208943931>
          email: f.bosch@genkgo.nl <f.bosch@genkgo.nl>
          skype: genkgo.support <skype:genkgo.support?call>
          web: www.genkgo.nl <http://www.genkgo.nl>

          Postadres:
          Postbus 15956
          1001 NL Amsterdam

          Bezoekadres:
          Keizersgracht 253
          Amsterdam

          Genkgo logo <http://www.genkgo.nl>

          Genkgo B.V. staat geregistreerd bij de Kamer van Koophandel onder nummer
          56501153

          f.bosch@genkgo.nl Frederik Bosch added a comment - Dear Ian, Never got this running, unfortunately. I now run TIKA in server mode with the following command. cd tika-1.3/tika-server nohup java -jar target/tika-server-1.3.jar & Do not remember how I installed it. I believe it is with mvn. It works, but you should run within supervisord or something similar. It crashes due to memory reasons. However, I was too lazy to implement that and now I restart the service from time to time. Good luck! Let me know if you found something useful! Regards, Frederik – Frederik Bosch Partner - Genkgo telefoon: +31 (0)20 - 894 39 31 <callto:+31208943931> email: f.bosch@genkgo.nl < f.bosch@genkgo.nl > skype: genkgo.support <skype:genkgo.support?call> web: www.genkgo.nl < http://www.genkgo.nl > Postadres : Postbus 15956 1001 NL Amsterdam Bezoekadres : Keizersgracht 253 Amsterdam Genkgo logo < http://www.genkgo.nl > Genkgo B.V. staat geregistreerd bij de Kamer van Koophandel onder nummer 56501153
          ianw Ian Williams added a comment -

          Hi Frederik

          Thank you for getting back to me. I'm interested in the memory issues you experienced. Does tika-server appear to leak memory over time?

          Many thanks
          Ian

          ianw Ian Williams added a comment - Hi Frederik Thank you for getting back to me. I'm interested in the memory issues you experienced. Does tika-server appear to leak memory over time? Many thanks Ian
          f.bosch@genkgo.nl Frederik Bosch added a comment -

          Dear Ian,

          Well, we are using TIKA to give meta data of found PDF documents on the
          web. I do not know what it causing the memory problems. I guess (wild
          guess) that the used memory increases a little bit after every PDF and
          then one time it reaches its maximum level and I have to restart the
          service.

          Regards,
          Frederik

          –
          Frederik Bosch
          Partner - Genkgo

          telefoon: +31 (0)20 - 894 39 31 <callto:+31208943931>
          email: f.bosch@genkgo.nl <f.bosch@genkgo.nl>
          skype: genkgo.support <skype:genkgo.support?call>
          web: www.genkgo.nl <http://www.genkgo.nl>

          Postadres:
          Postbus 15956
          1001 NL Amsterdam

          Bezoekadres:
          Keizersgracht 253
          Amsterdam

          Genkgo logo <http://www.genkgo.nl>

          Genkgo B.V. staat geregistreerd bij de Kamer van Koophandel onder nummer
          56501153

          f.bosch@genkgo.nl Frederik Bosch added a comment - Dear Ian, Well, we are using TIKA to give meta data of found PDF documents on the web. I do not know what it causing the memory problems. I guess (wild guess) that the used memory increases a little bit after every PDF and then one time it reaches its maximum level and I have to restart the service. Regards, Frederik – Frederik Bosch Partner - Genkgo telefoon: +31 (0)20 - 894 39 31 <callto:+31208943931> email: f.bosch@genkgo.nl < f.bosch@genkgo.nl > skype: genkgo.support <skype:genkgo.support?call> web: www.genkgo.nl < http://www.genkgo.nl > Postadres : Postbus 15956 1001 NL Amsterdam Bezoekadres : Keizersgracht 253 Amsterdam Genkgo logo < http://www.genkgo.nl > Genkgo B.V. staat geregistreerd bij de Kamer van Koophandel onder nummer 56501153

          Can someone assign this to me and I will taker it on when I get a free cycle.
          This will be really helpful for TIKA-1301
          I'll get round to it this weekend hopefully,

          lewismc Lewis John McGibbney added a comment - Can someone assign this to me and I will taker it on when I get a free cycle. This will be really helpful for TIKA-1301 I'll get round to it this weekend hopefully,

          Hi Lewis, I do not see you in the list of Assignees

          sergey_beryozkin Sergey Beryozkin added a comment - Hi Lewis, I do not see you in the list of Assignees

          Whoever has admin can add me.
          TBW honesty sergey_beryozkin it is not a big deal. I will get around to it anyways... soon

          lewismc Lewis John McGibbney added a comment - Whoever has admin can add me. TBW honesty sergey_beryozkin it is not a big deal. I will get around to it anyways... soon
          nick Nick Burch added a comment -

          Lewis - I don't have the karma to assign it to you, I think someone else needs to put you in a magic JIRA group first. That said, I think the field is wide-open for you to tackle this!

          nick Nick Burch added a comment - Lewis - I don't have the karma to assign it to you, I think someone else needs to put you in a magic JIRA group first. That said, I think the field is wide-open for you to tackle this!

          Hi Lewis,
          are you still interested, may be you can find some time before Christmas ?
          I tried to do a quick fix but got confused with what should be included as far as various Tika dependencies/parsers are concerned...
          Cheers, Sergey

          sergey_beryozkin Sergey Beryozkin added a comment - Hi Lewis, are you still interested, may be you can find some time before Christmas ? I tried to do a quick fix but got confused with what should be included as far as various Tika dependencies/parsers are concerned... Cheers, Sergey
          f.bosch@genkgo.nl Frederik Bosch added a comment -

          I am also still interested in Tomcat/WAR support.

          f.bosch@genkgo.nl Frederik Bosch added a comment - I am also still interested in Tomcat/WAR support.

          I have a half baked patch locally for webapp and WAR support similar to what we have over on Any23.
          I'll try my best to hammer this soon folks. Sorry about the ridiculous wait. God

          lewismc Lewis John McGibbney added a comment - I have a half baked patch locally for webapp and WAR support similar to what we have over on Any23. I'll try my best to hammer this soon folks. Sorry about the ridiculous wait. God

          lewismc, if you have the time, this would be great to have.

          tpalsulich Tyler Bui-Palsulich added a comment - lewismc , if you have the time, this would be great to have.
          davemeikle Dave Meikle added a comment -
          • Pushed to 1.11 following 1.10 release
          davemeikle Dave Meikle added a comment - Pushed to 1.11 following 1.10 release
          ianw Ian Williams added a comment -

          I am out of the office until Mon 10 Aug 2015.

          Regards
          Ian

          ianw Ian Williams added a comment - I am out of the office until Mon 10 Aug 2015. Regards Ian

          People

            Unassigned Unassigned
            gcc Graham Charters
            Votes:
            3 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

              Created:
              Updated: