Uploaded image for project: 'Tika'
  1. Tika
  2. TIKA-2298

To improve object recognition parser so that it may work without external RESTful service setup

    Details

    • Type: Improvement
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 1.14
    • Fix Version/s: 1.15
    • Component/s: parser
    • Flags:
      Patch

      Description

      When ObjectRecognitionParser was built to do image recognition, there wasn't
      good support for Java frameworks. All the popular neural networks were in
      C++ or python. Since there was nothing that runs within JVM, we tried
      several ways to glue them to Tika (like CLI, JNI, gRPC, REST).
      However, this game is changing slowly now. Deeplearning4j, the most famous
      neural network library for JVM, now supports importing models that are
      pre-trained in python/C++ based kits [5].

      Improvement:
      It will be nice to have an implementation of ObjectRecogniser that
      doesn't require any external setup(like installation of native libraries or
      starting REST services). Reasons: easy to distribute and also to cut the IO
      time.

        Activity

        Hide
        asmehra95 Avtar Singh added a comment -

        Not able run the VGG16 model in dl4j
        When I try to run full fledged model i get this error.
        Exception in thread "main" java.lang.OutOfMemoryError: Cannot allocate new FloatPointer(138357544): totalBytes = 1G, physicalBytes = 2G
        at org.bytedeco.javacpp.FloatPointer.<init>(FloatPointer.java:76)
        at org.nd4j.linalg.api.buffer.BaseDataBuffer.<init>(BaseDataBuffer.java:445)
        at org.nd4j.linalg.api.buffer.FloatBuffer.<init>(FloatBuffer.java:57)
        at org.nd4j.linalg.api.buffer.factory.DefaultDataBufferFactory.createFloat(DefaultDataBufferFactory.java:236)
        at org.nd4j.linalg.factory.Nd4j.createBuffer(Nd4j.java:1301)
        at org.nd4j.linalg.factory.Nd4j.createBuffer(Nd4j.java:1275)
        at org.nd4j.linalg.api.ndarray.BaseNDArray.<init>(BaseNDArray.java:252)
        at org.nd4j.linalg.cpu.nativecpu.NDArray.<init>(NDArray.java:109)
        at org.nd4j.linalg.cpu.nativecpu.CpuNDArrayFactory.create(CpuNDArrayFactory.java:247)
        at org.nd4j.linalg.factory.Nd4j.create(Nd4j.java:4768)
        at org.nd4j.linalg.factory.Nd4j.create(Nd4j.java:4726)
        at org.nd4j.linalg.factory.Nd4j.create(Nd4j.java:3861)
        at org.deeplearning4j.nn.graph.ComputationGraph.init(ComputationGraph.java:342)
        at org.deeplearning4j.nn.graph.ComputationGraph.init(ComputationGraph.java:274)
        at org.deeplearning4j.nn.modelimport.keras.KerasModel.getComputationGraph(KerasModel.java:483)
        at org.deeplearning4j.nn.modelimport.keras.KerasModel.getComputationGraph(KerasModel.java:471)
        at org.deeplearning4j.nn.modelimport.keras.KerasModelImport.importKerasModelAndWeights(KerasModelImport.java:178)
        at modelImport.ModelImportConfig.main(ModelImportConfig.java:18)
        Caused by: java.lang.OutOfMemoryError: Native allocator returned address == 0
        at org.bytedeco.javacpp.FloatPointer.<init>(FloatPointer.java:70)
        ... 17 more

        when i run the model that says 'NoTop' It is says: Invalid configuration
        I found out in the source code for helper functions, that the json file needs fixing.

        I am running on i5 6th gen with 4gb RAM.
        I tried 2 OS: Ubuntu and Window.
        Is there any way i can run it?

        Show
        asmehra95 Avtar Singh added a comment - Not able run the VGG16 model in dl4j When I try to run full fledged model i get this error. Exception in thread "main" java.lang.OutOfMemoryError: Cannot allocate new FloatPointer(138357544): totalBytes = 1G, physicalBytes = 2G at org.bytedeco.javacpp.FloatPointer.<init>(FloatPointer.java:76) at org.nd4j.linalg.api.buffer.BaseDataBuffer.<init>(BaseDataBuffer.java:445) at org.nd4j.linalg.api.buffer.FloatBuffer.<init>(FloatBuffer.java:57) at org.nd4j.linalg.api.buffer.factory.DefaultDataBufferFactory.createFloat(DefaultDataBufferFactory.java:236) at org.nd4j.linalg.factory.Nd4j.createBuffer(Nd4j.java:1301) at org.nd4j.linalg.factory.Nd4j.createBuffer(Nd4j.java:1275) at org.nd4j.linalg.api.ndarray.BaseNDArray.<init>(BaseNDArray.java:252) at org.nd4j.linalg.cpu.nativecpu.NDArray.<init>(NDArray.java:109) at org.nd4j.linalg.cpu.nativecpu.CpuNDArrayFactory.create(CpuNDArrayFactory.java:247) at org.nd4j.linalg.factory.Nd4j.create(Nd4j.java:4768) at org.nd4j.linalg.factory.Nd4j.create(Nd4j.java:4726) at org.nd4j.linalg.factory.Nd4j.create(Nd4j.java:3861) at org.deeplearning4j.nn.graph.ComputationGraph.init(ComputationGraph.java:342) at org.deeplearning4j.nn.graph.ComputationGraph.init(ComputationGraph.java:274) at org.deeplearning4j.nn.modelimport.keras.KerasModel.getComputationGraph(KerasModel.java:483) at org.deeplearning4j.nn.modelimport.keras.KerasModel.getComputationGraph(KerasModel.java:471) at org.deeplearning4j.nn.modelimport.keras.KerasModelImport.importKerasModelAndWeights(KerasModelImport.java:178) at modelImport.ModelImportConfig.main(ModelImportConfig.java:18) Caused by: java.lang.OutOfMemoryError: Native allocator returned address == 0 at org.bytedeco.javacpp.FloatPointer.<init>(FloatPointer.java:70) ... 17 more when i run the model that says 'NoTop' It is says: Invalid configuration I found out in the source code for helper functions, that the json file needs fixing. I am running on i5 6th gen with 4gb RAM. I tried 2 OS: Ubuntu and Window. Is there any way i can run it?
        Hide
        thammegowda Thamme Gowda added a comment -

        Avtar Singh
        Please share a link to your code, I will have a look on this!

        Could you also refer to my example code at https://github.com/USCDataScience/dl4j-kerasimport-examples/tree/master/dl4j-import-example and see what flags to pass to the importer (especially flags to disable further training)?

        PR to that repo with your VGG16 example would be greatly appreciated!

        Show
        thammegowda Thamme Gowda added a comment - Avtar Singh Please share a link to your code, I will have a look on this! Could you also refer to my example code at https://github.com/USCDataScience/dl4j-kerasimport-examples/tree/master/dl4j-import-example and see what flags to pass to the importer (especially flags to disable further training)? PR to that repo with your VGG16 example would be greatly appreciated!

          People

          • Assignee:
            Unassigned
            Reporter:
            asmehra95 Avtar Singh
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:

              Time Tracking

              Estimated:
              Original Estimate - 672h
              672h
              Remaining:
              Remaining Estimate - 672h
              672h
              Logged:
              Time Spent - Not Specified
              Not Specified

                Development