Solr
  1. Solr
  2. SOLR-8166

Introduce possibility to configure ParseContext in ExtractingRequestHandler/ExtractingDocumentLoader

    Details

      Description

      Actually there is no possibility to hand over some additional configuration by document extracting with ExtractingRequestHandler/ExtractingDocumentLoader.

      For example I need to put org.apache.tika.parser.pdf.PDFParserConfig with "extractInlineImages" set to true in ParseContext to trigger extraction/OCR recognizing of embedded images from pdf.

      It would be nice to have possibility to configure created ParseContext due xml-config file like TikaConfig does.

      I would suggest to have following:

      solrconfig.xml:
      <requestHandler name="/update/extract" class="org.apache.solr.handler.extraction.ExtractingRequestHandler">
      <str name="parseContext.config">parseContext.config</str>
      </requestHandler>

      parseContext.config:

      <entries>
      <entry class="org.apache.tika.parser.pdf.PDFParserConfig" value="org.apache.tika.parser.pdf.PDFParserConfig">
      <property name="extractInlineImages" value="true"/>
      </entry>
      </entries>

      1. SOLR-8166.patch
        19 kB
        Uwe Schindler
      2. SOLR-8166.patch
        19 kB
        Uwe Schindler

        Activity

        Hide
        ASF GitHub Bot added a comment -

        GitHub user abinet opened a pull request:

        https://github.com/apache/lucene-solr/pull/206

        SOLR-8166 provide config for tika's ParseContext

        You can merge this pull request into a Git repository by running:

        $ git pull https://github.com/abinet/lucene-solr origin/branch_5x

        Alternatively you can review and apply these changes as the patch at:

        https://github.com/apache/lucene-solr/pull/206.patch

        To close this pull request, make a commit to your master/trunk branch
        with (at least) the following in the commit message:

        This closes #206


        commit 1fb220f28e77bac4f9e2c99ffc6f4412aeaa013b
        Author: Andriy Binetsky <abinetsky@olmero.ch>
        Date: 2015-10-16T08:54:53Z

        SOLR-8166 provide config for tika's ParseContext


        Show
        ASF GitHub Bot added a comment - GitHub user abinet opened a pull request: https://github.com/apache/lucene-solr/pull/206 SOLR-8166 provide config for tika's ParseContext You can merge this pull request into a Git repository by running: $ git pull https://github.com/abinet/lucene-solr origin/branch_5x Alternatively you can review and apply these changes as the patch at: https://github.com/apache/lucene-solr/pull/206.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #206 commit 1fb220f28e77bac4f9e2c99ffc6f4412aeaa013b Author: Andriy Binetsky <abinetsky@olmero.ch> Date: 2015-10-16T08:54:53Z SOLR-8166 provide config for tika's ParseContext
        Hide
        Uwe Schindler added a comment - - edited

        Hi,
        we disallow using setAccessible inside reflection throughout Lucene/Solr (cause is Java 9 where this is veeeery limited), so your patch would not pass the code quality checks (forbidden-apis).
        I would suggest to add a ParseContextFactory that you can specify in your config and that has to be supplied by the user, implemented as native Java code by the user (using Solr's plugin mechanism).

        Alternatively add setters for all ParseContext methods in your parser.

        Show
        Uwe Schindler added a comment - - edited Hi, we disallow using setAccessible inside reflection throughout Lucene/Solr (cause is Java 9 where this is veeeery limited), so your patch would not pass the code quality checks (forbidden-apis). I would suggest to add a ParseContextFactory that you can specify in your config and that has to be supplied by the user, implemented as native Java code by the user (using Solr's plugin mechanism). Alternatively add setters for all ParseContext methods in your parser.
        Hide
        Uwe Schindler added a comment -

        You are alos using the wrong classloader for doing this stuff. This will break with Solr plugins. You have to pass the SolrResourceLoader down to your parser and this one has to call the class loading methods it provides. Don't use native Classloaders.

        Show
        Uwe Schindler added a comment - You are alos using the wrong classloader for doing this stuff. This will break with Solr plugins. You have to pass the SolrResourceLoader down to your parser and this one has to call the class loading methods it provides. Don't use native Classloaders.
        Hide
        ASF GitHub Bot added a comment -

        Github user abinet commented on the pull request:

        https://github.com/apache/lucene-solr/pull/206#issuecomment-148731447

        SOLR-8166 provide config for tika's ParseContext
        I would still prefer to modify existing ExtractingDocumentLoader rather then provide additional plugin with factory.
        setAccessible is awoided now
        And core.resourceClassloader is used

        Show
        ASF GitHub Bot added a comment - Github user abinet commented on the pull request: https://github.com/apache/lucene-solr/pull/206#issuecomment-148731447 SOLR-8166 provide config for tika's ParseContext I would still prefer to modify existing ExtractingDocumentLoader rather then provide additional plugin with factory. setAccessible is awoided now And core.resourceClassloader is used
        Hide
        ASF GitHub Bot added a comment -

        Github user uschindler commented on a diff in the pull request:

        https://github.com/apache/lucene-solr/pull/206#discussion_r42311681

        — Diff: solr/contrib/extraction/src/java/org/apache/solr/handler/extraction/ExtractingRequestHandler.java —
        @@ -79,6 +81,20 @@ public void inform(SolrCore core)

        { throw new SolrException(ErrorCode.SERVER_ERROR, e); }

        }
        +
        + String parseContextConfigLoc = (String) initArgs.get(PARSE_CONTEXT_CONFIG);
        + if (parseContextConfigLoc != null) {
        + File parseContextConfigFile = new File(parseContextConfigLoc);
        + if (parseContextConfigFile.isAbsolute() == false) {
        + parseContextConfigFile = new File(core.getResourceLoader().getConfigDir(), parseContextConfigFile.getPath());
        — End diff –

        Please dont create a file instance here, because with Zookeeper there is no config directory available. Just use the ResourceLoader methods to load the config file as InputStream (using RL#getResource()). Just pass this input stream to the XML parser.

        Show
        ASF GitHub Bot added a comment - Github user uschindler commented on a diff in the pull request: https://github.com/apache/lucene-solr/pull/206#discussion_r42311681 — Diff: solr/contrib/extraction/src/java/org/apache/solr/handler/extraction/ExtractingRequestHandler.java — @@ -79,6 +81,20 @@ public void inform(SolrCore core) { throw new SolrException(ErrorCode.SERVER_ERROR, e); } } + + String parseContextConfigLoc = (String) initArgs.get(PARSE_CONTEXT_CONFIG); + if (parseContextConfigLoc != null) { + File parseContextConfigFile = new File(parseContextConfigLoc); + if (parseContextConfigFile.isAbsolute() == false) { + parseContextConfigFile = new File(core.getResourceLoader().getConfigDir(), parseContextConfigFile.getPath()); — End diff – Please dont create a file instance here, because with Zookeeper there is no config directory available. Just use the ResourceLoader methods to load the config file as InputStream (using RL#getResource()). Just pass this input stream to the XML parser.
        Hide
        ASF GitHub Bot added a comment -

        Github user uschindler commented on a diff in the pull request:

        https://github.com/apache/lucene-solr/pull/206#discussion_r42311693

        — Diff: solr/contrib/extraction/src/java/org/apache/solr/handler/extraction/ExtractingRequestHandler.java —
        @@ -79,6 +81,20 @@ public void inform(SolrCore core)

        { throw new SolrException(ErrorCode.SERVER_ERROR, e); }

        }
        +
        + String parseContextConfigLoc = (String) initArgs.get(PARSE_CONTEXT_CONFIG);
        + if (parseContextConfigLoc != null) {
        + File parseContextConfigFile = new File(parseContextConfigLoc);
        + if (parseContextConfigFile.isAbsolute() == false)

        { + parseContextConfigFile = new File(core.getResourceLoader().getConfigDir(), parseContextConfigFile.getPath()); + }

        + try {
        + parseContextConfig = new ParseContextConfig(parseContextConfigFile, core.getResourceLoader().getClassLoader());
        — End diff –

        I would directly pass the resourceloader and not the classloader. Resourceloader has easy-to-use methods to load classes, too.

        Show
        ASF GitHub Bot added a comment - Github user uschindler commented on a diff in the pull request: https://github.com/apache/lucene-solr/pull/206#discussion_r42311693 — Diff: solr/contrib/extraction/src/java/org/apache/solr/handler/extraction/ExtractingRequestHandler.java — @@ -79,6 +81,20 @@ public void inform(SolrCore core) { throw new SolrException(ErrorCode.SERVER_ERROR, e); } } + + String parseContextConfigLoc = (String) initArgs.get(PARSE_CONTEXT_CONFIG); + if (parseContextConfigLoc != null) { + File parseContextConfigFile = new File(parseContextConfigLoc); + if (parseContextConfigFile.isAbsolute() == false) { + parseContextConfigFile = new File(core.getResourceLoader().getConfigDir(), parseContextConfigFile.getPath()); + } + try { + parseContextConfig = new ParseContextConfig(parseContextConfigFile, core.getResourceLoader().getClassLoader()); — End diff – I would directly pass the resourceloader and not the classloader. Resourceloader has easy-to-use methods to load classes, too.
        Hide
        ASF GitHub Bot added a comment -

        Github user uschindler commented on a diff in the pull request:

        https://github.com/apache/lucene-solr/pull/206#discussion_r42311704

        — Diff: solr/contrib/extraction/src/java/org/apache/solr/handler/extraction/ParseContextConfig.java —
        @@ -0,0 +1,113 @@
        +package org.apache.solr.handler.extraction;
        +
        +/*
        + * Licensed to the Apache Software Foundation (ASF) under one or more
        + * contributor license agreements. See the NOTICE file distributed with
        + * this work for additional information regarding copyright ownership.
        + * The ASF licenses this file to You under the Apache License, Version 2.0
        + * (the "License"); you may not use this file except in compliance with
        + * the License. You may obtain a copy of the License at
        + *
        + * http://www.apache.org/licenses/LICENSE-2.0
        + *
        + * Unless required by applicable law or agreed to in writing, software
        + * distributed under the License is distributed on an "AS IS" BASIS,
        + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
        + * See the License for the specific language governing permissions and
        + * limitations under the License.
        + */
        +
        +import javax.xml.parsers.DocumentBuilder;
        +import javax.xml.parsers.DocumentBuilderFactory;
        +import java.io.File;
        +import java.lang.reflect.Field;
        +import java.lang.reflect.Method;
        +import java.util.HashMap;
        +import java.util.Map;
        +
        +import org.apache.tika.parser.ParseContext;
        +import org.w3c.dom.Document;
        +import org.w3c.dom.Element;
        +import org.w3c.dom.NamedNodeMap;
        +import org.w3c.dom.Node;
        +import org.w3c.dom.NodeList;
        +
        +public class ParseContextConfig {
        + private Map<Class, Object> entries = new HashMap<>();
        +
        + public ParseContextConfig() {
        — End diff –

        Please remove unused constructors.

        Show
        ASF GitHub Bot added a comment - Github user uschindler commented on a diff in the pull request: https://github.com/apache/lucene-solr/pull/206#discussion_r42311704 — Diff: solr/contrib/extraction/src/java/org/apache/solr/handler/extraction/ParseContextConfig.java — @@ -0,0 +1,113 @@ +package org.apache.solr.handler.extraction; + +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +import javax.xml.parsers.DocumentBuilder; +import javax.xml.parsers.DocumentBuilderFactory; +import java.io.File; +import java.lang.reflect.Field; +import java.lang.reflect.Method; +import java.util.HashMap; +import java.util.Map; + +import org.apache.tika.parser.ParseContext; +import org.w3c.dom.Document; +import org.w3c.dom.Element; +import org.w3c.dom.NamedNodeMap; +import org.w3c.dom.Node; +import org.w3c.dom.NodeList; + +public class ParseContextConfig { + private Map<Class, Object> entries = new HashMap<>(); + + public ParseContextConfig() { — End diff – Please remove unused constructors.
        Hide
        Uwe Schindler added a comment -

        I left some comments on the pull request.

        Show
        Uwe Schindler added a comment - I left some comments on the pull request.
        Hide
        ASF GitHub Bot added a comment -

        Github user uschindler commented on a diff in the pull request:

        https://github.com/apache/lucene-solr/pull/206#discussion_r42311723

        — Diff: solr/contrib/extraction/src/test-files/log4j.properties —
        @@ -0,0 +1,31 @@
        +# Logging level
        — End diff –

        Please revert changes to this file. They are unrelated.

        Show
        ASF GitHub Bot added a comment - Github user uschindler commented on a diff in the pull request: https://github.com/apache/lucene-solr/pull/206#discussion_r42311723 — Diff: solr/contrib/extraction/src/test-files/log4j.properties — @@ -0,0 +1,31 @@ +# Logging level — End diff – Please revert changes to this file. They are unrelated.
        Hide
        ASF GitHub Bot added a comment -

        Github user uschindler commented on a diff in the pull request:

        https://github.com/apache/lucene-solr/pull/206#discussion_r42311743

        — Diff: solr/contrib/extraction/src/test/org/apache/solr/handler/extraction/ParseContextConfigTest.java —
        @@ -0,0 +1,54 @@
        +package org.apache.solr.handler.extraction;
        +
        +/*
        + * Licensed to the Apache Software Foundation (ASF) under one or more
        + * contributor license agreements. See the NOTICE file distributed with
        + * this work for additional information regarding copyright ownership.
        + * The ASF licenses this file to You under the Apache License, Version 2.0
        + * (the "License"); you may not use this file except in compliance with
        + * the License. You may obtain a copy of the License at
        + *
        + * http://www.apache.org/licenses/LICENSE-2.0
        + *
        + * Unless required by applicable law or agreed to in writing, software
        + * distributed under the License is distributed on an "AS IS" BASIS,
        + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
        + * See the License for the specific language governing permissions and
        + * limitations under the License.
        + */
        +
        +import org.apache.tika.parser.ParseContext;
        +import org.apache.tika.parser.pdf.PDFParserConfig;
        +import org.apache.xerces.dom.DocumentImpl;
        +import org.junit.Test;
        +import org.w3c.dom.Element;
        +
        +import static org.junit.Assert.*;
        +
        +public class ParseContextConfigTest {
        — End diff –

        Please subclass SolrTestCaseJ4, dont create plain tests, as those don't use the test framework, which does additional checks.

        Show
        ASF GitHub Bot added a comment - Github user uschindler commented on a diff in the pull request: https://github.com/apache/lucene-solr/pull/206#discussion_r42311743 — Diff: solr/contrib/extraction/src/test/org/apache/solr/handler/extraction/ParseContextConfigTest.java — @@ -0,0 +1,54 @@ +package org.apache.solr.handler.extraction; + +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +import org.apache.tika.parser.ParseContext; +import org.apache.tika.parser.pdf.PDFParserConfig; +import org.apache.xerces.dom.DocumentImpl; +import org.junit.Test; +import org.w3c.dom.Element; + +import static org.junit.Assert.*; + +public class ParseContextConfigTest { — End diff – Please subclass SolrTestCaseJ4, dont create plain tests, as those don't use the test framework, which does additional checks.
        Hide
        ASF GitHub Bot added a comment -

        Github user uschindler commented on a diff in the pull request:

        https://github.com/apache/lucene-solr/pull/206#discussion_r42311748

        — Diff: solr/contrib/extraction/src/java/org/apache/solr/handler/extraction/ParseContextConfig.java —
        @@ -0,0 +1,113 @@
        +package org.apache.solr.handler.extraction;
        +
        +/*
        + * Licensed to the Apache Software Foundation (ASF) under one or more
        + * contributor license agreements. See the NOTICE file distributed with
        + * this work for additional information regarding copyright ownership.
        + * The ASF licenses this file to You under the Apache License, Version 2.0
        + * (the "License"); you may not use this file except in compliance with
        + * the License. You may obtain a copy of the License at
        + *
        + * http://www.apache.org/licenses/LICENSE-2.0
        + *
        + * Unless required by applicable law or agreed to in writing, software
        + * distributed under the License is distributed on an "AS IS" BASIS,
        + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
        + * See the License for the specific language governing permissions and
        + * limitations under the License.
        + */
        +
        +import javax.xml.parsers.DocumentBuilder;
        +import javax.xml.parsers.DocumentBuilderFactory;
        +import java.io.File;
        +import java.lang.reflect.Field;
        +import java.lang.reflect.Method;
        +import java.util.HashMap;
        +import java.util.Map;
        +
        +import org.apache.tika.parser.ParseContext;
        +import org.w3c.dom.Document;
        +import org.w3c.dom.Element;
        +import org.w3c.dom.NamedNodeMap;
        +import org.w3c.dom.Node;
        +import org.w3c.dom.NodeList;
        +
        +public class ParseContextConfig {
        + private Map<Class, Object> entries = new HashMap<>();
        +
        + public ParseContextConfig()

        { + }

        +
        + public ParseContextConfig(Element element, ClassLoader loader) throws Exception

        { + extract(element, loader); + }

        +
        + public ParseContextConfig(Document document, ClassLoader loader) throws Exception

        { + this(document.getDocumentElement(), loader); + }

        +
        + public ParseContextConfig(String fileName, ClassLoader loader) throws Exception

        { + this(getBuilder().parse(fileName), loader); + }

        +
        + public ParseContextConfig(File file, ClassLoader loader) throws Exception

        { + this(getBuilder().parse(file), loader); + }

        +
        + private static DocumentBuilder getBuilder() throws Exception

        { + return DocumentBuilderFactory.newInstance().newDocumentBuilder(); + }

        +
        + private void extract(Element element, ClassLoader loader) throws Exception {
        + final NodeList xmlEntries = element.getElementsByTagName("entry");
        + for (int i = 0; i < xmlEntries.getLength(); i++) {
        + final NamedNodeMap xmlEntryAttributes = xmlEntries.item.getAttributes();
        + final String className = xmlEntryAttributes.getNamedItem("class").getNodeValue();
        + final String implementationName = xmlEntryAttributes.getNamedItem("value").getNodeValue();
        +
        + final NodeList xmlProperties = ((Element)xmlEntries.item).getElementsByTagName("property");
        +
        + final Class<?> interfaceClass = loader.loadClass(className);
        + final Class<?> implementationClass = loader.loadClass(implementationName);
        + final Object instance = implementationClass.newInstance();
        +
        + for (int j = 0; j < xmlProperties.getLength(); j++) {
        + final Node xmlProperty = xmlProperties.item(j);
        + final NamedNodeMap xmlPropertyAttributes = xmlProperty.getAttributes();
        +
        + final String propertyName = xmlPropertyAttributes.getNamedItem("name").getNodeValue();
        + final String propertyValue = xmlPropertyAttributes.getNamedItem("value").getNodeValue();
        +
        + final Field declaredField = interfaceClass.getDeclaredField(propertyName);
        + final Class<?> type = declaredField.getType();
        + final Method declaredMethod = interfaceClass.getDeclaredMethod("set" + propertyName.substring(0, 1).toUpperCase() + propertyName.substring(1, propertyName.length()), type);
        — End diff –

        This does not work in Turkey! Don't use String#toUpper/LowerCase() without giving a Locale (Locale.ROOT is needed here)

        Show
        ASF GitHub Bot added a comment - Github user uschindler commented on a diff in the pull request: https://github.com/apache/lucene-solr/pull/206#discussion_r42311748 — Diff: solr/contrib/extraction/src/java/org/apache/solr/handler/extraction/ParseContextConfig.java — @@ -0,0 +1,113 @@ +package org.apache.solr.handler.extraction; + +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +import javax.xml.parsers.DocumentBuilder; +import javax.xml.parsers.DocumentBuilderFactory; +import java.io.File; +import java.lang.reflect.Field; +import java.lang.reflect.Method; +import java.util.HashMap; +import java.util.Map; + +import org.apache.tika.parser.ParseContext; +import org.w3c.dom.Document; +import org.w3c.dom.Element; +import org.w3c.dom.NamedNodeMap; +import org.w3c.dom.Node; +import org.w3c.dom.NodeList; + +public class ParseContextConfig { + private Map<Class, Object> entries = new HashMap<>(); + + public ParseContextConfig() { + } + + public ParseContextConfig(Element element, ClassLoader loader) throws Exception { + extract(element, loader); + } + + public ParseContextConfig(Document document, ClassLoader loader) throws Exception { + this(document.getDocumentElement(), loader); + } + + public ParseContextConfig(String fileName, ClassLoader loader) throws Exception { + this(getBuilder().parse(fileName), loader); + } + + public ParseContextConfig(File file, ClassLoader loader) throws Exception { + this(getBuilder().parse(file), loader); + } + + private static DocumentBuilder getBuilder() throws Exception { + return DocumentBuilderFactory.newInstance().newDocumentBuilder(); + } + + private void extract(Element element, ClassLoader loader) throws Exception { + final NodeList xmlEntries = element.getElementsByTagName("entry"); + for (int i = 0; i < xmlEntries.getLength(); i++) { + final NamedNodeMap xmlEntryAttributes = xmlEntries.item .getAttributes(); + final String className = xmlEntryAttributes.getNamedItem("class").getNodeValue(); + final String implementationName = xmlEntryAttributes.getNamedItem("value").getNodeValue(); + + final NodeList xmlProperties = ((Element)xmlEntries.item ).getElementsByTagName("property"); + + final Class<?> interfaceClass = loader.loadClass(className); + final Class<?> implementationClass = loader.loadClass(implementationName); + final Object instance = implementationClass.newInstance(); + + for (int j = 0; j < xmlProperties.getLength(); j++) { + final Node xmlProperty = xmlProperties.item(j); + final NamedNodeMap xmlPropertyAttributes = xmlProperty.getAttributes(); + + final String propertyName = xmlPropertyAttributes.getNamedItem("name").getNodeValue(); + final String propertyValue = xmlPropertyAttributes.getNamedItem("value").getNodeValue(); + + final Field declaredField = interfaceClass.getDeclaredField(propertyName); + final Class<?> type = declaredField.getType(); + final Method declaredMethod = interfaceClass.getDeclaredMethod("set" + propertyName.substring(0, 1).toUpperCase() + propertyName.substring(1, propertyName.length()), type); — End diff – This does not work in Turkey! Don't use String#toUpper/LowerCase() without giving a Locale (Locale.ROOT is needed here)
        Hide
        ASF GitHub Bot added a comment -

        Github user abinet commented on the pull request:

        https://github.com/apache/lucene-solr/pull/206#issuecomment-150487320

        I fixed all issues you mentioned. It is also possible to use PropertyUtils from Apache commons-beanutils but we need to introduce new dependency then.

        Show
        ASF GitHub Bot added a comment - Github user abinet commented on the pull request: https://github.com/apache/lucene-solr/pull/206#issuecomment-150487320 I fixed all issues you mentioned. It is also possible to use PropertyUtils from Apache commons-beanutils but we need to introduce new dependency then.
        Hide
        ASF GitHub Bot added a comment -

        Github user uschindler commented on the pull request:

        https://github.com/apache/lucene-solr/pull/206#issuecomment-150497999

        Hi,
        a useful alternative to using commons-beanutils is using the JDK internal bean classes. See https://github.com/apache/lucene-solr/blob/trunk/solr/core/src/java/org/apache/solr/handler/admin/SystemInfoHandler.java#L166-L232 for an example. We read the properties of MXBeans in the SystemInfoHandler here. You can get a BeanInfo from the class and then use the property descriptors to get/set properties. And that is what you are doing.
        Because the JDK code is partly buggy for historical reasons, make sure to use the correct flags added with JDK 7 when inspecting and getting the property descriptors (disabling caches which are broken).

        Show
        ASF GitHub Bot added a comment - Github user uschindler commented on the pull request: https://github.com/apache/lucene-solr/pull/206#issuecomment-150497999 Hi, a useful alternative to using commons-beanutils is using the JDK internal bean classes. See https://github.com/apache/lucene-solr/blob/trunk/solr/core/src/java/org/apache/solr/handler/admin/SystemInfoHandler.java#L166-L232 for an example. We read the properties of MXBeans in the SystemInfoHandler here. You can get a BeanInfo from the class and then use the property descriptors to get/set properties. And that is what you are doing. Because the JDK code is partly buggy for historical reasons, make sure to use the correct flags added with JDK 7 when inspecting and getting the property descriptors (disabling caches which are broken).
        Hide
        ASF GitHub Bot added a comment -

        Github user abinet commented on the pull request:

        https://github.com/apache/lucene-solr/pull/206#issuecomment-153773456

        All done. it it clean enough for merge?

        Show
        ASF GitHub Bot added a comment - Github user abinet commented on the pull request: https://github.com/apache/lucene-solr/pull/206#issuecomment-153773456 All done. it it clean enough for merge?
        Hide
        ASF GitHub Bot added a comment -

        Github user uschindler commented on the pull request:

        https://github.com/apache/lucene-solr/pull/206#issuecomment-153793014

        Hey, yes exactly like that I will review that later. Give me a day or two, I am looking in merging it.

        Show
        ASF GitHub Bot added a comment - Github user uschindler commented on the pull request: https://github.com/apache/lucene-solr/pull/206#issuecomment-153793014 Hey, yes exactly like that I will review that later. Give me a day or two, I am looking in merging it.
        Hide
        Uwe Schindler added a comment -

        Hi,
        I cleaned up your PR (removed changes from unrelated files, refactored the config loading and type checking a bit) and attached as a patch here. I will commit soon.

        I changed the name of the attribute used for the implementation class to "impl=" instead of "value=", which seems wrong.

        I also added the remaining native types to your type converter (float, byte, short).

        Show
        Uwe Schindler added a comment - Hi, I cleaned up your PR (removed changes from unrelated files, refactored the config loading and type checking a bit) and attached as a patch here. I will commit soon. I changed the name of the attribute used for the implementation class to "impl=" instead of "value=", which seems wrong. I also added the remaining native types to your type converter (float, byte, short).
        Hide
        Uwe Schindler added a comment -

        Slightly simplified patch. I was able to remove the long chain of ifs for guessing property type. The Java Beans Framework does this automatically, so we can set/get properties as String easily using the same beans API.

        Show
        Uwe Schindler added a comment - Slightly simplified patch. I was able to remove the long chain of ifs for guessing property type. The Java Beans Framework does this automatically, so we can set/get properties as String easily using the same beans API.
        Hide
        ASF subversion and git services added a comment -

        Commit 1712629 from Uwe Schindler in branch 'dev/trunk'
        [ https://svn.apache.org/r1712629 ]

        SOLR-8166: Introduce possibility to configure ParseContext in ExtractingRequestHandler/ExtractingDocumentLoader

        Show
        ASF subversion and git services added a comment - Commit 1712629 from Uwe Schindler in branch 'dev/trunk' [ https://svn.apache.org/r1712629 ] SOLR-8166 : Introduce possibility to configure ParseContext in ExtractingRequestHandler/ExtractingDocumentLoader
        Hide
        ASF subversion and git services added a comment -

        Commit 1712632 from Uwe Schindler in branch 'dev/branches/branch_5x'
        [ https://svn.apache.org/r1712632 ]

        Merged revision(s) 1712629 from lucene/dev/trunk:
        SOLR-8166: Introduce possibility to configure ParseContext in ExtractingRequestHandler/ExtractingDocumentLoader
        This closes Github PR #206

        Show
        ASF subversion and git services added a comment - Commit 1712632 from Uwe Schindler in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1712632 ] Merged revision(s) 1712629 from lucene/dev/trunk: SOLR-8166 : Introduce possibility to configure ParseContext in ExtractingRequestHandler/ExtractingDocumentLoader This closes Github PR #206
        Hide
        ASF GitHub Bot added a comment -

        Github user uschindler commented on the pull request:

        https://github.com/apache/lucene-solr/pull/206#issuecomment-153850780

        Hi, this was merged into SVN. Can you close the pull request, the automatic close did not work...

        Show
        ASF GitHub Bot added a comment - Github user uschindler commented on the pull request: https://github.com/apache/lucene-solr/pull/206#issuecomment-153850780 Hi, this was merged into SVN. Can you close the pull request, the automatic close did not work...
        Hide
        ASF GitHub Bot added a comment -

        Github user abinet commented on the pull request:

        https://github.com/apache/lucene-solr/pull/206#issuecomment-153851266

        Hi, thank you for support and cooperation. I'll close pull request.

        Show
        ASF GitHub Bot added a comment - Github user abinet commented on the pull request: https://github.com/apache/lucene-solr/pull/206#issuecomment-153851266 Hi, thank you for support and cooperation. I'll close pull request.
        Hide
        ASF GitHub Bot added a comment -

        Github user abinet closed the pull request at:

        https://github.com/apache/lucene-solr/pull/206

        Show
        ASF GitHub Bot added a comment - Github user abinet closed the pull request at: https://github.com/apache/lucene-solr/pull/206
        Hide
        Uwe Schindler added a comment -

        Thanks Andriy!

        Show
        Uwe Schindler added a comment - Thanks Andriy!
        Hide
        ASF subversion and git services added a comment -

        Commit 1712677 from Uwe Schindler in branch 'dev/trunk'
        [ https://svn.apache.org/r1712677 ]

        SOLR-8166: Add some null checks

        Show
        ASF subversion and git services added a comment - Commit 1712677 from Uwe Schindler in branch 'dev/trunk' [ https://svn.apache.org/r1712677 ] SOLR-8166 : Add some null checks
        Hide
        ASF subversion and git services added a comment -

        Commit 1712678 from Uwe Schindler in branch 'dev/branches/branch_5x'
        [ https://svn.apache.org/r1712678 ]

        Merged revision(s) 1712677 from lucene/dev/trunk:
        SOLR-8166: Add some null checks

        Show
        ASF subversion and git services added a comment - Commit 1712678 from Uwe Schindler in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1712678 ] Merged revision(s) 1712677 from lucene/dev/trunk: SOLR-8166 : Add some null checks

          People

          • Assignee:
            Uwe Schindler
            Reporter:
            Andriy Binetsky
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development