Solr
  1. Solr
  2. SOLR-4361

DIH request parameters with dots throws UnsupportedOperationException

    Details

    • Type: Bug Bug
    • Status: Reopened
    • Priority: Minor Minor
    • Resolution: Unresolved
    • Affects Version/s: 4.1
    • Fix Version/s: 4.9, 6.0
    • Labels:
      None

      Description

      If the user puts placeholders for request parameters and these contain dots, DIH fails. Current workaround is to either use no dots or use the 4.0 DIH jar.

        Activity

        Hide
        James Dyer added a comment -

        Example from user list:

        I've just tried to upgrade from 4.0 to 4.1 and I have the following
        exception when reindexing my data:

        Caused by: java.lang.UnsupportedOperationException
        at java.util.Collections$UnmodifiableMap.put(Collections.java:1283)
        at
        org.apache.solr.handler.dataimport.VariableResolver.currentLevelMap(VariableResolver.java:204)
        at
        org.apache.solr.handler.dataimport.VariableResolver.resolve(VariableResolver.java:94)
        at
        org.apache.solr.handler.dataimport.VariableResolver.replaceTokens(VariableResolver.java:144)
        at
        org.apache.solr.handler.dataimport.ContextImpl.replaceTokens(ContextImpl.java:254)
        at
        org.apache.solr.handler.dataimport.JdbcDataSource.resolveVariables(JdbcDataSource.java:203)
        at
        org.apache.solr.handler.dataimport.JdbcDataSource.createConnectionFactory(JdbcDataSource.java:101)
        at
        org.apache.solr.handler.dataimport.JdbcDataSource.init(JdbcDataSource.java:62)
        at
        org.apache.solr.handler.dataimport.DataImporter.getDataSourceInstance(DataImporter.java:394)

        It seems to be related to the use of placeholders in data-config.xml:

        <dataConfig>
        <dataSource type="JdbcDataSource"
        name="bceDS"
        driver="$

        {dataimporter.request.solr.bceDS.driver}

        "
        url="$

        {dataimporter.request.solr.bceDS.url}

        "
        user="$

        {dataimporter.request.solr.bceDS.user}

        "
        password="$

        {dataimporter.request.solr.bceDS.password}

        "
        batchSize="-1"/>

        solrconfig.xml:

        <requestHandler name="/dataimport"
        class="org.apache.solr.handler.dataimport.DataImportHandler">
        <lst name="defaults">
        <str name="config">data-config.xml</str>

        <!-- dataSource parameters for data-config.xml -->
        <str name="solr.bceDS.driver">...</str>
        <str name="solr.bceDS.url">...</str>
        <str name="solr.bceDS.user">...</str>
        <str name="solr.bceDS.password">...</str>
        </lst>
        </requestHandler>

        Show
        James Dyer added a comment - Example from user list: I've just tried to upgrade from 4.0 to 4.1 and I have the following exception when reindexing my data: Caused by: java.lang.UnsupportedOperationException at java.util.Collections$UnmodifiableMap.put(Collections.java:1283) at org.apache.solr.handler.dataimport.VariableResolver.currentLevelMap(VariableResolver.java:204) at org.apache.solr.handler.dataimport.VariableResolver.resolve(VariableResolver.java:94) at org.apache.solr.handler.dataimport.VariableResolver.replaceTokens(VariableResolver.java:144) at org.apache.solr.handler.dataimport.ContextImpl.replaceTokens(ContextImpl.java:254) at org.apache.solr.handler.dataimport.JdbcDataSource.resolveVariables(JdbcDataSource.java:203) at org.apache.solr.handler.dataimport.JdbcDataSource.createConnectionFactory(JdbcDataSource.java:101) at org.apache.solr.handler.dataimport.JdbcDataSource.init(JdbcDataSource.java:62) at org.apache.solr.handler.dataimport.DataImporter.getDataSourceInstance(DataImporter.java:394) It seems to be related to the use of placeholders in data-config.xml: <dataConfig> <dataSource type="JdbcDataSource" name="bceDS" driver="$ {dataimporter.request.solr.bceDS.driver} " url="$ {dataimporter.request.solr.bceDS.url} " user="$ {dataimporter.request.solr.bceDS.user} " password="$ {dataimporter.request.solr.bceDS.password} " batchSize="-1"/> solrconfig.xml: <requestHandler name="/dataimport" class="org.apache.solr.handler.dataimport.DataImportHandler"> <lst name="defaults"> <str name="config">data-config.xml</str> <!-- dataSource parameters for data-config.xml --> <str name="solr.bceDS.driver">...</str> <str name="solr.bceDS.url">...</str> <str name="solr.bceDS.user">...</str> <str name="solr.bceDS.password">...</str> </lst> </requestHandler>
        Hide
        James Dyer added a comment -

        Also, this workaround was mentioned. This should be protected with a unit test so it doesn't get broken, also added to the wiki if not currently documented:

        I do something similar, but without the placeholders in db-data-config.xml. You can define the entire datasource in solrconfig.xml, then leave out that element entirely in db-data-config.xml. It seems really odd, but that is how the code works.

        This is working for me in 4.1, so it might be a workaround for you.

        It looks like this:

        <requestHandler name="/dataimport" class="org.apache.solr.handler.dataimport.DataImportHandler">
        <lst name="defaults">
        <str name="config">db-data-config.xml</str>
        <lst name="datasource">
        <str name="defType">JdbcDataSource</str>
        <str name="driver">com.mysql.jdbc.Driver</str>
        <str name="url">jdbc:mysql://$

        {textbooks.dbhost:nohost}

        /xxxx</str>
        <str name="user">$

        {textbooks.dbuser:yyyyy}

        </str>
        <str name="password">$

        {textbooks.dbpass:zzzzzz}

        </str>
        <str name="batchSize">-1</str>
        <str name="readOnly">true</str>
        <str name="onError">skip</str>
        <str name="netTimeoutForStreamingResults">600</str>
        <str name="zeroDateTimeBehavior">convertToNull</str>
        </lst>
        </lst>
        </requestHandler>

        Show
        James Dyer added a comment - Also, this workaround was mentioned. This should be protected with a unit test so it doesn't get broken, also added to the wiki if not currently documented: I do something similar, but without the placeholders in db-data-config.xml. You can define the entire datasource in solrconfig.xml, then leave out that element entirely in db-data-config.xml. It seems really odd, but that is how the code works. This is working for me in 4.1, so it might be a workaround for you. It looks like this: <requestHandler name="/dataimport" class="org.apache.solr.handler.dataimport.DataImportHandler"> <lst name="defaults"> <str name="config">db-data-config.xml</str> <lst name="datasource"> <str name="defType">JdbcDataSource</str> <str name="driver">com.mysql.jdbc.Driver</str> <str name="url">jdbc:mysql://$ {textbooks.dbhost:nohost} /xxxx</str> <str name="user">$ {textbooks.dbuser:yyyyy} </str> <str name="password">$ {textbooks.dbpass:zzzzzz} </str> <str name="batchSize">-1</str> <str name="readOnly">true</str> <str name="onError">skip</str> <str name="netTimeoutForStreamingResults">600</str> <str name="zeroDateTimeBehavior">convertToNull</str> </lst> </lst> </requestHandler>
        Hide
        James Dyer added a comment -

        Here is a fix and a unit test. All DIH tests pass with this. I will commit in a few days.

        Show
        James Dyer added a comment - Here is a fix and a unit test. All DIH tests pass with this. I will commit in a few days.
        Hide
        Commit Tag Bot added a comment -

        [trunk commit] James Dyer
        http://svn.apache.org/viewvc?view=revision&revision=1455245

        SOLR-4361: DIH to allow handler parameters with dots in the name

        Show
        Commit Tag Bot added a comment - [trunk commit] James Dyer http://svn.apache.org/viewvc?view=revision&revision=1455245 SOLR-4361 : DIH to allow handler parameters with dots in the name
        Hide
        Commit Tag Bot added a comment -

        [branch_4x commit] James Dyer
        http://svn.apache.org/viewvc?view=revision&revision=1455247

        SOLR-4361: DIH to allow handler parameters with dots in the name

        Show
        Commit Tag Bot added a comment - [branch_4x commit] James Dyer http://svn.apache.org/viewvc?view=revision&revision=1455247 SOLR-4361 : DIH to allow handler parameters with dots in the name
        Hide
        Chris Eldredge added a comment -

        Request to reopen this issue. Testing on lucene_solr_4_2 r1459428, TestURLDataSource is still broken when changing "baseurl" to "base.url" and "$

        {dataimporter.request.baseurl}

        " to "$

        {dataimporter.request.base.url}

        ".

        Show
        Chris Eldredge added a comment - Request to reopen this issue. Testing on lucene_solr_4_2 r1459428, TestURLDataSource is still broken when changing "baseurl" to "base.url" and "$ {dataimporter.request.baseurl} " to "$ {dataimporter.request.base.url} ".
        Hide
        James Dyer added a comment -

        Chris,

        Do you have a real-world problem that is still broken, or is this just a problem with modifying TestURLDataSource? This issue is tested in TestVariableResolverEndToEnd. The solrconfig.xml file contains a default parameter "dots.in.hsqldb.driver" with the driver name. The test subsequently references <dataSource ... driver="$

        {dataimporter.request.dots.in.hsqldb.driver}

        " ... />. Prior to fixing VariableResolver, this test would fail because the driver name would not resolve. With this fix, the test passes.

        The difference is that the "dataimporter.request" namespaces are (in reality) added by DocBuilder#getVariableResolver by creating a map for the "dataimporter" namespace and then a child map for the "request" namespace. With the fix here, VariableResolver is still requiring each node in the Variable tree to be added individually, rather than taking the shortcut you used in your modified version of TestURLDataSource. However, it is more forgiving with variable names containing dots: if it cannot walk the tree to find the rightmost name, then it goes as far as it can and then assumes the rest is a name with embedded dots in it.

        Show
        James Dyer added a comment - Chris, Do you have a real-world problem that is still broken, or is this just a problem with modifying TestURLDataSource? This issue is tested in TestVariableResolverEndToEnd. The solrconfig.xml file contains a default parameter "dots.in.hsqldb.driver" with the driver name. The test subsequently references <dataSource ... driver="$ {dataimporter.request.dots.in.hsqldb.driver} " ... />. Prior to fixing VariableResolver, this test would fail because the driver name would not resolve. With this fix, the test passes. The difference is that the "dataimporter.request" namespaces are (in reality) added by DocBuilder#getVariableResolver by creating a map for the "dataimporter" namespace and then a child map for the "request" namespace. With the fix here, VariableResolver is still requiring each node in the Variable tree to be added individually, rather than taking the shortcut you used in your modified version of TestURLDataSource. However, it is more forgiving with variable names containing dots: if it cannot walk the tree to find the rightmost name, then it goes as far as it can and then assumes the rest is a name with embedded dots in it.
        Hide
        Chris Eldredge added a comment -

        Yes, upgrading from 4.0 to 4.2 breaks DIH for me. I'm using the URLDataSource with variables in the baseUrl, such as dataimport.request.server.prefix. The properties are all replaced with empty strings whereas in 4.0 they're correctly substituted. I'll test using variables that don't contain dots to make sure it's related to this issue.

        Show
        Chris Eldredge added a comment - Yes, upgrading from 4.0 to 4.2 breaks DIH for me. I'm using the URLDataSource with variables in the baseUrl, such as dataimport.request.server.prefix. The properties are all replaced with empty strings whereas in 4.0 they're correctly substituted. I'll test using variables that don't contain dots to make sure it's related to this issue.
        Hide
        James Dyer added a comment -

        If this is still a problem for you, by all means, reopen. Please include the line in your data-config.xml that has the variable that doesn't resolve and also the url you're using (or section from solrconfig.xml that has the variable in "defaults"). Based on TestVariableResolverEndToEnd, which does something very similar to what you describe, I would not expect this to still fail.

        Show
        James Dyer added a comment - If this is still a problem for you, by all means, reopen. Please include the line in your data-config.xml that has the variable that doesn't resolve and also the url you're using (or section from solrconfig.xml that has the variable in "defaults"). Based on TestVariableResolverEndToEnd, which does something very similar to what you describe, I would not expect this to still fail.
        Hide
        Chris Eldredge added a comment -

        I don't have permission to reopen the issue, but I just confirmed that URLDataSource does not correctly replace variables that contain dots in its baseUrl. However, it substitutes an empty string instead of throwing UnsupportedOperationException.

        Removing the dots still works around the issue.

        I tested against https://svn.apache.org/repos/asf/lucene/dev/branches/lucene_solr_4_2@1460810

        Show
        Chris Eldredge added a comment - I don't have permission to reopen the issue, but I just confirmed that URLDataSource does not correctly replace variables that contain dots in its baseUrl. However, it substitutes an empty string instead of throwing UnsupportedOperationException. Removing the dots still works around the issue. I tested against https://svn.apache.org/repos/asf/lucene/dev/branches/lucene_solr_4_2@1460810
        Hide
        James Dyer added a comment -

        re-open to investigate the issue reported by Chris Eldredge.

        Chris, Can you provide (at least) the line in your data-config.xml that has the variable that doesn't resolve and also the url you're using (or section from solrconfig.xml that has the variable in "defaults").

        Show
        James Dyer added a comment - re-open to investigate the issue reported by Chris Eldredge. Chris, Can you provide (at least) the line in your data-config.xml that has the variable that doesn't resolve and also the url you're using (or section from solrconfig.xml that has the variable in "defaults").
        Hide
        Chris Eldredge added a comment -

        Snippet of our configuration that stopped working in 4.2:

        solrconfig.xml
        <requestHandler name="/dataimport" class="org.apache.solr.handler.dataimport.DataImportHandler">
          <lst name="defaults">
            <str name="server.prefix">${server.prefix:}</str>
          </lst>
        </requestHandler>
        
        data-config.xml
        <dataSource type="URLDataSource" baseUrl="http://${dataimporter.request.server.prefix}api.fool.com" />
        

        Changing server.prefix to server-prefix makes it work again.

        Show
        Chris Eldredge added a comment - Snippet of our configuration that stopped working in 4.2: solrconfig.xml <requestHandler name= "/dataimport" class= "org.apache.solr.handler.dataimport.DataImportHandler" > <lst name= "defaults" > <str name= "server.prefix" >${server.prefix:}</str> </lst> </requestHandler> data-config.xml <dataSource type= "URLDataSource" baseUrl= "http: //${dataimporter.request.server.prefix}api.fool.com" /> Changing server.prefix to server-prefix makes it work again.
        Hide
        Chris Eldredge added a comment -

        By the way, in case it isn't clear, we define server.prefix as a system property and it defaults to "" in production, but it would be something like "test." in pre-production to produce a complete baseUrl like http://test.api.fool.com

        Show
        Chris Eldredge added a comment - By the way, in case it isn't clear, we define server.prefix as a system property and it defaults to "" in production, but it would be something like "test." in pre-production to produce a complete baseUrl like http://test.api.fool.com
        Hide
        Steve Rowe added a comment -

        Bulk move 4.4 issues to 4.5 and 5.0

        Show
        Steve Rowe added a comment - Bulk move 4.4 issues to 4.5 and 5.0
        Hide
        Uwe Schindler added a comment -

        Move issue to Solr 4.9.

        Show
        Uwe Schindler added a comment - Move issue to Solr 4.9.

          People

          • Assignee:
            James Dyer
            Reporter:
            James Dyer
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:

              Development