Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-7948

MapReduceIndexerTool of solr 5.2.1 doesn't work with hadoop 2.7.1

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Won't Fix
    • Affects Version/s: 5.2.1
    • Fix Version/s: None
    • Component/s: contrib - MapReduce
    • Labels:
      None
    • Environment:

      OS:suse 11
      JDK:java version "1.7.0_65"
      Java(TM) SE Runtime Environment (build 1.7.0_65-b17)
      Java HotSpot(TM) 64-Bit Server VM (build 24.65-b04, mixed mode)
      HADOOP:hadoop 2.7.1
      SOLR:5.2.1

      Description

      When I used MapReduceIndexerTool of 5.2.1 to index files, I got follwoing errors,but I used 4.9.0's MapReduceIndexerTool, it did work with hadoop 2.7.1.
      Exception ERROR as following:
      INFO - 2015-08-20 11:44:45.155; [ ] org.apache.solr.hadoop.HeartBeater; Heart beat reporting class is org.apache.hadoop.mapreduce.task.TaskAttemptContextImpl
      INFO - 2015-08-20 11:44:45.161; [ ] org.apache.solr.hadoop.SolrRecordWriter; Using this unpacked directory as solr home: /usr/local/hadoop/tmp/nm-local-dir/usercache/root/appcache/application_1440040092614_0004/container_1440040092614_0004_01_000004/82f2eca9-d6eb-483b-960f-0d3b3b93788c.solr.zip
      INFO - 2015-08-20 11:44:45.162; [ ] org.apache.solr.hadoop.SolrRecordWriter; Creating embedded Solr server with solrHomeDir: /usr/local/hadoop/tmp/nm-local-dir/usercache/root/appcache/application_1440040092614_0004/container_1440040092614_0004_01_000004/82f2eca9-d6eb-483b-960f-0d3b3b93788c.solr.zip, fs: DFS[DFSClient[clientName=DFSClient_attempt_1440040092614_0004_r_000001_0_1678264055_1, ugi=root (auth:SIMPLE)]], outputShardDir: hdfs://127.0.0.1:9000/tmp/outdir/reducers/_temporary/1/_temporary/attempt_1440040092614_0004_r_000001_0/part-r-00001
      INFO - 2015-08-20 11:44:45.194; [ ] org.apache.solr.core.SolrResourceLoader; new SolrResourceLoader for directory: '/usr/local/hadoop/tmp/nm-local-dir/usercache/root/appcache/application_1440040092614_0004/container_1440040092614_0004_01_000004/82f2eca9-d6eb-483b-960f-0d3b3b93788c.solr.zip/'
      INFO - 2015-08-20 11:44:45.206; [ ] org.apache.solr.hadoop.HeartBeater; HeartBeat thread running
      INFO - 2015-08-20 11:44:45.207; [ ] org.apache.solr.hadoop.HeartBeater; Issuing heart beat for 1 threads
      INFO - 2015-08-20 11:44:45.418; [ ] org.apache.solr.hadoop.SolrRecordWriter; Constructed instance information solr.home /usr/local/hadoop/tmp/nm-local-dir/usercache/root/appcache/application_1440040092614_0004/container_1440040092614_0004_01_000004/82f2eca9-d6eb-483b-960f-0d3b3b93788c.solr.zip (/usr/local/hadoop/tmp/nm-local-dir/usercache/root/appcache/application_1440040092614_0004/container_1440040092614_0004_01_000004/82f2eca9-d6eb-483b-960f-0d3b3b93788c.solr.zip), instance dir /usr/local/hadoop/tmp/nm-local-dir/usercache/root/appcache/application_1440040092614_0004/container_1440040092614_0004_01_000004/82f2eca9-d6eb-483b-960f-0d3b3b93788c.solr.zip/, conf dir /usr/local/hadoop/tmp/nm-local-dir/usercache/root/appcache/application_1440040092614_0004/container_1440040092614_0004_01_000004/82f2eca9-d6eb-483b-960f-0d3b3b93788c.solr.zip/conf/, writing index to solr.data.dir hdfs://127.0.0.1:9000/tmp/outdir/reducers/_temporary/1/_temporary/attempt_1440040092614_0004_r_000001_0/part-r-00001/data, with permdir hdfs://127.0.0.10:9000/tmp/outdir/reducers/_temporary/1/_temporary/attempt_1440040092614_0004_r_000001_0/part-r-00001
      INFO - 2015-08-20 11:44:45.426; [ ] org.apache.solr.core.SolrXmlConfig; Loading container configuration from /usr/local/hadoop/tmp/nm-local-dir/usercache/root/appcache/application_1440040092614_0004/container_1440040092614_0004_01_000004/82f2eca9-d6eb-483b-960f-0d3b3b93788c.solr.zip/solr.xml
      INFO - 2015-08-20 11:44:45.474; [ ] org.apache.solr.core.CorePropertiesLocator; Config-defined core root directory: /usr/local/hadoop/tmp/nm-local-dir/usercache/root/appcache/application_1440040092614_0004/container_1440040092614_0004_01_000004/82f2eca9-d6eb-483b-960f-0d3b3b93788c.solr.zip
      INFO - 2015-08-20 11:44:45.503; [ ] org.apache.solr.core.CoreContainer; New CoreContainer 1656436773
      INFO - 2015-08-20 11:44:45.503; [ ] org.apache.solr.core.CoreContainer; Loading cores into CoreContainer [instanceDir=/usr/local/hadoop/tmp/nm-local-dir/usercache/root/appcache/application_1440040092614_0004/container_1440040092614_0004_01_000004/82f2eca9-d6eb-483b-960f-0d3b3b93788c.solr.zip/]
      INFO - 2015-08-20 11:44:45.503; [ ] org.apache.solr.core.CoreContainer; loading shared library: /usr/local/hadoop/tmp/nm-local-dir/usercache/root/appcache/application_1440040092614_0004/container_1440040092614_0004_01_000004/82f2eca9-d6eb-483b-960f-0d3b3b93788c.solr.zip/lib
      WARN - 2015-08-20 11:44:45.504; [ ] org.apache.solr.core.SolrResourceLoader; Can't find (or read) directory to add to classloader: lib (resolved as: /usr/local/hadoop/tmp/nm-local-dir/usercache/root/appcache/application_1440040092614_0004/container_1440040092614_0004_01_000004/82f2eca9-d6eb-483b-960f-0d3b3b93788c.solr.zip/lib).
      INFO - 2015-08-20 11:44:45.520; [ ] org.apache.solr.handler.component.HttpShardHandlerFactory; created with socketTimeout : 600000,connTimeout : 60000,maxConnectionsPerHost : 20,maxConnections : 10000,corePoolSize : 0,maximumPoolSize : 2147483647,maxThreadIdleTime : 5,sizeOfQueue : -1,fairnessPolicy : false,useRetries : false,
      FATAL - 2015-08-20 11:44:45.526; [ ] org.apache.hadoop.mapred.YarnChild; Error running child : java.lang.VerifyError: Bad return type
      Exception Details:
      Location:
      org/apache/solr/client/solrj/impl/HttpClientUtil.createClient(Lorg/apache/solr/common/params/SolrParams;Lorg/apache/http/conn/ClientConnectionManager;)Lorg/apache/http/impl/client/CloseableHttpClient; @62: areturn
      Reason:
      Type 'org/apache/http/impl/client/DefaultHttpClient' (current frame, stack[0]) is not assignable to 'org/apache/http/impl/client/CloseableHttpClient' (from method signature)
      Current Frame:
      bci: @62
      flags: { }
      locals:

      { 'org/apache/solr/common/params/SolrParams', 'org/apache/http/conn/ClientConnectionManager', 'org/apache/solr/common/params/ModifiableSolrParams', 'org/apache/http/impl/client/DefaultHttpClient' }

      stack:

      { 'org/apache/http/impl/client/DefaultHttpClient' }

      Bytecode:
      0000000: bb00 0359 2ab7 0004 4db2 0005 b900 0601
      0000010: 0099 001e b200 05bb 0007 59b7 0008 1209
      0000020: b600 0a2c b600 0bb6 000c b900 0d02 00bb
      0000030: 0011 592b b700 124e 2d2c b800 102d b0
      Stackmap Table:
      append_frame(@47,Object127)

      at org.apache.solr.handler.component.HttpShardHandlerFactory.init(HttpShardHandlerFactory.java:166)
      at org.apache.solr.handler.component.ShardHandlerFactory.newInstance(ShardHandlerFactory.java:49)
      at org.apache.solr.core.CoreContainer.load(CoreContainer.java:328)
      at org.apache.solr.hadoop.SolrRecordWriter.createEmbeddedSolrServer(SolrRecordWriter.java:163)
      at org.apache.solr.hadoop.SolrRecordWriter.<init>(SolrRecordWriter.java:119)
      at org.apache.solr.hadoop.SolrOutputFormat.getRecordWriter(SolrOutputFormat.java:163)
      at org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.<init>(ReduceTask.java:540)
      at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:614)
      at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)
      at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
      at java.security.AccessController.doPrivileged(Native Method)
      at javax.security.auth.Subject.doAs(Subject.java:415)
      at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
      at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)

        Issue Links

          Activity

          Hide
          markrmiller@gmail.com Mark Miller added a comment -

          Thanks for the report.

          I actually ran into this issue a couple weeks ago while trying to get the map reduce contrib back up to speed.

          You can see this issue when you try and run the example: https://github.com/markrmiller/solr-map-reduce-example

          I think the issue is that a Kite Morphlines jar is using a Solr class that has changed. If so, the answer is that Kite Morphlines should not use Solr classes outside of it's couple Solr modules. I'll look into getting that changed very soon, and then we will have to update versions.

          Show
          markrmiller@gmail.com Mark Miller added a comment - Thanks for the report. I actually ran into this issue a couple weeks ago while trying to get the map reduce contrib back up to speed. You can see this issue when you try and run the example: https://github.com/markrmiller/solr-map-reduce-example I think the issue is that a Kite Morphlines jar is using a Solr class that has changed. If so, the answer is that Kite Morphlines should not use Solr classes outside of it's couple Solr modules. I'll look into getting that changed very soon, and then we will have to update versions.
          Hide
          markrmiller@gmail.com Mark Miller added a comment -

          I think the issue is that a Kite Morphlines jar is using a Solr class that has changed.

          A quick look through the Kite code doesn't seem to support this. I'll spend some time digging.

          Show
          markrmiller@gmail.com Mark Miller added a comment - I think the issue is that a Kite Morphlines jar is using a Solr class that has changed. A quick look through the Kite code doesn't seem to support this. I'll spend some time digging.
          Hide
          markrmiller@gmail.com Mark Miller added a comment -

          It actually looks like perhaps a class clash issue with hadoop.

          You might try mapreduce.user.classpath.first=true in your config. I'll give that a try a little later today.

          Show
          markrmiller@gmail.com Mark Miller added a comment - It actually looks like perhaps a class clash issue with hadoop. You might try mapreduce.user.classpath.first=true in your config. I'll give that a try a little later today.
          Hide
          davidchiu davidchiu added a comment -

          Do you mean that I should add "mapreduce.job.user.classpath.first=true" into mapred-site.xml?

          Show
          davidchiu davidchiu added a comment - Do you mean that I should add "mapreduce.job.user.classpath.first=true" into mapred-site.xml?
          Hide
          davidchiu davidchiu added a comment - - edited

          I digged the problem again, I found that the httpclient-4.4.1 in solr 5.2.1 conflicted with the httpclient-4.2.5 in hadoop 2.7.1, I replaced the httpclient-4.2.5 in hadoop 2.7.1(just under hadoop/common/lib) with the httpclient-4.4.1, it went through.

          By the way, there is a bug in httpclient 4.4.1, in URLEncodedUtils.java, function of parse(final String s, final Charset charset) doesn't verify parameter of s, it will cause nullpointexception sometimes.

          Show
          davidchiu davidchiu added a comment - - edited I digged the problem again, I found that the httpclient-4.4.1 in solr 5.2.1 conflicted with the httpclient-4.2.5 in hadoop 2.7.1, I replaced the httpclient-4.2.5 in hadoop 2.7.1(just under hadoop/common/lib) with the httpclient-4.4.1, it went through. By the way, there is a bug in httpclient 4.4.1, in URLEncodedUtils.java, function of parse(final String s, final Charset charset) doesn't verify parameter of s, it will cause nullpointexception sometimes.
          Hide
          elyograg Shawn Heisey added a comment - - edited

          By the way, there is a bug in httpclient 4.4.1

          Is there an issue filed in Jira for this bug? Has it been fixed in the 4.5 version?

          edit: I don't see anything that looks like what you described in the 4.5 release notes.

          Show
          elyograg Shawn Heisey added a comment - - edited By the way, there is a bug in httpclient 4.4.1 Is there an issue filed in Jira for this bug? Has it been fixed in the 4.5 version? edit: I don't see anything that looks like what you described in the 4.5 release notes.
          Hide
          markrmiller@gmail.com Mark Miller added a comment -

          Sadly, it doesn't seem that mapreduce.job.user.classpath.first or mapreduce.job.classloader can handle this conflict. I'm still experimenting, but jar harmonization might be the only solution. That would be a real bummer - sometimes it is not so easy to do.

          Show
          markrmiller@gmail.com Mark Miller added a comment - Sadly, it doesn't seem that mapreduce.job.user.classpath.first or mapreduce.job.classloader can handle this conflict. I'm still experimenting, but jar harmonization might be the only solution. That would be a real bummer - sometimes it is not so easy to do.
          Hide
          ctargett Cassandra Targett added a comment -

          Mark Miller - did you ever get anywhere with this issue? Are there any other ideas for workarounds?

          Show
          ctargett Cassandra Targett added a comment - Mark Miller - did you ever get anywhere with this issue? Are there any other ideas for workarounds?
          Hide
          markrmiller@gmail.com Mark Miller added a comment -

          In the end, you have to harmonize the httpclient jars it seems.

          I updated my example project to work with Solr 5.2.1, so it does work: https://github.com/markrmiller/solr-map-reduce-example

          In this case it meant adding the following to the script:

          ## Harmonize Conflicting Jar Dependencies
          #######################
          
          # Hadoop uses a lower version than Solr and the flags to use user libs first don't help this conflict
          solr_http_client_version=4.4.1
          
          find $hadoop_distrib -name "httpclient-*.jar" -type f -exec rm {} \;
          find $hadoop_distrib -name "httpcore-*.jar" -type f -exec rm {} \;
          
          solr_client=$solr_distrib/server/solr-webapp/webapp/WEB-INF/lib/httpclient-$solr_http_client_version.jar
          solr_core=$solr_distrib/server/solr-webapp/webapp/WEB-INF/lib/httpcore-$solr_http_client_version.jar
          
          cp $solr_client $hadoop_distrib/share/hadoop/tools/lib
          cp $solr_corer $hadoop_distrib/share/hadoop/tools/lib
          
          cp $solr_client $hadoop_distrib/share/hadoop/kms/tomcat/webapps/kms/WEB-INF/lib
          cp $solr_corer $hadoop_distrib/share/hadoop/kms/tomcat/webapps/kms/WEB-INF/lib
          
          cp $solr_client $hadoop_distrib/share/hadoop/httpfs/tomcat/webapps/webhdfs/WEB-INF/lib
          cp $solr_corer $hadoop_distrib/share/hadoop/httpfs/tomcat/webapps/webhdfs/WEB-INF/lib
          
          cp $solr_client $hadoop_distrib/share/hadoop/common/lib
          cp $solr_corer $hadoop_distrib/share/hadoop/common/lib
          
          Show
          markrmiller@gmail.com Mark Miller added a comment - In the end, you have to harmonize the httpclient jars it seems. I updated my example project to work with Solr 5.2.1, so it does work: https://github.com/markrmiller/solr-map-reduce-example In this case it meant adding the following to the script: ## Harmonize Conflicting Jar Dependencies ####################### # Hadoop uses a lower version than Solr and the flags to use user libs first don't help this conflict solr_http_client_version=4.4.1 find $hadoop_distrib -name "httpclient-*.jar" -type f -exec rm {} \; find $hadoop_distrib -name "httpcore-*.jar" -type f -exec rm {} \; solr_client=$solr_distrib/server/solr-webapp/webapp/WEB-INF/lib/httpclient-$solr_http_client_version.jar solr_core=$solr_distrib/server/solr-webapp/webapp/WEB-INF/lib/httpcore-$solr_http_client_version.jar cp $solr_client $hadoop_distrib/share/hadoop/tools/lib cp $solr_corer $hadoop_distrib/share/hadoop/tools/lib cp $solr_client $hadoop_distrib/share/hadoop/kms/tomcat/webapps/kms/WEB-INF/lib cp $solr_corer $hadoop_distrib/share/hadoop/kms/tomcat/webapps/kms/WEB-INF/lib cp $solr_client $hadoop_distrib/share/hadoop/httpfs/tomcat/webapps/webhdfs/WEB-INF/lib cp $solr_corer $hadoop_distrib/share/hadoop/httpfs/tomcat/webapps/webhdfs/WEB-INF/lib cp $solr_client $hadoop_distrib/share/hadoop/common/lib cp $solr_corer $hadoop_distrib/share/hadoop/common/lib
          Hide
          manis.nesan Mani added a comment -

          Thank you davidchiu & Mark Miller for the suggestions
          I faced a similar issue and jar harmonization on the client side helped me to resolve.

          Environment :
          Solr : 5.2.1
          I was running a SolrJ Client Artifact in Jboss 6.2. And httpclient jars (4.2.1) part of system dependencies packaged along with jboss is conflicting & overriding the jars (4.4.1) packaged with my artifact.

          Using exclusions from [1] in the jboss-deployment-structure.xml one can prevent the server to automatically add the dependencies.
          [1] https://docs.jboss.org/author/display/AS72/Class+Loading+in+AS7

          Show
          manis.nesan Mani added a comment - Thank you davidchiu & Mark Miller for the suggestions I faced a similar issue and jar harmonization on the client side helped me to resolve. Environment : Solr : 5.2.1 I was running a SolrJ Client Artifact in Jboss 6.2. And httpclient jars (4.2.1) part of system dependencies packaged along with jboss is conflicting & overriding the jars (4.4.1) packaged with my artifact. Using exclusions from [1] in the jboss-deployment-structure.xml one can prevent the server to automatically add the dependencies. [1] https://docs.jboss.org/author/display/AS72/Class+Loading+in+AS7
          Hide
          davidchiu davidchiu added a comment - - edited

          I found a same issue about oozie with hadoop, the issue's jira is https://issues.apache.org/jira/browse/OOZIE-2066;

          Can it inspire us?

          another infomation:
          http://mail-archives.apache.org/mod_mbox//oozie-user/201508.mbox/<c8c5ca13f44c4b18aaea2c90561d5254@MBX3.impetus.co.in>

          Show
          davidchiu davidchiu added a comment - - edited I found a same issue about oozie with hadoop, the issue's jira is https://issues.apache.org/jira/browse/OOZIE-2066 ; Can it inspire us? another infomation: http://mail-archives.apache.org/mod_mbox//oozie-user/201508.mbox/ <c8c5ca13f44c4b18aaea2c90561d5254@MBX3.impetus.co.in>
          Hide
          warrensmith Warren Smith added a comment -

          There is a workaround listed here: http://stackoverflow.com/questions/32105513/solr-bad-return-type-error which worked for me.

          Basically, create your own httpClient and pass that to the CloudSolrClient constructor

          Show
          warrensmith Warren Smith added a comment - There is a workaround listed here: http://stackoverflow.com/questions/32105513/solr-bad-return-type-error which worked for me. Basically, create your own httpClient and pass that to the CloudSolrClient constructor

            People

            • Assignee:
              markrmiller@gmail.com Mark Miller
              Reporter:
              davidchiu davidchiu
            • Votes:
              1 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development