Uploaded image for project: 'Nutch'
  1. Nutch
  2. NUTCH-1534

cassandra/hector exception: InvalidRequestException(why:column name must not be empty)

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Auto Closed
    • Affects Version/s: 2.1
    • Fix Version/s: 2.5
    • Component/s: fetcher, parser
    • Labels:
      None
    • Environment:

      nutch 2.1 / cassandra 1.2.1 / gora-cassandra 0.2 / gora-core 0.2.1
      running fetch with parse=true
      fetcher.threads.per.queue=2

      Description

      during bigger fetches (100k+ URLs), sometimes these errors occure:

      2013-02-19 09:32:09,639 WARN  fetcher.FetcherJob - Attempting to finish item from unknown queue: FetchItem [queueID=http://www.wer-kennt-wen.de, url=http
      ://www.wer-kennt-wen.de/gallery/imageshow/mmfqq4y02q09, u=http://www.wer-kennt-wen.de/gallery/imageshow/mmfqq4y02q09, page=org.apache.nutch.storage.WebPa
      ge@7b1ab444 {
        "baseUrl":"null"
        "status":"34"
        "fetchTime":"1361262537305"
        "prevFetchTime":"1361257503835"
        "fetchInterval":"0"
        "retriesSinceFetch":"0"
        "modifiedTime":"0"
        "protocolStatus":"org.apache.nutch.storage.ProtocolStatus@40b98 {
        "code":"16"
        "args":"[Http code=403, url=http://www.wer-kennt-wen.de/gallery/imageshow/mmfqq4y02q09]"
        "lastModified":"0"
      }"
        "content":"null"
        "contentType":"null"
        "prevSignature":"null"
        "signature":"null"
        "title":"null"
        "text":"null"
        "parseStatus":"null"
        "score":"0.0"
        "reprUrl":"null"
        "headers":"{Set-Cookie=WKWSESSID=9d968aeef3a709bc4bba9bb955b93e1e; path=/; domain=.wer-kennt-wen.de, Connection=close, Content-Type=text/html, Cache-Co
      ntrol=no-store, no-cache, must-revalidate, post-check=0, pre-check=0, Date=Tue, 19 Feb 2013 08:28:57 GMT, P3P=CP="CAO OUR", Expires=Thu, 19 Nov 1981 08:5
      2:00 GMT, Server=Apache, Pragma=no-cache}"
        "outlinks":"{}"
        "inlinks":"{}"
        "markers":"{dist=0, _injmrk_=y, _ftcmrk_=1361257998-2045033576, _gnmrk_=1361257998-2045033576}"
        "metadata":"{}"
      }]
      2013-02-19 09:32:09,640 ERROR fetcher.FetcherJob - Unexpected error for http://www.wer-kennt-wen.de/gallery/imageshow/mmfqq4y02q09
      me.prettyprint.hector.api.exceptions.HInvalidRequestException: InvalidRequestException(why:column name must not be empty)
              at me.prettyprint.cassandra.service.ExceptionsTranslatorImpl.translate(ExceptionsTranslatorImpl.java:52)
              at me.prettyprint.cassandra.service.KeyspaceServiceImpl$1.execute(KeyspaceServiceImpl.java:97)
              at me.prettyprint.cassandra.service.KeyspaceServiceImpl$1.execute(KeyspaceServiceImpl.java:90)
              at me.prettyprint.cassandra.service.Operation.executeAndSetResult(Operation.java:101)
              at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:233)
              at me.prettyprint.cassandra.service.KeyspaceServiceImpl.operateWithFailover(KeyspaceServiceImpl.java:131)
              at me.prettyprint.cassandra.service.KeyspaceServiceImpl.batchMutate(KeyspaceServiceImpl.java:102)
              at me.prettyprint.cassandra.service.KeyspaceServiceImpl.batchMutate(KeyspaceServiceImpl.java:108)
              at me.prettyprint.cassandra.model.MutatorImpl$3.doInKeyspace(MutatorImpl.java:248)
              at me.prettyprint.cassandra.model.MutatorImpl$3.doInKeyspace(MutatorImpl.java:245)
              at me.prettyprint.cassandra.model.KeyspaceOperationCallback.doInKeyspaceAndMeasure(KeyspaceOperationCallback.java:20)
              at me.prettyprint.cassandra.model.ExecutingKeyspace.doExecute(ExecutingKeyspace.java:85)
              at me.prettyprint.cassandra.model.MutatorImpl.execute(MutatorImpl.java:245)
              at me.prettyprint.cassandra.model.MutatorImpl.insert(MutatorImpl.java:79)
              at org.apache.gora.cassandra.store.CassandraClient.addSubColumn(CassandraClient.java:172)
              at org.apache.gora.cassandra.store.CassandraStore.addOrUpdateField(CassandraStore.java:360)
              at org.apache.gora.cassandra.store.CassandraStore.flush(CassandraStore.java:212)
              at org.apache.gora.mapreduce.GoraRecordWriter.write(GoraRecordWriter.java:65)
              at org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.write(ReduceTask.java:587)
              at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
              at org.apache.nutch.fetcher.FetcherReducer$FetcherThread.output(FetcherReducer.java:663)
              at org.apache.nutch.fetcher.FetcherReducer$FetcherThread.run(FetcherReducer.java:557)
      Caused by: InvalidRequestException(why:column name must not be empty)
              at org.apache.cassandra.thrift.Cassandra$batch_mutate_result.read(Cassandra.java:19479)
              at org.apache.cassandra.thrift.Cassandra$Client.recv_batch_mutate(Cassandra.java:1035)
              at org.apache.cassandra.thrift.Cassandra$Client.batch_mutate(Cassandra.java:1009)
              at me.prettyprint.cassandra.service.KeyspaceServiceImpl$1.execute(KeyspaceServiceImpl.java:95)
              ... 20 more
      
      

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                rherget Roland von Herget
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: