[HADOOP-1196] StreamXmlRecordReader throws java.io.IOException: Resetting to invalid mark - ASF JIRA

Voters

Watch issue

Watchers

Create sub-task

Link

Clone

Update Comment Author

Replace String in Comment

Update Comment Visibility

Delete Comments

XML

Word

Printable

JSON

Details

Type: Bug
Status: Closed
Priority: Major
Resolution: Duplicate
Affects Version/s: None
Fix Version/s: None
Component/s: None
Labels:
None
Environment:

Suse Linux 64bit, JDK 1.6, Hadoop 0.12.2, 25 node cluster.

Description

I encounter this error consistently whenever I have over 500 maps in my job. I read XML feeds with about 200,000 documents in each, parse and index the document. I get the following exception.

java.io.IOException: Resetting to invalid mark
at java.io.BufferedInputStream.reset(BufferedInputStream.java:416)
at java.io.FilterInputStream.reset(FilterInputStream.java:200)
at org.apache.hadoop.streaming.StreamXmlRecordReader.fastReadUntilMatch(StreamXmlRecordReader.java:291)
at org.apache.hadoop.streaming.StreamXmlRecordReader.readUntilMatchBegin(StreamXmlRecordReader.java:120)
at org.apache.hadoop.streaming.StreamXmlRecordReader.seekNextRecordBoundary(StreamXmlRecordReader.java:113)
at org.apache.hadoop.streaming.StreamXmlRecordReader.init(StreamXmlRecordReader.java:75)
at org.apache.hadoop.streaming.StreamXmlRecordReader.(StreamXmlRecordReader.java:65)
at com.gale.searchng.workflow.fetcher.DocFetcherParser$XmlInputFormat.getRecordReader(DocFetcherParser.java:231)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:139)
at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1445)

Attachments

Activity

Comment

This comment will be Viewable by All Users Viewable by All Users

Cancel

People

Assignee:: Unassigned

Reporter:: Seetharam Venkatesh

Votes:: 0 Vote for this issue

Watchers:: 0 Start watching this issue

Dates

Created:: 03/Apr/07 15:27

Updated:: 08/Jul/09 17:05

Resolved:: 25/Oct/07 22:27

Agile

View on Board

StreamXmlRecordReader throws java.io.IOException: Resetting to invalid mark