Details
-
Bug
-
Status: Resolved
-
Low
-
Resolution: Fixed
-
None
-
linux
-
Low
Description
I have a json file created with sstable2json for a column family of super column type. Its size is about 1.9GB. (It's a dump of all keys because I cannot find out how to specify keys to dump in sstable2json.)
When I tried to create sstable from the json file, it failed with OutOfMemoryError as follows.
WARN 00:31:58,595 Schema definitions were defined both locally and in cassandra.yaml. Definitions in cassandra.yaml were ignored.
Exception in thread "main" java.lang.OutOfMemoryError: PermGen space
at java.lang.String.intern(Native Method)
at org.codehaus.jackson.util.InternCache.intern(InternCache.java:40)
at org.codehaus.jackson.sym.BytesToNameCanonicalizer.addName(BytesToNameCanonicalizer.java:471)
at org.codehaus.jackson.impl.Utf8StreamParser.addName(Utf8StreamParser.java:893)
at org.codehaus.jackson.impl.Utf8StreamParser.findName(Utf8StreamParser.java:773)
at org.codehaus.jackson.impl.Utf8StreamParser.parseLongFieldName(Utf8StreamParser.java:379)
at org.codehaus.jackson.impl.Utf8StreamParser.parseMediumFieldName(Utf8StreamParser.java:347)
at org.codehaus.jackson.impl.Utf8StreamParser._parseFieldName(Utf8StreamParser.java:304)
at org.codehaus.jackson.impl.Utf8StreamParser.nextToken(Utf8StreamParser.java:140)
at org.codehaus.jackson.map.deser.UntypedObjectDeserializer.mapObject(UntypedObjectDeserializer.java:93)
at org.codehaus.jackson.map.deser.UntypedObjectDeserializer.deserialize(UntypedObjectDeserializer.java:65)
at org.codehaus.jackson.map.deser.MapDeserializer._readAndBind(MapDeserializer.java:197)
at org.codehaus.jackson.map.deser.MapDeserializer.deserialize(MapDeserializer.java:145)
at org.codehaus.jackson.map.deser.MapDeserializer.deserialize(MapDeserializer.java:23)
at org.codehaus.jackson.map.ObjectMapper._readValue(ObjectMapper.java:1261)
at org.codehaus.jackson.map.ObjectMapper.readValue(ObjectMapper.java:517)
at org.codehaus.jackson.JsonParser.readValueAs(JsonParser.java:897)
at org.apache.cassandra.tools.SSTableImport.importUnsorted(SSTableImport.java:208)
at org.apache.cassandra.tools.SSTableImport.importJson(SSTableImport.java:197)
at org.apache.cassandra.tools.SSTableImport.main(SSTableImport.java:421)
So, what I had to is that split the json file with "split" command and modify them to be correct json file. Create sstable for each small json files.
Could you change json2sstable to avoid OutOfMemory?