It would be good to have command-line tools to convert between newline-delimited text to Avro data files.
add tools that read/write json records from/to avro data files
add tools that read/write CSV records from/to avro data files
add tools that read/write xml records from/to avro data files
For this issue, I'm imagining a java tool. A similar C tool would also be tremendously useful.
Some potential details:
Would the HDFS integration introduce a dependency on Hadoop Common's FileSystem?
Yes. Avro already uses the Hadoop 0.20 APIs elsewhere, which are intended to be stable for some time.
This patch provides conversion from text files to avro data files and back. Supports HDFS, local files, and piping.
Looks good! A few nits:
OptionSpec<Integer> level = p.accepts("level", "compression level")
OptionSet opts = p.parse(...);
compressionLevel = level.value(opts);
This update addresses those changes... thanks!
I think the default compression level should be 1: fast, but compressed.
Also, where do we document that '-' means standard in or standard out? The Util class is package-private, so that doesn't count. Perhaps we should add it to the help string?
Other than that, +1.
I agree, here are those changes...
Checkstyle fails, complaining about tabs when I run 'cd lang/java; ant clean test'.
I think my commit of AVRO-512 may also create conflicts with this patch, so please be sure to run 'svn up' before you re-test and re-submit. Thanks!
Okay this plays nice with checkstyle. Also made compliant with your patch, I think!
I just committed this. Thanks, Patrick!