I think I've explained this command poorly. It attempts to render whatever exists at a given path as human-readable text. Right now, it includes SequenceFile and gzip formats; it's not trying to stuff a framework for computation on SequenceFiles into FsShell. I agree that such a toolchain should be independent, but this aspires to something else.
While we're on the subject though, I'm not sure I fully understand the motivation for this command-line tool. Aren't each of those commands easily implemented in map/reduce? As I see it, there are two ways to generalize the operations Enis suggests, since all of WritableComparable is fair game. Either a) everything is first converted to a string or b) the framework can understand that a user-specified InputFormat creating a RecordReader creating a keytype comparable to IntWritable should select a comparator for its keys such that the user-supplied "70" is greater than "9", (unless the user actually intends a lexiographic ordering). Not to reveal my opinion.
In the latter case, code like this belongs in mapred, since merely working out the types is going to be either a hack or a significant effort. In the former case, for more than a single SequenceFile, such code still seems to belong in mapred; that said, piping the output of "text"- as implemented- through a general text-processing utility is a reasonable hack for some purposes. For my purposes, I only needed to check the first few records for some of the output, and this suffices. I don't know why a comparable utility like
HADOOP-175 never got committed (it would be a good base, though 1) it relies on UTF8 keys which are currently deprecated and 2) it solves some problems outside the limited domain of this issue), but that no similar utility has been written for the last year makes me wary of over-complicating this. It's for human-readability, not processing.