Just to say that the notion of adding a compressed flag to KV is pretty invasive with ripples across the code base. Messy is how we know what codec to used undoing the value. This info will not be in the KV.
I agree. In fact, the Type flag in the KV does not even get persisted in the HFile, IIUC. Given that, our best bet might be to prepend a "magic number" in the value to indicate that it is compressed. In this case, the onus would lie on the put (get) operation to compress (decompress) the value, as J-D proposed initially. As far as the server is concerned, the value will remain an opaque byte array.
The motivation behind the magic number is to be able to determine whether or not the value being read needs to be decompressed. Note that most codecs (including GZIP and LZO) prefix the compressed stream with some sort of a magic number. However, instead of relying on the algorithm-specific number, it might be more convenient to introduce a magic number of our own.
That would make sense, or it could be in the HCD.
I like the idea of using the HCD, considering that we want all clients to be on the same page, as far as compressing values goes.
Does the above approach sound reasonable? If so, may I take a stab at it?