HBase
  1. HBase
  2. HBASE-3732

New configuration option for client-side compression

    Details

    • Type: New Feature New Feature
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Won't Fix
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      We have a case here where we have to store very fat cells (arrays of integers) which can amount into the hundreds of KBs that we need to read often, concurrently, and possibly keep in cache. Compressing the values on the client using java.util.zip's Deflater before sending them to HBase proved to be in our case almost an order of magnitude faster.

      There reasons are evident: less data sent to hbase, memstore contains compressed data, block cache contains compressed data too, etc.

      I was thinking that it might be something useful to add to a family schema, so that Put/Result do the conversion for you. The actual compression algo should also be configurable.

      1. compressed_streams.jar
        2 kB
        Karthick Sankarachary

        Issue Links

          Activity

          Hide
          Benoit Sigoure added a comment -

          If you want Put/Result to do the conversion for you, that means the client needs to be aware of the schema of the table before it can start using it, right? Because right now HBase clients don't know the schema, so it's something extra that they'd need to lookup separately, unless we add new fields in the .META. table that go along with each and every region.

          Show
          Benoit Sigoure added a comment - If you want Put / Result to do the conversion for you, that means the client needs to be aware of the schema of the table before it can start using it, right? Because right now HBase clients don't know the schema, so it's something extra that they'd need to lookup separately, unless we add new fields in the .META. table that go along with each and every region.
          Hide
          Jean-Daniel Cryans added a comment -

          Benoit, that's already the case (and it's something we want to get rid of). See the regioninfo qualifer in .META.

          I agree it is a bit invasive and it's not just about adding a config option for families, so I'm still wondering if the pain of adding this in is worth it. You could also add a layer on top of HTable like CompressedHTable but that just seem ugly.

          Show
          Jean-Daniel Cryans added a comment - Benoit, that's already the case (and it's something we want to get rid of). See the regioninfo qualifer in .META. I agree it is a bit invasive and it's not just about adding a config option for families, so I'm still wondering if the pain of adding this in is worth it. You could also add a layer on top of HTable like CompressedHTable but that just seem ugly.
          Hide
          Benoit Sigoure added a comment -

          Oh yeah I forgot that this was in the info:regioninfo column, my bad.

          Wouldn't it be awesome if this was actually on a key-per-key basis? Is there a spare bit in KeyValue we can steal to indicate "this KV is compressed"? We could not only compress the value, but also the column qualifier and/or the key if they're big too (some applications store data in the column qualifier or, less frequently, in the key).

          Show
          Benoit Sigoure added a comment - Oh yeah I forgot that this was in the info:regioninfo column, my bad. Wouldn't it be awesome if this was actually on a key-per-key basis? Is there a spare bit in KeyValue we can steal to indicate "this KV is compressed"? We could not only compress the value, but also the column qualifier and/or the key if they're big too (some applications store data in the column qualifier or, less frequently, in the key).
          Hide
          stack added a comment -

          Benoît: We can't compress column qualifier because then columns would sort differently. As to adding bit to say KV is compressed, that might be possible. Currently we have a type byte in each KV. The top four bits are unused. I had stared a patch to use the top two for 'version' and had done the work to make sure version was not considered comparing adding proper masks etc. I could revive this work to add in a compression bit.

          Show
          stack added a comment - Benoît: We can't compress column qualifier because then columns would sort differently. As to adding bit to say KV is compressed, that might be possible. Currently we have a type byte in each KV. The top four bits are unused. I had stared a patch to use the top two for 'version' and had done the work to make sure version was not considered comparing adding proper masks etc. I could revive this work to add in a compression bit.
          Hide
          Benoit Sigoure added a comment -

          Sounds good Stack.

          Show
          Benoit Sigoure added a comment - Sounds good Stack.
          Hide
          Karthick Sankarachary added a comment -

          Does it make sense to perform the compression at the IPC layer, specifically in the HBaseClient and HBaseServer classes? Currently, then both read (write) headers and data through a DataInputStream (DataOutputStream). What if we wrap those streams such that it compresses the bytes flowing through it, based on the yet-to-be-determined config option? As a matter of fact, I was working on something along these lines last year, but didn't follow through on it. Luckily, I still have the compression-based streams that I wrote, and I'm attaching those here just to get your thoughts. If this approach truly makes sense, then I can try to put together a working patch.

          Show
          Karthick Sankarachary added a comment - Does it make sense to perform the compression at the IPC layer, specifically in the HBaseClient and HBaseServer classes? Currently, then both read (write) headers and data through a DataInputStream ( DataOutputStream ). What if we wrap those streams such that it compresses the bytes flowing through it, based on the yet-to-be-determined config option? As a matter of fact, I was working on something along these lines last year, but didn't follow through on it. Luckily, I still have the compression-based streams that I wrote, and I'm attaching those here just to get your thoughts. If this approach truly makes sense, then I can try to put together a working patch.
          Hide
          Jean-Daniel Cryans added a comment -

          That would be orthogonal, compressing the values before putting them on the network, memstore, block cache or HFile has IMO the biggest wins (as a listed in the description). With the compression on the IPC layer you win on client-regionserver transfers. Not bad, but different and at smaller scale than what I propose.

          Show
          Jean-Daniel Cryans added a comment - That would be orthogonal, compressing the values before putting them on the network, memstore, block cache or HFile has IMO the biggest wins (as a listed in the description). With the compression on the IPC layer you win on client-regionserver transfers. Not bad, but different and at smaller scale than what I propose.
          Hide
          Karthick Sankarachary added a comment -

          Oh, I see. I like the idea of keeping the value in a compressed form until the client tries to "get" it. Perhaps we can compress the value depending on whether it's fatter than a certain threshold? Also, given that the value typically accounts for most of the KeyValue's size, do we need to call HFile#getCompressingStream if the value is already compressed up front?

          Show
          Karthick Sankarachary added a comment - Oh, I see. I like the idea of keeping the value in a compressed form until the client tries to "get" it. Perhaps we can compress the value depending on whether it's fatter than a certain threshold? Also, given that the value typically accounts for most of the KeyValue 's size, do we need to call HFile#getCompressingStream if the value is already compressed up front?
          Hide
          Karthick Sankarachary added a comment -

          Stack, If you need help with this patch, please let me know, because I can make the time to work on it. This seems like a really useful low hanging fruit.

          Show
          Karthick Sankarachary added a comment - Stack, If you need help with this patch, please let me know, because I can make the time to work on it. This seems like a really useful low hanging fruit.
          Hide
          Jean-Daniel Cryans added a comment -

          Perhaps we can compress the value depending on whether it's fatter than a certain threshold

          That would make sense, or it could be in the HCD.

          do we need to call HFile#getCompressingStream if the value is already compressed up front

          The fact that the values are compressed should be transparent to the region servers, exactly like when the user is compressing the values themselves (like I described in the description of this jira).

          This seems like a really useful low hanging fruit.

          Not so sure about that. I think that are many easy ways to solve this, but most of them include polluting the API or doing weird acrobatics in the client. Compressing/decompressing is easy, it's all about where you're going to do it in the code.

          Show
          Jean-Daniel Cryans added a comment - Perhaps we can compress the value depending on whether it's fatter than a certain threshold That would make sense, or it could be in the HCD. do we need to call HFile#getCompressingStream if the value is already compressed up front The fact that the values are compressed should be transparent to the region servers, exactly like when the user is compressing the values themselves (like I described in the description of this jira). This seems like a really useful low hanging fruit. Not so sure about that. I think that are many easy ways to solve this, but most of them include polluting the API or doing weird acrobatics in the client. Compressing/decompressing is easy, it's all about where you're going to do it in the code.
          Hide
          Karthick Sankarachary added a comment -

          Not so sure about that. I think that are many easy ways to solve this, but most of them include polluting the API or doing weird acrobatics in the client. Compressing/decompressing is easy, it's all about where you're going to do it in the code.

          As Stack suggested, a compression bit to KeyValue#Type, say Compressed(128), can be used to tell if a value is compressed or not. Alternatively, we could define a Type#PutCompressed value, and have the server handle that the same way as Type#Put. The actual compression (decompression) would occur in the Put (Result) depending on the client-side compression algorithm. For all intents and purposes, this change would be transparent to the end user.

          Show
          Karthick Sankarachary added a comment - Not so sure about that. I think that are many easy ways to solve this, but most of them include polluting the API or doing weird acrobatics in the client. Compressing/decompressing is easy, it's all about where you're going to do it in the code. As Stack suggested, a compression bit to KeyValue#Type, say Compressed(128), can be used to tell if a value is compressed or not. Alternatively, we could define a Type#PutCompressed value, and have the server handle that the same way as Type#Put. The actual compression (decompression) would occur in the Put (Result) depending on the client-side compression algorithm. For all intents and purposes, this change would be transparent to the end user.
          Hide
          stack added a comment -

          Just to say that the notion of adding a compressed flag to KV is pretty invasive with ripples across the code base. Messy is how we know what codec to used undoing the value. This info will not be in the KV.

          Show
          stack added a comment - Just to say that the notion of adding a compressed flag to KV is pretty invasive with ripples across the code base. Messy is how we know what codec to used undoing the value. This info will not be in the KV.
          Hide
          Karthick Sankarachary added a comment -

          Just to say that the notion of adding a compressed flag to KV is pretty invasive with ripples across the code base. Messy is how we know what codec to used undoing the value. This info will not be in the KV.

          I agree. In fact, the Type flag in the KV does not even get persisted in the HFile, IIUC. Given that, our best bet might be to prepend a "magic number" in the value to indicate that it is compressed. In this case, the onus would lie on the put (get) operation to compress (decompress) the value, as J-D proposed initially. As far as the server is concerned, the value will remain an opaque byte array.

          The motivation behind the magic number is to be able to determine whether or not the value being read needs to be decompressed. Note that most codecs (including GZIP and LZO) prefix the compressed stream with some sort of a magic number. However, instead of relying on the algorithm-specific number, it might be more convenient to introduce a magic number of our own.

          That would make sense, or it could be in the HCD.

          I like the idea of using the HCD, considering that we want all clients to be on the same page, as far as compressing values goes.

          Does the above approach sound reasonable? If so, may I take a stab at it?

          Show
          Karthick Sankarachary added a comment - Just to say that the notion of adding a compressed flag to KV is pretty invasive with ripples across the code base. Messy is how we know what codec to used undoing the value. This info will not be in the KV. I agree. In fact, the Type flag in the KV does not even get persisted in the HFile , IIUC. Given that, our best bet might be to prepend a "magic number" in the value to indicate that it is compressed. In this case, the onus would lie on the put (get) operation to compress (decompress) the value, as J-D proposed initially. As far as the server is concerned, the value will remain an opaque byte array. The motivation behind the magic number is to be able to determine whether or not the value being read needs to be decompressed. Note that most codecs (including GZIP and LZO) prefix the compressed stream with some sort of a magic number. However, instead of relying on the algorithm-specific number, it might be more convenient to introduce a magic number of our own. That would make sense, or it could be in the HCD. I like the idea of using the HCD, considering that we want all clients to be on the same page, as far as compressing values goes. Does the above approach sound reasonable? If so, may I take a stab at it?
          Hide
          stack added a comment -

          Do you think we should provide this Karthick, this auto-compress/decompress with special magic lead-off bytes to flag compression done? Seems like something that is easy enough for users to do themselves in the layer above HBase if they need it. I'd think we'd want to see more demand for such a feature before you spent time on it. What you think?

          Show
          stack added a comment - Do you think we should provide this Karthick, this auto-compress/decompress with special magic lead-off bytes to flag compression done? Seems like something that is easy enough for users to do themselves in the layer above HBase if they need it. I'd think we'd want to see more demand for such a feature before you spent time on it. What you think?
          Hide
          Karthick Sankarachary added a comment -

          Fair enough. If possible, can we ask people on the mailing list to cast their votes on this issue (my vote is in already)? To me, this does look like a feature that can only be handled on the client side.

          Show
          Karthick Sankarachary added a comment - Fair enough. If possible, can we ask people on the mailing list to cast their votes on this issue (my vote is in already)? To me, this does look like a feature that can only be handled on the client side.
          Hide
          Jean-Daniel Cryans added a comment -

          I wouldn't call it a vote, since there's no voting process in Apache for requesting features. Contributors submit patches whether the rest of the community likes it or not, then it's up to the committers to get them into SVN if they want it or not.

          My opinion is that this is something HBase should be doing by default, there are too many advantages. I agree with Stack that it is easy to do at the application level, but then if everyone starts doing then it really begs the question as to why isn't HBase doing it in the first place.

          Show
          Jean-Daniel Cryans added a comment - I wouldn't call it a vote, since there's no voting process in Apache for requesting features. Contributors submit patches whether the rest of the community likes it or not, then it's up to the committers to get them into SVN if they want it or not. My opinion is that this is something HBase should be doing by default, there are too many advantages. I agree with Stack that it is easy to do at the application level, but then if everyone starts doing then it really begs the question as to why isn't HBase doing it in the first place.
          Hide
          Jonathan Gray added a comment -

          I agree that value compression is easily done at the application level. In cases where you have very large values, compressing that data is something you should always be thinking about.

          Published or contributed code samples could go a long way. Are there things we could add in Put/Get to make this kind of stuff easily pluggable?

          If it can be integrated simply, then this might be okay, but it should probably be part of a larger conversation about compression. And anything that touches KV needs to be thought through.

          I think there could be some substantial savings in hbase-specific prefix or row/family/qualifier compression, both on-disk and in-memory. One idea there would require some complicating of KeyValue and its comparator, or a simpler solution would require short-term memory allocations to reconstitute KVs as they make their way through the KVHeap/KVScanner.

          I've also done some work on supporting a two-level compressed/uncompressed block cache patch (with lzo). I'm waiting to finish until HBASE-3857 goes in as it adds some things that make life easier in the HFile code.

          Show
          Jonathan Gray added a comment - I agree that value compression is easily done at the application level. In cases where you have very large values, compressing that data is something you should always be thinking about. Published or contributed code samples could go a long way. Are there things we could add in Put/Get to make this kind of stuff easily pluggable? If it can be integrated simply, then this might be okay, but it should probably be part of a larger conversation about compression. And anything that touches KV needs to be thought through. I think there could be some substantial savings in hbase-specific prefix or row/family/qualifier compression, both on-disk and in-memory. One idea there would require some complicating of KeyValue and its comparator, or a simpler solution would require short-term memory allocations to reconstitute KVs as they make their way through the KVHeap/KVScanner. I've also done some work on supporting a two-level compressed/uncompressed block cache patch (with lzo). I'm waiting to finish until HBASE-3857 goes in as it adds some things that make life easier in the HFile code.
          Hide
          Jason Rutherglen added a comment -

          Sorry I meant to post the comment at HBASE-3857 to here:

          The FST data structure created in LUCENE-2792 could be used to compress the rowids in the HFile while simultaneously enabling fast lookup.

          Show
          Jason Rutherglen added a comment - Sorry I meant to post the comment at HBASE-3857 to here: The FST data structure created in LUCENE-2792 could be used to compress the rowids in the HFile while simultaneously enabling fast lookup.
          Hide
          stack added a comment -

          Moving out of 0.92.0. Pull it back in if you think different.

          Show
          stack added a comment - Moving out of 0.92.0. Pull it back in if you think different.
          Hide
          Anoop Sam John added a comment -

          stack The new RPC supports compression for cell blocks right? We can close this?

          Show
          Anoop Sam John added a comment - stack The new RPC supports compression for cell blocks right? We can close this?
          Hide
          stack added a comment -

          Resolving as "won't fix". You can achieve compression of KVs via other means now, by specifying client should do a compressioncodec on cellblocks on set up of connection.

          Show
          stack added a comment - Resolving as "won't fix". You can achieve compression of KVs via other means now, by specifying client should do a compressioncodec on cellblocks on set up of connection.
          Hide
          stack added a comment -

          Anoop Sam John Thanks Anoop.

          Show
          stack added a comment - Anoop Sam John Thanks Anoop.
          Hide
          Pradeep Gollakota added a comment -

          I'd like to reopen discussion on this ticket. I have a slightly different use case that I'm considering for client side compression (sorry if this isn't the right forum for this question).

          I have a scenario where clients are in a different network topology than the hbase cluster. The bandwidth between the clients and the cluster is limited. Since the client buffers writes, is there any mechanism in place for compressing the over the wire transfers?

          Show
          Pradeep Gollakota added a comment - I'd like to reopen discussion on this ticket. I have a slightly different use case that I'm considering for client side compression (sorry if this isn't the right forum for this question). I have a scenario where clients are in a different network topology than the hbase cluster. The bandwidth between the clients and the cluster is limited. Since the client buffers writes, is there any mechanism in place for compressing the over the wire transfers?
          Hide
          Harsh J added a comment -

          Pradeep,

          Does Stack's comment not address your need?

          You can achieve compression of KVs via other means now, by specifying client should do a compressioncodec on cellblocks on set up of connection.

          Show
          Harsh J added a comment - Pradeep, Does Stack's comment not address your need? You can achieve compression of KVs via other means now, by specifying client should do a compressioncodec on cellblocks on set up of connection.
          Hide
          Pradeep Gollakota added a comment -

          Yes it does. I misread his comment the first time. I also found HBASE-5355 which is exactly the thing that addresses my use case.

          Thanks.

          Show
          Pradeep Gollakota added a comment - Yes it does. I misread his comment the first time. I also found HBASE-5355 which is exactly the thing that addresses my use case. Thanks.

            People

            • Assignee:
              Unassigned
              Reporter:
              Jean-Daniel Cryans
            • Votes:
              2 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development