Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Minor Minor
    • Resolution: Fixed
    • Fix Version/s: 1.2.0 beta 1
    • Component/s: None
    • Labels:
      None

      Description

      For nodes with millions of keys, doing rolling restarts that take over 10 minutes per node can be painful if you have 100 node cluster. All of our time is spent on doing index summary computations on startup. It would be great if we could save those to disk as well. Our indexes are quite large.

      1. 0001-CASSANDRA-2392-v6.patch
        20 kB
        Vijay
      2. 0001-CASSANDRA-2392-V7.patch
        21 kB
        Vijay
      3. 0001-re-factor-first-and-last.patch
        12 kB
        Vijay
      4. 0001-save-summaries-to-disk.patch
        25 kB
        Vijay
      5. 0001-save-summaries-to-disk-v4.patch
        19 kB
        Vijay
      6. 0002-save-summaries-to-disk.patch
        16 kB
        Vijay
      7. 0002-save-summaries-to-disk-v2.patch
        17 kB
        Vijay
      8. 0002-save-summaries-to-disk-v3.patch
        17 kB
        Vijay
      9. CASSANDRA-2392-v5.patch
        20 kB
        Pavel Yaskevich

        Issue Links

          Activity

          Hide
          Pavel Yaskevich added a comment -

          Committed

          Show
          Pavel Yaskevich added a comment - Committed
          Hide
          Vijay added a comment -

          Rebased, the changes/breaks CASSANDRA-4023 but it is fine because the cost is during the upgrade and not seen after that (The cost of deserialization dk for index summary). I am going to work on 3762, I think we can commit it independently now.

          Show
          Vijay added a comment - Rebased, the changes/breaks CASSANDRA-4023 but it is fine because the cost is during the upgrade and not seen after that (The cost of deserialization dk for index summary). I am going to work on 3762, I think we can commit it independently now.
          Hide
          Jonathan Ellis added a comment -

          For the record, I'm still fine with saying "loading caches will slow down startup, deal with it," but I think we have a good plan of attack on 3762 now and it may be simpler to just do that first, before rebasing this. Which is also fine.

          Show
          Jonathan Ellis added a comment - For the record, I'm still fine with saying "loading caches will slow down startup, deal with it," but I think we have a good plan of attack on 3762 now and it may be simpler to just do that first, before rebasing this. Which is also fine.
          Hide
          Pavel Yaskevich added a comment -

          Great, now when we have this one ready, we really need to finish up with CASSANDRA-3762 to find out if such design makes sense or should we go with another strategy.

          Show
          Pavel Yaskevich added a comment - Great, now when we have this one ready, we really need to finish up with CASSANDRA-3762 to find out if such design makes sense or should we go with another strategy.
          Hide
          Vijay added a comment -

          The problem with v5 is that when we do the loadSummary() we will have ibuilder and dbuilder populated... when boolean readIndex is true we will repopulate it again....

          Attached v6 fixes it and tested. Thanks!

          Show
          Vijay added a comment - The problem with v5 is that when we do the loadSummary() we will have ibuilder and dbuilder populated... when boolean readIndex is true we will repopulate it again.... Attached v6 fixes it and tested. Thanks!
          Hide
          Pavel Yaskevich added a comment -

          changed SUMMARIES to SUMMARY,

          {load, save}Summaries to {load, save}

          Summary, changed few variable names, added/extended comments and log messages, added condition to skip IndexSummary re-population while primary_index is read and IndexSummary was already successfully loaded in SSTableReader.load(...), please test and let me know what you think.

          Show
          Pavel Yaskevich added a comment - changed SUMMARIES to SUMMARY, {load, save}Summaries to {load, save} Summary, changed few variable names, added/extended comments and log messages, added condition to skip IndexSummary re-population while primary_index is read and IndexSummary was already successfully loaded in SSTableReader.load(...), please test and let me know what you think.
          Hide
          Vijay added a comment -

          >>> Will you be able to deal with CASSANDRA-3762 in time for 1.1 release?
          Yeah i have a proto type working just have to do a quick benchmark to see if it will make sense....

          Show
          Vijay added a comment - >>> Will you be able to deal with CASSANDRA-3762 in time for 1.1 release? Yeah i have a proto type working just have to do a quick benchmark to see if it will make sense....
          Hide
          Pavel Yaskevich added a comment -

          Thanks for the patch, going take a look soon! Will you be able to deal with CASSANDRA-3762 in time for 1.1 release? That way we will be able to move it and this one back to the 1.1.

          Show
          Pavel Yaskevich added a comment - Thanks for the patch, going take a look soon! Will you be able to deal with CASSANDRA-3762 in time for 1.1 release? That way we will be able to move it and this one back to the 1.1.
          Hide
          Vijay added a comment -

          Hope the attached patch has all you needed... I added a check to see the disk access mode if it changed then we will recreate the summaries.... For now ignore the saveSummaries called when keycache is read. i will fix it in the other ticket.

          Show
          Vijay added a comment - Hope the attached patch has all you needed... I added a check to see the disk access mode if it changed then we will recreate the summaries.... For now ignore the saveSummaries called when keycache is read. i will fix it in the other ticket.
          Hide
          Pavel Yaskevich added a comment -

          Well thats for the following...
          if (!recreatebloom && !cacheLoading && loadSummaries)
          return;

          We still need to complete the indexes and builders even though you dont go through all the code in try.

          Then I can suggest you to skip the primary_index loop phrase instead of return, because using finally that way is not a good practice nor style.

          Not sure if it can have different semantics, but i will remove the refactor not a big deal.

          Please do.

          I dont think in either case we leak descriptors it is in the finally block already and all we are planning to save is a variable assignment and i will do that not a big deal.

          I don't follow here, what I mean is to change

          {load, save}

          Summaries method's try blocks to look like code posted below, because in your version you close stream only if IOException is thrown:

          +        try
          +        {
          +            {iStream = new DataInputStream(new FileInputStream(inMemoryDataFile));
          +            reader.indexSummary = IndexSummary.serializer.deserialize(iStream, reader.descriptor);
          +            ibuilder.deserializeBounds(iStream);
          +            dbuilder.deserializeBounds(iStream);
          +        }
          +        catch (IOException e)
          +        {
          +            // corrupted hence delete it and let it load it now.
          +            if (inMemoryDataFile.exists())
          +                inMemoryDataFile.delete();
          +            return false;
          +        }
          +        finally
          +        {
          +            FileUtils.closeQuietly(iStream);
          +        }
          

          and

          +        try
          +        {
          +            oStream = new DataOutputStream(new FileOutputStream(summaryFile));
          +            IndexSummary.serializer.serialize(reader.indexSummary, oStream);
          +            ibuilder.serializeBounds(oStream);
          +            dbuilder.serializeBounds(oStream);
          +        }
          +        catch (IOException e)
          +        {
          +            // corrupted hence delete it and let it load it now.
          +            if (summaryFile.exists())
          +                summaryFile.delete();
          +        }
          +        finally
          +        {
          +            FileUtils.closeQuietly(oStream);
          +        }
          

          To make sure that we close stream every time and not only when IOException is thrown,file descriptor can be closed even after original file was deleted so everything is save even if IOException is thrown and file is deleted before finally block is called.

          Show
          Pavel Yaskevich added a comment - Well thats for the following... if (!recreatebloom && !cacheLoading && loadSummaries) return; We still need to complete the indexes and builders even though you dont go through all the code in try. Then I can suggest you to skip the primary_index loop phrase instead of return, because using finally that way is not a good practice nor style. Not sure if it can have different semantics, but i will remove the refactor not a big deal. Please do. I dont think in either case we leak descriptors it is in the finally block already and all we are planning to save is a variable assignment and i will do that not a big deal. I don't follow here, what I mean is to change {load, save} Summaries method's try blocks to look like code posted below, because in your version you close stream only if IOException is thrown: + try + { + {iStream = new DataInputStream(new FileInputStream(inMemoryDataFile)); + reader.indexSummary = IndexSummary.serializer.deserialize(iStream, reader.descriptor); + ibuilder.deserializeBounds(iStream); + dbuilder.deserializeBounds(iStream); + } + catch (IOException e) + { + // corrupted hence delete it and let it load it now. + if (inMemoryDataFile.exists()) + inMemoryDataFile.delete(); + return false; + } + finally + { + FileUtils.closeQuietly(iStream); + } and + try + { + oStream = new DataOutputStream(new FileOutputStream(summaryFile)); + IndexSummary.serializer.serialize(reader.indexSummary, oStream); + ibuilder.serializeBounds(oStream); + dbuilder.serializeBounds(oStream); + } + catch (IOException e) + { + // corrupted hence delete it and let it load it now. + if (summaryFile.exists()) + summaryFile.delete(); + } + finally + { + FileUtils.closeQuietly(oStream); + } To make sure that we close stream every time and not only when IOException is thrown,file descriptor can be closed even after original file was deleted so everything is save even if IOException is thrown and file is deleted before finally block is called.
          Hide
          Vijay added a comment -

          >>> indexSummary.complete() can be moved out from the try block because it doesn't throw IOException
          Well thats for the following...
          if (!recreatebloom && !cacheLoading && loadSummaries)
          return;

          We still need to complete the indexes and builders even though you dont go through all the code in try.

          >>> is not a guaranteed thing which means that IndexSummary.last has a different semantics
          Not sure if it can have different semantics, but i will remove the refactor not a big deal.

          >>> Summaries methods are leaking file descriptors
          I dont think in either case we leak descriptors it is in the finally block already and all we are planning to save is a variable assignment and i will do that not a big deal.

          Show
          Vijay added a comment - >>> indexSummary.complete() can be moved out from the try block because it doesn't throw IOException Well thats for the following... if (!recreatebloom && !cacheLoading && loadSummaries) return; We still need to complete the indexes and builders even though you dont go through all the code in try. >>> is not a guaranteed thing which means that IndexSummary.last has a different semantics Not sure if it can have different semantics, but i will remove the refactor not a big deal. >>> Summaries methods are leaking file descriptors I dont think in either case we leak descriptors it is in the finally block already and all we are planning to save is a variable assignment and i will do that not a big deal.
          Hide
          Pavel Yaskevich added a comment -

          Also changes shown below are odd because the same behavior you get by keeping code the same - it will just throw an exception somewhere in try block, run code in finally block and never get to the indexSummary.complete() and

          {i,d}

          builder.complete(String) methods which are no-op in that case. Btw, indexSummary.complete() can be moved out from the try block because it doesn't throw IOException and no-op if code above it does but that is not a big deal anyway.

          +        catch (IOException ex)
          +        {
          +            exception = true;
          +            throw ex;
                   }
                   finally
                   {
          +            // close the file first.
                       FileUtils.closeQuietly(input);
          +            if (!exception)
          +            {
          +                // finalize the load.
          +                indexSummary.complete();
          +                // finalize the state of the reader
          +                ifile = ibuilder.complete(descriptor.filenameFor(Component.PRIMARY_INDEX));
          +                dfile = dbuilder.complete(descriptor.filenameFor(Component.DATA));
          +            }
                   }
          
          Show
          Pavel Yaskevich added a comment - Also changes shown below are odd because the same behavior you get by keeping code the same - it will just throw an exception somewhere in try block, run code in finally block and never get to the indexSummary.complete() and {i,d} builder.complete(String) methods which are no-op in that case. Btw, indexSummary.complete() can be moved out from the try block because it doesn't throw IOException and no-op if code above it does but that is not a big deal anyway. + catch (IOException ex) + { + exception = true; + throw ex; } finally { + // close the file first. FileUtils.closeQuietly(input); + if (!exception) + { + // finalize the load. + indexSummary.complete(); + // finalize the state of the reader + ifile = ibuilder.complete(descriptor.filenameFor(Component.PRIMARY_INDEX)); + dfile = dbuilder.complete(descriptor.filenameFor(Component.DATA)); + } }
          Hide
          Pavel Yaskevich added a comment -

          But the main idea is to reduce the code and the checks which we have to do just to populate the first and last variable. IMO it is better served in Index Summary which already has the needed checks. by using maybeAddEntry() and marking other private everywhere we dont need extra checks else where to populate the fields... first and last in a index is also a summary

          Correct me if I'm wrong but as I see in SSTableReader.load(...) that condition "SSTable.last == IndexSummary.last" is not a guaranteed thing which means that IndexSummary.last has a different semantics from SSTable.last. According to checks - I don't see many of those and IndexSummary in it's current state does not have anything to do with SSTable's last/first variables so I don't really understand what checks are you talking about? If you really want to be pedantic about the domain of first/last - I agree that they could belong to the summary of the SSTable but certainly not to the "index" one

          Because we read from the disk to populate the Index Summary? If yes i can make sure that both the patches go into the same release.

          Because we would end-up reading more data (e.g. some of the keys and all index and data positions would be read twice) from different files - primary_index and summary.

          Show
          Pavel Yaskevich added a comment - But the main idea is to reduce the code and the checks which we have to do just to populate the first and last variable. IMO it is better served in Index Summary which already has the needed checks. by using maybeAddEntry() and marking other private everywhere we dont need extra checks else where to populate the fields... first and last in a index is also a summary Correct me if I'm wrong but as I see in SSTableReader.load(...) that condition "SSTable.last == IndexSummary.last" is not a guaranteed thing which means that IndexSummary.last has a different semantics from SSTable.last. According to checks - I don't see many of those and IndexSummary in it's current state does not have anything to do with SSTable's last/first variables so I don't really understand what checks are you talking about? If you really want to be pedantic about the domain of first/last - I agree that they could belong to the summary of the SSTable but certainly not to the "index" one Because we read from the disk to populate the Index Summary? If yes i can make sure that both the patches go into the same release. Because we would end-up reading more data (e.g. some of the keys and all index and data positions would be read twice) from different files - primary_index and summary.
          Hide
          Vijay added a comment -

          >>> I don't think that "0001-re-factor-first-and-last" is a good idea because by moving first/last variables to IndexSummary
          But the main idea is to reduce the code and the checks which we have to do just to populate the first and last variable. IMO it is better served in Index Summary which already has the needed checks. by using maybeAddEntry() and marking other private everywhere we dont need extra checks else where to populate the fields... first and last in a index is also a summary

          >>> one release that could make start-up times even longer than right now
          Because we read from the disk to populate the Index Summary? If yes i can make sure that both the patches go into the same release.

          Show
          Vijay added a comment - >>> I don't think that "0001-re-factor-first-and-last" is a good idea because by moving first/last variables to IndexSummary But the main idea is to reduce the code and the checks which we have to do just to populate the first and last variable. IMO it is better served in Index Summary which already has the needed checks. by using maybeAddEntry() and marking other private everywhere we dont need extra checks else where to populate the fields... first and last in a index is also a summary >>> one release that could make start-up times even longer than right now Because we read from the disk to populate the Index Summary? If yes i can make sure that both the patches go into the same release.
          Hide
          Pavel Yaskevich added a comment -

          here is the last things with v3

          • {load, save}

            Summaries methods are leaking file descriptors because

            {o, i}

            Stream is closed only when method handles IOException.

          Nit:

          +            FileInputStream input = new FileInputStream(inMemoryDataFile);
          +            iStream = new DataInputStream(input);
          

          and

          +            FileOutputStream input = new FileOutputStream(summaryFile);
          +            oStream = new DataOutputStream(input);
          

          can be changed to

          {i,o}Stream = new Data{Input, Output}Stream(new File{Input, Output}Stream(summaryFile); 
          

          because input var is not really needed.

          I don't think that "0001-re-factor-first-and-last" is a good idea because by moving first/last variables to IndexSummary you change their semantics and they are no longer indicate the first and last key that SSTable keeps but rather first/last key covered by IndexSummary of the individual SSTable, so I think we really should just keep those variables in the old place.

          Also I'm concerned that CASSANDRA-3762 is marked for 1.2 and this one for 1.1 because if we don't get them in one release that could make start-up times even longer than right now, which breaks the point of current task, because there is big chance that key cache would be enabled on the big ColumnFamilies.

          Show
          Pavel Yaskevich added a comment - here is the last things with v3 {load, save} Summaries methods are leaking file descriptors because {o, i} Stream is closed only when method handles IOException. Nit: + FileInputStream input = new FileInputStream(inMemoryDataFile); + iStream = new DataInputStream(input); and + FileOutputStream input = new FileOutputStream(summaryFile); + oStream = new DataOutputStream(input); can be changed to {i,o}Stream = new Data{Input, Output}Stream(new File{Input, Output}Stream(summaryFile); because input var is not really needed. I don't think that "0001-re-factor-first-and-last" is a good idea because by moving first/last variables to IndexSummary you change their semantics and they are no longer indicate the first and last key that SSTable keeps but rather first/last key covered by IndexSummary of the individual SSTable, so I think we really should just keep those variables in the old place. Also I'm concerned that CASSANDRA-3762 is marked for 1.2 and this one for 1.1 because if we don't get them in one release that could make start-up times even longer than right now, which breaks the point of current task, because there is big chance that key cache would be enabled on the big ColumnFamilies.
          Hide
          Vijay added a comment -

          Hi Pavel, v3 has the method name changes. Now called Summaries.db

          Show
          Vijay added a comment - Hi Pavel, v3 has the method name changes. Now called Summaries.db
          Hide
          Jonathan Ellis added a comment -

          I think it's fine to acknowledge that key cache load will negate the advantages of enabling saved indexsummaries for this ticket, and open another one to improve the key cache design.

          Show
          Jonathan Ellis added a comment - I think it's fine to acknowledge that key cache load will negate the advantages of enabling saved indexsummaries for this ticket, and open another one to improve the key cache design.
          Hide
          Pavel Yaskevich added a comment -

          Renamed and done recommended changes. Exempt we have all the in-memory data-structures in one file instead of multiple files. They are handled differently and will be kind of throw away data so we can regenerate it.

          I kind of liked it more when component was Summary because InMemoryData doesn't really tell what is inside. Please rename SegmentedFile serialize/deserialize to something like serializeBounds/deserializeBounds.

          I do see Keycache working in my tests...

          Sorry I wasn't clear when I was saying that. It seems like that summary save/load is pointless in it's current form because even if we have loaded summary from disk we would anyway have to loop through whole PRIMARY_INDEX if pre-cache (which is always enabled by default) or re-create-BloomFilter was enabled, which is practically means that we spend the same time on I/O there as ibuilder.deserialize and dbuilder.deserialize together. We would need to change the logic in SSTableReader.load(boolean, Set<DecoratedKey>) the way it doesn't have such I/O overhead because this will make it even slower comparing to the time it takes now.

          Show
          Pavel Yaskevich added a comment - Renamed and done recommended changes. Exempt we have all the in-memory data-structures in one file instead of multiple files. They are handled differently and will be kind of throw away data so we can regenerate it. I kind of liked it more when component was Summary because InMemoryData doesn't really tell what is inside. Please rename SegmentedFile serialize/deserialize to something like serializeBounds/deserializeBounds. I do see Keycache working in my tests... Sorry I wasn't clear when I was saying that. It seems like that summary save/load is pointless in it's current form because even if we have loaded summary from disk we would anyway have to loop through whole PRIMARY_INDEX if pre-cache (which is always enabled by default) or re-create-BloomFilter was enabled, which is practically means that we spend the same time on I/O there as ibuilder.deserialize and dbuilder.deserialize together. We would need to change the logic in SSTableReader.load(boolean, Set<DecoratedKey>) the way it doesn't have such I/O overhead because this will make it even slower comparing to the time it takes now.
          Hide
          Vijay added a comment -

          Hi Pavel, Plz find the attached patch.
          1) Renamed and done recommended changes. Exempt we have all the in-memory data-structures in one file instead of multiple files. They are handled differently and will be kind of throw away data so we can regenerate it.
          2) I do see Keycache working in my tests...
          3) the change is only on 0002 and 0001 remains the same.

          Thanks!

          Show
          Vijay added a comment - Hi Pavel, Plz find the attached patch. 1) Renamed and done recommended changes. Exempt we have all the in-memory data-structures in one file instead of multiple files. They are handled differently and will be kind of throw away data so we can regenerate it. 2) I do see Keycache working in my tests... 3) the change is only on 0002 and 0001 remains the same. Thanks!
          Hide
          Pavel Yaskevich added a comment - - edited

          Will do, The initial idea was to save some disk space as they keys in some cases can be really long and with the index seeks was not that bad in my initial tests but i will save it in v2.

          I'm thinking here from the I/O perpective because if we just read one file sequentially we will get page cache read-head working for us populating it with useful data but if you read from two files and do random I/O on one of them that will lead to slower I/O + page cache populated with useless data which could cost performance when node finishes start-up and starts to serve reads. Index intervals are almost all the time big enough so space taken by keys negligible comparing to I/O benefits it would give us.

          I am not sure how saving dataPosition will help as we only have summaries between 128Keys or more and how will we mark a boundary with it? For example each row is 100MB big.

          Oh yes, you are right, we really need all boundary information from segmented files, my bad.

          Show
          Pavel Yaskevich added a comment - - edited Will do, The initial idea was to save some disk space as they keys in some cases can be really long and with the index seeks was not that bad in my initial tests but i will save it in v2. I'm thinking here from the I/O perpective because if we just read one file sequentially we will get page cache read-head working for us populating it with useful data but if you read from two files and do random I/O on one of them that will lead to slower I/O + page cache populated with useless data which could cost performance when node finishes start-up and starts to serve reads. Index intervals are almost all the time big enough so space taken by keys negligible comparing to I/O benefits it would give us. I am not sure how saving dataPosition will help as we only have summaries between 128Keys or more and how will we mark a boundary with it? For example each row is 100MB big. Oh yes, you are right, we really need all boundary information from segmented files, my bad.
          Hide
          Vijay added a comment -

          Hi Pavel

          >> To avoid any seeks in the PRIMARY_INDEX file upon IndexSummary.deserialize I suggest to save key (only BB part) as well as index position on IndexSummary.serialize.
          Will do, The initial idea was to save some disk space as they keys in some cases can be really long and with the index seeks was not that bad in my initial tests but i will save it in v2.

          >> I would also suggest to save dataPosition from the primary index into summaries file to avoid adding serialization to SegmentedFile because SegmentedFile serialize(...)/deserialize(...) are not really a serialize/deserialize
          I am not sure how saving dataPosition will help as we only have summaries between 128Keys or more and how will we mark a boundary with it? For example each row is 100MB big.

          >> can you please explain this chunk of code to me?
          The idea is to save the summary when SSTable creation/load completes (as there isnt any temporary state for them and they fit in memory). If corrupted or deleted or not there we will just recalculate them instead of depending on them.

          Show
          Vijay added a comment - Hi Pavel >> To avoid any seeks in the PRIMARY_INDEX file upon IndexSummary.deserialize I suggest to save key (only BB part) as well as index position on IndexSummary.serialize. Will do, The initial idea was to save some disk space as they keys in some cases can be really long and with the index seeks was not that bad in my initial tests but i will save it in v2. >> I would also suggest to save dataPosition from the primary index into summaries file to avoid adding serialization to SegmentedFile because SegmentedFile serialize(...)/deserialize(...) are not really a serialize/deserialize I am not sure how saving dataPosition will help as we only have summaries between 128Keys or more and how will we mark a boundary with it? For example each row is 100MB big. >> can you please explain this chunk of code to me? The idea is to save the summary when SSTable creation/load completes (as there isnt any temporary state for them and they fit in memory). If corrupted or deleted or not there we will just recalculate them instead of depending on them.
          Hide
          Pavel Yaskevich added a comment -

          Thanks for the patch! Here is my review:

          • Index summaries load in SSTableReader.load(boolean, Set<DecoratedKey>) breaks key cache pre-load.
          • IndexSummary deserialize(...) method should be made static and return IndexSummary object. This will also allow to drop IndexSummary argument from SSTableReader.loadSummaries(...).
          • To avoid any seeks in the PRIMARY_INDEX file upon IndexSummary.deserialize I suggest to save key (only BB part) as well as index position on IndexSummary.serialize.
          • I would also suggest to save dataPosition from the primary index into summaries file to avoid adding serialization to SegmentedFile because SegmentedFile serialize(...)/deserialize(...) are not really a serialize/deserialize - they just save/read boundaries. This way you would be able to do deserialization and boundary load at the save time without saving/reading additional information to/from the disk because only ibuilder needs indexPosition and dbuilder - dataPosition.
          • loadSummaries should be renamed to something more appropriate because that method does not only load index summaries it also loads index and data builders, per se it does not really load them but rather just deserializes boundaries into an existing object with is not a good practice.
          • can you please explain this chunk of code to me?
            +            // don't rename summaries as it is not created yet and created while it is loaded.
            +            for (Component component : Sets.difference(components, Sets.newHashSet(Component.DATA, Component.SUMMARIES)))
                              FBUtilities.renameWithConfirm(tmpdesc.filenameFor(component), newdesc.filenameFor(component));
            
          Show
          Pavel Yaskevich added a comment - Thanks for the patch! Here is my review: Index summaries load in SSTableReader.load(boolean, Set<DecoratedKey>) breaks key cache pre-load. IndexSummary deserialize(...) method should be made static and return IndexSummary object. This will also allow to drop IndexSummary argument from SSTableReader.loadSummaries(...). To avoid any seeks in the PRIMARY_INDEX file upon IndexSummary.deserialize I suggest to save key (only BB part) as well as index position on IndexSummary.serialize. I would also suggest to save dataPosition from the primary index into summaries file to avoid adding serialization to SegmentedFile because SegmentedFile serialize(...)/deserialize(...) are not really a serialize/deserialize - they just save/read boundaries. This way you would be able to do deserialization and boundary load at the save time without saving/reading additional information to/from the disk because only ibuilder needs indexPosition and dbuilder - dataPosition. loadSummaries should be renamed to something more appropriate because that method does not only load index summaries it also loads index and data builders, per se it does not really load them but rather just deserializes boundaries into an existing object with is not a good practice. can you please explain this chunk of code to me? + // don't rename summaries as it is not created yet and created while it is loaded. + for (Component component : Sets.difference(components, Sets.newHashSet(Component.DATA, Component.SUMMARIES))) FBUtilities.renameWithConfirm(tmpdesc.filenameFor(component), newdesc.filenameFor(component));
          Hide
          Vijay added a comment -

          Hi Pavel, Separating the refactor and the saving index patches and tested basic functionalities. Thanks!

          Show
          Vijay added a comment - Hi Pavel, Separating the refactor and the saving index patches and tested basic functionalities. Thanks!
          Hide
          Vijay added a comment -

          Hi Pavel, Initially i thought of doing something like autosaving cache but later decided on the other approch, plz take a look at this and let me know... Thanks!

          Show
          Vijay added a comment - Hi Pavel, Initially i thought of doing something like autosaving cache but later decided on the other approch, plz take a look at this and let me know... Thanks!
          Hide
          Vijay added a comment -

          Update: current plan for this ticket is to implement something like CASSANDRA-3623 for mmap'ed files and remove addPotentialBoundary() code.

          Show
          Vijay added a comment - Update: current plan for this ticket is to implement something like CASSANDRA-3623 for mmap'ed files and remove addPotentialBoundary() code.
          Hide
          Jonathan Ellis added a comment -

          To answer the question: yes, let's ignore caches here. Would like to do this for 1.1 as well.

          Show
          Jonathan Ellis added a comment - To answer the question: yes, let's ignore caches here. Would like to do this for 1.1 as well.
          Hide
          Pavel Yaskevich added a comment -

          Looking forward to see your patch.

          Show
          Pavel Yaskevich added a comment - Looking forward to see your patch.
          Hide
          Vijay added a comment -

          Cool then we can do a small refractor of addPotentialBoundary() used by MmappedSegmentedFile from the indexFile and ignore keycache in this patch (which will be taken care @ CASSANDRA-3143)?

          Show
          Vijay added a comment - Cool then we can do a small refractor of addPotentialBoundary() used by MmappedSegmentedFile from the indexFile and ignore keycache in this patch (which will be taken care @ CASSANDRA-3143 )?
          Hide
          Pavel Yaskevich added a comment -

          We already hold a primary index file which has information about index/data segment boundaries, saving boundary information twice would be redundant. Also we don't want to save key cache because it could be irrelevant by the time node starts and second because we are planing to add global key/row caches in 1.1.

          Show
          Pavel Yaskevich added a comment - We already hold a primary index file which has information about index/data segment boundaries, saving boundary information twice would be redundant. Also we don't want to save key cache because it could be irrelevant by the time node starts and second because we are planing to add global key/row caches in 1.1.
          Hide
          Vijay added a comment -

          Sorry wasn't clear earlier,
          in SSTR.load(recreatebloom, keysToLoadInCache) we do check to addPotentialBoundary() for mmap and compressionbuilder...
          We can save all of them (Should we make it configurable?) and for Keycache we can do a testAndLoad...

          Basically testAndLoad
          Option 1) will check the bloom filter and will check the file for the keys and if there is then it will add to cache (If cache is saved).
          Option 2) Or we can save the descriptor with the keycache file and testAndLoad can verify if the file exists (which will be cheaper during startup).

          Show
          Vijay added a comment - Sorry wasn't clear earlier, in SSTR.load(recreatebloom, keysToLoadInCache) we do check to addPotentialBoundary() for mmap and compressionbuilder... We can save all of them (Should we make it configurable?) and for Keycache we can do a testAndLoad... Basically testAndLoad Option 1) will check the bloom filter and will check the file for the keys and if there is then it will add to cache (If cache is saved). Option 2) Or we can save the descriptor with the keycache file and testAndLoad can verify if the file exists (which will be cheaper during startup).
          Hide
          Jonathan Ellis added a comment -

          What do you mean by saving ibuilder + dbuilder? Serialize them somehow?

          Show
          Jonathan Ellis added a comment - What do you mean by saving ibuilder + dbuilder? Serialize them somehow?
          Hide
          Vijay added a comment -

          We might also want to save: ibuilder and dbuilder + keycache with descriptor.

          Show
          Vijay added a comment - We might also want to save: ibuilder and dbuilder + keycache with descriptor.

            People

            • Assignee:
              Vijay
              Reporter:
              Chris Goffinet
              Reviewer:
              Pavel Yaskevich
            • Votes:
              1 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development