Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-13676

Some serializers depend on Stream-specific methods

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Normal
    • Resolution: Unresolved
    • None
    • Legacy/Core
    • None
    • Normal

    Description

      When fixing CASSANDRA-2382, Jonathan Ellis complained that some serializers did (do?) depend on Stream-specific methods. The buggy code is as follow:

      public static class EstimatedHistogramSerializer implements ICompactSerializer<EstimatedHistogram>
          {
              public void serialize(EstimatedHistogram eh, DataOutputStream dos) throws IOException
              {
                  long[] offsets = eh.getBucketOffsets();
                  long[] buckets = eh.getBuckets(false);
                  dos.writeInt(buckets.length);
                  for (int i = 0; i < buckets.length; i++)
                  {
                      dos.writeLong(offsets[i == 0 ? 0 : i - 1]);
                      dos.writeLong(buckets[i]);
                  }
              }
      
              public EstimatedHistogram deserialize(DataInputStream dis) throws IOException
              {
                  int size = dis.readInt();
                  long[] offsets = new long[size - 1];
                  long[] buckets = new long[size];
      
                  for (int i = 0; i < size; i++) {
                      offsets[i == 0 ? 0 : i - 1] = dis.readLong();
                      buckets[i] = dis.readLong();
                  }
                  return new EstimatedHistogram(offsets, buckets);
              }
          }
      

      The fixed code is:

      public static class EstimatedHistogramSerializer implements ICompactSerializer2<EstimatedHistogram>
          {
              public void serialize(EstimatedHistogram eh, DataOutput dos) throws IOException
              {
                  long[] offsets = eh.getBucketOffsets();
                  long[] buckets = eh.getBuckets(false);
                  dos.writeInt(buckets.length);
                  for (int i = 0; i < buckets.length; i++)
                  {
                      dos.writeLong(offsets[i == 0 ? 0 : i - 1]);
                      dos.writeLong(buckets[i]);
                  }
              }
      
              public EstimatedHistogram deserialize(DataInput dis) throws IOException
              {
                  int size = dis.readInt();
                  long[] offsets = new long[size - 1];
                  long[] buckets = new long[size];
      
                  for (int i = 0; i < size; i++) {
                      offsets[i == 0 ? 0 : i - 1] = dis.readLong();
                      buckets[i] = dis.readLong();
                  }
                  return new EstimatedHistogram(offsets, buckets);
              }
          }
      

      I notice that some serializers still depend on Stream-specific methods. For example, the IndexSummary_deserialize method has the following code:

       public IndexSummary deserialize(DataInputStream in, IPartitioner partitioner, int expectedMinIndexInterval, int maxIndexInterval) throws IOException
              {
                  int minIndexInterval = in.readInt();
                  if (minIndexInterval != expectedMinIndexInterval)
                  {
                      throw new IOException(String.format("Cannot read index summary because min_index_interval changed from %d to %d.",
                                                          minIndexInterval, expectedMinIndexInterval));
                  }
      
                  int offsetCount = in.readInt();
                  long offheapSize = in.readLong();
                  int samplingLevel = in.readInt();
                  int fullSamplingSummarySize = in.readInt();
      
                  int effectiveIndexInterval = (int) Math.ceil((BASE_SAMPLING_LEVEL / (double) samplingLevel) * minIndexInterval);
                  if (effectiveIndexInterval > maxIndexInterval)
                  {
                      throw new IOException(String.format("Rebuilding index summary because the effective index interval (%d) is higher than" +
                                                          " the current max index interval (%d)", effectiveIndexInterval, maxIndexInterval));
                  }
      
                  Memory offsets = Memory.allocate(offsetCount * 4);
                  Memory entries = Memory.allocate(offheapSize - offsets.size());
                  try
                  {
                      FBUtilities.copy(in, new MemoryOutputStream(offsets), offsets.size());
                      FBUtilities.copy(in, new MemoryOutputStream(entries), entries.size());
                  }
                  catch (IOException ioe)
                  {
                      offsets.free();
                      entries.free();
                      throw ioe;
                  }
                  // our on-disk representation treats the offsets and the summary data as one contiguous structure,
                  // in which the offsets are based from the start of the structure. i.e., if the offsets occupy
                  // X bytes, the value of the first offset will be X. In memory we split the two regions up, so that
                  // the summary values are indexed from zero, so we apply a correction to the offsets when de/serializing.
                  // In this case subtracting X from each of the offsets.
                  for (int i = 0 ; i < offsets.size() ; i += 4)
                      offsets.setInt(i, (int) (offsets.getInt(i) - offsets.size()));
                  return new IndexSummary(partitioner, offsets, offsetCount, entries, entries.size(), fullSamplingSummarySize, minIndexInterval, samplingLevel);
              }
      

      Is it worthy replacing the Stream-specific inputs as well?

      Attachments

        Activity

          People

            Unassigned Unassigned
            haozhong Hao Zhong
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: