Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-9559

Change the default "default serde" from ByteArraySerde to null

Details

    • Improvement
    • Status: Resolved
    • Blocker
    • Resolution: Fixed
    • None
    • 3.0.0
    • streams

    Description

      The current default "default serde" is not particularly useful, and in almost all cases is intended to be overwritten either by setting the config explicitly or by specifying serdes directly in the topology. If a user does not set the config and misses specifying a serde they will get a runtime ClassCastException once they start processing unless they are in fact processing only bytes.

      We should change the default default to null, so that an exception will instead be thrown immediately on startup if a user failed to specify a serde somewhere in their topology and it falls back to the unset default.

      KIP-741: https://cwiki.apache.org/confluence/display/KAFKA/KIP-741%3A+Change+default+serde+to+be+null 

      Attachments

        Activity

          high.lee Just to clarify – this ticket implies a backward incompatible change, and thus we cannot do it before 3.0 release. There is no timeline for a 3.0 major version bump atm – hence, we cannot really work on this ticket atm.

          mjsax Matthias J. Sax added a comment - high.lee  Just to clarify – this ticket implies a backward incompatible change, and thus we cannot do it before 3.0 release. There is no timeline for a 3.0 major version bump atm – hence, we cannot really work on this ticket atm.
          high.lee highluck added a comment -

          mjsax

          oh thank you, for your care!

          i'm understand!

          high.lee highluck added a comment - mjsax oh thank you, for your care! i'm understand!

          mjsax I am willing to work on this issue 

           

          msufyian Muhammad Sufyian added a comment - mjsax  I am willing to work on this issue   

          msufyian Thanks for you interest. As mentioned above, this is a breaking change and can only be done in a major release. Because there are no plans to bump the version to 3.0.0 atm, we cannot work in this ticket right now.

          mjsax Matthias J. Sax added a comment - msufyian  Thanks for you interest. As mentioned above, this is a breaking change and can only be done in a major release. Because there are no plans to bump the version to 3.0.0 atm, we cannot work in this ticket right now.

          lct45 â€“ I think we should also add corresponding checks to get clean error messages when we try to fall-back to the default serdes – atm, we know they can never be `null` but with this change they could be `null` and me might crash with a NPE what is bad user experience. Instead, we should check for the `null` condition and throw an informative `StreamsException` (or some sub-type).

          mjsax Matthias J. Sax added a comment - lct45  â€“ I think we should also add corresponding checks to get clean error messages when we try to fall-back to the default serdes – atm, we know they can never be `null` but with this change they could be `null` and me might crash with a NPE what is bad user experience. Instead, we should check for the `null` condition and throw an informative `StreamsException` (or some sub-type).

          Adding a check for NPE everywhere serdes are used is probably going to be extremely painful. Maybe instead of defaulting to null, we can add a PoisonSerde class as the default which actually provides a useful error message

          ableegoldman A. Sophie Blee-Goldman added a comment - Adding a check for NPE everywhere serdes are used is probably going to be extremely painful. Maybe instead of defaulting to null, we can add a PoisonSerde class as the default which actually provides a useful error message

          Adding a check for NPE everywhere serdes are used is probably going to be extremely painful. 

          I think there is just a single place to add it: StreamsConfig#defaultKeySerde and #defaultValueSerde  

          mjsax Matthias J. Sax added a comment - Adding a check for NPE everywhere serdes are used is probably going to be extremely painful.  I think there is just a single place to add it: StreamsConfig#defaultKeySerde and #defaultValueSerde  

          I don't think we can just throw an exception based on whether `defaultKeySerde` is null, as we always pass this in to initialize the keySerde in AbstractProcessorContext. Then we have about a million (ok, like 30) places where we call AbstractProcessorContext#key/valueSerde to get the default, and then maybe do/don't actually use that default serde by checking whether the operator/whatever has configured a specific serde. So we couldn't just throw an exception on null inside StreamsConfig#defaultKey/ValueSerde, and we can't throw if null inside AbstractProcessorContext#key/valueSerde without doing some refactoring of the downstream logic. For example see how things get passed through WrappingNullableUtils#prepareSerde(specificSerde, contextkeySerde, contextValueSerde)

          So, it's possible but it just might involve more work. Maybe not "Everywhere serdes are used" painful, though

          ableegoldman A. Sophie Blee-Goldman added a comment - I don't think we can just throw an exception based on whether `defaultKeySerde` is null, as we always pass this in to initialize the keySerde in AbstractProcessorContext. Then we have about a million (ok, like 30) places where we call AbstractProcessorContext#key/valueSerde to get the default, and then maybe do/don't actually use that default serde by checking whether the operator/whatever has configured a specific serde. So we couldn't just throw an exception on null inside StreamsConfig#defaultKey/ValueSerde, and we can't throw if null inside AbstractProcessorContext#key/valueSerde without doing some refactoring of the downstream logic. For example see how things get passed through WrappingNullableUtils#prepareSerde(specificSerde, contextkeySerde, contextValueSerde) So, it's possible but it just might involve more work. Maybe not "Everywhere serdes are used" painful, though

          Thanks for looking into the code – I think a corresponding refactoring would still be simpler than adding checks all over the place... And I feel very strong about having a check and throw an informative exception.

          lct45 the refactoring might be tricky so I would propose to do multiple PR for the refactoring upfront, before we do the actual PR to set default to `null` and add the check.

          mjsax Matthias J. Sax added a comment - Thanks for looking into the code – I think a corresponding refactoring would still be simpler than adding checks all over the place... And I feel very strong about having a check and throw an informative exception. lct45  the refactoring might be tricky so I would propose to do multiple PR for the refactoring upfront, before we do the actual PR to set default to `null` and add the check.

          Definitely +1 to refactoring, I found it pretty confusing to track down how this works at the moment. And yes, given how frequent of an issue the default serdes can be, we should absolutely throw an informative exception and not just an NPE – whatever form that may take

          ableegoldman A. Sophie Blee-Goldman added a comment - Definitely +1 to refactoring, I found it pretty confusing to track down how this works at the moment. And yes, given how frequent of an issue the default serdes can be, we should absolutely throw an informative exception and not just an NPE – whatever form that may take
          lct45 Leah Thomas added a comment -

          mjsax are you saying refactoring the code where we get the default serde and maybe/maybe not use it? I definitely agree that we want a clear error indicating what's going wrong, but I think we may be able to achieve that still with the "poison serde" idea that Sophie floated. Was there a reason you preferred the null check over the poison serde?

          lct45 Leah Thomas added a comment - mjsax  are you saying refactoring the code where we get the default serde and maybe/maybe not use it? I definitely agree that we want a clear error indicating what's going wrong, but I think we may be able to achieve that still with the "poison serde" idea that Sophie floated. Was there a reason you preferred the null check over the poison serde?

          refactoring the code where we get the default serde and maybe/maybe not use it?

          Yes, that was my proposal, ie, only try to get the default serde if we really want to use it.

          Was there a reason you preferred the null check over the poison serde?

          IMHO, it would add an (unnecessary?) layer of indirection and thus making the code more complex? To me, it seems cleaner for follow the refactoring idea – it's of course only my personal opinion.

          mjsax Matthias J. Sax added a comment - refactoring the code where we get the default serde and maybe/maybe not use it? Yes, that was my proposal, ie, only try to get the default serde if we really want to use it. Was there a reason you preferred the null check over the poison serde? IMHO, it would add an (unnecessary?) layer of indirection and thus making the code more complex? To me, it seems cleaner for follow the refactoring idea – it's of course only my personal opinion.
          lct45 Leah Thomas added a comment - https://github.com/apache/kafka/pull/10813

          People

            lct45 Leah Thomas
            ableegoldman A. Sophie Blee-Goldman
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: