In https://issues.apache.org/jira/browse/KAFKA-12602, we manually constructed a correct license file for 2.8.0. This file will certainly become wrong again in later releases, so we need to write some kind of script to automate a check.
It crossed my mind to automate the generation of the file, but it seems to be an intractable problem, considering that each dependency may change licenses, may package license files, link to them from their poms, link to them from their repos, etc. I've also found multiple URLs listed with various delimiters, broken links that I have to chase down, etc.
Therefore, it seems like the solution to aim for is simply: list all the jars that we package, and print out a report of each jar that's extra or missing vs. the ones in our `LICENSE-binary` file.
The check should be part of the release script at least, if not part of the regular build (so we keep it up to date as dependencies change).
Here's how I do this manually right now: