[CODEC-280] Base32/64 to allow optional strict/lenient decoding - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Closed
Priority: Minor
Resolution: Fixed
Affects Version/s: 1.14
Fix Version/s: 1.15
Labels:
None

Description

Base32 decodes blocks of 8 characters.

Base64 decodes blocks of 4 characters.

At the end of decoding some extra characters may be left. They are decoded using the appropriate bits. The bits that do not sum to form a byte (i.e. less than 8 bits) are discarded.

Currently if there are more than 8 bits left then the available bytes are extracted and the left over bits are validated to check they are zeros. If they are not zeros then an exception is raised. This functionality was added to ensure that a byte array that is decoded will be re-encoded to the exact same byte array (ignoring input padding).

There are two issues:

If the leftover bits are less than 8 then no attempt can be made to obtain the last bytes. However an exception is not raised indicating that the encoding was invalid (no left-over bits should be unaccounted for).
This raising of exceptions for leftover bits is causing reports from users that codec is not working as it used to. This is true but only because the user has some badly encoded bytes they want to decode. Since other libraries allow this then it seems that two options for decoding are required.

I suggest fixing the encoding so that it operates in two modes: strict and lenient.

Strict will throw an exception whenever there are unaccounted for bits.
Lenient will just discard the extra bits that cannot be used.

Lenient is the default for backward compatibility restoring functionality of the class to versions prior to 1.13.

Strict is enabled using a method:

Base64 codec = new Base64();
byte[] bytes = new byte{ 'E' };
Assertions.assertArrayEquals(new byte[0] () -> codec.decode(bytes));
codec.setStrictDecoding(true);
Assertions.assertThrows(IllegalArgumentException.class, () -> codec.decode());

Using strict encoding should ensure that a round trip returns the same bytes:

byte[] bytes = ...; // Some valid encoding with no padding characters
Base64 codec = new Base64();
codec.setStrictDecoding(true);
Assertions.assertArrayEquals(bytes, codec.encode(codec.decode(bytes)));

Attachments

Issue Links

is depended upon by

CODEC-289 Base32/64Input/OutputStream to allow optional strict/lenient decoding

Closed

is related to

CODEC-134 Base32 would decode some invalid Base32 encoded string into arbitrary value

Resolved

CODEC-279 Base64.decode fails on Java11 for certain valid base 64 encoded String

Resolved

CODEC-270 Base32 and Base64 still allow decoding some invalid trailing characters

Resolved

relates to

CODEC-263 Base64.decodeBase64 throw exception

Resolved

links to

GitHub Pull Request #35

(1 links to)

Activity

People

Assignee:: Alex Herbert

Reporter:: Alex Herbert

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 21/Jan/20 13:25

Updated:: 11/Feb/21 18:13

Resolved:: 28/Jan/20 15:20

Time Tracking

Estimated:

Not Specified

Remaining:

Logged:

20m