PIG-1781.3.patch, supersedes PIG-1781.2.patch
PIG-1781.2.patch and PIG-1781.3.patch:
- Changed ISOHelper's dateTime() to dateTimeParser(), which more closely matches the pre-patch behavior of accepting a date or time or dateTime.
- Renamed parseDate() to parseDateTime(), and renamed corresponding tests.
- Added a public constant ISOHelper#DEFAULT_DATE_TIME_ZONE to advertise the fact that ISOHelper has its own default time zone for parsing.
- Added code to ISOHelper#parseDateTime to use UTC as default time zone for parsing ambiguous dates.
- Added code to save/restore the System's default time zone. *This is a change in behavior from the pre-patch code.**
- Added unit tests:
- to illustrate various corner cases of date/time/dateTime and UTC/other-time-zone/no-time-zone parsing
- to illustrate how default time zones are managed.
- ran 'svn diff' at base of pig tree.
"ant clean test" in contrib directory succeeded.
I hope this version addresses everyone's concerns, but let me know if more changes are needed.
Regarding the general issue of time zones. I see the options for behavior like so:
(A) Use the System default time zone;
(B) Use ISOHelper's preferred time zone (UTC) to parse ambiguous times, but don't touch System's default time zone;
(C) Use ISOHelper's preferred time zone (UTC) to parse ambiguous times, and mutate the System time zone to match.
(D) Let the Pig User set a time zone for parsing ambiguous dates, independently of the System default time zone, with a parser default to either UTC or System's default.
The pre-patch 0.8.0 code behaves as option (C). This patch changes the behavior to be option (B) (with a tiny, unavoidable, race condition). (I think option (D) would be best, but that is more work that no one has requested yet.)
Let me know if you disagree with my choice.
(I think the best practice for Pig Users to assign timezones as early as possible in the data processing pipeline, and keep dates tagged with time zones throughout the pipeline.)