Pig
  1. Pig
  2. PIG-1430

ISODateTime -> DateTime: DateTime UDFs Should Also Support int/second Unix Times in All Operations

    Details

    • Tags:
      datetimes in pig unix time iso format integer second

      Description

      All functions in contrib.piggybank.java.src.main.java.org.apache.pig.piggybank.evaluation.datetime should seamlessly accept integer Unix/POSIX times, and return Unix time output when given an int, and ISO output when given a chararray.

      Note: Unix/POSIX times are the number of seconds elapsed since midnight proleptic Coordinated Universal Time (UTC) of January 1, 1970, not counting leap seconds. See http://en.wikipedia.org/wiki/Unix_time

        Activity

        Hide
        Russell Jurney added a comment -

        Actually, I think it should interpret int as unix time in seconds, and long as unix time in miliseconds. Thoughts?

        Show
        Russell Jurney added a comment - Actually, I think it should interpret int as unix time in seconds, and long as unix time in miliseconds. Thoughts?
        Hide
        Dmitriy V. Ryaboy added a comment -

        I think int vs long doing different things is dangerous.

        iirc ISO specifies to the second, so getting an int or a long should assume seconds.

        I suggest implementing a duplicate set of functions for millisecond timestamps that actually have "Ms" in the name (and of course a set of sec to millisec and sec to millisec and both to ISO conversions .

        -D

        Show
        Dmitriy V. Ryaboy added a comment - I think int vs long doing different things is dangerous. iirc ISO specifies to the second, so getting an int or a long should assume seconds. I suggest implementing a duplicate set of functions for millisecond timestamps that actually have "Ms" in the name (and of course a set of sec to millisec and sec to millisec and both to ISO conversions . -D
        Hide
        Russell Jurney added a comment -

        Good idea, will do!

        Show
        Russell Jurney added a comment - Good idea, will do!
        Hide
        Russell Jurney added a comment -

        I've been thinking about the feedback at the contributors meeting Monday. I propose that we postpone the addition of a full datetime PIG-1314 type in lieu of the builtins described below. This change is easy and I can do it immediately and get it in 0.8. The original proposal is quite hard, and I can't really estimate when I could have it completed. I'm not sure we need it. There are many other more important things I would rather do.

        I'd like to remove the piggybank classes org.apache.pig.piggybank.evaluation.datetime.* or at least deprecate them.

        I'd like to add the following builtins, which act on both ISO8601 datetime strings and long unix times. These could be made into many functions each, but I'd prefer to keep them as short as possible. I suggest we mirror the oracle date/time functions when possible: http://psoug.org/reference/date_func.html

        • Units

        When listed below, units are defined as one of:

        YEAR
        MONTH
        WEEK
        DAY
        HOUR
        MINUTE
        SECOND

        • Truncations

        TRUNC(date, unit) or TRUNC_DATE(date, unit)

        long/epoch input returns long/epoch output.
        ISO8601 string input returns IS08601 datetime output.

        • Dates to durations

        DURATION(date, unit)

        long/epoch input returns long output in the unit specified.
        ISO8601 input returns an ISO8601 duration

        • Adding/subtracting durations and dates: use longs.
        • Utilities

        CURRENT_ISOTIME
        CURRENT_UNIXTIME
        ISOTOUNIX
        UNIXTOISO

        The only ugly part to this is that ISO times are 2nd class citizens in that they cannot be added/subtracted. I'm prepared to live with that

        Show
        Russell Jurney added a comment - I've been thinking about the feedback at the contributors meeting Monday. I propose that we postpone the addition of a full datetime PIG-1314 type in lieu of the builtins described below. This change is easy and I can do it immediately and get it in 0.8. The original proposal is quite hard, and I can't really estimate when I could have it completed. I'm not sure we need it. There are many other more important things I would rather do. I'd like to remove the piggybank classes org.apache.pig.piggybank.evaluation.datetime.* or at least deprecate them. I'd like to add the following builtins, which act on both ISO8601 datetime strings and long unix times. These could be made into many functions each, but I'd prefer to keep them as short as possible. I suggest we mirror the oracle date/time functions when possible: http://psoug.org/reference/date_func.html Units When listed below, units are defined as one of: YEAR MONTH WEEK DAY HOUR MINUTE SECOND Truncations TRUNC(date, unit) or TRUNC_DATE(date, unit) long/epoch input returns long/epoch output. ISO8601 string input returns IS08601 datetime output. Dates to durations DURATION(date, unit) long/epoch input returns long output in the unit specified. ISO8601 input returns an ISO8601 duration Adding/subtracting durations and dates: use longs. Utilities CURRENT_ISOTIME CURRENT_UNIXTIME ISOTOUNIX UNIXTOISO The only ugly part to this is that ISO times are 2nd class citizens in that they cannot be added/subtracted. I'm prepared to live with that
        Hide
        Alan Gates added a comment -

        I think it's fine to start with just putting conversion functions into Pig Latin. What I'd like to clarify though is what is the desired end state? Does Pig eventually have a datetime type that does all the datetime stuff you can dream of (timezones, etc.)? Or does Pig only ever have longs or strings to represent times and a set of functions to work with those? Are you proposing that latter, or delaying the former in interest of getting something into 0.8?

        Show
        Alan Gates added a comment - I think it's fine to start with just putting conversion functions into Pig Latin. What I'd like to clarify though is what is the desired end state? Does Pig eventually have a datetime type that does all the datetime stuff you can dream of (timezones, etc.)? Or does Pig only ever have longs or strings to represent times and a set of functions to work with those? Are you proposing that latter, or delaying the former in interest of getting something into 0.8?
        Hide
        Olga Natkovich added a comment -

        Unlinking from the release since nobody signed up for the work. Feel free to link in if you are interested in working on the JIRA

        Show
        Olga Natkovich added a comment - Unlinking from the release since nobody signed up for the work. Feel free to link in if you are interested in working on the JIRA

          People

          • Assignee:
            Unassigned
            Reporter:
            Russell Jurney
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:

              Development