Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-8159

Improve expression function coverage (Spark 1.5)

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.5.0
    • Component/s: SQL
    • Labels:
      None

      Description

      This is an umbrella ticket to track new expressions we are adding to SQL/DataFrame.

      For each new expression, we should:
      1. Add a new Expression implementation in org.apache.spark.sql.catalyst.expressions
      2. If applicable, implement the code generated version (by implementing genCode).
      3. Add comprehensive unit tests (for all the data types the expressions support).
      4. If applicable, add a new function for DataFrame in org.apache.spark.sql.functions, and python/pyspark/sql/functions.py for Python.

      For date/time functions, put them in expressions/datetime.scala, and create a DateTimeFunctionSuite.scala for testing.

        Attachments

          Issue Links

          1.
          date/time function: unix_timestamp Sub-task Resolved Adrian Wang
          2.
          date/time function: from_unixtime Sub-task Resolved Adrian Wang
          3.
          date/time function: to_date Sub-task Resolved Adrian Wang
          4.
          date/time function: year Sub-task Resolved Tarek Auel
          5.
          date/time function: quarter Sub-task Resolved Tarek Auel
          6.
          date/time function: month Sub-task Resolved Tarek Auel
          7.
          date/time function: day / dayofmonth Sub-task Resolved Tarek Auel
          8.
          date/time function: hour Sub-task Resolved Tarek Auel
          9.
          date/time function: minute Sub-task Resolved Tarek Auel
          10.
          date/time function: second Sub-task Resolved Tarek Auel
          11.
          date/time function: weekofyear Sub-task Resolved Tarek Auel
          12.
          date/time function: datediff Sub-task Resolved Adrian Wang
          13.
          date/time function: date_add Sub-task Resolved Adrian Wang
          14.
          date/time function: date_sub Sub-task Resolved Adrian Wang
          15.
          date/time function: from_utc_timestamp Sub-task Resolved Adrian Wang
          16.
          date/time function: to_utc_timestamp Sub-task Resolved Adrian Wang
          17.
          date/time function: current_date Sub-task Resolved Adrian Wang
          18.
          date/time function: current_timestamp Sub-task Resolved Adrian Wang
          19.
          date/time function: add_months Sub-task Resolved Adrian Wang
          20.
          date/time function: last_day Sub-task Resolved Adrian Wang
          21.
          date/time function: next_day Sub-task Resolved Adrian Wang
          22.
          date/time function: trunc Sub-task Resolved Adrian Wang
          23.
          date/time function: months_between Sub-task Resolved Adrian Wang
          24.
          date/time function: date_format Sub-task Resolved Tarek Auel
          25.
          conditional function: if Sub-task Resolved Reynold Xin
          26.
          conditional functions: greatest Sub-task Resolved Adrian Wang
          27.
          conditional function: least Sub-task Resolved Adrian Wang
          28.
          conditional function: nvl Sub-task Resolved Reynold Xin
          29.
          math function: round Sub-task Resolved Yijie Shen
          30.
          math function: bin Sub-task Resolved Liang-Chi Hsieh
          31.
          math function: ceiling Sub-task Resolved Reynold Xin
          32.
          math function: conv Sub-task Resolved zhichao-li
          33.
          math function: degrees Sub-task Resolved Reynold Xin
          34.
          math function: radians Sub-task Resolved Reynold Xin
          35.
          math function: e Sub-task Resolved Adrian Wang
          36.
          math function: factorial Sub-task Resolved zhichao-li
          37.
          math function: hex Sub-task Resolved zhichao-li
          38.
          math function: pi Sub-task Resolved Adrian Wang
          39.
          math function: rename log -> ln Sub-task Resolved Reynold Xin
          40.
          math function: log2 Sub-task Resolved Adrian Wang
          41.
          math function: log Sub-task Resolved Liang-Chi Hsieh
          42.
          math function: negative Sub-task Resolved Reynold Xin
          43.
          math function: positive Sub-task Resolved zhichao-li
          44.
          math function: pmod Sub-task Resolved zhichao-li
          45.
          math function: alias power / pow Sub-task Resolved Reynold Xin
          46.
          math function: shiftleft Sub-task Resolved Tarek Auel
          47.
          math function: shiftright Sub-task Resolved Tarek Auel
          48.
          math function: alias sign / signum Sub-task Resolved Reynold Xin
          49.
          math function: shiftrightunsigned Sub-task Resolved zhichao-li
          50.
          math function: unhex Sub-task Resolved zhichao-li
          51.
          conditional function: isnull Sub-task Resolved Reynold Xin
          52.
          conditional function: isnotnull Sub-task Resolved Reynold Xin
          53.
          complex function: size Sub-task Resolved Pedro Rodriguez
          54.
          complex function: array_contains Sub-task Resolved Pedro Rodriguez
          55.
          complex function: sort_array Sub-task Resolved Cheng Hao
          56.
          misc function: md5 Sub-task Resolved Qian, Shilei
          57.
          misc function: sha1 / sha Sub-task Resolved Tarek Auel
          58.
          misc function: crc32 Sub-task Resolved Tarek Auel
          59.
          misc function: sha2 Sub-task Resolved Liang-Chi Hsieh
          60.
          string function: ascii Sub-task Resolved Cheng Hao
          61.
          string function: base64 Sub-task Resolved Cheng Hao
          62.
          string function: concat Sub-task Resolved Reynold Xin
          63.
          string function: concat_ws Sub-task Resolved Reynold Xin
          64.
          string function: decode Sub-task Resolved Cheng Hao
          65.
          string function: encode Sub-task Resolved Cheng Hao
          66.
          string function: format_number Sub-task Resolved Cheng Hao
          67.
          string function: get_json_object Sub-task Resolved Nathan Howell
          68.
          string function: instr Sub-task Resolved Cheng Hao
          69.
          string function: length Sub-task Resolved Cheng Hao
          70.
          string function: locate Sub-task Resolved Cheng Hao
          71.
          string function: alias lower/lcase Sub-task Resolved Reynold Xin
          72.
          string function: alias upper / ucase Sub-task Resolved Reynold Xin
          73.
          string function: lpad Sub-task Resolved Cheng Hao
          74.
          string function: ltrim Sub-task Resolved Cheng Hao
          75.
          string function: printf Sub-task Resolved Cheng Hao
          76.
          string function: regexp_extract Sub-task Resolved Cheng Hao
          77.
          string function: regexp_replace Sub-task Resolved Cheng Hao
          78.
          string function: repeat Sub-task Resolved Cheng Hao
          79.
          string function: reverse Sub-task Resolved Cheng Hao
          80.
          string function: rpad Sub-task Resolved Cheng Hao
          81.
          string function: rtrim Sub-task Resolved Cheng Hao
          82.
          string function: space Sub-task Resolved Cheng Hao
          83.
          string function: split Sub-task Resolved Cheng Hao
          84.
          string function: substr/substring should also support binary type Sub-task Resolved Cheng Hao
          85.
          string function: substring_index Sub-task Resolved Cheng Hao
          86.
          string function: trim Sub-task Resolved Cheng Hao
          87.
          string function: unbase64 Sub-task Resolved Cheng Hao
          88.
          string function: initcap Sub-task Resolved Cheng Hao
          89.
          string function: levenshtein Sub-task Resolved Tarek Auel
          90.
          string function: soundex Sub-task Resolved Cheng Hao
          91.
          udf_round_3 test fails Sub-task Resolved Yijie Shen
          92.
          udf_struct test failure Sub-task Resolved Yijie Shen
          93.
          Add unit tests for abs Sub-task Resolved Reynold Xin
          94.
          Add unit tests for +, -, *, /, % Sub-task Resolved Reynold Xin
          95.
          Move sqrt into math Sub-task Resolved Liang-Chi Hsieh
          96.
          improve unit test for MaxOf and MinOf Sub-task Resolved Wenchen Fan
          97.
          complex type constructors: struct and named_struct Sub-task Resolved Yijie Shen
          98.
          Remove e and pi from DataFrame functions Sub-task Resolved Reynold Xin
          99.
          Add python API for hex/unhex Sub-task Resolved Davies Liu
          100.
          Date/time function and data type design Sub-task Resolved Reynold Xin
          101.
          Improve unit test coverage for bitwise expressions Sub-task Resolved Reynold Xin
          102.
          MonotonicallyIncreasingID and SparkPartitionID should be marked as nondeterministic Sub-task Resolved Reynold Xin
          103.
          Improve unit test coverage for null expressions Sub-task Resolved Reynold Xin
          104.
          use UTC Calendar in `stringToDate` Sub-task Resolved Wenchen Fan
          105.
          add and improve tests for nondeterministic expressions Sub-task Resolved Wenchen Fan
          106.
          Add unit test for null inputs for date functions Sub-task Resolved Yijie Shen
          107.
          Add test cases for null inputs for expression unit tests Sub-task Resolved Yijie Shen
          108.
          Audit expression unit tests to make sure we pass the proper numeric ranges Sub-task Resolved Yijie Shen
          109.
          Audit expression unit tests to test for non-foldable codegen path Sub-task Resolved Yijie Shen
          110.
          Add TernaryExpression to simplify implementations Sub-task Resolved Davies Liu
          111.
          Create Python API for all SQL functions Sub-task Resolved Davies Liu
          112.
          DateTimeUtils cleanup Sub-task Resolved Yijie Shen
          113.
          minor bug fix in expressions Sub-task Resolved Yijie Shen

            Activity

              People

              • Assignee:
                rxin Reynold Xin
                Reporter:
                rxin Reynold Xin
              • Votes:
                0 Vote for this issue
                Watchers:
                22 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: