[SPARK-17963] Add examples (extend) in each function and improve documentation - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Documentation
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 2.1.0
Component/s: SQL
Labels:
None

Description

Currently, it seems function documentation is inconsistent and does not have examples (extend much.

For example, some functions have a bad indentation as below:

spark-sql> DESCRIBE FUNCTION EXTENDED approx_count_distinct;
Function: approx_count_distinct
Class: org.apache.spark.sql.catalyst.expressions.aggregate.HyperLogLogPlusPlus
Usage: approx_count_distinct(expr) - Returns the estimated cardinality by HyperLogLog++.
    approx_count_distinct(expr, relativeSD=0.05) - Returns the estimated cardinality by HyperLogLog++
      with relativeSD, the maximum estimation error allowed.

Extended Usage:
No example for approx_count_distinct.

spark-sql> DESCRIBE FUNCTION EXTENDED count;
Function: count
Class: org.apache.spark.sql.catalyst.expressions.aggregate.Count
Usage: count(*) - Returns the total number of retrieved rows, including rows containing NULL values.
    count(expr) - Returns the number of rows for which the supplied expression is non-NULL.
    count(DISTINCT expr[, expr...]) - Returns the number of rows for which the supplied expression(s) are unique and non-NULL.
Extended Usage:
No example for count.

whereas some do have a pretty one

spark-sql> DESCRIBE FUNCTION EXTENDED percentile_approx;
Function: percentile_approx
Class: org.apache.spark.sql.catalyst.expressions.aggregate.ApproximatePercentile
Usage:
      percentile_approx(col, percentage [, accuracy]) - Returns the approximate percentile value of numeric
      column `col` at the given percentage. The value of percentage must be between 0.0
      and 1.0. The `accuracy` parameter (default: 10000) is a positive integer literal which
      controls approximation accuracy at the cost of memory. Higher value of `accuracy` yields
      better accuracy, `1.0/accuracy` is the relative error of the approximation.

      percentile_approx(col, array(percentage1 [, percentage2]...) [, accuracy]) - Returns the approximate
      percentile array of column `col` at the given percentage array. Each value of the
      percentage array must be between 0.0 and 1.0. The `accuracy` parameter (default: 10000) is
       a positive integer literal which controls approximation accuracy at the cost of memory.
       Higher value of `accuracy` yields better accuracy, `1.0/accuracy` is the relative error of
       the approximation.

Extended Usage:
No example for percentile_approx.

Also, there are several inconsistent indentation, for example, FUNC(a,b) and FUNC(a, b) (note the indentation between arguments.

It'd be nicer if most of them have a good example with possible argument types.

Suggested format is as below for multiple line usage:

spark-sql> DESCRIBE FUNCTION EXTENDED rand;
Function: rand
Class: org.apache.spark.sql.catalyst.expressions.Rand
Usage:
      rand() - Returns a random column with i.i.d. uniformly distributed values in [0, 1].
        seed is given randomly.

      rand(seed) - Returns a random column with i.i.d. uniformly distributed values in [0, 1].
        seed should be an integer/long/NULL literal.

Extended Usage:
> SELECT rand();
 0.9629742951434543
> SELECT rand(0);
 0.8446490682263027
> SELECT rand(NULL);
 0.8446490682263027

For single line usage:

spark-sql> DESCRIBE FUNCTION EXTENDED date_add;
Function: date_add
Class: org.apache.spark.sql.catalyst.expressions.DateAdd
Usage: date_add(start_date, num_days) - Returns the date that is num_days after start_date.
Extended Usage:
> SELECT date_add('2016-07-30', 1);
 '2016-07-31'

Attachments

Issue Links

is duplicated by

SPARK-17940 Typo in LAST function error message

Resolved

is related to

SPARK-17940 Typo in LAST function error message

Resolved

links to

[Github] Pull Request #15513 (HyukjinKwon)

[Github] Pull Request #15677 (HyukjinKwon)

Activity

People

Assignee:: Hyukjin Kwon

Reporter:: Hyukjin Kwon

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 17/Oct/16 01:57

Updated:: 12/Dec/22 18:11

Resolved:: 03/Nov/16 03:57