Details
-
Documentation
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
2.3.0
-
None
Description
It looks we can generate the documentation from ExpressionDescription and ExpressionInfo for Spark's SQL function documentation.
I had some time to play with this so I just made a rough version - https://spark-test.github.io/sparksqldoc/
Codes I used are as below :
In pyspark shell:
from collections import namedtuple ExpressionInfo = namedtuple("ExpressionInfo", "className usage name extended") jinfos = spark.sparkContext._jvm.org.apache.spark.sql.api.python.PythonSQLUtils.listBuiltinFunctions() infos = [] for jinfo in jinfos: name = jinfo.getName() usage = jinfo.getUsage() usage = usage.replace("_FUNC_", name) if usage is not None else usage extended = jinfo.getExtended() extended = extended.replace("_FUNC_", name) if extended is not None else extended infos.append(ExpressionInfo( className=jinfo.getClassName(), usage=usage, name=name, extended=extended)) with open("index.md", 'w') as mdfile: strip = lambda s: "\n".join(map(lambda u: u.strip(), s.split("\n"))) for info in sorted(infos, key=lambda i: i.name): mdfile.write("### %s\n\n" % info.name) if info.usage is not None: mdfile.write("%s\n\n" % strip(info.usage)) if info.extended is not None: mdfile.write("```%s```\n\n" % strip(info.extended))
This change had to be made first before running the codes above:
+++ b/sql/core/src/main/scala/org/apache/spark/sql/api/python/PythonSQLUtils.scala @@ -17,9 +17,15 @@ package org.apache.spark.sql.api.python +import org.apache.spark.sql.catalyst.analysis.FunctionRegistry +import org.apache.spark.sql.catalyst.expressions.ExpressionInfo import org.apache.spark.sql.catalyst.parser.CatalystSqlParser import org.apache.spark.sql.types.DataType private[sql] object PythonSQLUtils { def parseDataType(typeText: String): DataType = CatalystSqlParser.parseDataType(typeText) + + def listBuiltinFunctions(): Array[ExpressionInfo] = { + FunctionRegistry.functionSet.flatMap(f => FunctionRegistry.builtin.lookupFunction(f)).toArray + } }
And then, I ran this:
mkdir docs echo "site_name: Spark SQL 2.3.0" >> mkdocs.yml echo "theme: readthedocs" >> mkdocs.yml mv index.md docs/index.md mkdocs serve
Attachments
Issue Links
- relates to
-
SPARK-14764 Spark SQL documentation should be more precise about which SQL features it supports
-
- Resolved
-
- links to