Details

    • Sub-task
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 3.0.0
    • None
    • SQL
    • None

    Description

      Postgres features a number of JSON functions that are missing in Spark: https://www.postgresql.org/docs/9.3/functions-json.html

      Redshift's JSON functions (https://docs.aws.amazon.com/redshift/latest/dg/json-functions.html) have partial overlap with the Postgres list.

      Some of these functions can be expressed in terms of compositions of existing Spark functions. For example, I think that json_array_length can be expressed with cardinality and from_json, but there's a caveat related to legacy Hive compatibility (see the demo notebook at https://databricks-prod-cloudfront.cloud.databricks.com/public/4027ec902e239c93eaaa8714f173bcfc/5796212617691211/45530874214710/4901752417050771/latest.html for more details).

      I'm filing this ticket so that we can triage the list of Postgres JSON features and decide which ones make sense to support in Spark. After we've done that, we can create individual tickets for specific functions and features.

      Attachments

        Activity

          People

            Unassigned Unassigned
            joshrosen Josh Rosen
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: