Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-376

Built-in functions for parsing JSON

    XMLWordPrintableJSON

    Details

    • Type: New Feature
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: Product Backlog
    • Fix Version/s: Impala 3.1.0
    • Component/s: Backend
    • Environment:
      All supported environments

      Description

      Hi,

      Hive comes with some useful built-in UDFs to process JSON objects.

      https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF

      Namely:

      • get_json_object
      • json_tuple

      To make Impala and Hive tables and quieries more interchangable, I am proposing porting these UDFs to be part Impala's built in functions:

      http://www.cloudera.com/content/cloudera-content/cloudera-docs/Impala/latest/Installing-and-Using-Impala/ciiu_functions.html

      Example

      Consider the following table raw_log

      action parameters
      search {"keyword":"hotel"}
      visit {"url":"http://example.com"}

      ...and the following query:

      SELECT get_json_object(event_params, "$.keyword") AS keyword FROM raw_log WHERE action='search';
      

      The query should return the following results:

      keyword
      hotel

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                stigahuang Quanlong Huang
                Reporter:
                tcz Zoltan Toth-Czifra
              • Votes:
                9 Vote for this issue
                Watchers:
                15 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: