Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-376

Built-in functions for parsing JSON

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • Product Backlog
    • Impala 3.1.0
    • Backend
    • All supported environments

    Description

      Hi,

      Hive comes with some useful built-in UDFs to process JSON objects.

      https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF

      Namely:

      • get_json_object
      • json_tuple

      To make Impala and Hive tables and quieries more interchangable, I am proposing porting these UDFs to be part Impala's built in functions:

      http://www.cloudera.com/content/cloudera-content/cloudera-docs/Impala/latest/Installing-and-Using-Impala/ciiu_functions.html

      Example

      Consider the following table raw_log

      action parameters
      search {"keyword":"hotel"}
      visit {"url":"http://example.com"}

      ...and the following query:

      SELECT get_json_object(event_params, "$.keyword") AS keyword FROM raw_log WHERE action='search';
      

      The query should return the following results:

      keyword
      hotel

      Attachments

        Issue Links

          Activity

            People

              stigahuang Quanlong Huang
              tcz Zoltan Toth-Czifra
              Votes:
              9 Vote for this issue
              Watchers:
              14 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: