Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-23149

Consistency of Parsing Object Identifiers

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Critical
    • Resolution: Unresolved
    • None
    • None
    • None
    • None

    Description

      There needs to be better consistency with handling of object identifiers (database, tables, column, view, function, etc.).  I think it makes sense to standardize on the same rules which MySQL/MariaDB uses for their column names so that Hive can be more of a drop-in replacement for these.
       
      The two important things to keep in mind are:
       
      1// Permitted characters in quoted identifiers include the full Unicode Basic Multilingual Plane (BMP), except U+0000
       
      2// If any components of a multiple-part name require quoting, quote them individually rather than quoting the name as a whole. For example, write `my-table`.`my-column`, not `my-table.my-column`.  
       
      https://dev.mysql.com/doc/refman/8.0/en/identifiers.html
      https://dev.mysql.com/doc/refman/8.0/en/identifier-qualifiers.html  

       
      That is to say:
       

      -- Select all rows from a table named `default.mytable`
      -- (Yes, the table name itself has a period in it. This is valid)
      SELECT * FROM `default.mytable`;
       
      -- Select all rows from database `default`, table `mytable`
      SELECT * FROM `default`.`mytable`;  
      

       
      This plays out in a couple of ways.  There may be more, but these are the ones I know about already:
       
      1// Hive generates incorrect syntax: HIVE-23128
       
      2// Hive throws exception if there is a period in the table name.  This is an invalid response.  Table name may have a period in them. More likely than not, it will throw 'table not found' exception since the user most likely accidentally used backticks incorrectly and meant to specify a db and a table separately. HIVE-16907

      Once we have the parsing figured out and support for backticks to enclose UTF-8 strings, then the backend database needs to actually support the UTF-8 character set. It currently does not: HIVE-1808

      Attachments

        Issue Links

          Activity

            People

              belugabehr David Mollitor
              belugabehr David Mollitor
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated: