Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-18013

Hive incorrectly rewrites create statements for VIEW

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 2.1.0
    • None
    • Hive, SQL

    Description

      I am encountering errors when using VIEW s as, when Hive rewrites the create statements unrelated query fragments are inserted, resulting in a malformed view statement that cannot then be parsed. 

      The create statements I am using for each VIEW is as follows;

      CREATE VIEW bar.web_hits
      AS
      SELECT url
      FROM  bar.web_hit_log
      WHERE ip_address NOT IN (
           SELECT ip_address
           FROM  bar.crawler
           WHERE active = true
      )
      AND timestamp_gmt BETWEENaAND “b”;
       
      CREATE VIEW  bar.page_view_agg
      AS
      SELECT url AS page_url, COUNT(*) AS page_count
      FROM  bar.web_hits a
      GROUP BY url
      ORDER BY page_count DESC;
      

      I then have an INSERT statement which selects data from the previously created views and puts it into a table;

      INSERT OVERWRITE TABLE  bar.view_result
      SELECT page_url, page_count
      FROM  bar.page_view_agg;
      

      The following error is produced from running the above query;

      FAILED: SemanticException line 1:52 missing EOF at '.' near 'crawler' in definition of VIEW page_view_agg [
      SELECT `a`.`url` AS `page_url`, COUNT(*) AS`crawler`.`ip_address``page_count`
      FROM ` bar`.`crawler` `a`
      GROUP`crawler`.`active`BY `a`.`url`
      ORDER BY page_count DESC
      ] used as page_view_agg at Line 3:5
      


      Running describe extended page_view_agg produces the following output (edited for simplicity);

      page_url     string
      page_count   bigint
       
      viewOriginalText:
      SELECT url AS page_url, COUNT(*) AS page_count
      FROM  bar.web_hits a
      GROUP BY url
      ORDER BY page_count DESC, 
       
      viewExpandedText:
      SELECT `a`.`url` AS `page_url`, COUNT(*) AS`crawler`.`ip_address``page_count`
      FROM ` bar`.`crawler` `a`
      GROUP`crawler`.`active`BY `a`.`url`
      ORDER BY page_count DESC, 
      

       
      As you can see, it looks like the statement has been corrupted by the insertion of the unnecessary string `crawler`.`ip_address` on line 1.

      Attachments

        Activity

          People

            Unassigned Unassigned
            jgreenstevens Jaydene Green-Stevens
            Votes:
            1 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: