Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-14473 Druid integration II
  3. HIVE-15277

Teach Hive how to create/delete Druid segments

Log workAgile BoardRank to TopRank to BottomBulk Copy AttachmentsBulk Move AttachmentsVotersWatch issueWatchersConvert to IssueMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 2.2.0
    • 2.2.0
    • Druid integration

    Description

      We want to extend the DruidStorageHandler to support CTAS queries.
      In this implementation Hive will generate druid segment files and insert the metadata to signal the handoff to druid.

      The syntax will be as follows:

      CREATE TABLE druid_table_1
      STORED BY 'org.apache.hadoop.hive.druid.DruidStorageHandler'
      TBLPROPERTIES ("druid.datasource" = "datasourcename")
      AS <select `timecolumn` as `___time`, `dimension1`,`dimension2`,  `metric1`, `metric2`....>;
      

      This statement stores the results of query <input_query> in a Druid datasource named 'datasourcename'. One of the columns of the query needs to be the time dimension, which is mandatory in Druid. In particular, we use the same convention that it is used for Druid: there needs to be a the column named '__time' in the result of the executed query, which will act as the time dimension column in Druid. Currently, the time column dimension needs to be a 'timestamp' type column.
      metrics can be of type long, double and float while dimensions are strings. Keep in mind that druid has a clear separation between dimensions and metrics, therefore if you have a column in hive that contains number and need to be presented as dimension use the cast operator to cast as string.
      This initial implementation interacts with Druid Meta data storage to add/remove the table in druid, user need to supply the meta data config as --hiveconf hive.druid.metadata.password=XXX --hiveconf hive.druid.metadata.username=druid --hiveconf hive.druid.metadata.uri=jdbc:mysql://host/druid

      Attachments

        1. file.patch
          177 kB
          Slim Bouguerra
        2. HIVE-15277.2.patch
          246 kB
          Slim Bouguerra
        3. HIVE-15277.patch
          259 kB
          Slim Bouguerra
        4. HIVE-15277.patch
          259 kB
          Slim Bouguerra
        5. HIVE-15277.patch
          258 kB
          Slim Bouguerra
        6. HIVE-15277.patch
          260 kB
          Slim Bouguerra
        7. HIVE-15277.patch
          259 kB
          Slim Bouguerra
        8. HIVE-15277.patch
          255 kB
          Slim Bouguerra
        9. HIVE-15277.patch
          177 kB
          Slim Bouguerra

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            bslim Slim Bouguerra Assign to me
            bslim Slim Bouguerra
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment