Uploaded image for project: 'Kylin'
  1. Kylin
  2. KYLIN-679

Adding Spark Support to Apache Kylin

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • v2.0.0
    • Spark Engine
    • None

    Description

      Challenges in current architecture:

      High latency when reading data from Hive
      --Several hours to fetch data when join big tables
      --Route to SQL-on-Hadoop turned off due to performance issue

      Time-to-Market of data latency
      --Huge IO & Network traffic with MR jobs

      Streaming
      --Streaming process and pre-calculate cubes

      Where Spark could bring benefits to Kylin:

      Integrating with Spark SQL:
      --Option I: Read data from SparkSQL instead of Hive
      --Option II: Route unsupported queries to SparkSQL
      --Option III: Kylin to be OLAP source of SparkSQL

      Spark Cube Build Engine
      --Efficiency cube generate engine with Spark

      Spark Streaming
      --Leverage SparkStreaming for StreamingOLAP

      HBase?
      --Any idea?

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              lukehan Luke Han
              Votes:
              3 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: