Uploaded image for project: 'Spot (Retired)'
  1. Spot (Retired)
  2. SPOT-180

[ODM] Support Hive views as source data for ML/OA modules

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Closed
    • Minor
    • Resolution: Fixed
    • None
    • None

    Description

      Currently Apache spot can be intergated with physical Hive tables only (Parquet)
      It's good for new projects/parsers but not convinient for existing.

      Supporting Hive views in Apache Spot can solve at least 3 problems:

      1) Any existing schema can be easy converted to ODM (It will be just mapping ODM fields with schema-specific, without rewriting existing parsers and exporting/importing existing data into).
      2) As result - no duplication of data.
      3) 2 similar data sources (like bluecoat and squid - both are proxies) can be unified into one view and this view represents one kind of data - proxy

      One possible impact - perfomance. However Hive views can be partitioned as well, see https://cwiki.apache.org/confluence/display/Hive/PartitionedViews

      Attachments

        Issue Links

          Activity

            People

              natedogs911 nathanael Smith
              vladmir Vladimir
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: