Uploaded image for project: 'Apache Lens (Retired)'
  1. Apache Lens (Retired)
  2. LENS-444

cube.fact.is.aggregated not properly documented

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • api, cube
    • None

    Description

      Consider a measure in a cube:

          <measure name="revenue" type="DOUBLE" default_aggr="SUM"/>
      

      Consider that a fact table F is supplying data to this cube, which has the column "revenue"

      We run a query:

      lens-shell>query execute cube select userid, count(revenue) from user_activity where time_range_in(dt, '2014-06-25-00', '2014-06-26-00')
      Launching query failed cause:No driver accepted the query, because No candidate fact table available to answer the query, because {"brief":"Columns: [[hive_fact_user_curation_good_traffic]] are missing default aggregate","details":{"user_attributestore_er_fact_adgroup_view,user_attributestore_er_fact_supply_site_burn,user_attributestore_er_fact_demandcategory_click,user_attributestore_er_fact_supplycategory_visits,user_attributestore_er_fact_supply_site_impressions_rendered,user_attributestore_er_fact_adgroup_click,user_attributestore_er_fact_adgroup_impression_time_install,user_attributestore_er_fact_app_impression_time_install,user_attributestore_er_fact_supply_site_impressions_served,user_attributestore_er_fact_adgroup_burn,user_attributestore_er_fact_app_visits,user_attributestore_er_fact_app_click,user_attributestore_er_fact_supply_site_click,user_attributestore_er_fact_adgroup_impressions_rendered":[{"cause":"COLUMN_NOT_FOUND","missingColumns":["totalburn"]}],"hive_fact_user_curation_good_traffic":[{"cause":"MISSING_DEFAULT_AGGREGATE","columnsMissingDefaultAggregate":["hive_fact_user_curation_good_traffic"]}]}}
      

      Lens complains the that the "columnsMissingDefaultAggregate". This happens because we are querying for "count" when the default_aggr defined for the measure in the cube is SUM. It runs fine if the query is for sum(revenue).

      This is then fixed by setting the property "cube.fact.is.aggregated" = false on the fact table F.

      IMO this behaviour of "is aggregated fact" is not documented properly and will leave many other users confused. Lets make it more obvious by way of having it as part of fact schema spec or document it well.

      Attachments

        Activity

          People

            Unassigned Unassigned
            angadsingh Angad Singh
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: