Uploaded image for project: 'Hivemall'
  1. Hivemall
  2. HIVEMALL-274

Wrong target variable name in the step-by-step tutorial

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Minor
    • Resolution: Fixed
    • None
    • 0.6.0
    • None

    Description

      In the step-by-step tutorial, the target column name of regression should be `num_purchases` instead of `label`.

      create table if not exists regressor as
      select
        train_regressor(
          features, -- feature vector
          num_purchases, -- target value
          '-loss_function squared -optimizer AdaGrad -regularization l2' -- hyper-parameters
        ) as (feature, weight)
      from
        training
      ;
      
      with features_exploded as (
        select
          id,
          extract_feature(fv) as feature,
          extract_weight(fv) as value
        from
          training t1 
          LATERAL VIEW explode(features) t2 as fv
      ),
      predictions as (
        select
          t1.id,
          sum(p1.weight * t1.value) as predicted_num_purchases
        from
          features_exploded t1
          LEFT OUTER JOIN regressor p1 ON (t1.feature = p1.feature)
        group by
          t1.id
      )
      select
        rmse(t1.predicted_num_purchases, t2.num_purchases) as rmse,
        mae(t1.predicted_num_purchases, t2.num_purchases) as mae
      from
        predictions t1
      join
        training t2 on (t1.id = t2.id)
      ;
      
      

      Attachments

        Issue Links

          Activity

            People

              myui Makoto Yui
              chezou Aki Ariga
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: