Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-7292 Hive on Spark
  3. HIVE-8756

numRows and rawDataSize are not collected by the Spark stats [Spark Branch]

    XMLWordPrintableJSON

    Details

    • Type: Sub-task
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.1.0
    • Component/s: Spark
    • Labels:
      None

      Description

      Run the following hive queries

      set datanucleus.cache.collections=false;
      set hive.stats.autogather=true;
      set hive.merge.mapfiles=false;
      set hive.merge.mapredfiles=false;
      set hive.map.aggr=true;
      
      create table tmptable(key string, value string);
      INSERT OVERWRITE TABLE tmptable
      SELECT unionsrc.key, unionsrc.value 
      FROM (SELECT 'tst1' AS key, cast(count(1) AS string) AS value FROM src s1
            UNION  ALL  
            SELECT s2.key AS key, s2.value AS value FROM src1 s2) unionsrc;
      DESCRIBE FORMATTED tmptable;
      

      The hive on spark prints the following table parameters:

      COLUMN_STATS_ACCURATE	true                
      	numFiles            	2                   
      	numRows             	0                   
      	rawDataSize         	0                   
      	totalSize           	225
      

      The hive on mr prints the following table parameters:

      able Parameters:	 	 
      	COLUMN_STATS_ACCURATE	true                
      	numFiles            	2                   
      	numRows             	26                  
      	rawDataSize         	199                 
      	totalSize           	225 
      

      As above we can see the numRows and rawDataSize are not collected by hive on spark stats

        Attachments

        1. HIVE-8756.1-spark.patch
          14 kB
          Na Yang
        2. HIVE-8756.2-spark.patch
          66 kB
          Na Yang

          Issue Links

            Activity

              People

              • Assignee:
                nyang Na Yang
                Reporter:
                nyang Na Yang
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: