Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-4193

Improve detection of maximum cpu clock frequency in cpu-info

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: Impala 2.7.0
    • Fix Version/s: Impala 2.8.0
    • Component/s: Backend
    • Labels:
      None

      Description

      We determine the maximum cpu frequency once during startup in util/cpu-info.cc:92 by reading /proc/cpuinfo and looking for the maximum frequency of any core. This assumes that the system is under enough load to max out at least one core, to read the correct maximum frequency. During benchmark tests, this is often not the case, leading to wrong calculations of the elapsed wall clock time in util/benchmark.cc:63. Here, sw.elapsedTime() is the number of elapsed CPU ticks, and dividing this by a wrong value for maximum CPU frequency yields wrong results.

      As a fix we can read the maximum cpu frequency from /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_max_freq which contains the correct value.

        Activity

        Hide
        jbapple Jim Apple added a comment -

        I see you have assigned this to yourself. When you send a patch, you might want to rerun the benchmarks to fix the numbers we put in the comments at the top of them. Beware IMPALA-3784, though.

        Also, are you sure /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_max_freq is available on older kernels?

        Show
        jbapple Jim Apple added a comment - I see you have assigned this to yourself. When you send a patch, you might want to rerun the benchmarks to fix the numbers we put in the comments at the top of them. Beware IMPALA-3784 , though. Also, are you sure /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_max_freq is available on older kernels?
        Hide
        dhecht Dan Hecht added a comment -

        Lars Volker, could you post the output of /proc/cpuinfo and uname -a on a system that shows this problem? In all modern CPUs, the TSC frequency doesn't change with the P-state, so "cpu Mhz" should be constant and this shouldn't happen.

        Show
        dhecht Dan Hecht added a comment - Lars Volker , could you post the output of /proc/cpuinfo and uname -a on a system that shows this problem? In all modern CPUs, the TSC frequency doesn't change with the P-state, so "cpu Mhz" should be constant and this shouldn't happen.
        Hide
        lv Lars Volker added a comment -

        Jim Apple - I assumed that cpufreq came with the sysfs filesystem, so it should be present from kernels 2.4 onwards. However this was wrong, since I provisioned a bunch of virtual machines with the latest versions of Ubuntu, SLES, and RHEL and none of them had that file. However I checked private physical machines with CentOS 5.11 and they had this file.

        I'll do a quick survey of developer machines on dev@ to get more insights.

        Show
        lv Lars Volker added a comment - Jim Apple - I assumed that cpufreq came with the sysfs filesystem, so it should be present from kernels 2.4 onwards. However this was wrong, since I provisioned a bunch of virtual machines with the latest versions of Ubuntu, SLES, and RHEL and none of them had that file. However I checked private physical machines with CentOS 5.11 and they had this file. I'll do a quick survey of developer machines on dev@ to get more insights.
        Hide
        jbapple Jim Apple added a comment -

        We probably want a survey of user machines, since CpuInfo::Init is called during impalad startup. However, since that's impossible to get, we'd want to be very careful about having a fallback cpuinfo_max_freq is not available.

        Show
        jbapple Jim Apple added a comment - We probably want a survey of user machines, since CpuInfo::Init is called during impalad startup. However, since that's impossible to get, we'd want to be very careful about having a fallback cpuinfo_max_freq is not available.
        Hide
        lv Lars Volker added a comment -

        Dan Hecht - Here's the relevant output from my dev machine. As you can see all cores are currently stepped down. I also saw this happen on some development clusters I have access to. Am I looking at the wrong place?

        uname -a
        Linux lv-desktop 4.2.0-27-generic #32~14.04.1-Ubuntu SMP Fri Jan 22 15:32:26 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
        
        cat /proc/cpuinfo
        processor	: 0
        vendor_id	: GenuineIntel
        cpu family	: 6
        model		: 60
        model name	: Intel(R) Core(TM) i7-4790 CPU @ 3.60GHz
        stepping	: 3
        microcode	: 0x1c
        cpu MHz		: 888.328
        cache size	: 8192 KB
        physical id	: 0
        siblings	: 8
        core id		: 0
        cpu cores	: 4
        apicid		: 0
        initial apicid	: 0
        fpu		: yes
        fpu_exception	: yes
        cpuid level	: 13
        wp		: yes
        flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm ida arat epb pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid xsaveopt
        bugs		:
        bogomips	: 7183.48
        clflush size	: 64
        cache_alignment	: 64
        address sizes	: 39 bits physical, 48 bits virtual
        power management:
        
        processor	: 1
        vendor_id	: GenuineIntel
        cpu family	: 6
        model		: 60
        model name	: Intel(R) Core(TM) i7-4790 CPU @ 3.60GHz
        stepping	: 3
        microcode	: 0x1c
        cpu MHz		: 900.703
        cache size	: 8192 KB
        physical id	: 0
        siblings	: 8
        core id		: 1
        cpu cores	: 4
        apicid		: 2
        initial apicid	: 2
        fpu		: yes
        fpu_exception	: yes
        cpuid level	: 13
        wp		: yes
        flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm ida arat epb pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid xsaveopt
        bugs		:
        bogomips	: 7183.48
        clflush size	: 64
        cache_alignment	: 64
        address sizes	: 39 bits physical, 48 bits virtual
        power management:
        
        processor	: 2
        vendor_id	: GenuineIntel
        cpu family	: 6
        model		: 60
        model name	: Intel(R) Core(TM) i7-4790 CPU @ 3.60GHz
        stepping	: 3
        microcode	: 0x1c
        cpu MHz		: 882.984
        cache size	: 8192 KB
        physical id	: 0
        siblings	: 8
        core id		: 2
        cpu cores	: 4
        apicid		: 4
        initial apicid	: 4
        fpu		: yes
        fpu_exception	: yes
        cpuid level	: 13
        wp		: yes
        flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm ida arat epb pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid xsaveopt
        bugs		:
        bogomips	: 7183.48
        clflush size	: 64
        cache_alignment	: 64
        address sizes	: 39 bits physical, 48 bits virtual
        power management:
        
        processor	: 3
        vendor_id	: GenuineIntel
        cpu family	: 6
        model		: 60
        model name	: Intel(R) Core(TM) i7-4790 CPU @ 3.60GHz
        stepping	: 3
        microcode	: 0x1c
        cpu MHz		: 927.281
        cache size	: 8192 KB
        physical id	: 0
        siblings	: 8
        core id		: 3
        cpu cores	: 4
        apicid		: 6
        initial apicid	: 6
        fpu		: yes
        fpu_exception	: yes
        cpuid level	: 13
        wp		: yes
        flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm ida arat epb pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid xsaveopt
        bugs		:
        bogomips	: 7183.48
        clflush size	: 64
        cache_alignment	: 64
        address sizes	: 39 bits physical, 48 bits virtual
        power management:
        
        processor	: 4
        vendor_id	: GenuineIntel
        cpu family	: 6
        model		: 60
        model name	: Intel(R) Core(TM) i7-4790 CPU @ 3.60GHz
        stepping	: 3
        microcode	: 0x1c
        cpu MHz		: 895.078
        cache size	: 8192 KB
        physical id	: 0
        siblings	: 8
        core id		: 0
        cpu cores	: 4
        apicid		: 1
        initial apicid	: 1
        fpu		: yes
        fpu_exception	: yes
        cpuid level	: 13
        wp		: yes
        flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm ida arat epb pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid xsaveopt
        bugs		:
        bogomips	: 7183.48
        clflush size	: 64
        cache_alignment	: 64
        address sizes	: 39 bits physical, 48 bits virtual
        power management:
        
        processor	: 5
        vendor_id	: GenuineIntel
        cpu family	: 6
        model		: 60
        model name	: Intel(R) Core(TM) i7-4790 CPU @ 3.60GHz
        stepping	: 3
        microcode	: 0x1c
        cpu MHz		: 813.937
        cache size	: 8192 KB
        physical id	: 0
        siblings	: 8
        core id		: 1
        cpu cores	: 4
        apicid		: 3
        initial apicid	: 3
        fpu		: yes
        fpu_exception	: yes
        cpuid level	: 13
        wp		: yes
        flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm ida arat epb pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid xsaveopt
        bugs		:
        bogomips	: 7183.48
        clflush size	: 64
        cache_alignment	: 64
        address sizes	: 39 bits physical, 48 bits virtual
        power management:
        
        processor	: 6
        vendor_id	: GenuineIntel
        cpu family	: 6
        model		: 60
        model name	: Intel(R) Core(TM) i7-4790 CPU @ 3.60GHz
        stepping	: 3
        microcode	: 0x1c
        cpu MHz		: 804.796
        cache size	: 8192 KB
        physical id	: 0
        siblings	: 8
        core id		: 2
        cpu cores	: 4
        apicid		: 5
        initial apicid	: 5
        fpu		: yes
        fpu_exception	: yes
        cpuid level	: 13
        wp		: yes
        flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm ida arat epb pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid xsaveopt
        bugs		:
        bogomips	: 7183.48
        clflush size	: 64
        cache_alignment	: 64
        address sizes	: 39 bits physical, 48 bits virtual
        power management:
        
        processor	: 7
        vendor_id	: GenuineIntel
        cpu family	: 6
        model		: 60
        model name	: Intel(R) Core(TM) i7-4790 CPU @ 3.60GHz
        stepping	: 3
        microcode	: 0x1c
        cpu MHz		: 800.015
        cache size	: 8192 KB
        physical id	: 0
        siblings	: 8
        core id		: 3
        cpu cores	: 4
        apicid		: 7
        initial apicid	: 7
        fpu		: yes
        fpu_exception	: yes
        cpuid level	: 13
        wp		: yes
        flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm ida arat epb pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid xsaveopt
        bugs		:
        bogomips	: 7183.48
        clflush size	: 64
        cache_alignment	: 64
        address sizes	: 39 bits physical, 48 bits virtual
        power management:
        
        Show
        lv Lars Volker added a comment - Dan Hecht - Here's the relevant output from my dev machine. As you can see all cores are currently stepped down. I also saw this happen on some development clusters I have access to. Am I looking at the wrong place? uname -a Linux lv-desktop 4.2.0-27-generic #32~14.04.1-Ubuntu SMP Fri Jan 22 15:32:26 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 60 model name : Intel(R) Core(TM) i7-4790 CPU @ 3.60GHz stepping : 3 microcode : 0x1c cpu MHz : 888.328 cache size : 8192 KB physical id : 0 siblings : 8 core id : 0 cpu cores : 4 apicid : 0 initial apicid : 0 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm ida arat epb pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid xsaveopt bugs : bogomips : 7183.48 clflush size : 64 cache_alignment : 64 address sizes : 39 bits physical, 48 bits virtual power management: processor : 1 vendor_id : GenuineIntel cpu family : 6 model : 60 model name : Intel(R) Core(TM) i7-4790 CPU @ 3.60GHz stepping : 3 microcode : 0x1c cpu MHz : 900.703 cache size : 8192 KB physical id : 0 siblings : 8 core id : 1 cpu cores : 4 apicid : 2 initial apicid : 2 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm ida arat epb pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid xsaveopt bugs : bogomips : 7183.48 clflush size : 64 cache_alignment : 64 address sizes : 39 bits physical, 48 bits virtual power management: processor : 2 vendor_id : GenuineIntel cpu family : 6 model : 60 model name : Intel(R) Core(TM) i7-4790 CPU @ 3.60GHz stepping : 3 microcode : 0x1c cpu MHz : 882.984 cache size : 8192 KB physical id : 0 siblings : 8 core id : 2 cpu cores : 4 apicid : 4 initial apicid : 4 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm ida arat epb pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid xsaveopt bugs : bogomips : 7183.48 clflush size : 64 cache_alignment : 64 address sizes : 39 bits physical, 48 bits virtual power management: processor : 3 vendor_id : GenuineIntel cpu family : 6 model : 60 model name : Intel(R) Core(TM) i7-4790 CPU @ 3.60GHz stepping : 3 microcode : 0x1c cpu MHz : 927.281 cache size : 8192 KB physical id : 0 siblings : 8 core id : 3 cpu cores : 4 apicid : 6 initial apicid : 6 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm ida arat epb pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid xsaveopt bugs : bogomips : 7183.48 clflush size : 64 cache_alignment : 64 address sizes : 39 bits physical, 48 bits virtual power management: processor : 4 vendor_id : GenuineIntel cpu family : 6 model : 60 model name : Intel(R) Core(TM) i7-4790 CPU @ 3.60GHz stepping : 3 microcode : 0x1c cpu MHz : 895.078 cache size : 8192 KB physical id : 0 siblings : 8 core id : 0 cpu cores : 4 apicid : 1 initial apicid : 1 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm ida arat epb pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid xsaveopt bugs : bogomips : 7183.48 clflush size : 64 cache_alignment : 64 address sizes : 39 bits physical, 48 bits virtual power management: processor : 5 vendor_id : GenuineIntel cpu family : 6 model : 60 model name : Intel(R) Core(TM) i7-4790 CPU @ 3.60GHz stepping : 3 microcode : 0x1c cpu MHz : 813.937 cache size : 8192 KB physical id : 0 siblings : 8 core id : 1 cpu cores : 4 apicid : 3 initial apicid : 3 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm ida arat epb pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid xsaveopt bugs : bogomips : 7183.48 clflush size : 64 cache_alignment : 64 address sizes : 39 bits physical, 48 bits virtual power management: processor : 6 vendor_id : GenuineIntel cpu family : 6 model : 60 model name : Intel(R) Core(TM) i7-4790 CPU @ 3.60GHz stepping : 3 microcode : 0x1c cpu MHz : 804.796 cache size : 8192 KB physical id : 0 siblings : 8 core id : 2 cpu cores : 4 apicid : 5 initial apicid : 5 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm ida arat epb pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid xsaveopt bugs : bogomips : 7183.48 clflush size : 64 cache_alignment : 64 address sizes : 39 bits physical, 48 bits virtual power management: processor : 7 vendor_id : GenuineIntel cpu family : 6 model : 60 model name : Intel(R) Core(TM) i7-4790 CPU @ 3.60GHz stepping : 3 microcode : 0x1c cpu MHz : 800.015 cache size : 8192 KB physical id : 0 siblings : 8 core id : 3 cpu cores : 4 apicid : 7 initial apicid : 7 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm ida arat epb pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid xsaveopt bugs : bogomips : 7183.48 clflush size : 64 cache_alignment : 64 address sizes : 39 bits physical, 48 bits virtual power management:
        Hide
        jbapple Jim Apple added a comment -

        I think Dan is referring to, to quote Wikipedia:

        For Pentium 4 processors, Intel Xeon processors (family [], models []); for Intel Core Solo and Intel Core Duo processors (family [], model []); for the Intel Xeon processor 5100 series and Intel Core 2 Duo processors (family [], model []); for Intel Core 2 and Intel Xeon processors (family [], display_model []); for Intel Atom processors (family [], display_model []): the time-stamp counter increments at a constant rate. That rate may be set by the maximum core-clock to bus-clock ratio of the processor or may be set by the maximum resolved frequency at which the processor is booted. The maximum resolved frequency may differ from the maximum qualified frequency of the processor.

        Show
        jbapple Jim Apple added a comment - I think Dan is referring to, to quote Wikipedia: For Pentium 4 processors, Intel Xeon processors (family [], models []); for Intel Core Solo and Intel Core Duo processors (family [], model []); for the Intel Xeon processor 5100 series and Intel Core 2 Duo processors (family [], model []); for Intel Core 2 and Intel Xeon processors (family [], display_model []); for Intel Atom processors (family [], display_model []): the time-stamp counter increments at a constant rate. That rate may be set by the maximum core-clock to bus-clock ratio of the processor or may be set by the maximum resolved frequency at which the processor is booted. The maximum resolved frequency may differ from the maximum qualified frequency of the processor.
        Hide
        lv Lars Volker added a comment -

        Thanks for the clarification. This seems consistent with the observed behavior that the performance ratio between different benchmarks in the same suite doesn't change between runs, whereas the absolute figures change. Since for the benchmarks only the conversion from ticks to wall clock time is affected, we might as well switch from showing iters/ms to iters/megatic or similar.

        Show
        lv Lars Volker added a comment - Thanks for the clarification. This seems consistent with the observed behavior that the performance ratio between different benchmarks in the same suite doesn't change between runs, whereas the absolute figures change. Since for the benchmarks only the conversion from ticks to wall clock time is affected, we might as well switch from showing iters/ms to iters/megatic or similar.
        Hide
        lv Lars Volker added a comment -

        Jim Apple - Regarding the fallback I agree. As a test I changed the path to cpuinfo_max_freq to something non-existent and made sure the old behavior is restored. Is there anything else you think I should check to make sure this works on systems where cpuinfo_max_freq cannot be found?

        Show
        lv Lars Volker added a comment - Jim Apple - Regarding the fallback I agree. As a test I changed the path to cpuinfo_max_freq to something non-existent and made sure the old behavior is restored. Is there anything else you think I should check to make sure this works on systems where cpuinfo_max_freq cannot be found?
        Hide
        jbapple Jim Apple added a comment -

        Also, I think the benchmark FW pokes the user into changing the power settings to be 'performance' so no downclocking happens. Doesn't that take care of the issue?

        Show
        jbapple Jim Apple added a comment - Also, I think the benchmark FW pokes the user into changing the power settings to be 'performance' so no downclocking happens. Doesn't that take care of the issue?
        Hide
        lv Lars Volker added a comment -

        Yes, that would take care of the issue. I only saw warnings when running benchmarks compiled in debug mode. Can you point me to the place where the scaling governor is mentioned?

        Show
        lv Lars Volker added a comment - Yes, that would take care of the issue. I only saw warnings when running benchmarks compiled in debug mode. Can you point me to the place where the scaling governor is mentioned?
        Hide
        lv Lars Volker added a comment -

        I tried switching to the performance governor with this command:

        echo performance | sudo tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
        

        Here's the output of two subsequent executions of cat /proc/cpuinfo. While it looks like using the performance governor improves thing, the maximum observed frequency still fluctuates.

        ✔ ~/i2() ✗$ grep MHz /proc/cpuinfo
        cpu MHz         : 3696.468
        cpu MHz         : 3990.796
        cpu MHz         : 3600.000
        cpu MHz         : 3692.390
        cpu MHz         : 3700.265
        cpu MHz         : 3600.984
        cpu MHz         : 3600.984
        cpu MHz         : 3600.843
        ✔ ~/i2() ✗$ grep MHz /proc/cpuinfo
        cpu MHz         : 3600.140
        cpu MHz         : 3774.234
        cpu MHz         : 3600.000
        cpu MHz         : 3600.421
        cpu MHz         : 3600.140
        cpu MHz         : 3600.703
        cpu MHz         : 3601.125
        cpu MHz         : 3600.562
        
        Show
        lv Lars Volker added a comment - I tried switching to the performance governor with this command: echo performance | sudo tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor Here's the output of two subsequent executions of cat /proc/cpuinfo . While it looks like using the performance governor improves thing, the maximum observed frequency still fluctuates. ✔ ~/i2() ✗$ grep MHz /proc/cpuinfo cpu MHz : 3696.468 cpu MHz : 3990.796 cpu MHz : 3600.000 cpu MHz : 3692.390 cpu MHz : 3700.265 cpu MHz : 3600.984 cpu MHz : 3600.984 cpu MHz : 3600.843 ✔ ~/i2() ✗$ grep MHz /proc/cpuinfo cpu MHz : 3600.140 cpu MHz : 3774.234 cpu MHz : 3600.000 cpu MHz : 3600.421 cpu MHz : 3600.140 cpu MHz : 3600.703 cpu MHz : 3601.125 cpu MHz : 3600.562
        Hide
        jbapple Jim Apple added a comment -

        I was getting confused. That info is located at /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor. I found it here: https://github.com/google/benchmark/blob/cba945e37dd8f336c7c8f5367f3c7d9498d5e09b/src/sysinfo.cc

        We also probably want to disable turbo, which is I think what you're seeing above:

        http://askubuntu.com/a/620114

        echo "1" | sudo tee /sys/devices/system/cpu/intel_pstate/no_turbo

        Show
        jbapple Jim Apple added a comment - I was getting confused. That info is located at /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor . I found it here: https://github.com/google/benchmark/blob/cba945e37dd8f336c7c8f5367f3c7d9498d5e09b/src/sysinfo.cc We also probably want to disable turbo, which is I think what you're seeing above: http://askubuntu.com/a/620114 echo "1" | sudo tee /sys/devices/system/cpu/intel_pstate/no_turbo
        Hide
        lv Lars Volker added a comment -

        Ok, assuming we don't want to interfere with regular daemon startup, I went ahead and drafted a change to just issue warnings during the benchmark runs if these settings are suboptimal. I think this is less intrusive, achieves the goal, and makes for more accurate benchmark results, since it encourages proper system setup.

        The change is here: https://gerrit.cloudera.org/4528

        Feedback welcome.

        Show
        lv Lars Volker added a comment - Ok, assuming we don't want to interfere with regular daemon startup, I went ahead and drafted a change to just issue warnings during the benchmark runs if these settings are suboptimal. I think this is less intrusive, achieves the goal, and makes for more accurate benchmark results, since it encourages proper system setup. The change is here: https://gerrit.cloudera.org/4528 Feedback welcome.
        Hide
        lv Lars Volker added a comment -

        IMPALA-4193: Warn when benchmarks run with sub-optimal CPU settings

        Change-Id: I5e879cb35cf736f6112c1caed829722a38849794
        Reviewed-on: http://gerrit.cloudera.org:8080/4528
        Reviewed-by: Jim Apple <jbapple@cloudera.com>
        Tested-by: Internal Jenkins

        Show
        lv Lars Volker added a comment - IMPALA-4193 : Warn when benchmarks run with sub-optimal CPU settings Change-Id: I5e879cb35cf736f6112c1caed829722a38849794 Reviewed-on: http://gerrit.cloudera.org:8080/4528 Reviewed-by: Jim Apple <jbapple@cloudera.com> Tested-by: Internal Jenkins

          People

          • Assignee:
            lv Lars Volker
            Reporter:
            lv Lars Volker
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development