Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-9264 [Umbrella] Follow-up on IntelOpenCL FPGA plugin
  3. YARN-9265

FPGA plugin fails to recognize Intel Processing Accelerator Card



    • Type: Sub-task
    • Status: Resolved
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: 3.1.0
    • Fix Version/s: 3.3.0
    • Component/s: None
    • Labels:
    • Hadoop Flags:


      The plugin cannot autodetect Intel FPGA PAC (Processing Accelerator Card).

      There are two major issues.

      Problem #1

      The output of aocl diagnose:

      Device Name:
      Package Pat:
      Vendor: Intel Corp
      Physical Dev Name   Status            Information
      pac_a10_f200000     Passed            PAC Arria 10 Platform (pac_a10_f200000)
                                            PCIe 08:00.0
                                            FPGA temperature = 79 degrees C.
      Call "aocl diagnose <device-names>" to run diagnose for specified devices
      Call "aocl diagnose all" to run diagnose for all devices

      The plugin fails to recognize this and fails with the following message:

      2019-01-25 06:46:02,834 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.resourceplugin.fpga.FpgaResourcePlugin: Using FPGA vendor plugin: org.apache.hadoop.yarn.server.nodemanager.containermanager.resourceplugin.fpga.IntelFpgaOpenclPlugin
      2019-01-25 06:46:02,943 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.resourceplugin.fpga.FpgaDiscoverer: Trying to diagnose FPGA information ...
      2019-01-25 06:46:03,085 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.ResourceHandlerModule: Using traffic control bandwidth handler
      2019-01-25 06:46:03,108 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.CGroupsHandlerImpl: Initializing mounted controller cpu at /sys/fs/cgroup/cpu,cpuacct/yarn
      2019-01-25 06:46:03,139 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.fpga.FpgaResourceHandlerImpl: FPGA Plugin bootstrap success.
      2019-01-25 06:46:03,247 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.resourceplugin.fpga.IntelFpgaOpenclPlugin: Couldn't find (?i)bus:slot.func\s=\s.*, pattern
      2019-01-25 06:46:03,248 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.resourceplugin.fpga.IntelFpgaOpenclPlugin: Couldn't find (?i)Total\sCard\sPower\sUsage\s=\s.* pattern
      2019-01-25 06:46:03,251 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.resourceplugin.fpga.IntelFpgaOpenclPlugin: Failed to get major-minor number from reading /dev/pac_a10_f300000
      2019-01-25 06:46:03,252 ERROR org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor: Failed to bootstrap configured resource subsystems!
      org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.ResourceHandlerException: No FPGA devices detected!

      Problem #2

      The plugin assumes that the file name under /dev can be derived from the "Physical Dev Name", but this is wrong. For example, it thinks that the device file is /dev/pac_a10_f300000 which is not the case, the actual file is /dev/intel-fpga-port.0.


        1. YARN-9265-001.patch
          23 kB
          Peter Bacsko
        2. YARN-9265-002.patch
          25 kB
          Peter Bacsko
        3. YARN-9265-003.patch
          26 kB
          Peter Bacsko
        4. YARN-9265-004.patch
          27 kB
          Peter Bacsko
        5. YARN-9265-005.patch
          27 kB
          Peter Bacsko
        6. YARN-9265-006.patch
          27 kB
          Peter Bacsko
        7. YARN-9265-007.patch
          28 kB
          Peter Bacsko
        8. YARN-9265-008.patch
          43 kB
          Peter Bacsko
        9. YARN-9265-009.patch
          46 kB
          Peter Bacsko



            • Assignee:
              pbacsko Peter Bacsko
              pbacsko Peter Bacsko
            • Votes:
              0 Vote for this issue
              8 Start watching this issue


              • Created: