Intel OpenCL Intel CPU not detected - opencl

I'm running the following code in order to retrieve device information:
#include <CL/cl.h>
#include <vector>
int main(int argc, char *argv[])
{
//find and print all available opencl devices using clGetdeviceinfo
cl_int err;
cl_uint num_platforms;
cl_platform_id *platforms;
cl_uint num_devices;
cl_device_id *devices;
err = clGetPlatformIDs(0, NULL, &num_platforms);
platforms = (cl_platform_id *)malloc(num_platforms * sizeof(cl_platform_id));
err = clGetPlatformIDs(num_platforms, platforms, NULL);
for (int i = 0; i < num_platforms; i++)
{
cl_platform_id platform = platforms[i];
char platform_name[100];
err = clGetPlatformInfo(platform, CL_PLATFORM_NAME, 100, platform_name, NULL);
std::cout << "Platform " << i << ": " << platform_name << std::endl;
cl_uint num_devices;
err = clGetDeviceIDs(platform, CL_DEVICE_TYPE_ALL, 0, NULL, &num_devices);
devices = (cl_device_id *)malloc(num_devices * sizeof(cl_device_id));
err = clGetDeviceIDs(platform, CL_DEVICE_TYPE_ALL, num_devices, devices, NULL);
for (int j = 0; j < num_devices; j++)
{
cl_device_id device = devices[j];
char device_name[100];
err = clGetDeviceInfo(device, CL_DEVICE_NAME, 100, device_name, NULL);
std::cout << "Device " << j << ": " << device_name << std::endl;
}
}
}
Which results in the following output:
Platform 0: Intel(R) OpenCL HD Graphics
Device 0: Intel(R) Iris(R) Plus Graphics [0x8a52]
Platform 1: Clover
With a segfault occuring at the clGetDeviceInfo for the second platform.
I'm running Ubuntu 20.04.4 LTS on a MS Surface Pro 7 with a
10th Gen Intel® Core™ i7-1065G7 CPU.
clinfo output:
Number of platforms 2
Platform Name Intel(R) OpenCL HD Graphics
Platform Vendor Intel(R) Corporation
Platform Version OpenCL 3.0
Platform Profile FULL_PROFILE
Platform Extensions cl_khr_byte_addressable_store cl_khr_fp16 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_icd cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_intel_command_queue_families cl_intel_subgroups cl_intel_required_subgroup_size cl_intel_subgroups_short cl_khr_spir cl_intel_accelerator cl_intel_driver_diagnostics cl_khr_priority_hints cl_khr_throttle_hints cl_khr_create_command_queue cl_intel_subgroups_char cl_intel_subgroups_long cl_khr_il_program cl_intel_mem_force_host_memory cl_khr_subgroup_extended_types cl_khr_subgroup_non_uniform_vote cl_khr_subgroup_ballot cl_khr_subgroup_non_uniform_arithmetic cl_khr_subgroup_shuffle cl_khr_subgroup_shuffle_relative cl_khr_subgroup_clustered_reduce cl_intel_device_attribute_query cl_khr_suggested_local_work_size cl_khr_subgroups cl_intel_spirv_device_side_avc_motion_estimation cl_intel_spirv_media_block_io cl_intel_spirv_subgroups cl_khr_spirv_no_integer_wrap_decoration cl_intel_unified_shared_memory cl_khr_mipmap_image cl_khr_mipmap_image_writes cl_intel_planar_yuv cl_intel_packed_yuv cl_intel_motion_estimation cl_intel_device_side_avc_motion_estimation cl_intel_advanced_motion_estimation cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_image2d_from_buffer cl_khr_depth_images cl_khr_3d_image_writes cl_intel_media_block_io cl_intel_va_api_media_sharing cl_intel_sharing_format_query cl_khr_pci_bus_info cl_intel_subgroup_local_block_io
Platform Host timer resolution 1ns
Platform Extensions function suffix INTEL
Platform Name Clover
Platform Vendor Mesa
Platform Version OpenCL 1.1 Mesa 21.2.6
Platform Profile FULL_PROFILE
Platform Extensions cl_khr_icd
Platform Extensions function suffix MESA
Platform Name Intel(R) OpenCL HD Graphics
Number of devices 1
Device Name Intel(R) Iris(R) Plus Graphics [0x8a52]
Device Vendor Intel(R) Corporation
Device Vendor ID 0x8086
Device Version OpenCL 3.0 NEO
Driver Version 22.17.23034
Device OpenCL C Version OpenCL C 1.2
Device Type GPU
Device Profile FULL_PROFILE
Device Available Yes
Compiler Available Yes
Linker Available Yes
Max compute units 64
Max clock frequency 1100MHz
Device Partition (core)
Max number of sub-devices 0
Supported partition types None
Supported affinity domains (n/a)
Max work item dimensions 3
Max work item sizes 256x256x256
Max work group size 256
Preferred work group size multiple 32
Max sub-groups per work group 32
Sub-group sizes (Intel) 8, 16, 32
Preferred / native vector sizes
char 16 / 16
short 8 / 8
int 4 / 4
long 1 / 1
half 8 / 8 (cl_khr_fp16)
float 1 / 1
double 1 / 1 (n/a)
Half-precision Floating-point support (cl_khr_fp16)
Denormals Yes
Infinity and NANs Yes
Round to nearest Yes
Round to zero Yes
Round to infinity Yes
IEEE754-2008 fused multiply-add Yes
Support is emulated in software No
Single-precision Floating-point support (core)
Denormals Yes
Infinity and NANs Yes
Round to nearest Yes
Round to zero Yes
Round to infinity Yes
IEEE754-2008 fused multiply-add Yes
Support is emulated in software No
Correctly-rounded divide and sqrt operations No
Double-precision Floating-point support (n/a)
Address bits 64, Little-Endian
Global memory size 13084180480 (12.19GiB)
Error Correction support No
Max memory allocation 4294959104 (4GiB)
Unified memory for Host and Device Yes
Shared Virtual Memory (SVM) capabilities (core)
Coarse-grained buffer sharing Yes
Fine-grained buffer sharing No
Fine-grained system sharing No
Atomics No
Minimum alignment for any data type 128 bytes
Alignment of base address 1024 bits (128 bytes)
Preferred alignment for atomics
SVM 64 bytes
Global 64 bytes
Local 64 bytes
Max size for global variable 65536 (64KiB)
Preferred total size of global vars 4294959104 (4GiB)
Global Memory cache type Read/Write
Global Memory cache size 1048576 (1024KiB)
Global Memory cache line size 64 bytes
Image support Yes
Max number of samplers per kernel 16
Max size for 1D images from buffer 268434944 pixels
Max 1D or 2D image array size 2048 images
Base address alignment for 2D image buffers 4 bytes
Pitch alignment for 2D image buffers 4 pixels
Max 2D image size 16384x16384 pixels
Max planar YUV image size 16384x16352 pixels
Max 3D image size 16384x16384x2048 pixels
Max number of read image args 128
Max number of write image args 128
Max number of read/write image args 128
Max number of pipe args 16
Max active pipe reservations 1
Max pipe packet size 1024
Local memory type Local
Local memory size 65536 (64KiB)
Max number of constant args 8
Max constant buffer size 4294959104 (4GiB)
Max size of kernel argument 2048 (2KiB)
Queue properties (on host)
Out-of-order execution Yes
Profiling Yes
Queue properties (on device)
Out-of-order execution No
Profiling No
Preferred size 0
Max size 0
Max queues on device 0
Max events on device 0
Prefer user sync for interop Yes
Profiling timer resolution 52ns
Execution capabilities
Run OpenCL kernels Yes
Run native kernels No
Sub-group independent forward progress Yes
IL version SPIR-V_1.2
SPIR versions 1.2
printf() buffer size 4194304 (4MiB)
Built-in kernels block_motion_estimate_intel;block_advanced_motion_estimate_check_intel;block_advanced_motion_estimate_bidirectional_check_intel;
Motion Estimation accelerator version (Intel) 2
Device-side AVC Motion Estimation version 1
Supports texture sampler use Yes
Supports preemption Yes
Device Extensions cl_khr_byte_addressable_store cl_khr_fp16 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_icd cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_intel_command_queue_families cl_intel_subgroups cl_intel_required_subgroup_size cl_intel_subgroups_short cl_khr_spir cl_intel_accelerator cl_intel_driver_diagnostics cl_khr_priority_hints cl_khr_throttle_hints cl_khr_create_command_queue cl_intel_subgroups_char cl_intel_subgroups_long cl_khr_il_program cl_intel_mem_force_host_memory cl_khr_subgroup_extended_types cl_khr_subgroup_non_uniform_vote cl_khr_subgroup_ballot cl_khr_subgroup_non_uniform_arithmetic cl_khr_subgroup_shuffle cl_khr_subgroup_shuffle_relative cl_khr_subgroup_clustered_reduce cl_intel_device_attribute_query cl_khr_suggested_local_work_size cl_khr_subgroups cl_intel_spirv_device_side_avc_motion_estimation cl_intel_spirv_media_block_io cl_intel_spirv_subgroups cl_khr_spirv_no_integer_wrap_decoration cl_intel_unified_shared_memory cl_khr_mipmap_image cl_khr_mipmap_image_writes cl_intel_planar_yuv cl_intel_packed_yuv cl_intel_motion_estimation cl_intel_device_side_avc_motion_estimation cl_intel_advanced_motion_estimation cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_image2d_from_buffer cl_khr_depth_images cl_khr_3d_image_writes cl_intel_media_block_io cl_intel_va_api_media_sharing cl_intel_sharing_format_query cl_khr_pci_bus_info cl_intel_subgroup_local_block_io
Platform Name Clover
Number of devices 0
NULL platform behavior
clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...) No platform
clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...) No platform
clCreateContext(NULL, ...) [default] No platform
clCreateContext(NULL, ...) [other] Success [INTEL]
clCreateContextFromType(NULL, CL_DEVICE_TYPE_DEFAULT) Success (1)
Platform Name Intel(R) OpenCL HD Graphics
Device Name Intel(R) Iris(R) Plus Graphics [0x8a52]
clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU) No devices found in platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU) Success (1)
Platform Name Intel(R) OpenCL HD Graphics
Device Name Intel(R) Iris(R) Plus Graphics [0x8a52]
clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR) No devices found in platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM) No devices found in platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL) Success (1)
Platform Name Intel(R) OpenCL HD Graphics
Device Name Intel(R) Iris(R) Plus Graphics [0x8a52]
NOTE: your OpenCL library only supports OpenCL 2.2,
but some installed platforms support OpenCL 3.0.
Programs using 3.0 features may crash
or behave unexpectedly
I have tried to install Intel-SDK OpenCL CPU-runtime with no success.
How do I configure the CPU for OpenCL?
The intended use for the CPU is to be able to debug OpenCL-kernels.

The latest official OpenCL 18.1 CPU Runtime release from Intel is a bugged mess and/or not available.
Intel have lately integrated the OpenCL CPU Runtime into their oneAPI DPC++ Compiler.
One version that I know of that works (I tested it on 10980XE) is this one:
oclcpuexp-2020.10.4.0.15_rel.tar.gz
You may also try the latest version, oclcpuexp-2021.13.11.0.23_rel.tar.gz
I originally found the hint here.

Related

cpuid: reported micro-architecture seems ambiguous

Ubuntu 20.04 LTS. Note (unknown type) reported:
$ cpuid | less
CPU 0:
vendor_id = "GenuineIntel"
version information (1/eax):
processor type = primary processor (0)
family = 0x6 (6)
model = 0xe (14)
stepping id = 0xd (13)
extended family = 0x0 (0)
extended model = 0x9 (9)
(family synth) = 0x6 (6)
(model synth) = 0x9e (158)
(simple synth) = Intel Core (unknown type) (Kaby Lake / Coffee Lake) {Skylake}, 14nm
.
.
.
(uarch synth) = Intel Coffee Lake {Skylake}, 14nm
(synth) = Intel Xeon E-2200 (Coffee Lake R0) {Skylake}, 14nm
cat /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 158
model name : Intel(R) Xeon(R) E-2278G CPU # 3.40GHz
stepping : 13
microcode : 0xea
cpu MHz : 3400.000
cache size : 16384 KB
physical id : 0
siblings : 16
core id : 0
cpu cores : 8
apicid : 0
initial apicid : 0
fpu : yes
fpu_exception : yes
cpuid level : 22
. . .
So what is this processor? Kaby? Coffee? or Skylake? I ask because I am doing PMU programming. I want to be sure to do the right thing depending on what the architecture is.
This is the http://www.etallen.com/cpuid.html version of the cpuid command, rather than the one that's part of the msr-tools package.
According to Intel's web site, "Intel(R) Xeon(R) E-2278G" is a Coffee Lake, as your cpuid shell command figured out from other CPUID info, perhaps including the brand string.
Specifically, according to wikichip, it's a Coffee Lake Refresh iteration of Coffee Lake E (the Xeon-badged versions of "client" cores), so Coffee Lake ER.
Being Coffee Lake means the IA cores are microarchitecturally identical to Skylake (and Kaby Lake), although it has a newer GPU and has memory controllers rated for higher speeds.
For the CPU cores, the improvement is just a refinement of the silicon process (14nm++).
Except maybe for some hardware fix for vulnerabilities like Meltdown, and L1TF or other related things. And maybe refinement of the Spectre mitigations. But not yet fixing the bug that required disabling the LSD (loop buffer) in microcode, or the JCC erratum, so both those performance problems are still self-inflicted by its microcode to ensure correctness even in corner cases.
The PMU hardware is AFAIK identical in SKL, KBL, and CFL, except perhaps(?) some errata being fixed. The event IDs are all the same, I assume.
Understanding command-line cpuid output:
version information (1/eax):
... family/model/stepping/...
(simple synth) = Intel Core (unknown type) (Kaby Lake / Coffee Lake) {Skylake}, 14nm
I'm guessing that this is saying that the family/model/stepping and extended-model/family fields don't distinguish between KBL and CFL (thus unknown type), but do imply one or the other. And {Skylake} is reminding you that KBL and CFL are both iterations of Skylake.
The "(unknown type)" is probably saying it doesn't know whether it's an i3/5/7/9 or Pentium or Celeron.
On my i7-6700k Skylake desktop, I get this. From cpuid version 20170122 from an old Arch Linux AUR install, which is still over a year after the CPU itself was released, which may be an older version than you have.
version information (1/eax):
processor type = primary processor (0)
family = Intel Pentium Pro/II/III/Celeron/Core/Core 2/Atom, AMD Athlon/Duron, Cyrix M2, VIA C3 (6)
model = 0xe (14)
stepping id = 0x3 (3)
extended family = 0x0 (0)
extended model = 0x5 (5)
(simple synth) = Intel Core i3-6000 / i5-6000 / i7-6000 / Pentium G4000 / Celeron G3900 / Xeon E3-1200 (Skylake), 14nm
So yeah, (simple synth) is probably what it was able to divine from just the things in that leaf, not the brand string.
From the latest version of the tool, version 20220224,
version information (1/eax):
processor type = primary processor (0)
family = 0x6 (6)
model = 0xe (14)
...
(family synth) = 0x6 (6)
(model synth) = 0x5e (94)
(simple synth) = Intel Core (unknown type) (Skylake-H R0) {Skylake}, 14nm
...
(uarch synth) = Intel Skylake {Skylake}, 14nm
(synth) = Intel Core i*-6000 (Skylake-H R0) {Skylake}, 14nm
So the (unknown type) seems perfectly normal, just a consequence of Intel not varying the family, model, stepping numbers across i3/5/7.
The version of cpuid in your Ubuntu 20.04 LTS does seem to know about Coffee Lake, though. Perhaps not new enough to know about Coffee Lake Refresh, since yours said "R0" for yours, same as "R0" for my Skylake (which is not a "refresh"; there weren't new iterations of CPUs released as Skylake. They moved straight on to Kaby Lake after SKL.)
Anyway, your version of cpuid does eventually get to a correct and fully accurate description of your CPU, synthesized from various pieces of information.
(synth) = Intel Xeon E-2200 (Coffee Lake R0) {Skylake}, 14nm
I'm not sure exactly what cpuid means by R0. Perhaps "revision 0", which would make sense if it didn't know about "Coffee Lake Refresh". I wonder if a newer cpuid would report R1 or spell out "Refresh"?
It is indeed an E-22xx series Xeon of the Coffee Lake family. The CPU cores are basically identical to Skylake, and it's built in a 14nm(++) process. IDK if Intel tweaked anything at all inside the cores between CFL and CFL-refresh, so for programming it you have all the info you need.
Other instances of "R" show up in CPUID output that mean "registered trademark" like in "Intel(R) Xeon(R)" in the brand string Intel(R) Xeon(R) E-2278G CPU # 3.40GHz that Linux pulls directly out of the CPU via the EAX=0017h leaf of the CPUID machine instruction with different ECX inputs. But I think the R0 is probably being used to mean "first iteration" (0th refresh?), since it's not just copying strings like "Xeon(R)".

Having trouble stopping U-Boot autoboot

Background:
I have an old Seagate BlackArmor NAS 110 that I'm trying to install Debian on by following the instructions here: https://github.com/hn/seagate-blackarmor-nas.
I have a couple of USB to TTL serial adapters (one FTDI chipset and the other Prolific) that I've tried and have run into the same issue with both. I have made the connection to the serial port on the board of the NAS using a multimeter to make sure I've gotten the pinout correct.
Problem:
I'm not able to stop the autoboot process by pressing keys and any point during the boot process. The device also does not seem to respond to any keystrokes although they are echoed back.
What I've Tried So Far:
Using USB to TTL serial adapters with two different chipsets
Using the adapters on two different computers (MacBook Pro and a ThinkPad)
Using different operating systems (MacOS, Windows 10, Ubuntu 20.04)
Using different terminal programs (Screen, Minicom, Putty)
Turned off hardware and software flow control
Tested output of adapters by shorting RX and TX pins and seeing keystrokes echoed back
Commands seem to be sent to device as when I type I see my commands echoed back (not sure if this is supposed to happen)
I've been at this for a few days and can't figure it out. I've also recorded my screen while experiencing the issue: https://streamable.com/xl43br. Can anyone see where I'm going wrong?
Terminal output while experiencing the problem:
Welcome to minicom 2.7.1
OPTIONS:
Compiled on Nov 15 2020, 08:12:42.
Port /dev/tty.usbserial-AQ00KV6T, 16:51:31
Press Meta-Z for help on special keys
???
__ __ _ _
| \/ | __ _ _ ____ _____| | |
| |\/| |/ _` | '__\ \ / / _ \ | |
| | | | (_| | | \ V / __/ | |
|_| |_|\__,_|_| \_/ \___|_|_|
_ _ ____ _
| | | | | __ ) ___ ___ | |_
| | | |___| _ \ / _ \ / _ \| __|
| |_| |___| |_) | (_) | (_) | |_
\___/ |____/ \___/ \___/ \__| ** uboot_ver:v0.0.5 **
** MARVELL BOARD: MONO LE
U-Boot 1.1.4 (Nov 6 2009 - 11:15:26) Marvell version: 3.4.18
U-Boot code: 00600000 -> 0067FFF0 BSS: -> 006CDE60
Soc: 88F6192 A1 (DDR2)
CPU running # 800Mhz L2 running # 400Mhz
SysClock = 200Mhz , TClock = 166Mhz
DRAM CAS Latency = 3 tRP = 3 tRAS = 8 tRCD=3
DRAM CS[0] base 0x00000000 size 128MB
DRAM Total size 128MB 16bit width
Addresses 8M - 0M are saved for the U-Boot usage.
Mem malloc Initialization (8M - 7M): Done
NAND:d32 MB
Marvell Serial ATA Adapter
Integrated Sata device found
CPU : Marvell Feroceon (Rev 1)
Scanning partition header:
Found sign PrEr at c0000
Found sign KrNl at 2c0000
Found sign RoOt at 540000
Streaming disabled
Write allocate disabled
USB 0: host mode
PEX 0: interface detected no Link.
Net: egiga0 [PRIME]
0 any key to stop autoboot: 1
NAND read: device 0 offset 0xc4000, size 0x195200
Reading data from 0x259000 -- 100% complete.
1659392 bytes read: OK
Calculate CRC32:
crc32 checksum Pass
NAND read: device 0 offset 0x2c4000, size 0x21c000
Reading data from 0x4dfe00 -- 100% complete.
2211840 bytes read: OK
Calculate CRC32:
crc32 checksum Pass
## Booting image at 00040000 ...
Image Name: Linux-2.6.22.18
Created: 2009-11-06 3:38:29 UTC
Image Type: ARM Linux Kernel Image (uncompressed)
Data Size: 2211388 Bytes = 2.1 MB
Load Address: 00008000
Entry Point: 00008000
Verifying Checksum ... OK
OK
Starting kernel ...
Uncompressing Linux.......................................................................................................................................... done, booting the kernel.
Linux version 2.6.22.18 (root#jasonDev.localdomain) (gcc version 4.2.1) #1 Fri Nov 6 11:38:22 CST 2009 v0.0.7
CPU: ARM926EJ-S [56251311] revision 1 (ARMv5TE), cr=00053977
Machine: Feroceon-KW
Using UBoot passing parameters structure
Memory policy: ECC disabled, Data cache writeback
CPU0: D VIVT write-back cache
CPU0: I cache: 16384 bytes, associativity 4, 32 byte lines, 128 sets
CPU0: D cache: 16384 bytes, associativity 4, 32 byte lines, 128 sets
Built 1 zonelists. Total pages: 32512
Kernel command line: console=ttyS0,115200 mtdparts=nand_mtd:0x000a0000#0x0(uboot),0x00010000#0x000a0000(param),0x00200000#0x000c0000(preroot),0x00280000#0x002c0000(uimage),0x01a000000
PID hash table entries: 512 (order: 9, 2048 bytes)
Console: colour dummy device 80x30
Dentry cache hash table entries: 16384 (order: 4, 65536 bytes)
Inode-cache hash table entries: 8192 (order: 3, 32768 bytes)
Memory: 128MB 0MB 0MB 0MB = 128MB total
Memory: 109056KB available (4048K code, 289K data, 128K init)
Security Framework v1.0.0 initialized
Mount-cache hash table entries: 512
CPU: Testing write buffer coherency: ok
NET: Registered protocol family 16
CPU Interface
-------------
SDRAM_CS0 ....base 00000000, size 128MB
SDRAM_CS1 ....disable
SDRAM_CS2 ....disable
SDRAM_CS3 ....disable
PEX0_MEM ....base e8000000, size 128MB
PEX0_IO ....base f2000000, size 1MB
INTER_REGS ....base f1000000, size 1MB
NFLASH_CS ....base fa000000, size 2MB
SPI_CS ....base f4000000, size 16MB
BOOT_ROM_CS ....no such
DEV_BOOTCS ....no such
CRYPT_ENG ....base f0000000, size 2MB
Marvell Development Board (LSP Version KW_LSP_4.2.7_patch21_with_rx_desc_tuned)-- MONO Soc: 88F6192 A1 LE
Detected Tclk 166666667 and SysClk 200000000
MV Buttons Device Load
Marvell USB EHCI Host controller #0: c05b4600
PEX0 interface detected no Link.
PCI: bus0: Fast back to back transfers enabled
SCSI subsystem initialized
usbcore: registered new interface driver usbfs
usbcore: registered new interface driver hub
usbcore: registered new device driver usb
NET: Registered protocol family 2
Time: kw_clocksource clocksource has been installed.
IP route cache hash table entries: 1024 (order: 0, 4096 bytes)
TCP established hash table entries: 4096 (order: 3, 32768 bytes)
TCP bind hash table entries: 4096 (order: 2, 16384 bytes)
TCP: Hash tables configured (established 4096 bind 4096)
TCP reno registered
checking if image is initramfs...it isn't (bad gzip magic numbers); looks like an initrd
Freeing initrd memory: 16384K
RTC registered
Use the XOR engines (acceleration) for enhancing the following functions:
o RAID 5 Xor calculation
o kernel memcpy
o kenrel memzero
Number of XOR engines to use: 4
cesadev_init(c00116c4)
mvCesaInit: sessions=640, queue=64, pSram=f0000000
MV Buttons Driver Load
VFS: Disk quotas dquot_6.5.1
Dquot-cache hash table entries: 1024 (order 0, 4096 bytes)
Installing knfsd (copyright (C) 1996 okir#monad.swb.de).
JFFS2 version 2.2. (NAND) ?Â?© 2001-2006 Red Hat, Inc.
fuse init (API version 7.8)
SGI XFS with large block numbers, no debug enabled
io scheduler noop registered
io scheduler anticipatory registered (default)
Serial: 8250/16550 driver $Revision: 1.90 $ 4 ports, IRQ sharing disabled
serial8250.0: ttyS0 at MMIO 0xf1012000 (irq = 33) is a 16550A
serial8250.0: ttyS1 at MMIO 0xf1012100 (irq = 34) is a 16550A
RAMDISK driver initialized: 2 RAM disks of 16384K size 1024 blocksize
loop: module loaded
Loading Marvell Ethernet Driver:
o Cached descriptors in DRAM
o DRAM SW cache-coherency
o Single RX Queue support - ETH_DEF_RXQ=0
o Single TX Queue support - ETH_DEF_TXQ=0
o TCP segmentation offload enabled
o Receive checksum offload enabled
o Transmit checksum offload enabled
o Network Fast Processing (Routing) supported
o Driver ERROR statistics enabled
o Driver INFO statistics enabled
o Proc tool API enabled
o Rx descripors: q0=256
o Tx descripors: q0=532
o Loading network interface(s):
o egiga0, ifindex = 1, GbE port = 0
Warning: Giga 1 is Powered Off
mvFpRuleDb (c73ab000): 1024 entries, 4096 bytes
e100: Intel(R) PRO/100 Network Driver, 3.5.17-k4-NAPI
e100: Copyright(c) 1999-2006 Intel Corporation
Integrated Sata device found
scsi0 : Marvell SCSI to SATA adapter
scsi1 : Marvell SCSI to SATA adapter
NFTL driver: nftlcore.c $Revision: 1.98 $, nftlmount.c $Revision: 1.41 $
NAND device: Manufacturer ID: 0xec, Chip ID: 0x75 (Samsung NAND 32MiB 3,3V 8-bit)
Scanning device for bad blocks
7 cmdlinepart partitions found on MTD device nand_mtd
Using command line partition definition
Creating 7 MTD partitions on "nand_mtd":
0x00000000-0x000a0000 : "uboot"
0x000a0000-0x000b0000 : "param"
0x000c0000-0x002c0000 : "preroot"
0x002c0000-0x00540000 : "uimage"
0x00540000-0x01f40000 : "rootfs"
0x01f40000-0x02000000 : "misc"
0x00000000-0x02000000 : "flash"
ehci_marvell ehci_marvell.70059: Marvell Orion EHCI
ehci_marvell ehci_marvell.70059: new USB bus registered, assigned bus number 1
ehci_marvell ehci_marvell.70059: irq 19, io base 0xf1050100
ehci_marvell ehci_marvell.70059: USB 2.0 started, EHCI 1.00, driver 10 Dec 2004
usb usb1: configuration #1 chosen from 1 choice
hub 1-0:1.0: USB hub found
hub 1-0:1.0: 1 port detected
USB Universal Host Controller Interface driver v3.0
usb 1-1: new high speed USB device using ehci_marvell and address 2
usb 1-1: configuration #1 chosen from 1 choice
hub 1-1:1.0: USB hub found
hub 1-1:1.0: 4 ports detected
usbcore: registered new interface driver usblp
drivers/usb/class/usblp.c: v0.13: USB Printer Device Class driver
Initializing USB Mass Storage driver...
usbcore: registered new interface driver usb-storage
USB Mass Storage support registered.
mice: PS/2 mouse device common for all mice
i2c /dev entries driver
attach_adapter....
md: linear personality registered for level -1
md: raid0 personality registered for level 0
md: raid1 personality registered for level 1
md: raid10 personality registered for level 10
raid6: int32x1 73 MB/s
raid6: int32x2 80 MB/s
raid6: int32x4 83 MB/s
raid6: int32x8 74 MB/s
raid6: using algorithm int32x4 (83 MB/s)
md: raid6 personality registered for level 6
md: raid5 personality registered for level 5
md: raid4 personality registered for level 4
raid5: measuring checksumming speed
arm4regs : 722.800 MB/sec
8regs : 503.200 MB/sec
32regs : 600.000 MB/sec
raid5: using function: arm4regs (722.800 MB/sec)
device-mapper: ioctl: 4.11.0-ioctl (2006-10-12) initialised: dm-devel#redhat.com
dm_crypt using the OCF package.
usbcore: registered new interface driver hiddev
usbcore: registered new interface driver usbhid
drivers/hid/usbhid/hid-core.c: v2.6:USB HID core driver
wix gpio_init
Advanced Linux Sound Architecture Driver Version 1.0.14 (Thu May 31 09:03:25 2007 UTC).
ALSA device list:
No soundcards found.
TCP cubic registered
NET: Registered protocol family 1
NET: Registered protocol family 17
md: Autodetecting RAID arrays.
md: autorun ...
md: ... autorun DONE.
RAMDISK: cramfs filesystem found at block 0
RAMDISK: Loading 1620KiB [1 disk] into ram disk... done.
VFS: Mounted root (cramfs filesystem) readonly.
Freeing init memory: 128K
Enter Pre-Root FileSystem:
FW_UPDATE_FLAG_RES:1
BOARDTEST_FALG:0
DSK1_RES:1
DSK2_RES:1
DSK3_RES:1
DSK4_RES:1
DSK1_S_RES:
DSK2_S_RES:
DSK3_S_RES:
DSK4_S_RES:
CHK_RES:1
MD0CHK_RES:1
init started: BusyBox v1.1.1 (2008.10.08-08:58+0000) multi-call binary
Starting pid 396, console /dev/ttyS0: '/etc/init.d/rcS'
Starting network...
Starting inetd... OK
NOT_DEF_RES:0
EXT3-fs: unable to read superblock
FAT: unable to read boot sector
EXT3-fs: unable to read superblock
EXT2-fs: unable to read superblock
FAT: unable to read boot sector
FAT: unable to read boot sector
egiga0: started
admindasdas
So it turns out there is a short somewhere between the RX pin and the +3.3V pin which is not allowing me to send anything to the board. Thank you to those who have commented.

OpenCL on isolated CPU cores

I have an 8 core linux machine on which I've altered my grub.conf with
isolcpus=4,5,6,7
so that the last four cores are not used by the OS process scheduler. Running the command clinfo shows for the CPU: MAX_COMPUTE_UNITS: 4.
Removing the isolcpus line from my grub.conf file and running clinfo shows for the CPU: MAX_COMPUTE_UNITS: 8. I guess this means that any OpenCL kernel will not use the isolated CPUs. Does anyone know how to force an OpenCL kernel to use the isolated CPUs? More info on my specific OpenCL implementation from clinfo:
NAME: Intel(R) Xeon(R) CPU E5-2603 v2 # 1.80GHz
VENDOR: Intel(R) Corporation
PROFILE: FULL_PROFILE
VERSION: OpenCL 1.2 (Build 8)
EXTENSIONS: cl_khr_icd cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store cl_khr_spir cl_intel_exec_by_local_thread cl_khr_depth_images cl_khr_3d_image_writes cl_khr_fp64
DRIVER_VERSION: 1.2.0.8

/usr/local/cuda-8.0/lib64/libOpenCL.so.1: no version information available

When I am running computecpp_info
$ /usr/local/computecpp/bin/computecpp_info
/usr/local/computecpp/bin/computecpp_info: /usr/local/cuda-8.0/lib64/libOpenCL.so.1: no version information available (required by /usr/local/computecpp/bin/computecpp_info)
/usr/local/computecpp/bin/computecpp_info: /usr/local/cuda-8.0/lib64/libOpenCL.so.1: no version information available (required by /usr/local/computecpp/bin/computecpp_info)
********************************************************************************
ComputeCpp Info (CE 0.7.0)
********************************************************************************
Toolchain information:
GLIBC version: 2.19
GLIBCXX: 20150426
This version of libstdc++ is supported.
********************************************************************************
Device Info:
Discovered 3 devices matching:
platform : <any>
device type : <any>
--------------------------------------------------------------------------------
Device 0:
Device is supported : NO - Device does not support SPIR
CL_DEVICE_NAME : GeForce GTX 750 Ti
CL_DEVICE_VENDOR : NVIDIA Corporation
CL_DRIVER_VERSION : 384.111
CL_DEVICE_TYPE : CL_DEVICE_TYPE_GPU
--------------------------------------------------------------------------------
Device 1:
Device is supported : UNTESTED - Device not tested on this OS
CL_DEVICE_NAME : Intel(R) HD Graphics
CL_DEVICE_VENDOR : Intel(R) Corporation
CL_DRIVER_VERSION : r5.0.63503
CL_DEVICE_TYPE : CL_DEVICE_TYPE_GPU
--------------------------------------------------------------------------------
Device 2:
Device is supported : YES - Tested internally by Codeplay Software Ltd.
CL_DEVICE_NAME : Intel(R) Core(TM) i7-4790 CPU # 3.60GHz
CL_DEVICE_VENDOR : Intel(R) Corporation
CL_DRIVER_VERSION : 1.2.0.475
CL_DEVICE_TYPE : CL_DEVICE_TYPE_CPU
If you encounter problems when using any of these OpenCL devices, please consult
this website for known issues:
https://computecpp.codeplay.com/releases/v0.7.0/platform-support-notes
********************************************************************************
and when running clinfo
$ clinfo
clinfo: /usr/local/cuda-8.0/lib64/libOpenCL.so.1: no version information available (required by clinfo)
Number of platforms 2
Platform Name NVIDIA CUDA
Platform Vendor NVIDIA Corporation
Platform Version OpenCL 1.2 CUDA 9.0.282
Platform Profile FULL_PROFILE
Platform Extensions cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_fp64 cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll cl_nv_copy_opts cl_nv_create_buffer
Platform Extensions function suffix NV
Platform Name Intel(R) OpenCL
Platform Vendor Intel(R) Corporation
Platform Version OpenCL 1.2
Platform Profile FULL_PROFILE
Platform Extensions cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_depth_images cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_icd cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_spir
Platform Extensions function suffix INTEL
Platform Name NVIDIA CUDA
Number of devices 1
Device Name GeForce GTX 750 Ti
Device Vendor NVIDIA Corporation
Device Vendor ID 0x10de
Device Version OpenCL 1.2 CUDA
Driver Version 384.111
Device OpenCL C Version OpenCL C 1.2
Device Type GPU
Device Profile FULL_PROFILE
Device Topology (NV) PCI-E, 01:00.0
Max compute units 5
Max clock frequency 1110MHz
NVIDIA Compute Capability 5.0
Device Partition (core)
Max number of sub-devices 1
Supported partition types None
Max work item dimensions 3
Max work item sizes 1024x1024x64
Max work group size 1024
Preferred work group size multiple 32
Warp size (NV) 32
Preferred / native vector sizes
char 1 / 1
short 1 / 1
int 1 / 1
long 1 / 1
half 0 / 0 (n/a)
float 1 / 1
double 1 / 1 (cl_khr_fp64)
Half-precision Floating-point support (n/a)
Single-precision Floating-point support (core)
Denormals Yes
Infinity and NANs Yes
Round to nearest Yes
Round to zero Yes
Round to infinity Yes
IEEE754-2008 fused multiply-add Yes
Support is emulated in software No
Correctly-rounded divide and sqrt operations Yes
Double-precision Floating-point support (cl_khr_fp64)
Denormals Yes
Infinity and NANs Yes
Round to nearest Yes
Round to zero Yes
Round to infinity Yes
IEEE754-2008 fused multiply-add Yes
Support is emulated in software No
Correctly-rounded divide and sqrt operations No
Address bits 64, Little-Endian
Global memory size 2094530560 (1.951GiB)
Error Correction support No
Max memory allocation 523632640 (499.4MiB)
Unified memory for Host and Device No
Integrated memory (NV) No
Minimum alignment for any data type 128 bytes
Alignment of base address 4096 bits (512 bytes)
Global Memory cache type Read/Write
Global Memory cache size 81920
Global Memory cache line 128 bytes
Image support Yes
Max number of samplers per kernel 32
Max size for 1D images from buffer 134217728 pixels
Max 1D or 2D image array size 2048 images
Max 2D image size 16384x16384 pixels
Max 3D image size 4096x4096x4096 pixels
Max number of read image args 256
Max number of write image args 16
Local memory type Local
Local memory size 49152 (48KiB)
Registers per block (NV) 65536
Max constant buffer size 65536 (64KiB)
Max number of constant args 9
Max size of kernel argument 4352 (4.25KiB)
Queue properties
Out-of-order execution Yes
Profiling Yes
Profiling timer resolution 1000ns
Execution capabilities
Run OpenCL kernels Yes
Run native kernels No
Kernel execution timeout (NV) Yes
Concurrent copy and kernel execution (NV) Yes
Number of async copy engines 1
Prefer user sync for interop No
printf() buffer size 1048576 (1024KiB)
Built-in kernels
Device Available Yes
Compiler Available Yes
Linker Available Yes
Device Extensions cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_fp64 cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll cl_nv_copy_opts cl_nv_create_buffer
Platform Name Intel(R) OpenCL
Number of devices 2
Device Name Intel(R) HD Graphics
Device Vendor Intel(R) Corporation
Device Vendor ID 0x8086
Device Version OpenCL 1.2
Driver Version r5.0.63503
Device OpenCL C Version OpenCL C 1.2
Device Type GPU
Device Profile FULL_PROFILE
Max compute units 20
Max clock frequency 1200MHz
Device Partition (core)
Max number of sub-devices 0
Supported partition types None
Max work item dimensions 3
Max work item sizes 256x256x256
Max work group size 256
Preferred work group size multiple 32
Preferred / native vector sizes
char 16 / 16
short 8 / 8
int 4 / 4
long 1 / 1
half 0 / 0 (n/a)
float 1 / 1
double 0 / 0 (n/a)
Half-precision Floating-point support (n/a)
Single-precision Floating-point support (core)
Denormals No
Infinity and NANs Yes
Round to nearest Yes
Round to zero Yes
Round to infinity Yes
IEEE754-2008 fused multiply-add Yes
Support is emulated in software No
Correctly-rounded divide and sqrt operations Yes
Double-precision Floating-point support (n/a)
Address bits 64, Little-Endian
Global memory size 1709598311 (1.592GiB)
Error Correction support No
Max memory allocation 854799155 (815.2MiB)
Unified memory for Host and Device Yes
Minimum alignment for any data type 128 bytes
Alignment of base address 1024 bits (128 bytes)
Global Memory cache type Read/Write
Global Memory cache size 524288
Global Memory cache line 64 bytes
Image support Yes
Max number of samplers per kernel 16
Max size for 1D images from buffer 53424947 pixels
Max 1D or 2D image array size 2048 images
Max 2D image size 16384x16384 pixels
Max 3D image size 2048x2048x2048 pixels
Max number of read image args 128
Max number of write image args 128
Local memory type Local
Local memory size 65536 (64KiB)
Max constant buffer size 854799155 (815.2MiB)
Max number of constant args 8
Max size of kernel argument 1024
Queue properties
Out-of-order execution No
Profiling Yes
Profiling timer resolution 80ns
Execution capabilities
Run OpenCL kernels Yes
Run native kernels No
SPIR versions 1.2
Prefer user sync for interop Yes
printf() buffer size 4194304 (4MiB)
Built-in kernels block_motion_estimate_intel;block_advanced_motion_estimate_check_intel;block_advanced_motion_estimate_bidirectional_check_intel
Device Available Yes
Compiler Available Yes
Linker Available Yes
Device Extensions cl_intel_accelerator cl_intel_advanced_motion_estimation cl_intel_driver_diagnostics cl_intel_motion_estimation cl_intel_planar_yuv cl_intel_packed_yuv cl_intel_required_subgroup_size cl_intel_subgroups cl_intel_va_api_media_sharing cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_depth_images cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_icd cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_spir
Device Name Intel(R) Core(TM) i7-4790 CPU # 3.60GHz
Device Vendor Intel(R) Corporation
Device Vendor ID 0x8086
Device Version OpenCL 1.2 (Build 475)
Driver Version 1.2.0.475
Device OpenCL C Version OpenCL C 1.2
Device Type CPU
Device Profile FULL_PROFILE
Max compute units 8
Max clock frequency 3600MHz
Device Partition (core)
Max number of sub-devices 8
Supported partition types by counts, equally, by names (Intel)
Max work item dimensions 3
Max work item sizes 8192x8192x8192
Max work group size 8192
Preferred work group size multiple 128
Preferred / native vector sizes
char 1 / 32
short 1 / 16
int 1 / 8
long 1 / 4
half 0 / 0 (n/a)
float 1 / 8
double 1 / 4 (cl_khr_fp64)
Half-precision Floating-point support (n/a)
Single-precision Floating-point support (core)
Denormals Yes
Infinity and NANs Yes
Round to nearest Yes
Round to zero No
Round to infinity No
IEEE754-2008 fused multiply-add No
Support is emulated in software No
Correctly-rounded divide and sqrt operations No
Double-precision Floating-point support (cl_khr_fp64)
Denormals Yes
Infinity and NANs Yes
Round to nearest Yes
Round to zero Yes
Round to infinity Yes
IEEE754-2008 fused multiply-add Yes
Support is emulated in software No
Correctly-rounded divide and sqrt operations No
Address bits 64, Little-Endian
Global memory size 8260567040 (7.693GiB)
Error Correction support No
Max memory allocation 2065141760 (1.923GiB)
Unified memory for Host and Device Yes
Minimum alignment for any data type 128 bytes
Alignment of base address 1024 bits (128 bytes)
Global Memory cache type Read/Write
Global Memory cache size 262144
Global Memory cache line 64 bytes
Image support Yes
Max number of samplers per kernel 480
Max size for 1D images from buffer 129071360 pixels
Max 1D or 2D image array size 2048 images
Max 2D image size 16384x16384 pixels
Max 3D image size 2048x2048x2048 pixels
Max number of read image args 480
Max number of write image args 480
Local memory type Global
Local memory size 32768 (32KiB)
Max constant buffer size 131072 (128KiB)
Max number of constant args 480
Max size of kernel argument 3840 (3.75KiB)
Queue properties
Out-of-order execution Yes
Profiling Yes
Local thread execution (Intel) Yes
Profiling timer resolution 1ns
Execution capabilities
Run OpenCL kernels Yes
Run native kernels Yes
SPIR versions 1.2
Prefer user sync for interop No
printf() buffer size 1048576 (1024KiB)
Built-in kernels
Device Available Yes
Compiler Available Yes
Linker Available Yes
Device Extensions cl_khr_icd cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store cl_khr_depth_images cl_khr_3d_image_writes cl_intel_exec_by_local_thread cl_khr_spir cl_khr_fp64
NULL platform behavior
clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...) No platform
clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...) No platform
clCreateContext(NULL, ...) [default] No platform
clCreateContext(NULL, ...) [other] Success [NV]
clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU) No platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU) No platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR) No platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM) No platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL) No platform
Here, I tried to find the reason of /usr/local/cuda-8.0/lib64/libOpenCL.so.1: no version information available and tried to fix this issue, but I didn't get any fruitful help. Please help by explaining or referring something to get it clear.

what's to do with "GPU not found. Falling back to CPU device"

Just downloaded AMD-APP-SDK-v2.5-RC2-lnx32, why i got this: GPU not found. , when i try to execute a sample program ?
%> ./AESEncryptDecrypt
Platform 0 : Advanced Micro Devices, Inc.
Encrypting Image ....
Input Image : input512.bmp
Key : 15 201 51 89 92 34 96 66 11 225 161 96 81 211 108 124
GPU not found. Falling back to CPU device
Selected Platform Vendor : Advanced Micro Devices, Inc.
Device 0 : Intel(R) Core(TM)2 Duo CPU T5870 # 2.00GHz
Executing kernel for 1 iterations
-------------------------------------------
Output Filename : output.bmp
==========================================
fglrxinfo:
display: :0.0 screen: 0
OpenGL vendor string: ATI Technologies Inc.
OpenGL renderer string: ATI Mobility Radeon HD 3400 Series
OpenGL version string: 3.3.10665 Compatibility Profile Context
It sounds like the Radeon HD 3400 hardware does not support "ATI Stream Processing" (and thus OpenCL on the GPU).
See How to enable OpenCL-GPU-processing in Linux (CL_DEVICE_TYPE_GPU)? (It is only supported on certain cores, and there are multiple cores -- with different features enabled -- used within a Fxxx model range).
Happy OpenCL-on-CPU coding.

Resources