Parallel execution is not performed OpenCL MQL5 - opencl

I have created a kernel of the OpenCL in Mql5.
Here is the code:
const string cl_src =
//" int weightsum; \r\n"
" #pragma OPENCL EXTENSION cl_khr_fp64 : enable \r\n"
"__kernel void CalculateSimpleMA( \r\n"
"int rates_total, \r\n"
"int prev_calculated, \r\n"
"int begin, \r\n"
"int InpMAPeriod, \r\n"
"__global double *price, \r\n"
"__global double *ExtLineBuffer \r\n"
") \r\n"
"{ \r\n"
"int i,limit; \r\n"
"int len_price = get_global_id(4); \r\n"
//"int len_Ext = get_global_id(5); \r\n"
" if(prev_calculated==0)// first calculation \r\n"
" { \r\n"
"limit=InpMAPeriod+begin; \r\n"
"for(i=0;i<limit-1;i++) ExtLineBuffer[i]=0.0; \r\n"
"double firstValue=0; \r\n"
"for(i=begin;i<limit;i++) \r\n"
"firstValue+=price[i]; \r\n"
"firstValue/=InpMAPeriod; \r\n"
"ExtLineBuffer[limit-1]=firstValue; \r\n"
"} \r\n"
"else limit=prev_calculated-1; \r\n"
"for(i=limit;i<rates_total;i++) \r\n"
"ExtLineBuffer[i]=ExtLineBuffer[i-1]+(price[i]-price[i-InpMAPeriod])/InpMAPeriod;\r\n"
"} \r\n"
" \r\n"
"__kernel void CalculateEMA( \r\n"
"int rates_total, \r\n"
"int prev_calculated, \r\n"
"int begin, \r\n"
"int InpMAPeriod, \r\n"
"__global double *price, \r\n"
"__global double *ExtLineBuffer \r\n"
") \r\n"
"{ \r\n"
"int i,limit; \r\n"
"double SmoothFactor=2.0/(1.0+InpMAPeriod); \r\n"
"if(prev_calculated==0) \r\n"
"{ \r\n"
"limit=InpMAPeriod+begin; \r\n"
"ExtLineBuffer[begin]=price[begin]; \r\n"
"for(i=begin+1;i<limit;i++) \r\n"
"ExtLineBuffer[i]=price[i]*SmoothFactor+ExtLineBuffer[i-1]*(1.0-SmoothFactor); \r\n"
"} \r\n"
"else limit=prev_calculated-1; \r\n"
"for(i=limit;i<rates_total;i++) \r\n"
"ExtLineBuffer[i]=price[i]*SmoothFactor+ExtLineBuffer[i-1]*(1.0-SmoothFactor); \r\n"
"} \r\n"
" \r\n"
"__kernel void CalculateLWMA( \r\n"
"int rates_total, \r\n"
"int prev_calculated, \r\n"
"int begin, \r\n"
"int InpMAPeriod, \r\n"
"__global double *price, \r\n"
"__global double *ExtLineBuffer \r\n"
") \r\n"
"{ \r\n"
"int i,limit; \r\n"
"int weightsum; \r\n"
"double sum; \r\n"
"if(prev_calculated==0) \r\n"
"{ \r\n"
"weightsum=0; \r\n"
"limit=InpMAPeriod+begin; \r\n"
"for(i=0;i<limit;i++) ExtLineBuffer[i]=0.0; \r\n"
"double firstValue=0; \r\n"
"for(i=begin;i<limit;i++) \r\n"
"{ \r\n"
"int k=i-begin+1; \r\n"
"weightsum+=k; \r\n"
"firstValue+=k*price[i]; \r\n"
"} \r\n"
"firstValue/=(double)weightsum; \r\n"
"ExtLineBuffer[limit-1]=firstValue; \r\n"
"} \r\n"
"else limit=prev_calculated-1; \r\n"
"for(i=limit;i<rates_total;i++) \r\n"
"{ \r\n"
"sum=0; \r\n"
"for(int j=0;j<InpMAPeriod;j++) sum+=(InpMAPeriod-j)*price[i-j];\r\n"
"ExtLineBuffer[i]=sum/weightsum; \r\n"
"} \r\n"
"} \r\n"
" \r\n"
"__kernel void CalculateSmoothedMA( \r\n"
"int rates_total, \r\n"
"int prev_calculated, \r\n"
"int begin, \r\n"
"int InpMAPeriod, \r\n"
"__global double *price, \r\n"
"__global double *ExtLineBuffer \r\n"
") \r\n"
"{ \r\n"
"int i,limit; \r\n"
"if(prev_calculated==0) \r\n"
"{ \r\n"
"limit=InpMAPeriod+begin; \r\n"
"for(i=0;i<limit-1;i++) ExtLineBuffer[i]=0.0; \r\n"
"double firstValue=0; \r\n"
"for(i=begin;i<limit;i++) \r\n"
"firstValue+=price[i]; \r\n"
"firstValue/=InpMAPeriod; \r\n"
"ExtLineBuffer[limit-1]=firstValue; \r\n"
"} \r\n"
"else limit=prev_calculated-1; \r\n"
"for(i=limit;i<rates_total;i++) \r\n"
"ExtLineBuffer[i]=(ExtLineBuffer[i-1]*(InpMAPeriod-1)+price[i])/InpMAPeriod;\r\n"
"} \r\n";
While I am able to run this code, I am getting results but are not fast as expected with GPU as I am not able to utilize the cores or worker. I tried the following way to execute:
int offset[2]={0,0};
int work [2]={1024,1024};
if(!CLExecute(cl_krn,2,offset,work))
Print("Kernel not executed",GetLastError());
But the program got error as OpenCl device not found.
Then I tried this:
int offset[2]={0,0};
int work [2]={1024,1024};
if(!CLExecute(cl_krn))
Print("Kernel not executed",GetLastError());
And my code ran smoothly, but not with multiple cores.
What I can do to make the program work with multiple workers without fail.
EDITED
What platform? Did you install the proper OpenCL drivers? What is the output of clinfo?
Answer: I am using Windows 10 Pro. I have installed the latest OpenCL drivers from the Intel website. I guess the SDK what they give is what I have installed.
Here is the output of the clinfo: Gist of clinfo output

Related

command : Delete[] x

I have the following simple code. I allocate dynamically memory for 3 doubles, I assign to each double a number and after I deallocate the memory but as one can see if runs the code the only difference before and after the deletion (delete[] x) and the only difference is for the first double of the vector. I can't understand why the content of the first element of the vector changed and the content of x remained the same with the same address of memory.
#include <iostream>
#include <cmath>
int main(int argc, char * argv[])
{
double * x;
x = new double [3];
x[0] = 1; x[1]=3; x[2]=5;
std::cout << x[0] << " " << x[1] << " " << x[2] << "\n";
std::cout << x << "\n";
delete[] x;
std::cout << x[0] << " " << x[1] << " " << x[2] << "\n";
std::cout << x << "\n";
return 0;
}
To my understanding, this is undefined behaviour; x is read after it is deleted.

Nested loops in OpenCl Kernel

I have recently started trying to study OpenCl and am trying to convert the following code into an efficient OpenCl kernel:
for(int i = 0; i < VECTOR_SIZE; i++)
{
for(int j = 0; j < 100; j++)
{
C[i] = sqrt(A[i] + sqrt(A[i] * B[i])) * sqrt(A[i] + sqrt(A[i] * B[i]));
}
}
This is what I have come up with so far using different tutorials. My question is, can I somehow get rid of the outer loop in my kernel. Would you say that this is an okey implementation of the above C++ code and no further thing can be done to make it more efficient or close to how an openCL program is supposed to be like.
Also, all the tutorials that I have read so far have the kernels written in a const char *. What is reason behind this and is this the only way OPenCL kernels are written or usually we code them in some other file and then include it in our regular code or something.
Thanks
const char *RandomComputation =
"__kernel \n"
"void RandomComputation( "
" __global float *A, \n"
" __global float *B, \n"
" __global float *C) \n"
"{ \n"
" //Get the index of the work-item \n"
" int index = get_global_id(0); \n"
" for (int j = 0; j < 100 ; j++) \n"
" { \n"
" C[index] = sqrt(A[index] + sqrt(A[index] * B[index])) * sqrt(A[index] + sqrt(A[index] * B[index])); \n"
"} \n"
"} \n";
When you want to use nested loop in OpenCL kernel , use the two dimension like this example as matrix multiplication .
__kernel void matrixMul(__global float* C,
__global float* A,
__global float* B,
int wA, int wB)
{
int tx = get_global_id(0);
int ty = get_global_id(1);
float value = 0;
for (int k = 0; k < wA; ++k)
{
float elementA = A[ty * wA + k];
float elementB = B[k * wB + tx];
value += elementA * elementB;
}
C[ty * wA + tx] = value;
}
Did you need full explanation on here

How to print more than one QStrings on QtextEdit

Lets say we have a variable called X and we do some operations on it. now for printing it on the QtextEdit I want to print it like this cout on console:
cout << "The value of X is " << X << endl;
But the setText function only prints out a QString not both "the value of ... " and X.
You can use a QTextStream to write data into a QString similar to cout:
int X = 42;
QString str;
QTextStream out(&str);
out << "The value of X is " << X << endl;
qDebug() << str;
Output:
"The value of X is 42
"
I would solve this in the following way:
QString text = QString("This is my value: %1").arg(x); // x can be either number or string
textEdit->setText(text);
If your "x" is an integer, for example, you can convert that number into a string and concatenate that with the introducing string like that:
QString myText = "This is my value: " + QString::number(x);
If x=5 this will give you this string:
This is my value: 5
You can now assign myText to your QTextEdit with settext.

OpenCL on Xcode error: global variables must have a constant address space qualifier const char *KernelSource = \n"\

I tried to compile this code in Xcode, but I get the above error
const char *KernelSource = "\n"\
"__kernel void pi( \n"\
"__global float* out, \n"\
"uint cntSteps \n"\
") \n"\
"{ \n"\
" const uint idThread = get_local_id(0); \n"\
" uint numprocs = get_global_size(0); \n"\
" const float local_num = (float)cntSteps / numprocs; \n"\
"float step = 1.0 / cntSteps; \n"\
"float sum = 0.0; \n"\
"float x; \n"\
"int localmax = (idThread + 1) * local_num; \n"\
"for(uint i = idThread * local_num; i < localmax; i++) \n"\
"{ \n"\
"x = step * (i + 0.5); \n"\
"sum = sum + 4.0 / (1.0 + x * x); \n"\
"} \n"\
"out[idThread] = sum * step; \n"\
"} \n";
Any idea what is wrong with this kernel !,
Here is how I solved it:
create a kernel file e.g calc_pi.cl
// calc_pi.cl
kernel void pi(global float* out, uint cntSteps) {
const uint idThread = get_local_id(0);
uint numprocs = get_global_size(0);
const float local_num = (float)cntSteps / numprocs;
float step = 1.0 / cntSteps;
float sum = 0.0;
float x;
int localmax = (idThread + 1) * local_num;
for(uint i = idThread * local_num; i < localmax; i++) {
x = step * (i + 0.5);
sum = sum + 4.0 / (1.0 + x * x);
}
out[idThread] = sum * step;
}
2. in main.cpp
// include all necessary headers and the above kernel e.g
#include "calc_pi.cl.h"
//.. more includes
then work with your kernel
I hope this helps someone starting to work with OpenCL in Xcode.

Invalid multipart request with 0 mime parts google drive upload

I am trying to upload image to google-drive in qt/c++.
My code:
void googled::newuploadSettings(QNetworkReply *reply){
QByteArray m_boundary;
m_boundary = "--";
m_boundary += QString("42 + 13").toAscii();
QByteArray data = reply->readAll();
qDebug() << data;
QString x = getValue(data,"access_token");
qDebug() << x;
x = "Bearer " + x;
qDebug() << x;
QNetworkRequest request;
QUrl url("https://www.googleapis.com/upload/drive/v2/files?uploadType=multipart");
request.setUrl(url);
request.setRawHeader("Content-Length","200000000");
QString y = "multipart/related; boundary=" + QString("42+13");
qDebug() << y;
request.setRawHeader("Content-Type",y.toAscii());
request.setRawHeader("Authorization",x.toLatin1());
QString str;
str += m_boundary;
str += "\r\n";
str += "Content-Disposition: form-data; title=\"";
str += QString("sp").toAscii();
str += "\"; ";
str += "filename=\"";
str += QFile::encodeName("kashmir");
str += "\"\r\n";
str += "Content-Length: " ;
str += QString("200000000").toAscii();
str += "\r\n";
str += "Content-Type: ";
str += QString("application/json; charset=UTF-8").toAscii();
str += "\r\n";
str += "Mime-version: 1.0 ";
str += "\r\n";
str += "\r\n";
str += "mimeType:image/jpeg";
str += "\r\n";
str += "\r\n\r\n";
str += m_boundary;
str += "Content-Type: ";
str += "image/jpeg";
QByteArray arr;
arr.append(str.toUtf8());
QFile file("/home/saurabh/Pictures/005.jpg");
file.open(QIODevice::ReadOnly);
arr.append(file.readAll());
arr.append(m_boundary);
file.close();
qDebug() << "file";
//qDebug() << str;
qDebug() << arr;
m_netM = new QNetworkAccessManager;
QObject::connect(m_netM, SIGNAL(finished(QNetworkReply *)),
this, SLOT(uploadfinishedSlot(QNetworkReply *)));
m_netM->post(request,arr);
}
You are not using the quotation marks for the boundary attribute. Use the following content-type:
QString y = "multipart/related; boundary=\"" + QString("42+13") + "\"";

Resources