I'm setting up a parallel optimisation environment using IBM CPLEX 12.9, Julia Language v1.1.0 and JuMP. To start a local optimisation I'm currently using the library CPLEX.jl that provides the connection (using C calls on background) to optimise some model locally. Let's call this machine A.
However, I'm trying to start an optimisation in a remote machine which means that when I start an optimisation on A, Julia will call the CPLEX installed on the machine B (which has more memory, cpus, etc).
Looking the CPLEX documentation I've seen that for a local optimisation we call the function
CPXopenCPLEX(int * status_p)
given by the lib libcplex1290.so. For a remote connection, CPLEX provides another interface to connect to external servers by the function
CPXopenCPLEXremote(char const * transport, int argc, char const *const * argv, int * status_p)
The package CPLEX.jl supports only local optimisation and it uses the CPXopenCPLEX() function. Looking for this package, the connection with the local CPLEX installation is made by the following command:
ccall(("CPXopenCPLEX",libcplex),Ptr{Cvoid}, (Ptr{Cint},),stats)
where libcplex="/opt/ibm/ILOG/CPLEX_Studio129/cplex/bin/x86-64_linux/libcplex1290.so", and stats is an Array{Int32,1}. This command is found at the file cpx_env.jl of the package CPLEX.jl.
What I've tried is to implement a similar function that will call CPXopenCPLEXremote insteat of CPXopenCPLEX with the correct values. My Julia1.1 code is the following:
const libcplex = "/opt/ibm/ILOG/CPLEX_Studio129/cplex/bin/x86-64_linux/libcplex1290remote.so"
argv=["/usr/bin/ssh", "IP_OF_REMOTE_MACHINE","/opt/ibm/ILOG/CPLEX_Studio129/cplex/bin/x86-64_linux/cplex", "-worker=process"]
ret= ccall(("CPXopenCPLEXremote",libcplex),Ptr{Cvoid}, (Ptr{Cchar},Cint,Ptr{Ptr{Cchar}},Ptr{Cint},),"processtransport",Int32(4),argv,stats)
The problem is ret=Ptr{Nothing} #0x0000000000000000 with means that the connection did not succedd.
I'm quite sure that the problem is in the way that I'm giving the arguments to ccall() to call CPXopenCPLEXremote.
Could someone with experience in this tye of call help me with the parameters?
I'm also configuring an automatic identification for the ssh connection. For now I've to inform my user and password on each ssh connection from the machine A to the remote machine B. (I'll update this question later)
Thank you all for any help.If it works, I'm going to create the lib CPLEXremote.jl for the community.
best regards, Isaias
Many things could go wrong here. I don't know Julia but here are the things that could try outside of Julia to sort this out:
You definitely need passwordless ssh connection. There is no way you can supply username/password with the CPLEX remote object API. This is mentioned in the documentation here.
On both machines make sure that not only CPLEX is installed but also that the folder that contains the various libcplex*transport.so and libcplex*worker.so libraries is in LD_LIBRARY_PATH. The remote object code has to load these libraries dynamically at runtime.
For debugging purposes set environment variable ILOG_CPLEX_REMOTE_OBJECT_TRACE to 99. This should give more information about the error that happens.
Try adding either -stdio or -namedpipes=. to the command line.
Take a look at the example cplex/examples/src/remotec/parmipopt.c. This basically does what you plan to do. It also involves user functions, so it is a bit more complicated than what you plan.
Look at example cplex/examples/src/remotec/parbenders.c this does more complicated things in the solution process but the setup of the remote solvers is pretty simple. You can run this example by going to cplex/examples/x86-64_linux/static_pic and saying make -f Makefile.remote remote-run-parbenders. It is a good idea to start with that and trying to modify it so that it does not only run on your localhost but actually connects to the remote machine correctly. This takes Julia out of the picture. Once you have this working go back to Julia and figure out how to invoke CPLEX from there.
Related
how can I perform normal R-Code on a SQL Server without using the Microsoft rx-functions? I think the ComputeContext "RxInSqlServer" isn't the right one? But I couldn't find good Information about the other ComputeContext-options.
Is this possible with this Statement?
rxSetComputeContext(ComputeContext)
Or can I only use it to perform rx-functions? An other Option could be to set the Server Connection in RStudio or VisualStudio?
My Problem is: I want analyse data from hadoop via ODBC-Connection on the SQL Server, so I would like to use the performance of the remote SQL Server and not the data in SQL Server. And then I want analyse the hadoop-data with sparklyr.
Summary: I want to use the performance from the remote server and not the SQL Server data. So RStudio should run not local, it should perform and use the memory of the remote server.
Thanks!
The concept of a compute context in Microsoft R Server is, “Where will the computation be performed?”
When setting compute context, you are telling Microsoft R Server that computation will occur on either the local machine (with either “local” or “localpar” compute contexts), or, the script will be executed on a remote machine which has Microsoft R Server installed on it. Remote compute contexts are defined by creating a compute context object, and then setting the context to that object.
For SQL Server, you would create an RxInSqlServer() object, and then call rxSetComputeContext() on that object. For Hadoop, the object would be created via the RxHadoopMR() call.
In code, it would look something like:
CC <- RxHadoopMR( < context defined here > )
rxSetComputeContext(CC)
To see usage on defining a context, please see documentation (Enter "?RxHadoopMR" in the R Client, no quotes).
Any call to an "rx" function after this will be performed on the Hadoop cluster, with no data being transferred to the client; other than the results.
RxInSqlServer() would follow the same pattern.
Note: To perform any remote computation, Microsoft R Server must be installed on that machine.
If you wish to run a standard R function on a remote compute context, you must wrap that function in a call to rxExec(). rxExec() is desinged as an interface to parallelize any Open Source R function and allow for its execution on a remote context. Please see documentation (enter "?rxExec" in the R Client, no quotes) for usage.
For information on efficient parallelization, please see this blog: https://blogs.msdn.microsoft.com/microsoftrservertigerteam/2016/11/14/performance-optimization-when-using-rxexec-to-parallelize-algorithms/
You called out "without using the Microsoft rx-functions" and I am interpreting this as, "I would like to use Open Source R Algorithms on data in-SQL Server", with Microsoft R Server, you must use rxExec() as the interface to run Open Source R. If you want to use no rx functions at all, you will need to query the data to your local machine, and then use Open Source R. To interface with a remote context using Microsoft R Server, the bare minimum is using rxExec().
This is how you will be able to achieve the first part of your ask, "how can I perform normal R-Code on a SQL Server without using the Microsoft rx-functions? I think the ComputeContext "RxInSqlServer" isn't the right one?"
For your second ask, "My Problem is: I want analyse data from hadoop via ODBC-Connection on the SQL Server, so I would like to use the performance of the remote SQL Server and not the data in SQL Server. And then I want analyse the hadoop-data with sparklyr."
First, I'd like to comment that with the release of Microsoft R Server 9.1, you can use sparklyr in-line with an MRS Spark connection, for some examples, please see this blog: https://blogs.msdn.microsoft.com/microsoftrservertigerteam/2017/04/19/new-features-in-9-1-microsoft-r-server-with-sparklyr-interoperability/
Secondly, what you are trying to do is very involved. I can think of two ways that this is possible.
One is, if you have SQL Server PolyBase, you can configure SQL Server to make a virtual table referencing data in Hadoop, similar to Hive. After you have referenced your Hadoop data in SQl Server, you would use an RxInSqlServer() compute context on these tables. This would analyse the data in SQL Server, and return the results to the client.
Here is a detailed blog explaining an end-to-end setup on Cloudera and SQL Server: https://blogs.msdn.microsoft.com/microsoftrservertigerteam/2016/10/17/integrating-polybase-with-cloudera-using-active-directory-authentication/
The Second, which I would NOT recommend, is untested, hacky, and has the following prereqs:
1) Your Hadoop cluster must have OpenSSH installed and configured
2) Your SQL Server Machine must have the ability to SSH into your Hadoop Cluster
3) You must be able to place an SSH Key on your SQL Server machine in a directory which the R Services process has the ability to access
And I need to add another disclaimer here, there is No Guarantee of this working, and, likely, it will not work. The software was not designed to operate in this fashion.
You would then do the following:
On your client machine, you would define a custom function which contains the analysis that you wish to perform, this can be Open Source R Function, rx functions, or a mix.
In this custom function, before calling any other R or rx functions, you would define a RxHadoopMR compute context object which points to your cluster, referencing the SSH key in the directory on the SQL Server machine as if you were executing from that machine. (in the same way that you would define the RxHadoopMR object if you were to do a remote Hadoop operation from your client machine).
Within this custom function, immediately after RxHadoopMR() is defined, you would call rxSetComputeContext() on your defined RxHadoopMR() object
Still in this custom function, write the actual script which will operate on the data in Hadoop.
After this function is defined, you would define an RxInSqlServer() compute context object on the client machine.
You would set your compute context to RxInSqlServer()
Then you would call rxExec() with your custom function as an input.
What this will do is execute your custom function on the SQL Server machine, which would hopefully cause it to define its compute context as your Hadoop cluster, and pull the data over SSH for analysis on the SQL Server machine; returning the results to client.
With that said, this is not how Microsoft R Server was designed to be used, and if you wish to optimize performance, please use Option One and configure PolyBase.
I have a car navigation system installed in my car and I figured out that it's running vxWorks 6.9.3.
What I'm trying to achieve is to change some hidden settings of the nav-system.
Small introduction: Nav system have ability to connect to internet via Bluetooth. I setup small web-server the only thing it can do is detect IP address of client. I opened that web-site from head unit browser and detected ip address of head unit. Than I'm able to scan for opened network ports of it.
It turned out that it has 23 port open. And I'm able to telnet there.
It didn't required any password or login and it report operation system info: Windriver vxWorks 6.9.3
I can run various commands here, inspect filesystem, etc.
But I don't know how I can change something. I even found the way to transfer files from USB-key from and to device.
I found that all settings which I want to change are stored in .sqlite files. Some of them are gzipped and have .inf file with check-sums. Algorithm of check-sum calculation is proprietary so I can't transfer .sqlite files from device to usb-key, change something, than gzip and calculate new check-sum.
I think OS can somehow interact with .sqlite files in-memory without ungzip them.
So, is there any ways to open sqlite shell on device using vxWorks kernel shell?
If yes, that would be perfect and enough to achieve anything I want.
If this can't be achieved, can somebody give me some advice of what possibilities I have from vxWorks kernel shell?
The commands available on the VxWorks shell depend on the loaded applications and the kernel itself. From the shell you can call all "public functions" loaded by VxWorks. You enter the function call in a C-like syntax and the shell parses the arguments pushes them onto the stack and jumps to the address of the function just like a normal function call in C.
A helpful function to check if a funtion exists is lkup "foo" which will lists all functions containing "foo" in their name (case sensitive!). But it doesn't tell you anything about the requested parameters. If you are not passing all parameters to the function via the shell, the intepreter pushes some zeroes onto the stack before executing the function call. This may lead to very strange results and may even damage your system (depending on the function)...
If you're able to load a program you may want to use the functions of symLib to iterate all symbols of the VxWorks sysSymTbl.
I need to run thousands* of models on 15 machines (each of 4 cores), all Windows. I started to learn parallel, snow and snowfall packages and read a bunch of intro's, but they mainly focus on the setup of the master. There is only a little information on how to set up the worker (slave) nodes on Windows. The information is often contradictory: some say that SOCK cluster is practically the easiest way to go, others claim that SOCK cluster setup is complicated on Windows (sshd setup) and the best way to go is MPI.
So, what is an easiest way to install slave nodes on Windows? MPI, PVM, SOCK or NWS? My, possibly naive ideas were (listed by priority):
To use all 4 cores on the slave nodes (required).
Ideally, I need only R with some packages and a slave R script or R function that would listen on some port and wait for tasks from master.
Ideally, nodes can be added/removed dynamically from the cluster.
Ideally, the slaves would connect to the master - so I wouldn't have to list all the slaves IP's in configuration of the master.
Only 1 is 100% required, 2-4 are "would be good". Is it too naive to request?
I am sorry but I have not been able to figure this out from the available docs and tutorials. I would be grateful if you point me out to the right source.
* Note that each of those thousands of models will take at least 7 minutes, so there won't be a big communication overhead.
It's a shame how all these APIs (like parallel/snow/snowfall) are complex to work with, a lots of docs but not what you need... I have found an API which is very simple and goes straight to the ideas I sketched!! It is redis and doRedis R package (as recommended here). Finally a very simple tutorial is present! Just modified a bit and got this:
The workers need only R, doRedis package and this script:
require(doRedis)
redisWorker('jobs', '10.0.0.7') # IP of the server
The master needs redis server running (installed the experimental windows binaries for Windows), and this R code:
require(doRedis)
registerDoRedis('jobs')
foreach(j=1:10,.combine=sum,.multicombine=TRUE) %dopar%
... # whatever you need to run
removeQueue('jobs')
Adding/removing workers is fully dynamic, no need to specify IPs at master, automatic "load balanancing", simple and no need for tons of docs! This solution fulfills all the requirements and even more - as stated in ?registerDoRedis:
The doRedis parallel back end tolerates faults among the worker processes and automatically resubmits failed tasks.
I don't know how complex this would be using the parallel/snow/snowfall with SOCKS/MPI/PVM/NWS, if it would be possible at all, but I guess very complex...
The only disadvantages of using redis I found:
It is a database server. I wonder if this API exist somewhere without the need to install the database server which I don't need at all. I guess it must exist!
There is a bug in the current doRedis package ("object '.doRedisGlobals' not found") with no solution yet and I am not able to install the old working doRedis 1.0.5 package into R 3.0.1.
Does anyone has an experience in creating a TCP server in C++ for calling R functions and serving the results to the clients?
I implemented my own using POCO C++ libraries, but got an error message which led me to see the fact that RInside can not be used in a multi-threaded application.
I think this is non-sense. Ok, R itself is single threaded but there should be a way of creating a server in C++ and RInside.
You probably want Rserve which has been doing this for a decade, rather than starting something new with our RInside -- though you could look at my RInside/Wt example for a webapp...
I'm trying to learn Ada on Linux by porting simple C++ tools to Ada.
Right now I' trying to write a simple serial communicaton program that sends modem commands and waits for a signalled filedescriptor using select call.
I can't seem to find the package containing the select call - do I have to look for some platform specific package here? Where would I find this? Am I even looking for the right thing here?
select() is an OS call specific to Unix, and thus isn't part of Ada's standard library.
You will either need to find a (non-standard) package that provides a Unix system call interface, wrap it yourself using interfacing pragmas, or take a different approach.
For the first option, I can only help a little, since I don't have a Unix system handy. A Posix package should have it, and I believe you can find one such package (Florist) for Gnat here. I can't speak to its quality.
To make your own bindings, you'd want to check out the facilities provided for this in Appendix B of the LRM. This is kind of an advanced topic though, and should not be attempted unless you either know a lot about how your OS does its subroutine linkages, or are ready to learn.
For "a different approach", look into whatever reference guide you are using has to say about Ada's tasking and/or protected objects (not to be confused with the protected keyword in C++). For example, you might prefer to have one task whose sole job is to read incoming data from the serial port. You can synchronize with it between reads via a rendezvous, or to get really sexy, with a queue implemented via protected object.