Problem: I need to control the order of execution in which tasks are processed in parallel by a foreach loop. Unfortunately, this is not supported by foreach.
Solution in mind: Using doRedis to use the database to hold all tasks, that are executed in the foreach loop. To control the order I want to overwrite getTask by setGetTask to get the tasks based on pre-specified order. Though I could not find to much documentation on how to do this.
Additional Information:
There is a small paragraph on setGetTask with an example in the redis documentation.
getTask <- function ( queue , job_id , ...)
{
key <- sprintf("
redisEval("local x=redis.call('hkeys',KEYS[1])[1];
if x==nil then return nil end;
local ans=redis.call('hget',KEYS[1],x);
redis.call('hdel',KEYS[1],x);i
return ans",key)
}
setGetTask(getTask)
I though think the code in the documentation is syntactically not correct (missing imho a " and a closing bracket ")"). I thought this is not possible on CRAN, as the code for the documentation is executed on submission.
Changing the getTask function does not change anything in regard of the workers getting tasks (even if introducing obvious non-sense into the redisEval like changing it to redisEval("dddddddddd(((")
I only had access to the setGetTask function after installing the package from source (which I downloaded from the official CRAN package page of version 1.1.1 (which imho should make no difference than installing it directly from CRAN)
Data: The Dataframe of tasks to execute looks the following:
taskName;taskQueuePosition;parameter1;paramterN
taskT;1;val1;10
taskK;2;val2;8
taskP;3;val3;7
taskA;4;val4;7
I want to use 'taskQueuePosition' to control the order, tasks with lower numbers should be executed first.
Questions:
Does anybody know any sources where I can get more information on doing this with doRedis or on setGetTask?
Does anybody know how I need to change getTask to achieve the above described?
Any other smart ideas to control the order of execution in a foreach loop? Preferably so that at some point I can use doRedis as parallel back end (changing this would mean a major change in the processing due to complicated technical infrastructure reasons).
Code (for easy reproduction):
The following assumes that the redis-server is started on the local machine.
Redis DB Filling:
library(doRedis)
library(foreach)
options('redis:num'=TRUE) # needed for proper execution
REDIS_JOB_QUEUE = "jobs"
registerDoRedis(REDIS_JOB_QUEUE)
# filling up the data frame
taskDF = data.frame(taskName=c("taskT","taskK","taskP","taskA"),
taskQueuePosition=c(1,2,3,4),
parameter1=c("val1","val2","val3","val4"),
parameterN=c(10,8,7,7))
foreach(currTask=iter(taskDF, by='row'),
.verbose = T
) %dopar% {
print(paste("Executing task: ",currTask$taskName))
Sys.sleep(currTask$parameterN)
}
removeQueue(REDIS_JOB_QUEUE)
Worker:
library(doRedis)
REDIS_JOB_QUEUE = "jobs"
startLocalWorkers(n=1, queue=REDIS_JOB_QUEUE)
I could solve the problem and now can control the order of task execution.
Additional information:
1. There seems to be a typo in the documentation, that renders the getTask example not working. By considering the form of the default_getTask function from the file task.R in the package, it should look probably something like:
getTaskDefault <- function ( queue , job_id , ...)
{
key <- sprintf("%s:%s",queue, job_id)
return(redisEval("local x=redis.call('hkeys',KEYS[1])[1];
if x==nil then return nil end;
local ans=redis.call('hget',KEYS[1],x);
redis.call('set', KEYS[1] .. '.start.' .. x, x);
redis.call('hdel',KEYS[1],x);
return ans",key))
}
It seems that the letters behind first percent sign in the first line of the function got lost. This would explain the uneven number of brackets and quotes.
2) setGetTask still does not have any effect for me. When I set the getTask function though through .option while the DB is filled (like it is described in the vignette of the package) it is successfully called.
3) The information on 2) means that I do not need the getTask function, so I can use the package from CRAN.
----- Questions -----
1) The doRedis vignette describes how a custom getTask can be successfully set.
2 and 3) When the LUA script in getTask function is modified like below, the tasks are drawn from the database in the way they are submitted. This is not exactly what I was asking for, but due to time restraints and the fact I have (or better had) not the first idea about LUA script, it is imho a satisfying solution to control the order of submission by the taskQueuePosition column.
getTaskInOrder <- function ( queue , job_id , ...)
{
key <- sprintf("%s:%s",queue, job_id)
return(redisEval("
local tasks=redis.call('hkeys',KEYS[1]); -- get all tasks
local x=tasks[1]; -- get first task available task
if x==nil then -- if there are no tasks left, stop processing
return nil
end;
local xMin = 65535; -- if we have more tasks than 65535, getting the
-- task with the lowest taskID is not guaranteed to be the first one
local i = 1;
-- local iMinFound = -1;
while (x ~= nil) do -- search the array until there are no tasks left
-- print('x: ',x)
local xNum = tonumber(x);
if(xNum<xMin) then
xMin = xNum;
-- iMinFound = i;
end
i=i+1;
-- print('i is now: ',i);
x=tasks[i];
end
-- print('Minimum is task number',xMin,' found at i ', iMinFound)
x=tostring(xMin) -- convert it back to a string (maybe it would
-- be better to keep the original string somewhere,
-- in case we loose some information whilst converting to number)
-- print('x is now:',x);
-- print(KEYS[1] .. '.start.' .. x, x);
-- print('');
local ans=redis.call('hget',KEYS[1],x);
redis.call('set', KEYS[1] .. '.start.' .. x, x);
redis.call('hdel',KEYS[1],x);
return ans",key))
}
Important note: I noticed that if a task is aborted, the order is screwed up and the resubmitted task (even though the task number remains the same), will be executed after the originally submitted tasks. This is okay for me.
------ Code (for easy reproduction):------
This leads to the following code example (with 12 entries in the task data frame, instead the original 4):
Redis DB Filling:
library(doRedis)
library(foreach)
options('redis:num'=TRUE) # needed for proper execution
REDIS_JOB_QUEUE = "jobs"
getTaskInOrder <- function ( queue , job_id , ...)
{
...like above
}
registerDoRedis(REDIS_JOB_QUEUE)
# filling up the data frame already in order of tasks to be executed
# otherwise the dataframe has to be sorted by taskQueuePosition
taskDF = data.frame(taskName=c("taskA","taskB","taskC","taskD","taskE","taskF","taskG","taskH","taskI","taskJ","taskK","taskL"),
taskQueuePosition=c(1,2,3,4,5,6,7,8,9,10,11,12),
parameter1=c("val1","val2","val3","val4","val1","val2","val3","val4","val1","val2","val3","val4"),
parameterN=c(5,5,5,4,4,4,4,3,3,3,2,2))
foreach(currTask=iter(taskDF, by='row'),
.verbose = T,
.options.redis = list(getTask = getTaskInOrder
) %dopar% {
print(paste("Executing task: ",currTask$taskName))
Sys.sleep(currTask$parameterN)
}
removeQueue(REDIS_JOB_QUEUE)
Worker:
library(doRedis)
REDIS_JOB_QUEUE = "jobs"
startLocalWorkers(n=1, queue=REDIS_JOB_QUEUE)
Another note: just in case you are processing long jobs, as I do, please notice a bug in redis 1.1.1 (the current version on CRAN), which leads to tasks being resubmitted (due to a timeout) despite the workers still working on them.
I want to execute a batch file using People code in Application Engine Program. But The program have an issue returning Exec code as a non zero value (Value - 1).
Below is people code snippet below.
Global File &FileLog;
Global string &LogFileName, &Servername, &commandline;
Local string &Footer;
If &Servername = "PSNT" Then
&ScriptName = "D: && D:\psoft\PT854\appserv\prcs\RNBatchFile.bat";
End-If;
&commandline = &ScriptName;
/* Need to commit work or Exec will fail */
CommitWork();
&ExitCode = Exec("cmd.exe /c " | &commandline, %Exec_Synchronous + %FilePath_Absolute);
If &ExitCode <> 0 Then
MessageBox(0, "", 0, 0, ("Batch File Call Failed! Exit code returned by script was " | &ExitCode));
End-If;
Any help how to resolve this issue.
Best bet is to do a trace of the execution.
Thoughts:
Can you log on the the process scheduler you are running this on and execute the script OK?
Is the AE being scheduled or called at run-time?
You should not need to change directory as you are using a fully qualified path to the script.
you should not need to call "cmd /c" as this will create an additional shell for you application to run within, making debuging harder, etc.
Run a trace, and drop us the output. :) HTH
What about changing the working directory to D: inside of the script instead? You are invoking two commands and I'm wondering what the shell is returning to exec. I'm assuming you wrote your script to give the appropriate return code and that isn't the problem.
I couldn't tell from the question text, but are you looking for a negative result, such as -1? I think return codes are usually positive. 0 for success, some other positive number for failure. Negative numbers may be acceptable, but am wondering if Exec doesn't like negative numbers?
Perhaps the PeopleCode ChDir function still works as an alternative to two commands in one line? I haven't tried it for a LONG time.
Another alternative that gives you significant control over the process is to use java.lang.Runtime.exec from PeopleCode: http://jjmpsj.blogspot.com/2010/02/exec-processes-while-controlling-stdin.html.
I am attempting to design a front end GUI for a CLI program by the name of eac3to.exe. The problem as I see it is that this program sends all of it's output to a cmd window. This is giving me no end of trouble because I need to get a lot of this output into a GUI window. This sounds easy enough, but I am begining to wonder whether I have found one of AutoIt's limitations?
I can use the Run() function with a windows internal command such as Dir and then get the output into a variable with the AutoIt StdoutRead() function, but I just can't get the output from an external program such as eac3to.exe - it just doesn't seem to work whatever I do! Just for testing purposesI I don't even need to get the output to a a GUI window: just printing it with ConsoleWrite() is good enough as this proves that I was able to read it into a variable. So at this stage that's all I need to do - get the text (usually about 10 lines) that has been output to a cmd window by my external CLI program into a variable. Once I can do this the rest will be a lot easier. This is what I have been trying, but it never works:
Global $iPID = Run("C:\VIDEO_EDITING\eac3to\eac3to.exe","", #SW_SHOW)
Global $ScreenOutput = StdoutRead($iPID)
ConsoleWrite($ScreenOutput & #CRLF)
After running this script all I get from the consolWrite() is a blank line - not the text data that was output as a result of running eac3to.exe (running eac3to without any arguments just lists a screen of help text relating to all the commandline options), and that's what I am trying to get into a variable so that I can put it to use later in the program.
Before I suggest a solution let me just tell you that Autoit has one
of the best help files out there. Use it.
You are missing $STDOUT_CHILD = Provide a handle to the child's STDOUT stream.
Also, you can't just do RUN and immediately call stdoutRead. At what point did you give the app some time to do anything and actually print something back to the console?
You need to either use ProcessWaitClose and read the stream then or, you should read the stream in a loop. Simplest check would be to set a sleep between RUN and READ and see what happens.
#include <AutoItConstants.au3>
Global $iPID = Run("C:\VIDEO_EDITING\eac3to\eac3to.exe","", #SW_SHOW, $STDOUT_CHILD)
; Wait until the process has closed using the PID returned by Run.
ProcessWaitClose($iPID)
; Read the Stdout stream of the PID returned by Run. This can also be done in a while loop. Look at the example for StderrRead.
; If the proccess doesnt end when finished you need to put this inside of a loop.
Local $ScreenOutput = StdoutRead($iPID)
ConsoleWrite($ScreenOutput & #CRLF)
I have a Rails 4.0.0 app setup with a model called episode which mounts a carrierwave uploader called file_uploader to upload mp3s. I got my app setup using carrierwave_backgrounder and resque to background the processing of the uploaded files which are saved to an sftp server using the carrierwave-ftp gem. On my local machine it works great. Also on my vps (CentOS 6) it works great when I just start up the app using rails s or even rails s -e production. However when I switch to nginx + passenger, it no longer works as expected.
The files are uploaded to the /public/uploads/tmp dir where they are supposed be stored temporarily, but they never get moved into the upload dir that I have specified and none of the other post-processing stuff gets done, like setting content type, removing cache dirs, setting file size and length, etc.
So, yesterday, I switched from using the carrierwave_backgrounder command save_in_background to process_in_background and now it works fine for files stored locally, however, when I switch to sftp storage using the carrierwave-ftp gem, the files get processed, i.e., they are transferred to my sftp server and the path is stored in my model, but then the job hangs in the Resque queue.
The relevant code that is not getting executed is:
process :set_content_type
process :save_content_type_duration_and_size_in_model
Does anyone have any idea why this would work fine using development mode and even production mode but not using nginx + passenger?
Here's all the relevant code below:
episode.rb:
class Episode < ActiveRecord::Base
require 'carrierwave/orm/activerecord'
# require 'mp3info'
mount_uploader :file, FileUploader
process_in_background :file
belongs_to :podcast
validates :name, :podcast, :file, presence: true
default_scope { order("created_at DESC") }
scope :most_recent, ->(max = 5) { limit(max) }
end
file_uploader.rb:
# encoding: utf-8
class FileUploader < CarrierWave::Uploader::Base
include CarrierWave::MimeTypes
include ::CarrierWave::Backgrounder::Delay
storage :sftp
# Override the directory where uploaded files will be stored.
# This is a sensible default for uploaders that are meant to be mounted:
def store_dir
"#{model.podcast.name.to_s.downcase.parameterize}"
end
before :store, :remember_cache_id
after :store, :delete_tmp_dir
# This is the relevant code that is not getting executed
process :set_content_type
process :save_content_type_duration_and_size_in_model
def save_content_type_duration_and_size_in_model
model.content_type = file.content_type if file.content_type
model.file_size = file.size
Mp3Info.open(model.file.current_path) do |media|
model.duration = media.length
end
end
# store! nil's the cache_id after it finishes so we need to remember it for deletion
def remember_cache_id(new_file)
#cache_id_was = cache_id
end
def delete_tmp_dir(new_file)
# make sure we don't delete other things accidentally by checking the name pattern
if #cache_id_was.present? && #cache_id_was =~ /\A[\d]{8}\-[\d]{4}\-[\d]+\-[\d]{4}\z/
FileUtils.rm_rf(File.join(root, cache_dir, #cache_id_was))
end
end
end
config/initializers/carrierwave_backgrounder.rb:
CarrierWave::Backgrounder.configure do |c|
c.backend :resque, queue: :carrierwave
end
config/initializers/carrierwave.rb:
CarrierWave.configure do |config|
config.sftp_host = "ftphost.com"
config.sftp_user = "ftp_user"
config.sftp_folder = "ftp_password"
config.sftp_url = "http://url.com"
config.sftp_options = {
:password => "ftp_password",
:port => 22
}
end
I'm starting Resque with the command: QUEUE=* bundle exec rake environment resque:work &
If you need more info, just ask. Any help would be greatly appreciated.
UPDATE: Well, oddly enough as is often the case, it is now magically working. Not sure what did the trick, so I'm afraid this won't be of any help to anyone else who stumbles on this page.
i have the same issue. My process blocks run in development (rails s) but not under apache2/passenger. It's not pretty, but the way i solved it was to move my process code into the after :cache callback. The process blocks are called between the after and before cache callbacks so this seemed reasonable to me.
Here's the super weird part: I don't mean to call the functions, i mean to copy the code out of your process blocks (or functions) and paste directly into your after_cache callback.
I know i'm doing something wrong to cause this situation but i cannot figure it out. Hope this helps you.
version :office_preview
# comment out the following since it does nothing under Passenger
#process :office_to_img
end
def office_to_img
this won't be called under passenger :(
end
after :cache, :after_cache
def after_cache(file)
#for some reason, calling it here doesn't do anything
#office_to_img
code copied&pasted here from office_to_img
end
I'm trying to write a lua script that reads input from other processes and analyzes it. For this purpose I'm using io.popen and it works as expected in Windows, but on Unix(Solaris) reading from io.popen blocks, so the script just waits there until something comes along instead of returning immediately...
As far as I know I can't change the functionality of io.popen from within the script, and if at all possible I would rather not have to change the C code, because then the script will then need to be bound with the patched binary.
Does that leave me with any command-line solutions?
Ok got no answers so far, but for posterity if someone needs a similar solution I did the following more or less
function my_popen(name,cmd)
local process = {}
process.__proc = assert(io.popen(cmd..">"..name..".tmp", 'r'))
process.__file = assert(io.open(name..".tmp", 'r'))
process.lines = function(self)
return self.__file:lines()
end
process.close = function(self)
self.__proc:close()
self.__file:close()
end
return process
end
proc = my_popen("somename","some command")
while true
--do stuf
for line in proc:lines() do
print(line)
end
--do stuf
end
Your problems seems to be related to buffering. For some reason the pipe is waiting for some data to be read before it allows the opened program to write more to it, and it seems to be less than a line. What you can do is use io.popen(cmd):read"*a" to read everything. This should avoid the buffering problem. Then you can split the returned string in lines with for line in string.gmatch("[^\n]+") do someting_with(line) end.
Your solution consist in dumping the output of the process to a file, and reading that file. You can replace your use or io.popen with io.execute, and discard the return value (just check it's 0).