R Import - CSV file from password protected URL - in .BAT file - r

Okay - so here is what I'm trying to do.
I've got this password protected CSV file I'm trying to import into R.
I can import it fine using:
read.csv()
and when I run my code in RStudio everything works perfect.
However, when I try and run my .R file using a batch file (windows .bat) it doesn't work. I want to use the .BAT file so that I can set up a scheduled task to run my code every morning.
Here is my .BAT file:
"E:\R-3.0.2\bin\x64\R.exe" CMD BATCH "E:\Control Files\download_data.R" "E:\Control Files\DailyEmail.txt"
And here is my .R file:
url <- "http://username:password#www.url.csv"
data <- read.csv(url, skip=1)
** note, I've put my username/password and the exact location of the CSV in my code. I've used generic stuff here, as this is work related and posting usernames and passwords is probably frowned upon.
As I've said, this code works fine when I use it in RStudio. But fails when I use the .BAT file.
I get the following error message:
Error in download.file(url, "E:/data/data.csv") :
cannot open URL 'websiteurl'
In addition: Warning message:
In download.file(url, "E:/data/data.csv") :
unable to resolve 'username'
Execution halted
** above websiteurl is the http above (I can't post links)
So obviously, the .BAT is having trouble with the username/password? Any thoughts?
* EDIT *
I've gone so far as trying this on Linux. Thinking maybe windows was playing silly bugger.
Just from the terminal, I run Rscript -e "download_data.r" and get the EXACT same error message as I did in Windows. So I suspect this may be a problem with where I'm getting the data? Could the provider be blocking data from the command line, but not from with Rstudio?

I have had similar problems which had to do with file permissions. The .bat file somehow does not have the same privileges as you running the code directly from Rstudio. Try using rscript (http://stat.ethz.ch/R-manual/R-devel/library/utils/html/Rscript.html) within your .bat file like
Rscript "E:\Control Files\download_data.R"
What is the purpose of the argument "E:\Control Files\DailyEmail.txt"? Is the program suppose to use it in any way?

So, I've found a solution, which is likely not the most practical for most people, but works for me.
What I did was migrated my project over to a Linux system. Running daily scripts, is easier on Linux anyways.
The solution makes use of the "wget" function in linux.
You can either run the wget right in your shell script, or make use of the system() function in R to run the wget.
code looks like:
wget -O /home/user/.../file.csv --user=userid --password='password' http://www.url.com/file.csv
And you can do something like:
syscomand >- "wget -O /home/.../file.csv --user=userid --password='password' http://www.url.com/file.csv"
system (syscommand)
in R to download the CSV to a location on your hard drive, then grab the CSV using read.csv()
Doing it this way gave me some more insight into the potential root cause of the problem. While the system(syscommand) is running, I get the following output:
Connecting to www.website.com (www.website.com)|ip.ad.re.ss|:80... connected.
HTTP request sent, awaiting response... 401 Unauthorized
Reusing existing connection to www.weburl.com:80.
HTTP request sent, awaiting response... 200 OK
Not sure why it has to send the request twice? And why I'm getting a 401 Unauthorized the first try?

Related

Vscode-R; permission denied to access request.log file

I am learning R and followed the instructions to program R using Visual Studio Code. I then tried to run the following line of code to learn how to read data.
dat <- read.table("d.data")
View(dat)
where d.data is a data file. I received the following error:
cannot open file '...\.vscode-R/request.log': Permission denied.
I tried using the "Give Access To" command from right-clicking the file in File Explorer, however, I don't think it did anything. How do I grant the program/terminal permission to open the file? It may be significant to note that running the same commands using the radian console works without any issues (I get the data outputted in a separate window).
I found a workaround for this issue by adjusting these settings in VSCode:
"r.alwaysUseActiveTerminal": true,
"r.bracketedPaste": true,
Then, by calling radian in an opened cmd terminal, I was able to load everything without any issues

What is the filepath that a "Read CSV" operator needs to read a file from RapidMiner Server?

I have a RM Server running on a VM (Ubuntu) on top of my Win10 machine.
I have a process to read a .csv file and write its contents on a MySQL database on a MySQL Server which also runs on the same VM.
The problem is that the read file operator does not seem to be able to find the file.
Scenario1.
When I try as location-name in the read csv operator ../data/myFile.csv
and run the process on Server I am getting Failed to execute initialization process: Error executing process /apps/myApp/process/task_read_csv_to_db: The file 'java.io.FileNotFoundException: /root/../data/myFile.csv (No such file or directory)' does not exist.
Scenario2.
When I try as location-name in the read csv operator /apps/myApp/data/myFile.csv
and run the process on Server I am getting Failed to execute initialization process: Error executing process /apps/myApp/process/task_read_csv_to_db: The file 'java.io.FileNotFoundException: /apps/myApp/data/myFile.csv (No such file or directory)' does not exist.
What is the right filepath that I should give to the Read CSV operator?
Just to update with the answer. After David's suggestion, I resulted in storing the .csv file outside of the /rapidminer-server-home/data/repository since every remote repository seems to be depicted with an integer instead of its original name, making the use of the actual full path of the file not usable.
I would say, the issue is that depending on the location of the JobAgent that is executing your process, the relative path might be varying.
Is /apps/myApp/data/myFile.csv the correct path to the file? If not, I would suggest to use the absolute path to the file. Hope this helps.
Best,
David

R: "Error calling capture_console_output: 87" when using terminalExecute()

I am trying to run an executable called swat_edit.exe in R. It works perfectly when I run it directly in the command prompt, and also when I run it directly in the Terminal tab in R. However, when I try to write a function in R to run the executable, I get an error (I get a number of different errors...).
I have tried to use different methods of running the file:
1: I used system("swat_edit"), which returns the following error:
Unhandled Exception: System.IO.IOException: The handle is invalid.
at System.IO.__Error.WinIOError(Int32 errorCode, String maybeFullPath)
at System.Console.set_CursorVisible(Boolean value)
at SWEdit.Program.Run(String[] args)
at SWEdit.Program.Main(String[] args)
[1] 17234
2: I used shell("swat_edit"), which returns the exact same error as (1).
3: I used shell.exec("swat_edit"). This works, but it opens the executable in a new window, which then runs for a few seconds and closes (as intended). I need the program to run in the R terminal window so it can run many iterations in the background without disrupting other things. This is not a viable option.
4: I tried using terminalSend(ID,"swat_edit") (from the rstudioapi package). This works in that it sends the command to the terminal window in R. When I move there and hit enter it executes perfectly, running in the terminal window like I want it to. However, I need to run many iterations so this is not viable either. I tried using KeyboardSimulator to go to the Terminal tab and hitting enter (which worked), but this also does not let me use the PC for other purposes while running my code.
5: I tried using terminalExecute("swat_edit"), which returns the following error code:
Error calling capture_console_output: 87
[Process completed]
[Exit code: -532462766]
6: I tried making a python file that runs swat_edit.exe, and then running that file in R. The python file works when I run it by itself, from the command prompt, or from the terminal in R. It does not, however, work when I try to run it in the R terminal using terminalExecute (same error as in (5)).
NOTE: I have another executable called swat.exe (entirely different program) that works with all of the above-mentioned methods.
So in summary: swat_edit.exe runs perfectly in command prompt and R terminal, but does not work when I try to run it using R code (either system(), shell(), or terminalExecute().
I can't figure out the difference between terminalExecute() and typing the string into terminal and hitting enter, but apparently there is something happening in between...
It will be tedious to reproduce this since it uses external programs, but if anyone has any idea about the error messages or how I can copy a string and run it in the terminal without any interference, that would be greatly appreciated.
EDIT: I found a method that solves my problem. I created a .bat file that runs swat_edit minimized. I was able to run this .bat file with the shell function (or any of the other commands I mentioned) in R. This doesn't answer why I was having the issues I described, and it doesn't let me run swat_edit in the R terminal, but it's good enough for me.
The .bat file was simply the following:
"START /MIN /WAIT C:\~\SWAT_Edit.exe"

Delete files from SFTP using R Studio

I need to delete files from an FTP site once I have processed them in R (parsing content). However, nothing I try seems to work.
this is what ive trying, and variations of.
library(RCurl)
curlPerform(url="sftp://user:password#sftplocation/folder/", quote="DELE filename.pdf")
curlPerform(url="ftp://xxx.xxx.xxx.xxx/", quote="DELE file.txt", userpwd = "user:pass")
Error is
Error in function (type, msg, asError = TRUE) : Unknown SFTP command
When I run the following code, I get a lovely list of all the files (which is used to download them). So I know the connection is working just great, and the parsing from the downloaded files works great!
curlPerform(url="sftp://user:password#sftplocation/folder/")
Thanks,
Siobhan
To delete over sftp, use rm instead of DELE - which looks like an ftp rather than an sftp command.
Then make sure you have the full file path. This works for me:
curlPerform(
url="sftp://me#host.example.com/",
.opts=list(
ssh.public.keyfile=pub,
ssh.private.keyfile=pri),
verbose=TRUE,
quote="rm /home/me/test/test.txt")
Note I've put my credentials in some key files so I don't put the password in plain text in the code.
I'm not convinced this is the best way to do it, since I can't stop it printing the contents of the URL... There's might be an option...

osmar package in R (OpenStreetMap)

The osmar package in R has a demo file called demo("navigator"). It is provided to illustrate package capabilities and functions. When I ten the script, I hit the following line and error:
R> muc <- get_osm(muc_bbox, src)
sh: osmosis: command not found
Error in file(con, "r") : cannot open the connection
In addition: Warning message:
In file(con, "r") :
cannot open file '/var/folders/81/4k487q0969q1d8rfd1pyhyr40000gs/T//RtmpdgZSOy/file13a473cb904c': No such file or directory
The command is intended to convert an osmosis data object to a osmar object. I have properly installed osmosis for MacOSX, updated my path definition in the bash shell to point to the osmosis executable.
I'm not sure what the error message means and how best to respond. Any help appreciated
Brad
Have your restarted R? It looks like osmosis isn't in your path, although you do mention that you set that. Make sure that you can run one of the osmosis commands in Terminal:
osmosis --read-xml SloveniaGarmin.osm --tee 4 --bounding-box left=15 top=46 --write-xml SloveniaGarminSE.osm --bounding-box left=15 bottom=46 --write-xml SloveniaGarminNE.osm --bounding-box right=15 top=46 --write-xml SloveniaGarminSW.osm --bounding-box right=15 bottom=46 --write-xml SloveniaGarminNW.osm
The example is irrelevant, as long as it doesn't say osmosis file not found.
Also, make sure you have gzip in your path. I am almost certain that it is default, but the demo package relies on it to run. Just open a Terminal and type gzip to make sure it is there.
Finally, if you need to debug this, then run this:
library(osmar)
download.file("http://osmar.r-forge.r-project.org/muenchen.osm.gz","muenchen.osm.gz")
system("gzip -d muenchen.osm.gz")
# At this point, check the directory listed by getwd(). It should contain muenchen.osm.
src <- osmsource_osmosis(file = "muenchen.osm",osmosis = "osmosis")
muc_bbox <- center_bbox(11.575278, 48.137222, 3000, 3000)
debug(osmar:::get_osm_data.osmosis)
get_osm(muc_bbox, src)
# Press Enter till you get to
# request <- osm_request(source, what, destination)
# Then type request to get the command it is sending.
After you type Enter once, and then request you will get the string it is sending to your OS. It should be something like:
osmosis --read-xml enableDateParsing=no file=muenchen.osm --bounding-box top=48.1507120588903 left=11.5551240885889 bottom=48.1237319411097 right=11.5954319114111 --write-xml file=<your path>
Try pasting this into your Terminal. It should work from any directory.
Oh, and type undebug(osmar:::get_osm_data.osmosis) to stop debugging. Type Q to exit the debugger.
Hey I just got this thing working. The problem is not with the system path variable for osmosis. It is with the system call the script makes which uses the "gzip" application to unzip the .gz file it has downloaded before. So there is an error when gzip is not installed in your machine or gzip is not in the system path variable. so installing gzip and adding it to the path variable will mitigate this error. alternatively you can unzip the file manually to the same path and run the script again.

Resources