I am trying to automate some basic git operations from within a R script. I am using Rstudio on Windows OS. This may be helpful for example if you wished to update GitHub when a script finishes performing some automated task.
I wrote some simple functions that utilize R's shell() function and the Window's & pipe operator to send a chain of commands to the OS terminal:
# Git status.
gitstatus <- function(dir = getwd()){
cmd_list <- list(
cmd1 = tolower(substr(dir,1,2)),
cmd2 = paste("cd",dir),
cmd3 = "git status"
)
cmd <- paste(unlist(cmd_list),collapse = " & ")
shell(cmd)
}
# Git add.
gitadd <- function(dir = getwd()){
cmd_list <- list(
cmd1 = tolower(substr(dir,1,2)),
cmd2 = paste("cd",dir),
cmd3 = "git add --all"
)
cmd <- paste(unlist(cmd_list),collapse = " & ")
shell(cmd)
}
# Git commit.
gitcommit <- function(msg = "commit from Rstudio", dir = getwd()){
cmd_list <- list(
cmd1 = tolower(substr(dir,1,2)),
cmd2 = paste("cd",dir),
cmd3 = paste0("git commit -am ","'",msg,"'")
)
cmd <- paste(unlist(cmd_list),collapse = " & ")
shell(cmd)
}
# Git push.
gitpush <- function(dir = getwd()){
cmd_list <- list(
cmd1 = tolower(substr(dir,1,2)),
cmd2 = paste("cd",dir),
cmd3 = "git push"
)
cmd <- paste(unlist(cmd_list),collapse = " & ")
shell(cmd)
}
My gitstatus, gitadd, and gitpush functions work. The gitcommit function does not work. It generates the following error:
fatal: Paths with -a does not make sense.
Warning message:
In shell(cmd) : 'd: & cd D:/Documents/R/my_path & git commit -am 'commit from Rstudio'' execution failed with error code 128
The gitpush function works because if you switch to the terminal or git within Rstudio, you can commit changes and then successfully call gitpush.
Any ideas on how to fix this issue?
...
Note: I have Git bash installed, and I can successfully use git from the Windows command terminal and Rstudio. I also tried an alternative strategy which was to have R write a temporary .bat file and then execute this, but this strategy also gets hung up on the commit step.
Solution
The answer lie within Dirk Eddelbuettel's drat package function addrepo. It was also necessary to use git2r's config function to insure that git recognizes R. git2r's functions probably provide a more robust solution for working with git from an R script in the future. In the meantime, here's how I fixed the problem.
Install git2r. Use git2r::config() to insure git recognizes R.
From Dirk's code I modified the gitcommit() function to utilize sprintf() and system() to execute a system command:
# Git commit.
gitcommit <- function(msg = "commit from Rstudio", dir = getwd()){
cmd = sprintf("git commit -m\"%s\"",msg)
system(cmd)
}
Sprintf's output looks like this:
[1] "git commit -m\"commit from Rstudio\""
Example
#install.packages("git2r")
library(git2r)
# Insure you have navigated to a directory with a git repo.
dir <- "mypath"
setwd(dir)
# Configure git.
git2r::config(user.name = "myusername",user.email = "myemail")
# Check git status.
gitstatus()
# Download a file.
url <- "https://i.kym-cdn.com/entries/icons/original/000/002/232/bullet_cat.jpg"
destfile <- "bullet_cat.jpg"
download.file(url,destfile)
# Add and commit changes.
gitadd()
gitcommit()
# Push changes to github.
gitpush()
Well, the pic looks wonky, but I think you get the point.
From what I have read in this Ask Ubuntu question, you should be using &&, not &, to separate multiple commands in the Git bash. Try doing that:
gitcommit <- function(msg = "commit from Rstudio", dir = getwd()) {
cmd_list <- list(
cmd1 = tolower(substr(dir, 1, 2)),
cmd2 = paste("cd", dir),
cmd3 = paste0("git commit -am ", "'", msg, "'")
)
cmd <- paste(unlist(cmd_list),collapse = " && ")
shell(cmd)
}
Note that your gitcommit function will output something like this:
/v & cd /var/www/service/usercode/255741827 & git commit -am 'first commit'"
I don't know what purpose the substr(dir, 1, 2) portion serves, but if my suggestion still doesn't work, then try removing it to leave just the cd and git commit commands.
I personally like the syntax and simplicity of the gert package from rOpenSci. Specifically, to commit a repo you would do:
git_add("test.txt")
git_commit("Adding a file", author = "jerry <jerry#gmail.com>")
Top push to remote:
git_push(remote = "origin", repo = ".")
And see all other useful functions with this simple syntax.
Related
I am trying to create mapbox tiles using mapboxapi::tippecanoe() in R. Unfortunately, my work computer runs Windows 10, which greatly complicates what I am trying to do. Tippecanoe is a Unix executable, so I downloaded and installed Ubuntu and am running it on a Windows subsystem for Linux. To get tippecanoe to launch, I had to edit the source code of mapboxapi::tippecanoe() to pass arguments to WSL. I then ran into an issue where Tippecanoe would give me an error that it could not open database files. Some research on Github led me to believe that this was related the number of open files limit in Ubuntu. After a lot of digging, I was able to increase ulimit -n to 65535 for on my ubuntu terminal. As soon as I launch Ubuntu, if I type in ulimit -n, I get 65535. However, when I call `sytem2("wsl", "ulimit -n"), I get the default value of 1024. I thought this was due to the user that R was calling in Ubuntu, but running system2("wsl", "whoami") returned the username for who I increased both the hard and soft nofile limits for. I am really stumped. Apologies for not pasting a reproducible example, but I am not sure how to make one for this situation. Any help would be much appreciated. Thanks!
Well after a whole lot of tinkering, this ended up being mostly a straightforward R code issue. The ulimit issue may still have been a problem, but actually I need to fix the R code in mapboxapi::tippecanoe(). Because mapboxapi::tippecanoe()uses the system() command to call tippecanoe, not only did I need to change the call to invoke wsl through a login shell using system2("wsl", "- d Ubuntu -lc 'tippecanoe <arguments to tippecanoe>'"), but I also needed to change the paths that R sent to tippecanoe to be linux paths instead of Window paths. If anyone else is having trouble with this, here is the tweaked mapboxapi::tippecanoe() command code that actually worked for me:
tippecanoe2<-function (input, output, layer_name, min_zoom = NULL,
max_zoom = NULL, drop_rate = NULL, overwrite = TRUE, other_options = NULL,
keep_geojson = FALSE)
{
check_install <- system2("wsl", "tippecanoe -v") == 0
linux_dir<-paste(getwd(), layer_name, sep="/")#make a directory in your linux directory for the .mbtiles
parsed<-strsplit(linux_dir, split="/") #parse the windows directory path
n<-length(parsed[[1]])
dir_out<-paste("",parsed[[1]][n-1], parsed[[1]][n], sep="/") #construct the linux directory path
dir.create(linux_dir)
op<-options(useFancyQuotes = FALSE)
if (!check_install) {
rlang::abort(c("tippecanoe is not installed or cannot be found by the application you are using to run mapboxapi.",
"If you haven't installed tippecanoe, please visit https://github.com/mapbox/tippecanoe for installation instructions.",
"If you have installed tippecanoe, run `Sys.getenv('PATH')` and make sure your application can find tippecanoe. If it cannot, adjust your PATH accordingly."))
}
opts <- c()
if (!is.null(min_zoom)) {
opts <- c(opts, sprintf("-Z%s", min_zoom))
}
if (!is.null(max_zoom)) {
opts <- c(opts, sprintf("-z%s", max_zoom))
}
if (is.null(min_zoom) && is.null(max_zoom)) {
opts <- c(opts, "-zg")
}
if (!is.null(drop_rate)) {
opts <- c(opts, sprintf("-r%s", drop_rate))
}
else {
opts <- c(opts, "-as")
}
if (overwrite) {
opts <- c(opts, "-f")
}
collapsed_opts <- paste0(opts, collapse = " ")
if (!is.null(other_options)) {
extra_opts <- paste0(other_options, collapse = " ")
collapsed_opts <- paste(collapsed_opts, extra_opts)
}
dir <- linux_dir
if (any(grepl("^sf", class(input)))) {
input <- sf::st_transform(input, 4326)
if (is.null(layer_name)) {
layer_name <- stringi::stri_rand_strings(1, 6)
}
if (keep_geojson) {
outfile <- paste0(layer_name, ".geojson")
path <- file.path(dir_out, outfile)
sf::st_write(input, path, quiet = TRUE, delete_dsn = TRUE,
delete_layer = TRUE)
}
else {
tmp <- tempdir("//wsl$/Ubuntu/tmp")#Here you would need to tweak to the file path for your linux distribution's temporary directory
tempfile <- paste0(layer_name, ".geojson")
path <- file.path(tmp, tempfile)
sf::st_write(input, path, quiet = TRUE, delete_dsn = TRUE,
delete_layer = TRUE)
}
call <- sprintf("tippecanoe -o %s/%s %s %s", dir_out, output,
collapsed_opts, path)
call2<-paste("-d Ubuntu /bin/bash -lc", sQuote(call, op), sep=" ")
system2("wsl", call2)
}
else if (inherits(input, "character")) {
if (!is.null(layer_name)) {
collapsed_opts <- paste0(collapsed_opts, " -l ",
layer_name)
}
call <- sprintf("tippecanoe -o %s/%s %s %s", dir_out, output,
collapsed_opts, input)
call2<-paste("-d Ubuntu /bin/bash -lc", sQuote(call, op), sep=" ")
system2("wsl", call2)
}
}
First of all, I use a remote R interpreter.
When I unselect "Disable .Rprofile execution on console start" in the settings of DataSpell and save it, IDE throws a weird error when I try to start an R console as below:
CompositeException (3 nested):
------------------------------
[1]: Cannot cast org.jetbrains.plugins.notebooks.jupyter.ui.remote.JupyterRemoteTreeModelServiceVfsListener to org.jetbrains.plugins.notebooks.jupyter.remote.vfs.JupyterVFileEvent$Listener
[2]: Cannot cast org.jetbrains.plugins.notebooks.jupyter.remote.modules.JupyterRemoteEphemeralModuleManagerVfsListener to org.jetbrains.plugins.notebooks.jupyter.remote.vfs.JupyterVFileEvent$Listener
[3]: Cannot cast org.jetbrains.plugins.notebooks.jupyter.ui.remote.JupyterRemoteVfsListener to org.jetbrains.plugins.notebooks.jupyter.remote.vfs.JupyterVFileEvent$Listener
------------------------------
I tried to give an empty .Rprofile file. Nothing changed. It throws the same error. Anyway, here is my .Rprofile file:
options(java.parameters = "-Xmx4G")
options(download.file.method = "wget")
project_base <- getwd()
print(paste("getwd:", getwd()))
Sys.setenv(R_PACKRAT_CACHE_DIR = "~/.rcache")
#### -- Packrat Autoloader (version 0.7.0) -- ####
source("packrat/init.R")
#### -- End Packrat Autoloader -- ####
# These ensures that the project uses it private library
p <- .libPaths()[[1]]
Sys.setenv(R_LIBS_SITE = p)
Sys.setenv(R_LIBS_USER = p)
Sys.setenv(R_PACKRAT_DEFAULT_LIBPATHS = p)
packrat::set_opts(use.cache = TRUE)
print(paste("whoami:", system("whoami", intern = TRUE)))
print(paste("libpaths:", .libPaths()))
print(paste0("cache_path: ", packrat:::cacheLibDir()))
restore_packrat <- function(restart = FALSE) {
packrat::restore(
overwrite.dirty = TRUE, prompt = F, restart =
restart, dry.run = F
)
}
snapshot_packrat <- function() {
packrat::snapshot(
ignore.stale = TRUE, snapshot.sources = FALSE,
infer.dependencies = FALSE
)
}
I appreciate the help of anyone who faced this issue and solved it.
PS: I also issued a bug report to the developers. If you have the same problem, please upvote the issue and this question.
https://youtrack.jetbrains.com/issue/R-1393
My issue is: when I run the following code from one laptop in RScript.exe via Task Scheduler, I get the desired output; that is the email is sent. But when I run the same code on another machine in RScript.exe via Task Scheduler, it doesn't run. Another machine (machine 2) is able to send emails (when only the code for email is run), so I think the issue is with the following part.
results <- get_everything(query = q, page = 1, page_size = 2, language = "en", sort_by = "popularity", from = Yest, to = Today)
I am unable to find what is the issue here. Can someone please help me with this?
My code is:
library(readxl)
library(float)
library(tibble)
library(string)
library(data.table)
library(gt)
library(tidyquant)
library(condformat)
library(xtable)
library(plyr)
library(dplyr)
library(newsanchor)
library(blastula)
Today <- Sys.Date()
Yest <- Sys.Date()-1
results <- get_everything(query = "Inflation", page = 1, page_size = 2, language =
"en", sort_by = "popularity", from = Yest, to = Today, api_key =
Sys.getenv("NEWS_API_KEY"))
OP <- results$results_df
OP <- OP[-c(1, 5:9)]
colnames(OP) <- c("News Title", "Description", "URL")
W <- print(xtable(OP), type="html", print.results=FALSE, align = "l")
email1 <-
compose_email(
body = md(
c("<tr>", "<td>", "<table>", "<tr>", "<td>", "<b>", "Losers News", "</b>", W,
"</td>", "</tr>", "</table>","</td>", "<td>")
)
)
email1 %>%
smtp_send(
from = "abc#domain.com",
to = "pqr#domain.com",
subject = "Hello",
credentials = creds_key(
"XYZ"
)
)
Whenever you schedule jobs, consider using a command line shell such as PowerShell or Bash to handle the automation steps, capture, and log errors and messages. Rscript fails on the second machine for some unknown reason which you cannot determine since you do not receive any error messages from console using TaskScheduler.
Therefore, consider PowerShell to run all needed Rscript.exe calls and other commands and capture all errors to date-stamped log file. Below script redirects all console output to a .log file with messages. When Rscript command fails, the log will dump error or any console output (i.e., head, tail) below it. Regularly check logs after scheduled jobs.
PowerShell script (save as .ps1 file)
cd "C:\path\to\scripts"
& {
echo "`nAutomation Start: $(Get-Date -format 'u')"
echo "`nSTEP 1: myscript.R - $(Get-Date -format 'u')"
Rscript myscript.R
# ... ADD ANY OTHER COMMANDS ...
echo "`nCAutomation End: $(Get-Date -format 'u')"
} 3>&1 2>&1 > "C:\path\to\logs\automation_run_$(Get-Date -format 'yyyyMMdd').log"
Command Line (to be used in Task Scheduler)
Powershell.exe -executionpolicy remotesigned -File myscheduler.ps1
Note: Either change directory in TaskScheduler job settings where myscheduler.ps1 resides or run absolute path in -File argument.
A password cannot be specified in unzip (utils) function. The other function I am aware of, getZip (Hmisc), only works for zip files containing one compressed file.
I would like to do something like this to unzip all the files in foo.zip in Windows 8:
unzip("foo.zip", password = "mypass")
I found this question very useful but saw that no formal answers were posted, so here goes:
First I installed 7z.
Then I added "C:\Program Files\7-Zip" to my environment path.
I tested that the 7z command was recognized from the command line.
I opened R and typed in system("7z x secure.7z -pPASSWORD") with the appropriate PASSWORD.
I have multiple zipped files and I'd rather not the password show in the source code or be stored in any text file, so I wrote the following script:
file_list <- list.files(path = ".", pattern = ".7z", all.files = T)
pw <- readline(prompt = "Enter the password: ")
for (file in file_list) {
sys_command <- paste0("7z ", "x ", file, " -p", pw)
system(sys_command)
}
which when sourced will prompt me to enter the password, and the zip files will be decompressed in a loop.
I found #Kim 's answer worked for me eventually but not first off. I thought I'd just add a few extra links/steps that helped me get there in the end.
Close and reopen R so that environment path is recognised
If you've already opened R when you do steps 1-3 you need to close and reload R for R to recognise the environment path for 7z. #wush978 's answer to this question r system doesn't work when trying 7zip was informative. I used Sys.getenv("PATH") to check that 7zip was included in the environment paths.
Step 4. I opened R and typed in system("7z x secure.7z -pPASSWORD") with the appropriate PASSWORD.
I actually found this didn't work so I modified it slightly following the instructions in this post which also explains how to specify an output directory https://stackoverflow.com/a/16098709/13678913.
If you have already extracted the files the system command prompts you to choose whether you want to replace the existing file with the file from the archive and provides options
(Y)es / (N)o / (A)lways / (S)kip all / A(u)to rename all / (Q)uit?
So the modified step 4 (Y allows replacement of files)
system("7z e -ooutput_dir secure.zip -pPASSWORD" Y)
Putting this altogether as a modified set of instructions
Install 7z.
Added "C:\Program Files\7-Zip\" to my environment path using menu options (instructions here https://www.opentechguides.com/how-to/article/windows-10/113/windows-10-set-path.html)
Closed and reopened R studio. Typed Sys.getenv("PATH") to check path to 7zip recognised in the environment (as per #wush978 's answer to question r system doesn't work when trying 7zip)
Typed in the console system("7z e -oC:/My Documents/output_dir secure.zip -pPASSWORD") with the appropriate PASSWORD (as per instructions here https://stackoverflow.com/a/16098709/13678913)
And here is a modified version of #Kim 's neat function (including specified output directory and check for existing files):
My main script
output_dir <- "C:/My Documents/output_dir " #space after directory name is important
zippedfiles_dir <- "C:/My Documents/zippedfiles_dir/"
file_list <- paste0(output_dir , zippedfiles_dir , list.files(path = zippedfiles_dir, pattern = ".zip", all.files = T))
source("unzip7z.R")
Code inside source file unzip7z.R
pw = readline(prompt = "Enter the password: ")
for (file in file_list) {
csvfile <- gsub("\\.zip", "\\.csv", gsub(".*? ", "", file)) #csvfile name (removes output_dir from 'file' and replaces .zip extension with .csv)
#check if csvfile already exists in output_dir, and if it does, replace it with archived version and if it doesn't exist, continue to extract.
if(file.exists(csvfile)) {
sys_command = paste0("7z ", "e -o", file, " -p", pw, " Y")
} else {
sys_command = paste0("7z ", "e -o", file, " -p", pw)
}
system(sys_command)
}
password <- "your password"
read.table(
text = system(paste0("unzip -p -P ", password, " yourfile.zip ", "yourfile.csv"),
intern = "TRUE"
), stringsAsFactors = FALSE, header = TRUE, sep = ","
)
password <- "your password"
system(
command = paste0("unzip -o -P ", password, " ", "yourfile.zip"),
wait = TRUE
)
I am trying to run external tools from the MEME suite, one of this tool (jaspar2meme) producing a text file that is then use as an input of a second tool (fimo). Here is my code :
#!usr/bin/Rscript
com1 <- "meme/bin/jaspar2meme"
arg1 <- "-bundle jaspar_plant_2014.pfm"
message("jaspar2meme command: ", com1, arg1)
system2(command = com1, args = arg1, stdout = "motif.fimo", wait = T)
com2 <- paste0("meme/bin/fimo")
arg2 <- paste0("--text --oc . --verbosity 1 --thresh 1.0E-4 --bgfile bg.fimo motif.fimo Genes_up_h16.ame")
message("FIMO command: ", com2, arg2)
system2(command = com2, args = arg2, stdout = "fimoresult.txt", wait = T)
When I run this code from within RStudio (via source), it works perfectly: the file motif.fimo is produced by jaspar2meme and use by fimo to produce the resulting file fimoresult.txt.
When I run the same script via Rscript from the shell (bash), the motif.fimo is also produced as expected but is not found by fimoand the fimoresult.txt remains empty.
What I tried so far is to use either system() or system2(), using the wait=T option or not, specifying the full path to motif.fimo but without success.
I finally got it... The locale variables were different in RStudio and Rscript. The motif.fimo file produced by jaspar2meme looked the same in both cases but was apparently not. By changing the first call to system2() by :
system2(command = com1, args = arg1, stdout = "motif.fimo", wait = T, env = c("LC_NUMERIC=C"))
solve my problem.