I am looking to create a couple of options with optparse in R. I want to feed the R script an argument from bash. This option will first check the length of the argument and then it will check if the argument contains all numbers or a combo of numbers and letters. This should be written as an if then statement.
So what I have so far:
if (opt$TYPE) != "RUNDATE" && str_length(opt$IDNO) != 8 && opt$IDNO != opt$IDNO[0-9]{8} {
print_help(opt_parser)
stop("Not a valid Identity Number.n", call. = FALSE)
}
if (opt$TYPE) != "LIMSID" && str_length(opt$IDNO) == 7 && opt$IDNO != opt$IDNO[A-Z]{3}[0-9]{4} {
print_help(opt_parser)
stop("Not a valid Identity Number.n", call. = FALSE)
}
Example: ABC1234 & ABCD123
needs to be checked so that the first three characters are letters, and that the last four characters are numbers. This would allow ABC1234 to proceed while ABCD123 would trigger the error message and stop the function.
I have an idea for the syntax in bash and am looking for it's equivalent in R:
[[A-Z]]{3}[0-9]{4}
Related
I am to construct a function named read_text_file.
It takes in an argument textFilePath that is a single character and two optional parameters withBlanks and withComments that are both single
logicals;
textFilePath is the path to the text file (or R script);
if withBlanks and withComments are set to FALSE, then read_text_file() will return the text file without blank lines (i.e. lines that contain nothing or only whitespace) and commented (i.e. lines that starts with “#”) lines respectively;
it outputs a character vector of length n where each element corresponds to its respective line of text/code.
I came up with the function below:
read_text_file <- function(textFilePath, withBlanks = TRUE, withComments = TRUE){
# check that `textFilePath`: character(1)
if(!is.character(textFilePath) | length(textFilePath) != 1){
stop("`textFilePath` must be a character of length 1.")}
if(withComments==FALSE){
return(grep('^$', readLines(textFilePath),invert = TRUE, value = TRUE))
}
if(withBlanks==FALSE){
return(grep('^#', readLines(textFilePath),invert = TRUE, value = TRUE))
}
return(readLines(textFilePath))
}
The second if-statement will always be executed leaving the third if-statement unexecuted.
I'd recommend processing an imported object instead of returning it immediately:
read_text_file <- function(textFilePath, withBlanks = TRUE, withComments = TRUE){
# check that `textFilePath`: character(1)
if(!is.character(textFilePath) | length(textFilePath) != 1){
stop("`textFilePath` must be a character of length 1.")}
result = readLines(textFilePath)
if(!withComments){
result = grep('^\\s*#\\s*', result, invert = TRUE, value = TRUE)
}
if(!withBlanks){
result = grep('^\\s*$', result, invert = TRUE, value = TRUE)
}
result
}
The big change is defining the result object that we modify as needed and then return at the end. This is good both because (a) it is more concise, not repeating the readLines command multiple times, (b) it lets you easily do 0, 1, or more data cleaning steps on result before returning it.
I also made some minor changes:
I don't use return() - it is only needed if you are returning something before the end of the function code, which with these modifications is not necessary.
You had your "comment" and "blank" regex patterns switched, I corrected that.
I changed == FALSE to !, which is a little safer and good practice. You could use isFALSE() if you want more readability.
I added \\s* into your regex patterns in a couple places which will match any amount of whitespace (including none)
So I am trying to learn R on my own and am just working through the online tutorial. I am trying to code a recursive function that prints the first n terms of the Fibonacci sequence and can't get the code to run without the error:
Error in if (nterms <= 0) { : missing value where TRUE/FALSE needed
My code does ask me for input before entering the if else statement either which I think is odd as well. Below is my code any help is appreciated.
#Define the fibonacci sequence
recurse_fibonacci <- function(n) {
# Define the initial two values of the sequence
if (n <= 1){
return(n)
} else {
# define the rest of the terms of the sequence using recursion
return(recurse_fibonacci(n-1) + recurse_fibonacci(n-2))
}
}
#Take input from the user
nterms = as.integer(readline(prompt="How many terms? "))
# check to see if the number of terms entered is valid
if(nterms <= 0) {
print("please enter a positive integer")
} else {
# This part actually calculates and displays the first n terms of the sequence
print("Fibonacci Sequence: ")
for(i in 0:(nterms - 1)){
print(recurse_fibonacci(i))
}
}
This is a problem of readline in non-interactive mode. readline does not wait for a keypress and immediately executes the next instruction. The solution below is the solution posted in this other SO post.
I post below a complete answer, with the Fibonnaci numbers function a bit modified.
recurse_fibonacci <- function(n) {
# Define the initial two values of the sequence
if (n <= 1){
n
} else{
# define the rest of the terms of the sequence using recursion
Recall(n - 1) + Recall(n - 2)
}
}
#Take input from the user
cat("How many terms?\n")
repeat{
nterms <- scan("stdin", what = character(), n = 1)
if(nchar(nterms) > 0) break
}
nterms <- as.integer(nterms)
# check to see if the number of terms entered is valid
if(nterms <= 0) {
print("please enter a positive integer")
} else {
# This part actually calculates and displays the first n terms of the sequence
print("Fibonacci Sequence: ")
for(i in 0:(nterms - 1)){
print(recurse_fibonacci(i))
}
}
This code is the contents of file fib.R. Running in a Ubuntu 20.04 terminal gives
rui#rui:~$ Rscript fib.R
How many terms?
8
Read 1 item
[1] "Fibonacci Sequence: "
[1] 0
[1] 1
[1] 1
[1] 2
[1] 3
[1] 5
[1] 8
[1] 13
rui#rui:~$
To make it work with Rscript replace
nterms = as.integer(readline(prompt="How many terms? "))
with
cat ("How many terms?")
nterms = as.integer (readLines ("stdin", n = 1))
Then you can run it as Rscript fib.R, assuming that the code is in the file fib.R in the current working directory.
Otherwise, execute it with source ("fib.R") within an R shell.
Rscript does not operate in interactive mode and does not expect any input from the terminal. Check what interactive () returns in both the cases. Rscript will return FALSE as it is non-interactive, but the same function when run within an R shell (with source ()) it will be true.
?readline mentions that it cannot be used in non-interactive mode. Whereas readLines explicitely connect to stdin.
The code works fine but you shouldn't enter it into the terminal as is. My suggestion: put the code into a script file (ending .R) and source it (get help about it with ?source but it's actually pretty straightforward).
In R-Studio you can simply hit the source button.
I have the following code to check if the dataset existed and if the dataset has data, and the problem is coming from the double quote around the macro variable.
check_data_ready <- defmacro(tracking_sheet,table_df,table_name,
expr={if (exists(table_df) && is.data.frame(get(table_df)) && dim(table_df)==NULL) {`tracking_sheet$DataReady[tracking_sheet$Table==table_name]<-'Ready'
} else {tracking_sheet$DataReady[tracking_sheet$Table==table_name]<-'Check!'}
})
check_data_ready(tracking_sheet,"df","Table")
the error msg is
Error in if (exists("df") && is.data.frame(get("df")) && dim("df") ==
So apparently dim("df") is not working, which should be dim(df). I try to substr the double quote from 2 to 3, but looks like it recognize d as 1 and f as 2, it doesn't count the double quote. I am lost here, how do I make the code working like this
if (exists("df") && is.data.frame(get("df")) && dim(df) ==
I'm trying to write an R script that takes in 3 arguments when run with Rscript: input file name, whether it has a header or not (values are 'header' or 'no_header', and a positive integer (the number of replacements; its for a bootstrap application). So, when I run it this way:
Rscript bootstrapWithReplacement.R survival.csv header 50
it should, before running, check if:
1) The script indeed took in 3 parameters;
2) whether the first parameter is a file;
3) whether the second parameter has a 'header' or 'no_header' value, and
4) if the number passed is a positive integer.
Here is my code so far:
pcArgs <- commandArgs()
snOffset <- grep('--args', pcArgs)
inputFile <- pcArgs[snOffset+1]
headerSpec <- pcArgs[snOffset+2] ## header/no_header
numberOfResamples <- pcArgs[snOffset+3] ## positive integer
check.integer <- function(N){
!length(grep("[^[:digit:]]", as.character(N)))
}
if (!file_test("-f",inputFile)) {stop("inputFile not defined. Proper use: Rscript bootstrapWithReplacementFile.R survival.csv header 50.")}
if (!exists("headerSpec")) {stop("headerSpec not defined. Proper use: Rscript bootstrapWithReplacementFile.R survival.csv header 50.")}
if (!exists("numberOfResamples")) {stop("numberOfResamples not defined. Proper use: Rscript bootstrapWithReplacementFile.R survival.csv header 50.")}
if ((headerSpec != 'header') == TRUE & (headerSpec != 'no_header') == TRUE) {stop("headerSpec not properly defined. Correct values: 'header' OR 'no_header'.")}
if (check.integer(numberOfResamples) != TRUE | (numberOfResamples>0) != TRUE) {stop("numberOfResamples not properly defined. Must be an integer larger than 0.")}
if (headerSpec == 'header') {
inputData<-read.csv(inputFile)
for (i in 1:numberOfResamples) {write.csv(inputData[sample(nrow(inputData),replace=TRUE),], paste("./bootstrap_",i,"_",inputFile,sep=""), row.names=FALSE)}
}
if (headerSpec == 'no_header') {
inputData<-read.table(inputFile,header=FALSE)
for (i in 1:numberOfResamples) {write.table(inputData[sample(nrow(inputData),replace=TRUE),], paste("./bootstrap_",i,"_",inputFile,sep=""),
sep=",", row.names=FALSE, col.names=FALSE)}
}
My problem is, the check for the existence of a file works, but for the header or integer don't.
Also, how can I, in the beginning, check if all three arguments have been passed?
Thanks!
As Vincent said, you should use the trailingOnly argument to commandArgs to simplify things.
As Konrad said, never, ever, ever compare directly to TRUE and FALSE.
Also, use assertive for doing assertions.
library(assertive)
library(methods)
cmd_args <- commandArgs(TRUE)
if(length(cmd_args) < 3)
{
stop("Not enough arguments. Please supply 3 arguments.")
}
inputFile <- cmd_args[1]
if (!file_test("-f", inputFile))
{
stop("inputFile not defined, or not correctly named."
}
headerSpec <- match.arg(cmd_args[2], c("header", "no_header"))
numberOfResamples <- as.numeric(cmd_args[3])
assert_all_numbers_are_whole_numbers(numberOfResamples)
assert_all_are_positive(numberOfResamples)
message("Success!")
I managed to solve all the checks, here's how:
if ((length(pcArgs) == 8) == FALSE) {stop("Not enough arguments. Please supply 3 arguments. Proper use example: Rscript bootstrapWithReplacementFile.R survival.csv header 50.")}
if (!file_test("-f",inputFile)) {stop("inputFile not defined, or not correctly named. Proper use example: Rscript bootstrapWithReplacementFile.R survival.csv header 50.")}
if ((headerSpec != 'header') == TRUE & (headerSpec != 'no_header') == TRUE) {stop("headerSpec not properly defined. Correct values: 'header' OR 'no_header'.")}
if (check.integer(numberOfResamples) != TRUE | (numberOfResamples>0) != TRUE) {stop("numberOfResamples not properly defined. Must be an integer larger than 0.")}
Thanks everyone!
In my textbook, there is this example very similar this to reverse a line from an input file:
void Reverse(ifstream &inFile, int level)
{
int myInput = inFile.get();
if (myInput != '\n' && myInput != EOF) // don't understand this, line 4
Reverse(inFile, level);
if (myInput != EOF)
cout.put(myInput);
}
What I don't get is the line I commented. Because from an input file that's inputs were:
ABC\n
DEF\0
When the \n is equal to myInput, doesn't it make line 4's conditional statement become false since the first (myInput != '\n') would be false, and the second part (myInput != EOF) be true making that whole line false, and not calling the Reverse function again? Thanks.
The trick to understanding recursion on a very basic level is to trace the
execution and write out the sequence of calls. The following might help you
to understand how this works. I indented each call to Recurse() and
included the line numbers that are executed.
3: myInput = 'A'
5: Reverse()
3: myInput = 'B'
5: Reverse()
3: myInput = 'C'
5: Recurse()
3: myInput = '\n' <<<< base condition, recursion stops here
7: cout.put('\n')
7: cout.put('C')
7: cout.put('B')
7: cout.put('A')
So this will output '\nCBA'.
As for the logic in the base condition, remember De Morgan's Laws:
(NOT P) AND (NOT Q)
-> NOT (P OR Q)
(myInput != '\n') && (myInput != EOF)
-> (!(myInput == '\n')) && (!(myInput == EOF))
-> !((myInput == '\n') || (myInput == EOF))
The if statement on line 4 will exclude newlines and the end of file
condition so the recursion stops when it encounters either condition. This is
what causes the recursion to only read one line from the file.
That line is the recursion's base condition. Apparently, this function is designed to print out one line in reverse from the input file at a time. So, once it encounters a \n or an EOF, it has reached the end of the current line, and so does not continue further. Repeated calls to Reverse print out successive lines.
You're right. It won't call Reverse() when the character is a '\n'. So this will only reverse the first line of a file.