printing warnings by R - r

I have a code on a file that works on a matrix and I read it by using
source("filecode.r")
As the matrix that the code works with must have some specific characteristics, I would like to print a message to remember the user that the input matrix must be formatted with those characteristics.
The code is this:
n<- nrow(aa)
d_ply(aa, 1, function(row){
cu<- dist(as.numeric(row[-1]))
cucu<- as.matrix(cu)
saveRDS(cucu, file = paste0(row$ID, ".rds"))
}, .progress='text', .print = TRUE)
Ideally I would like to add a warning message appearing before the code starts running...like this:
Warning(“1) did you write ‘ID’ in position [1,1] of the input matrix?;
2) is your matrix saved as a .txt?
3) ensure that the matrix file does not have empty rows at the end”)
and receiving also a question like "do you want to go on then?".
Thank you in advance for all suggestions!
Gab

Put that at the beginning of your file:
check <- readline(prompt="Warning!\n(1) did you write 'ID' in position [1,1] of the input matrix? \n(2) is your matrix saved as a .txt?\n(3) ensure that the matrix file does not have empty rows at the end\n\n Do you wish to continue? (y/n)")
if(check == "n") stop("Aborted.")
print(check) #Here would follow your code instead
If you type "y" the following code will be evaluated. If you type "n", the script stops and prints the message inside stop().
You could also make sure that only 'y' and 'n' are accepted by putting the prompt statement inside of a while loop:
check <- NA
while(!(check %in% c('y','n'))) {
check <- readline(prompt="Warning!\n(1) did you write 'ID' in position [1,1] of the input matrix? \n(2) is your matrix saved as a .txt?\n(3) ensure that the matrix file does not have empty rows at the end\n\n Do you wish to continue? (y/n)")
}
if(check == "n") stop("Aborted.")

Related

Variable not replacing value in loop in R

This is what I'm trying to do:
I have a large excel sheet I'm importing to R.
The data needs to be cleaned so one of the procedures is to test for character length.
Once the program finds a string that is too long, it needs to prompt the operator for a replacement
The operator inputs an alternative, and the program replaces the original with the input text.
The code I have seems to work procedurally, but the variable I have is not overwriting the original value.
library(tidyr)
library(dplyr)
library(janitor)
library(readxl)
fileToOpen <-read_excel(file.choose(),sheet="Data")
MasterFile <- fileToOpen
#This line checks the remaining bad strings in the column
CPNErrors <- nrow(filter(MasterFile,nchar(Field_to_Check) > 26))
#This line selects the bad field from the first in the list of strings to exceed the limit
TEST <- select(filter(MasterFile,nchar(Field_to_Check) > 26),Field_to_Check)[1,]
#This is the loop -- prompts the operator for a replacement, assigns a variable to the input and then replaces the bad value in the data frame
while (CPNErrors >= 1) {message("Replace ",TEST," with what?"); var=readline();MasterFile$Field_to_Check[MasterFile$Field_to_Check == TEST] <- var;print(var)}
The prompt works and assigns the readline() to the var, but the code will not replace the original string as a variable. When I run the code separately outside the loop, it will replace as long as I input an exact string (no variable assignment), so there's some syntactical thing I'm missing.
I've been searching for hours, and am just starting out in R, so if anyone can offer any assistance I'd greatly appreciate it.
EDIT -- ok... I think I found the source of the problem, but I don't know how to fix it. When I run
MasterFile$Field_to_Check[MasterFile$Field_to_Check == TEST]
It comes with a null result, but if I run
MasterFile$Field_to_Check[MasterFile$Field_to_Check == "Some Text that's in the data frame"]
It comes out with a result. Any idea on why I can't filter this list by the variable? The TEST variable comes out as expected.
Try this approach with a for loop :
CPNErrors <- which(nchar(MasterFile$Field_to_Check) > 26)
for(i in CPNErrors){
var=readline(paste0("Replace ",MasterFile$Field_to_Check[i]," with what? "))
MasterFile$Field_to_Check[i] <- var
}

How to remove the first row from multiple dataframes?

I have multiple dataframes and would like to remove the first row in all of them.
I have tried using a for loop but cannot understand what I am doing wrong
for (i in cities){
i <- i[-1, ]
}
I get the following error code:
Error in i[-1, ] : incorrect number of dimensions
If we assume that the only objects in your workspace are dataframes then this might succeed:
cities <- objects() )
for (i in cities) { assign(i, get(i)[-1,])}
Explanation:
Two thing wrong with original codes:
One was already mentioned in comments. "df" is not the same as df. You need to use get to convert a character value to a "true" R name that is used to retrieve an object having that name. The result of object() is only a character value. In R the term "name" means a "language object". See the help page: ?mode. (There is potential confusion about rownames and columnnames which are always "character"-class.) It's not like SAS which is a macro language that has no such distinction.
The second error was trying to get substitution for the i on the left-hand side of <-. The would have failed even if you were working with actual R names. The assign function is designed to handle character values that are then converted to R names.
say you get a list of all the tables in your environment, and you call that list cities. You can't just iterate over each value of cities and change things, because in the list they are just characters.
Here is what you need:
for (i in cities){
tmp <- get(i) # load the actual table
tmp <- tmp[-1, ] # remove first column
assign(i, tmp) # re-assign table to original table name
}

Writing a loop in R

I have written a loop in R. The code is expected to go through a list of variables defined in a list and then for each of the variables perform a function.
Problem 1 - I cannot loop through the list of variables
Problem 2 - I need to insert each output from the values into Mongo DB
Here is an example of the list:
121715771201463_626656620831011
121715771201463_1149346125105084
Based on this value - I am running a code and i want this output to be inserted into MongoDB. Right now only the first value and its corresponding output is inserted
test_list <-
C("121715771201463_626656620831011","121715771201463_1149346125105084","121715771201463_1149346125105999")
for (i in test_list)
{ //myfunction//
mongo.insert(mongo, DBNS, i)
}
I am able to only pick the values for the first value and not all from the list
Any help is appreciated.
Try this example, which prints the final characters
myfunction <- function(x){ print( substr(x, 27, nchar(x)) ) }
test_list <- c("121715771201463_626656620831011",
"121715771201463_1149346125105084",
"121715771201463_1149346125105999")
for (i in test_list){ myfunction(i) }
for (j in 1:length(test_list)){ myfunction(test_list[j]) }
The final two lines should each produce
[1] "31011"
[1] "105084"
[1] "105999"
It is not clear whether "variable" is the same as "value" here.
If what you mean by variable is actually an element in the list you construct, then I think Ilyas comment above may solve the issue.
If "variable" is instead an object in the workspace, and elements in the list are the names of the objects you want to process, then you need to make sure that you use get. Like this:
for(i in ls()){
cat(paste(mode(get(i)),"\n") )
}
ls() returns a list of names of objects. The loop above goes through them all, uses get on them to get the proper object. From there, you can do the processing you want to do (in the example above, I just printed the mode of the object).
Hope this helps somehow.

R reading files to iterate through them, final iteration command will not work due to file handling error

I have written 100 two-columned matrices into their own separate text files and now need to read the contents back into a list of 100 matrices such that I can conduct the command at the bottom of this entry "ComputeStepSize". My code is as follows:
I write each of the 100 matrices in "listofmatrices1" into their own files.
for(i in 1:length(listofmatrices1)){
write.table(listofmatrices1[[i]], file=(paste("traj1", as.character(i), ".txt", sep="")), row.names=FALSE, sep="\t")
}
use the sys() command to make a list of the files such that I can read them with following commands.
individualmatrices1<-system("ls /Users/Deirdreclarkson/rpractice/traj1*", intern=TRUE)
readTrajectory1 <- function(traj1) {
a <- read.table(traj1, sep="\t", header=TRUE)
return(a)
}
I read the matrices into an empty list using "readTrajectory".
trajectorieslist1<-vector("list", 100)
for (i in 1:length(individualmatrices1)){
val1 <- readTrajectory1(individualmatrices1[i])
trajectorieslist1[[i]]<-val1
}
the header of one of the matrices in the list:
X Y
112.4563 112.4563
110.1210 110.1210
109.2143 109.2143
108.1806 108.1806
107.3700 107.3700
I'm iterating through the matrix 2 columns and measuring the difference between each consecutive value.
ComputeStepSize<-function(table){
deltastepY <- diff(table[,2][seq(1,length(table[,2]), 2)])
print(deltastepY)
deltastepX <- diff(table[,1][seq(1,length(table[,2]), 2)])
print(deltastepX)
overalldelta<-sqrt(deltastepY**2+deltastepX**2)
return(overalldelta)
}
Read the individual matrices using a for loop.
for (i in 100){
finalsteplist1<-ComputeStepSize(trajectorieslist1[i])
}
Issue
Error in table[, 2] : incorrect number of dimensions
I don't understand why this is happening as I have told the "ComputeStepSize" command that there are only 2 columns- "diff(table[,2][seq(1,length(table[,2]), 2)])".
Can any one spot where in my file-handling I've gone wrong such that this is happening?
I have assigned the read.table of one of the files to a variable "a" and tried ComputeStepSize(a).This returns a generic debugger:
function (x, ...) UseMethod("print")
but I can't print deltastepX or Y while they're being made as it causes the command won't operate on them to begin with.
Error in table[, 2] : incorrect number of dimensions
… Can any one spot where in my file-handling I've gone wrong such that this
is happening?
You didn't go wrong in your file-handling, you just used a general subscripting operator […] in
finalsteplist1<-ComputeStepSize(trajectorieslist1[i])
(yielding a sublist of the list) instead of the operator [[…]] used to select a single element:
finalsteplist1<-ComputeStepSize(trajectorieslist1[[i]])

Need an explanation for a particular R code snippet

The following is the code for which i need an explanation for:
for (i in id) {
data <- read.csv(files[i] )
c <- complete.cases(data)
naRm <- data[c, ]
completeCases <- rbind(completeCases, c(i, nrow(naRm)))
as i understand, the variable c here stores multiple logical values. The line after, that seems foreign to me. How does data[c, ] work?
FYI, I am an R newbie.
complete.classes looks for all rows that are "complete", have no missing values. Here is the man page. Thus the completeCases object will tell you the number of "complete" rows in each file you have just read. You really don't need to store the value of i in the rbind call though as it is just the row number, so it is redundant. A vector would do just fine for this application.
Also looks like you are missing a close brackets or this isn't a complete chunk of code.

Resources