Error: missing value where True/False - r

I am trying to delete all values in a list that have the tag ".dsw". My list is a list of files using the function list.files. This is my code:
for (file in GRef) {
if (strsplit(file, "[.]")[[1]][3] == "dsw") {
#GRef=GRef[-file]
for(n in 1:length(GRef)){
if (GRef[n] == file){
GRef=GRef[-n]
}
}
}
}
Where GRef is the list of file names. I get the error listed above, but I dont understand why. I have looked at this post: Error .. missing value where TRUE/FALSE needed, but I dont think it is the same thing.

You shouldn't attempt to to modify a vector while you are looping over it. The problem is your are removing items you are then trying to extract later which is causing the missing values. It's better to identify all the items you want remove first, then remove them. For example
GRef <- c("a.file.dsw", "b.file.txt", "c.file.gif", "d.file.dsw")
exts <- sapply(strsplit(GRef, "[.]"), `[`, 3)
GRef <- GRef[exts!="dsw"]

Related

Length Zero in r

Greetings I am getting an error of
Error in if (nrow(pair) == 0) { :argument is of length zero
I have checked the other answers but do not seem to work on a variable like mine. Please check code below, please assist if you can.
pair<-NULL
if(exists("p.doa.ym")) pair <- rbind(pair, p.doa.ym[,1:2])
if(exists("p.doa.yd")) pair <- rbind(pair, p.doa.yd[,1:2])
if(nrow(pair) == 0) {
print("THERE ARE NO MATCHES FOR TODAY. STOP HERE")
quit()
}
Since you set pair=NULL and then it might happen that pair stays null if those two if statements are not true, you either need to check if pair is null first, or you could set pair to an empty data frame, or something else.
One option:
if (!is.null(pair)) {
if (nrow(pair)==0) {
# your code
}
}
Another option:
pair=data.frame()
# your code

Need assistance with understanding R for loop error: unexpected '}' in " }"

Just to give some background first:
I currently have 2 data frames (giraffe, leaf) and both of them share the column 'key', where the elements in the leaf data frame are a subset of giraffe. What I needed to do is compare the two data frames and when there are matching elements in both data frames in the 'key' column, the string 'leaf' will be input into another column (project) in the giraffe data frame inside the same row as the matching 'key' element. I've taken the following approach however it seems I have made a small error somewhere and after searching online, I still don't know what it is:
Truth_vector <- is.element((giraffe[,1]),(leaf[,1])) #returns a vector with 3000 elements, most are FALSE except for where the element inside 'key' is present in both data frames
i=1
for (i in 1:length(giraffe[,1])) {
if Truth_vector[i] == TRUE {
giraffe[i,5] <- 'leaf'
}
i = i+1
}
Error: unexpected '}' in "}"
Edit:
I tried implementing the solution as a function however nothing ends up happening, no error messages get returned either. What I've done is:
Project_assign <- function(prjct) {
Truth_vector <- is.element((giraffe[,1]),(prjct[,1]))
giraffe[which(Truth_vector),5] <- 'prjct'
}
Project_assign(leaf)
Edit: This was because everything was getting assigned in the function sub environment, not the global environment. Using assign('giraffe',giraffe,envir=.GlobalEnv) solves this however you should try and avoid the assign function and Instead I used a for loop going over a list of all the dataframes
You have a couple issues. First, the if criteria needs to be in parentheses, and secondly you don't need to increment i yourself. This should suffice:
for (i in 1:length(giraffe[,1])) {
if (Truth_vector[i] == TRUE) {
giraffe[i,5] <- 'leaf'
}
}
Of course, this would do it too:
giraffe[which(Truth_vector),5] <- 'leaf'
(assuming Truth_vector is not longer than the number of rows in giraffe)

Loop works outside function but in functions it doesn't.

Been going around for hours with this. My 1st question online on R. Trying to creat a function that contains a loop. The function takes a vector that the user submits like in pollutantmean(4:6) and then it loads a bunch of csv files (in the directory mentioned) and binds them. What is strange (to me) is that if I assign the variable id and then run the loop without using a function, it works! When I put it inside a function so that the user can supply the id vector then it does nothing. Can someone help ? thank you!!!
pollutantmean<-function(id=1:332)
{
#read files
allfiles<-data.frame()
id<-str_pad(id,3,pad = "0")
direct<-"/Users/ped/Documents/LearningR/"
for (i in id) {
path<-paste(direct,"/",i,".csv",sep="")
file<-read.csv(path)
allfiles<-rbind(allfiles,file)
}
}
Your function is missing a return value. (#Roland)
pollutantmean<-function(id=1:332) {
#read files
allfiles<-data.frame()
id<-str_pad(id,3,pad = "0")
direct<-"/Users/ped/Documents/LearningR/"
for (i in id) {
path<-paste(direct,"/",i,".csv",sep="")
file<-read.csv(path)
allfiles<-rbind(allfiles,file)
}
return(allfiles)
}
Edit:
Your mistake was that you did not specify in your function what you want to get out from the function. In R, you create objects inside of function (you could imagine it as different environment) and then specify which object you want it to return.
With my comment about accepting my answer, I meant this: (...To mark an answer as accepted, click on the check mark beside the answer to toggle it from greyed out to filled in...).
Consider even an lapply and do.call which would not need return being last line of function:
pollutantmean <- function(id=1:332) {
id <- str_pad(id,3,pad = "0")
direct_files <- paste0("/Users/ped/Documents/LearningR/", id, ".csv")
# READ FILES INTO LIST AND ROW BIND
allfiles <- do.call(rbind, lapply(direct_files, read.csv))
}
ok, I got it. I was expecting the files that are built to be actually created and show up in the environment of R. But for some reason they don't. But R still does all the calculations. Thanks lot for the replies!!!!
pollutantmean<-function(directory,pollutant,id)
{
#read files
allfiles<-data.frame()
id2<-str_pad(id,3,pad = "0")
direct<-paste("/Users/pedroalbuquerque/Documents/Learning R/",directory,sep="")
for (i in id2) {
path<-paste(direct,"/",i,".csv",sep="")
file<-read.csv(path)
allfiles<-rbind(allfiles,file)
}
#averaging polutants
mean(allfiles[,pollutant],na.rm = TRUE)
}
pollutantmean("specdata","nitrate",23:35)

Warning meassage: number of items to replace is not a multiple of replacement length

I got warnings when running this code.
For example, when I put
tm1<- summary(tmfit)[c(4,8,9)],
I can get the result, but I need to run this code for each $i$.
Why do I get this error?
Is there any way to do this instead of via a for loop?
Specifically, I have many regressants ($y$) with the same two regressors ($x$'s).
How I can get these results of regression analysis(to make some comparisons)?
dreg=read.csv("dayreg.csv")
fundr=read.csv("fundreturnday.csv")
num=ncol(fundr)
exr=dreg[,2]
tm=dreg[,4]
for(i in 2:num)
{
tmfit=lm(fundr[,i]~exr+tm)
tm1[i]<- summary(tmfit)[c(4,8,9)]
}
Any help is highly appreciated
Try storing your result into a list instead of a vector.
dreg=read.csv("dayreg.csv")
fundr=read.csv("fundreturnday.csv")
num=ncol(fundr)
exr=dreg[,2]
tm = list()
for(i in 2:num)
{
tmfit=lm(fundr[,i]~exr+tm)
tm1[[i]]<- summary(tmfit)[c(4,8,9)]
}
You can look at an element in the list like so
tm1[[2]]

R - create iterable list/dataframe from unique()

I'd like to get the unique elements from a column. That seems straight forward. Both of these work, but I'm not getting the object type I'd like:
userlist <- as.list(somebigdf$username)
userlist <- unique(userlist)
or
userlist <- unique(somebigdf$username)
When I iterate through, I'm not getting the names:
for(i in 1:length(userlist)){
cat(names(userlist[i]), '\n')
}
Returns blank spaces.
for(i in userlist){
cat(i, '\n')
}
Returns integers.
The above function is just an example. I'll be using that but also matching the returned name in an if-else function.
The object types seem to be integers or an extended data.frame with lots of values for each name - which isn't what I want. I would really just like a list of strings something along the lines of userlist = c( the results from unique).
Edit -
This code will iterate correctly through the names:
for(name in unique(somebigdf$username)){
cat(name, '\n')
}
I'm accepting my own answer. Namely, a working solution - this code will iterate correctly through the names:
for(name in unique(somebigdf$username)){
cat(name, '\n')
}
If someone at a later date has a better answer that seems more in keeping with the question, I will be happy to accept that as the answer.

Resources