I want to adjust my function so that my if and else if statements recognize the name of the dataframe used and execute the correct plotting function. These are some mock data structured the same as mine:
df1<-data.frame(A=c(1,2,2,3,4,5,1,1,2,3),
B=c(4,4,2,3,4,2,1,5,2,2),
C=c(3,3,3,3,4,2,5,1,2,3),
D=c(1,2,5,5,5,4,5,5,2,3),
E=c(1,4,2,3,4,2,5,1,2,3),
dummy1=c("yes","yes","no","no","no","no","yes","no","yes","yes"),
dummy2=c("high","low","low","low","high","high","high","low","low","high"))
df1[colnames(df1)] <- lapply(df1[colnames(df1)], factor)
vals <- colnames(df1)[1:5]
dummies <- colnames(df1)[-(1:5)]
step1 <- lapply(dummies, function(x) df1[, c(vals, x)])
step2 <- lapply(step1, function(x) split(x, x[, 6]))
names(step2) <- dummies
tbls <- unlist(step2, recursive=FALSE)
tbls<-lapply(tbls, function(x) x[(names(x) %in% names(df1[c(1:5)]))])
A<-lapply(tbls,"[", c(1,2))
B<-lapply(tbls,"[", c(3,4))
C<-lapply(tbls,"[", c(3,4))
list<-list(A,B,C)
names(list)<-c("A","B","C")
And this is my function:
plot_1<-function (section, subsample) {
data<-list[grep(section, names(list))]
data<-data[[1]]
name=as.character(names(data))
if(section=="A" && subsample=="None"){plot_likert_general_section(df1[c(1:2)],"A")}
else if (section==name && subsample=="dummy1"){plot_likert(data$dummy1.yes, title=paste("How do the",name,"topics rank?"));plot_likert(data$Ldummy1.no, title = paste("How do the",name,"topics rank?"))}
}
Basically what I want it to do is plot a certain graph by specifying section and subsample I'm interested in if, for example, I want to plot section C and subsample dummy.1, I just write:
plot_1(section="C", subsample="dummy1)
I want to avoid writing this:
else if (section=="A" && subsample=="dummy1"){plot_likert(data$dummy1.yes, title=paste("How do the A topics rank?"));plot_likert(data$Ldummy1.no, title = paste("How do the A topics rank?"))}
else if (section=="B" && subsample=="dummy1"){plot_likert(data$dummy1.yes, title=paste("How do the B topics rank?"));plot_likert(data$Ldummy1.no, title = paste("How do the B topics rank?"))}
else if (section=="C" && subsample=="dummy1"){plot_likert(data$dummy1.yes, title=paste("How do the c topics rank?"));plot_likert(data$Ldummy1.no, title = paste("How do the C topics rank?"))}
else if (section=="C" && subsample=="dummy2")...
.
.
}
So I tried to extract the dataframe used from the list so that it matches the string of the section typed in the function (data<-list[grep(section, names(list))]) and store its name as a character (name=as.character(names(data))), because I thought that in this way the function would have recognized the string "A", "B" or "C" by itself, without the need for me to specify each condition.
However, if I run it, I get this error: Warning message: In section == name && subsample == "dummy1" : 'length(x) = 4 > 1' in coercion to 'logical(1)', that, from what I understand, is due to the presence of a vector in the statement. But I have no idea how to correct for this (I'm still quite new to R).
How can I fix the function so that it does what I want? Thanks in advance!
Well, I can't really test your code without the plot_likert_general_section function or the plot_likert function, but I've done a bit of simplifying and best practices--passing list in as an argument, consistent spaces and assignment operators, etc.--and this is my best guess as to what you want:
plot_1 = function(list, section, subsample) { ## added `list` as an argument
data = list[[grep(section, names(list))]] # use [[ to extract a single item
name = as.character(names(data))
if(subsample == "None"){
plot_likert_general_section(df1[c(1:2)], section)
} else {
yesno = paste(subsample, c("yes", "no"), sep = ".")
plot_likert(data[[yesno[1]]], title = paste("How do the", name, "topics rank?"))
plot_likert(data[[yesno[2]]], title = paste("How do the", name, "topics rank?"))
}
}
plot_1(list, section = "C", subsample = "dummy1)
I'm not sure if your plot_likert functions use base or grid graphics--but either way you'll need to handle the multiple plots. With base, probably use mfrow() to display both of them, if grid I'd suggest putting them in a list to return them both, and then maybe using gridExtra::grid.arrange() (or similar) to plot both of them.
You're right that the error is due to passing a vector where a single value is expected. Try inserting print statements before the equality test to diagnose why this is.
Also, be careful with choosing variable names like name which are baseR functions (e.g. ?name). I'd also recommend following the tidyverse style guide here: https://style.tidyverse.org/.
I'm trying to create a function that looks up price and car type from a data set. Both will have default arguments. For price, this is easy enough. But for the car type (which I have as factors), I can't find a way to set all factors as default.
The goal is that if you don't set anything in car_type, it will return all possible car types.
search <- function(start_price = 0, end_price = 1000, car_type = ???){
subset_data <- auto[price <= end_price &
price > start_price &
vehicleType == car_type]
return(subset_data)
}
search()
So that the "search()" returns all cars between the prices of 0 and 1000 and of all possible car types. I've tried using vectors and lists, without any luck.
The usual way to approach this is to use NULL as a default and handle that in the function.
search <- function(start_price = 0, end_price = 1000, car_type = NULL){
if (is.null(car_type) {
car_type <- levels(auto$vehicleType)
}
subset_data <- auto[price <= end_price &
price > start_price &
vehicleType %in% car_type]
return(subset_data)
}
I am new to R. I wanted to define a R function, find_element, that takes as its inputs a list and a value of any type, and returns the value of the matched element in the input list that matches the value. thanks for your help
find_element <- function(arr, val){
count = 0
for(i in arr){
if (i == val){
print(count)
} else
count = count + 1
print ("No Match")
}
}
e.g.
arr <- 1:10
find_element(arr, 10)
# 10
find_element(arr, 12)
# NULL
Just for educational purposes, please, try (although this is not recommended practice in R!):
find_element <- function(arr, val) {
count = 1
for (i in arr) {
if (i == val) {
return(count)
} else
count = count + 1
}
return("No Match")
}
This will yield
arr <- 1:10
find_element(arr, 10)
#[1] 10
find_element(arr, 12)
#[1] "No Match"
Please, note
In R, elements of vectors, etc are numbered starting with 1
You have to use return instead of print to indicate the return value of a function (well, I know there's a short cut - but it's for the purpose of education, here)
The final return must come after the for loop.
Built-in function
Also for educational purposes, please, note that Sotos already has shown the R way in his comment:
which(arr == 10)
#[1] 10
which(arr == 12)
#integer(0)
In R, it's almost always better to use the well-documented built-in functions or those from packages. And, yes, try to avoid for loops in R.
Learnig R online
As pointed out in the (now deleted) answer of engAnt there are several ressources to learn R. https://www.rstudio.com/online-learning/#R lists a number of resources.
Consider this dataset:
myData = data.frame(parent = c(1,3,5,7,8,9)
,child = c(2,4,6,8,9,10))
I want to filter this dataset down to only records that belong to parent # 7. In T-SQL, I'd accomplish this using a recursive common table expression. Is there a way to do something like this elegantly in R?
Without a more complex example, it's hard to see exactly what you need (as here there is no "recursion" per se. It's just one level deep. However, to answer your specific question, here are two way:
myData = data.frame(parent = c(1,3,5,7,8,9),child = c(2,4,6,8,9,10))
#Using Dplyr
library(dplyr)
myData %>% filter(parent == "7")
#Using Base R
myData$child[myData$parent == 7]
Edit:
Per the comments below, I think this simple script will do it. There may be a simpler way but the do while loop below is the first that came to mind:
newParent = 7
answer <- c()
repeat{
currentResult <- myData$child[myData$parent == newParent]
if(length(currentResult) != 0){
newParent <- currentResult
answer <- c(answer, newParent)
}else{
break
}
}
> answer
[1] 8 9 10
Edit #2
Here's the same thing rewritten recursively given that the OP was interested in seeing how it is done:
findChild <- function(myData, parent){
currentChild <- myData$child[myData$parent == parent]
if(length(currentChild) != 0){
return(c(currentChild, findChild(myData, currentChild)))
}else{
return()
}
}
findChild(myData, 7)
UPDATE
Thanks to the help and suggestions of #CarlWitthoft my code was simplified to this:
model <- unlist(sapply(1:length(model.list),
function(i) ifelse(length(model.list[[i]][model.lookup[[i]]] == "") == 0,
NA, model.list[[i]][model.lookup[[i]]])))
ORIGINAL POST
Recently I read an article on how vectorizing operations in R instead of using for loops are a good practice, I have a piece of code where I used a big for loop and I'm trying to make it a vector operation but I cannot find the answer, could someone help me? Is it possible or do I need to change my approach? My code works fine with the for loop but I want to try the other way.
model <- c(0)
price <- c(0)
size <- c(0)
reviews <- c(0)
for(i in 1:length(model.list)) {
if(length(model.list[[i]][model.lookup[[i]]] == "") == 0) {
model[i] <- NA
} else {
model[i] <- model.list[[i]][model.lookup[[i]]]
}
if(length(model.list[[i]][price.lookup[[i]]] == "") == 0) {
price[i] <- NA
} else {
price[i] <- model.list[[i]][price.lookup[[i]]]
}
if(length(model.list[[i]][reviews.lookup[[i]]] == "") == 0) {
reviews[i] <- NA
} else {
reviews[i] <- model.list[[i]][reviews.lookup[[i]]]
}
size[i] <- product.link[[i]][size.lookup[[i]]]
}
Basically the model.list variable is a list from which I want to extract a particular vector, the location from that vector is given by the variables model.lookup, price.lookup and reviews.lookup which contain logical vectors with just one TRUE value which is used to return the desired vector from model.list. Then every cycle of the for loop the extracted vectors are stored on variables model, price, size and reviews.
Could this be changed to a vector operation?
In general, try to avoid if when not needed. I think your desired output can be built as follows.
model <- unlist(sapply(1:length(model.list), function(i) model.list[[i]][model.lookup[[i]]]))
model[model=='']<-NA
And the same for your other variables. This assumes that all model.lookup[[i]] are of length one. If they aren't, you won't be able to write the output to a single element of model in the first place.
I would also note that you are grossly overcoding, e.g. x<-0 is better than x<-c(0), and don't bother with length evaluation on a single item.