User-defined function does not function, but line-by-line works - r

I am aiming to define a function that essentially copies a row based on the number of selectable values in another table. Each new row will contain combinations of unique selectable values. I ran the following code on one measure, but plan to use a loop for multiple measures after I can successfully define a function. However, the function does not run, but I can chunk the code and it works fine. Thanks in advance!
output_template_v2<-output_template_v1
measure <- "A"
col <- "1"
add_selOptions_to_output<-function(output_template_v2, measure, col, attributes){
if (tolower(str_sub(attributes$Attribute,-4, -1)) == "_sel"){
selOptions<-attributes[attributes$Measure.Name ==measure & attributes$Attribute == col & attributes$Program == "Blue",]
}
if (length(selOptions$Attribute > 0)){
subcopy<- output_template_v2[output_template_v2$Measure == measure,]
output_template_v2<-output_template_v2[output_template_v2$Measure != measure,]
subcopy<-subcopy[rep(1, length(selOptions$Attribute)),]
}
for (i in seq_along(selOptions)){
subcopy[,col][i]<-selOptions$Attribute[i]
}
output_template_v2 <-rbind(output_template_v2, subcopy)
}

The function works — but it does not modify its function arguments, because function arguments are copied into the function. Instead, the function returns the modified table. This already works, but the assignment in the last line of your function is redundant, so remove it:
add_selOptions_to_output <- function (output_template_v2, measure, col, attributes) {
if (tolower(str_sub(attributes$Product.Attribute, -4L, -1L)) == "_sel") {
selOptions <- attributes[
attributes$Catalog.Measure.Name == measure &
attributes$Product.Attribute == col &
attributes$Program == "Blue", ]
}
if (length(selOptions$Attribute.Values > 0)) {
subcopy <- output_template_v2[output_template_v2$`Measure #` == measure, ]
output_template_v2 <- output_template_v2[output_template_v2$`Measure #` != measure, ]
subcopy <- subcopy[rep(1L, length(selOptions$Attribute.Values)), ]
}
for (i in seq_along(selOptions)) {
subcopy[, col][i] <- selOptions$Attribute.Values[i]
}
rbind(output_template_v2, subcopy)
}
Either way, you’ll need to assign the return value of the function back to the argument with which you’re calling it, e.g.:
tmpl = add_selOptions_to_output(tmpl, measure, col, attributes)

Related

Detecting first iteration (cycle) in R loop (without counter)

I'd like to detect the first iteration in a loop within a function from inside the body of the loop (i.e., without using some counter variable defined outside the loop), and in the most flexible possible manner.
Here would be one basic solution, just to demonstrate the idea:
vect = c('x', 'y', 'z')
for (elem in vect) {
print(elem)
isfirst(elem, vect)
}
isfirst = function(ele, vec) {
if (ele == vec[1]) {
print('this is the first cycle!')
} else {
print('this is NOT the first cycle!')
}
}
The "problem" with this is that I want this function to be easily reusable in any loop: that means that it should not need loop-specific arguments such as elem and vect. That is: another loop might use e.g. for (my_item in my_list) etc., and so then the isfirst arguments would need to be modified correspondingly, e.g. isfirst(my_item, my_list). The ideal way would be to just have an isfirst() without any arguments needed.
I'm not sure whether this is even possible, but I welcome any ideas.
(About why I need this: I would simply want to provide users with a function that behaves differently based on whether or not the iteration is the first, and that they can flexibly use in any loop and don't need to make even this small adjustment of changing the arguments.)
Well, here is the closest I could get:
vect = c('x', 'y', 'z')
for (elem in enum(vect)) {
print(elem)
isfirst()
}
enum = function(vec) {
assign("first_iteration", TRUE, envir = .GlobalEnv)
vec = mapply(c, 1:length(vec), vec, SIMPLIFY = FALSE) # this is just a small extra, not related to the question
return(vec)
}
isfirst = function() {
if (first_iteration == TRUE) {
print('this is the first cycle!')
assign("first_iteration", FALSE, envir = .GlobalEnv)
} else {
print('this is NOT the first cycle!')
}
}
But I'm still hoping for a better solution.

How to include logical checks in a custom function

I have written a custom function that performs a mathematical transformation on a column of data with the inputs being the data and one other input (temperature). I would like to have 2 different logical checks. The first one is whether or not any values in the column exceed a certain threshold, because the transformation is different above and below the threshold. The second is a check if the temperature input is above a certain value and in that case, to deliver a warning that values above the threshold are unusual and to check the data.
Right now, I have the function written with a series of if/else statements. However, this a warning that it is only using the first element of the string of T/F statements. A simplified example of my function is as follows:
myfun = function(temp,data) {
if(temp > 34){
warning('Temperature higher than expected')
}
if (data > 50) {
result = temp*data
return(result)
} else if(data <= 50) {
result = temp/data
return(result)
}
}
myfun(temp = c(25,45,23,19,10), data = c(30,40,NA,50,10))
As you can see, because it is only using the first value for the if/else statements, it does not properly calculate the return values because it doesn't switch between the two versions of the transformation. Additionally, it's only checking if the first temp value is above the threshold. How can I get it to properly apply the logical check to every value and not just the first?
-edit-simplified the function per #The_Questioner's suggestion and changed < 50 to <= 50.
The main issue with your code is that you are passing all the values to the functions as vectors, but then are doing single element comparisons. You need to either pass the elements one by one to the function, or put some kind of vectorized comparison or for loop into your function. Below is the for loop approach, which is probably the least elegant way to do this, but at least it's easy to understand what's going on.
Another issue is that NA's apparently need to be handled in the data vector before passing to any of your conditional statements, or you'll get an error.
A final issue is what to do when data = 50. Right now you have conditional tests for greater or less than 50, but as you can see, the 4th point in data is 50, so right now you get an NA.
myfun = function(temp,data) {
result <- rep(NA,length(temp))
for (t in 1:length(temp)) {
if(temp[t] > 34) {
warning('Temperature higher than expected')
if (!is.na(data[t])) {
if (data [t] > 50) {
result[t] <- temp[t]*data[t]
} else if(data[t] < 50) {
result[t] <- temp[t]/data[t]
}
}
} else {
if (!is.na(data[t])) {
if (data[t] > 50) {
result[t] <- temp[t]*data[t]
} else if(data[t] < 50) {
result[t] <- temp[t]/data[t]
}
}
}
}
return(result)
}
Output:
> myfun(temp = c(25,45,23,19,10), data = c(30,40,NA,50,10))
[1] 0.8333333 1.1250000 NA NA 1.0000000

Write similar which function

sorry for unclarity
myfunction should return index of elements in vector satisfy condition
myfunction <- function(vector,condition)
{
seq_along(vector)[vector == condition]
}
myfunction(vector == condition)
Error: object 'conditions' not found
I'm not sure exactly what you want your function to perform. Does it need to show which elements in a vector satisfy a condition (which is what which(vector == 10) would do)? If that is your intent, can you just do something like:
myfunction <- function(vector, condition){
which(vector == condition)
}
In any case, as far as I'm aware, you can't put a test condition in the parameter definitions of your function.

using 'input$entry' inside paste0 for variable length entry[1:n]

I have a clunky block of shiny code on server.R side that I feel R syntax ought to allow me make one or two lines and simultaneously more flexible with some kind of lapply or do.call line
if(input$parVary == "area" && as.numeric(input$nTraces) > 3 )
{
area <- c(input$area, input$area2, input$area3, input$area4)
} else if(input$parVary == "area" && as.numeric(input$nTraces) > 2 )
{
area <- c(input$area, input$area2, input$area3)
} else if(input$parVary == "area" && as.numeric(input$nTraces) > 1 )
{
area <- c(input$area, input$area2)
} else
{
area <- input$area
}
But I have spent a day and about a billion different combos of lapply, do.calls, reactive, get, c, and observes around
paste0('input$area', 1:as.numeric(input$nTraces))
I just can't seem to find the right combination or figure out the reactive concept I'm missing. It -seems- to be related to the code not ever including individual input$area1, input$area2, etc... explicit text anywhere in the code?
I spoke a little too soon in comment above. My specific code ended up needing conditional to handle the list versus single value case. But #NicE answer is one I was looking for. Five sections like
if(input$parVary == "area" && as.numeric(input$nTraces) > 1 )
{
area <- lapply(paste0("area",1:as.numeric(input$nTraces)),function(x) input[[x]])
} else
{
area <- input$area1
}
teamed with later
mySolution <- list()
if(input$nTraces =='1')
{
mySolution <- solveCalc(dat=dat,tox=tox,area=area,temp=temp,model=modelIn)
} else
{
mySolution <- switch(input$parVary,
"model" = lapply(modelIn,solveCalc,dat=dat,tox=tox,area=area,temp=temp),
"temp" = lapply(temp,solveCalc,dat=dat,tox=tox,area=area,model=modelIn),
"tox" = lapply(tox,solveCalc,dat=dat,temp=temp,area=area,model=modelIn),
"Vt" = lapply(dat,solveCalc,tox=tox,temp=temp,area=area,model=modelIn),
"area" = lapply(area,solveCalc,dat=dat,tox=tox,temp=temp,model=modelIn)
)
}
Got me just what I wanted.

R -- screening Excel rows according to characteristics of multiple cells

I am trying to eliminate all rows in excel that have he following features:
First column is an integer
Second column begins with an integer
Third column is empty
The code I have written appears to run indefinitely. CAS.MULT is the name of my dataframe.
for (i in 1:nrow(CAS.MULT)) {
testInteger <- function(x) {
test <- all.equal(x, as.integer(x), check.attributes = FALSE)
if (test == TRUE) {
return (TRUE)
}
else {
return (FALSE)
}
}
if (testInteger(as.integer(CAS.MULT[i,1])) == TRUE) {
if (testInteger(as.integer(substring(CAS.MULT[i,2],1,1))) == TRUE) {
if (CAS.MULT[i,3] == '') {
CAS.MULT <- data.frame(CAS.MULT[-i,])
}
}
}
}
You should be very wary of deleting rows within a for loop, if often leads to undesired behavior. There are a number of ways you could handle this. For instance, you can flag the rows for deletion and then delete them after.
Another thing I noticed is that you are converting your columns to integers before passing them to your function to test if they are integers, so you will be incorrectly returning true for all values passed to the function.
Maybe something like this would work (without a reproducible example it's hard to say if it will work or not):
toDelete <- numeric(0)
for (i in 1:nrow(CAS.MULT)) {
testInteger <- function(x) {
test <- all.equal(x, as.integer(x), check.attributes = FALSE)
if (test == TRUE) {
return (TRUE)
}
else {
return (FALSE)
}
}
if (testInteger(CAS.MULT[i,1]) == TRUE) {
if (testInteger(substring(CAS.MULT[i,2],1,1)) == TRUE) {
if (CAS.MULT[i,3] == '') {
toDelete <- c(toDelete, i)
}
}
}
}
CAS.MULT <- CAS.MULT[-1*toDelete,]
Hard to be sure without testing my code on your data, but this might work. Instead of a loop, the code below uses logical indexing based on the conditions you specified in your question. This is vectorized (meaning it operates on the entire data frame at once, rather than by row) and is much faster than looping row by row:
CAS.MULT.screened = CAS.MULT[!(CAS.MULT[,1] %% 1 == 0 |
as.numeric(substring(CAS.MULT[,2],1,1)) %% 1 == 0 |
CAS.MULT[,3] == ""), ]
For more on checking whether a value is an integer, see this SO question.
One other thing: Just for future reference, for efficiency you should define your function outside the loop, rather than recreating the function every time through the loop.

Resources