R rolling function based upon columns - r

In R, I am attempting to create a column of a local min/max, based on 2 other columns.
In particular, I want the 3rd column to be a "current" column, and when x1 > current or x2 < current I want to update currentValue. Otherwise, it should be the previous currentValue
Initially, I set the entire y1 column to my starting value.
As can be seen, Row 5 should be using the currentValue of 5, and no change should be made. However, the comparison is being made to the value of 2 instead.
Any help would be greatly appreciated as I am unfamiliar with applying custom rolling functions in R. It seems like there should be an elegant solution for this, but a few other similar posts require a lot of code to accomplish this.
> c1 <- c(1,1,2,5,4,3,2,1)
> c2 <- c(2,3,3,6,6,4,4,2)
> c3 <- 2
> tempData <- data.frame(c1,c2,c3)
> names(tempData) <- c("x1", "x2", "currentValue")
> tempData
x1 x2 currentValue
1 1 2 2
2 1 3 2
3 2 3 2
4 5 6 2
5 4 6 2
6 3 4 2
7 2 4 2
8 1 2 2
>
> tempData$currentValue <- ifelse (tempData$x1 > lag(tempData$currentValue), tempData$x1, ifelse(tempData$x2 < lag(tempData$currentValue), tempData$x2, lag(tempData$currentValue)))
> tempData
x1 x2 currentValue
1 1 2 NA
2 1 3 2
3 2 3 2
4 5 6 5
5 4 6 4
6 3 4 3
7 2 4 2
8 1 2 2

I think this code could help you.
It is problematic to apply that lag function in the ifelse statement, in you code is not shifting the values of the column I guess, anyway, check this following code.
c1 <- c(1,1,2,5,4,3,2,1)
c2 <- c(2,3,3,6,6,4,4,2)
c3 <- 2
tempData <- data.frame(c1,c2,c3)
names(tempData) <- c("x1", "x2", "currentValue")
tempData
tempData$x1.lag <- c(NA, tempData$x1[1:7] )
tempData$x2.lag <- c(NA, tempData$x2[1:7] )
tempData
tempData$currentValue <- ifelse (tempData$x1 > tempData$x1.lag , tempData$x1,
ifelse( tempData$x2 < tempData$x2.lag, tempData$x2, tempData$currentValue))
tempData$x1.lag <- NULL
tempData$x2.lag <- NULL
tempData

Related

variable names in for loop

x_names <-c("x1","x2","x3")
data <- c(1,2,3,4)
fake <- c(2,3,4,5)
for (i in x_names)
{
x = fake
data = as.data.frame(cbind(data,x))
#data <- data %>% rename(x_names = x)
}
I made a toy example. This code will generate a data frame with 1 column called data, and 3 columns called x. Instead of calling the columns x, I want them with the name x1, x2, x3 (stored in x_names). I put the x_name in the code (comment out), but it does not work. Could you help me with it?
We can also use map_dfc from tidyverse:
library(tidyverse)
cbind(data, map_dfc(x_names, ~ tibble(!!.x := fake)))
Output:
data x1 x2 x3
1 1 2 2 2
2 2 3 3 3
3 3 4 4 4
4 4 5 5 5
We can avoid the for loop and use replicate to repeat fake data using setNames to name the dataframe with x_names.
cbind(data, setNames(data.frame(replicate(length(x_names), fake)), x_names))
# data x1 x2 x3
#1 1 2 2 2
#2 2 3 3 3
#3 3 4 4 4
#4 4 5 5 5
Ideally one should avoid growing objects in a loop, however one way to solve OP's problem in loop is
for (i in seq_along(x_names)) {
data = cbind.data.frame(data, fake)
names(data)[i + 1] <- x_names[i]
}
An option is just to assign the 'fake' to create the new columns in base R
data[x_names] <- fake
data
# data x1 x2 x3
#1 1 2 2 2
#2 2 3 3 3
#3 3 4 4 4
#4 4 5 5 5
EDIT: Based on comments from #avid_useR
data
data <- data.frame(data)
When you exchange your out-commented line
#data <- data %>% rename(x_names = x)
with
colnames(data)[ncol(data)] <- i
it should set the right colnames.

Building all combinations of a vector - looking for a nicer way

I have a simple case which I could solve in an ugly way, but I am sure a much cleverer (and faster) way must exist.
Let's take this vector
d <- 1:6
I want to list all the possible combinations in a "going-forward" way :
1 2
1 3
1 4
1 5
1 6
2 3
2 4
2 5
...
5 6
The working way I could first come with is the following
n <- 6
combDF <- data.frame()
for( i in 1:(n-1)){
thisVal <- rep(i,n-i)
nextVal <- cumsum(rep(1,n-1)) + 1
nextVal <- nextVal[nextVal > i]
print("---")
print(thisVal)
print(nextVal[nextVal > i])
df <- data.frame(thisVal = thisVal, nextVal = nextVal)
combDF <- rbind(combDF, df)
}
I am sure there must be a cleverer way to doing that.
Loud debugging? I just found this way
as.data.frame(t(combn(d,m=2)))
V1 V2
1 1 2
2 1 3
3 1 4
4 1 5
5 1 6
6 2 3
7 2 4
8 2 5
9 2 6
10 3 4
11 3 5
12 3 6
13 4 5
14 4 6
15 5 6
One approach using expand.grid and subsetting:
d <- 1:6
foo <- expand.grid(a = d, b = d)
foo[foo[, "a"] > foo[, "b"], c("b", "a")]
This may not be the best way to go about things if you have large vectors and memory constraints as the expand.grid call generates a lot of items that are removed, but it is quite readable and communicates intent clearly.

Delete duplicated records within row in a df in R

I would like to get rid of duplicated records in each row of my df:
df <- data.frame(a=c(1,3,5), b =c(1,2,4), c=c(2,3,7))
X1 X2 X3
1 1 1 2
2 3 2 3
3 5 4 7
I want to get this:
X1 X2 X3
1 1 NA 2
2 3 2 NA
3 5 4 7
Now, I can achieve this using apply:
data.frame(t(apply(df,1, function(row) ifelse(!duplicated(row), row, NA))))
but it seems unlikely that there isn't a more compact (and perhaps efficient) way of achieving this.
Am I missing a command or package here?

How to split a list and save objects individually?

I am trying to add a new column to multiple data frames, and then replace the original data frame with the new one. This is how I am creating the new data frames:
df1 <- data.frame(X1=c(1,2,3),X2=c(1,2,3))
df2 <- data.frame(X1=c(4,5,6),X2=c(4,5,6))
groups <- list(df1,df2)
groups <- lapply(groups,function(x) cbind(x,X3=x[,1]+x[,2]))
groups
[[1]]
X1 X2 X3
1 1 1 2
2 2 2 4
3 3 3 6
[[2]]
X1 X2 X3
1 4 4 8
2 5 5 10
3 6 6 12
I'm satisfied with how the new data frames have been created. What I'm stuck on is then breaking up my groups list and then saving the list elements back into their respective original data frames.
Desired Output
Essentially, I want to do something like df1,df2 <- groups[[1]],groups[[2]] but that is of course not syntatically valid. I have more than 2 data frames, which is why I'm hoping for a more programmatic approach than simply typing out N lines of code.
for (i in 1:length(groups)){
assign(paste("df",i,sep=""),as.data.frame(groups[[i]]))
}
should do it. Try it out, please.
#Rockbar led me to a general solution as well:
for(i in 1:length(groups)){
assign(names(groups)[i],as.data.frame(groups[[i]]))
}
> df1
X1 X2 X3
1 1 1 2
2 2 2 4
3 3 3 6
> df2
x1 X3 X3
1 4 4 8
2 5 5 10
3 6 6 12
I should note that this only works if the objects in the list are all named. Thank you again #Rockbar for guiding me to this.

R - Using for loop to conditionally change values in a dataframe

All of the variables are on the same scale in the data.frame 1-5.
Example of data.frame
rpi_invert
A B C D
5 2 4 1
3 5 5 2
1 1 3 4
For all values that equal 5 I would like to change it to 1.
for 4 change to 2.
for 2 change to 4.
for 1 change to 5.
Example of data.frame after values have been changed.
rpi_invert
A B C D
1 4 2 5
3 1 1 4
5 5 3 2
What I have tired.
for(b in colnames(rpi_invert)){
rpi_invert[[b]][rpi_invert[[b]] == 5] <- 1
rpi_invert[[b]][rpi_invert[[b]] == 4] <- 2
rpi_invert[[b]][rpi_invert[[b]] == 2] <- 4
rpi_invert[[b]][rpi_invert[[b]] == 1] <- 5
}
This will only change the values in the first row and not the second column.
for(b in colnames(rpi_invert)){
rpi_invert <- ifelse(rpi_invert[[b]] == 5,1,
ifelse(rpi_invert[[b]] == 4,2,
ifelse(rpi_invert[[b]] == 2,4,
ifelse(rpi_invert[[b]] == 1,5,rpi_invert[[b]]))))
}
But this gives me the error:
Error in rpi_invert[[b]] : subscript out of bounds
If I try to the same methods for an individual column instead of looping through the data.frame then both methods work so I am not sure what is the problem.
I am sure what I am trying to do can be done more efficiently without a for loop probably with some type of apply function but I am not sure how.
Any help will be appreciated please let me know if further information is needed.
You can try (if your data.frame is df):
3-(df-3)
# A B C D
#1 1 4 2 5
#2 3 1 1 4
#3 5 5 3 2
or, same but written a bit differently: 6-df

Resources