I have a dataframe (samples x species) which I want to loop this command through (column-wise):
dist <- vegdist(decostand(X,"standardize",MARGIN=2), method="euclidean")
I need the name of the column in each of the new dist-values. So if my columns are called A, B, C, then the result should be dist-values called Dist.A, Dist.B, Dist.C, and so on. I believe this can be done with paste, but I have no clue how.
You can try (if it is a data.frame)
d1[] <- lapply(colnames(d1), function(x) paste(d1[,x], x, sep="."))
Or
d1[] <- Map(function(x,y) paste(x, y, sep="."), d1, colnames(d1))
If it is a matrix
m1[] <- paste(m1, colnames(m1)[col(m1)],sep=".")
data
m1 <- matrix(1:15, ncol=3, dimnames=list(NULL, LETTERS[1:3]))
d1 <- as.data.frame(m1)
Related
I'm very new to R, and I heard it's best to replace loops with apply functions, however I couldn't wrap my head around on how to transform my loop with this example. Any help would be appreciated.
file_path is a list of file names
file_path[1] = "/home/user/a.rds"
file_path[2] = "/home/user/b.rds"
...
vector_sum <- rep(0,50000)
for(i in 1:5){
temp_data <- readRDS(file_path[i])
temp_data <- as.matrix(temp_data[,c("loss_amount")])
vector_sum <- vector_sum + temp_data
}
My goal is to loop through all the files, in each file only keep loss_amount column and add it to vector_sum, so in the end vector_sum is the sum of all loss_amount columns from all files
Using rowSums.
rowSums(sapply(file_path, \(x) readRDS(x)[, 'loss_amount'], USE.NAMES=F))
# [1] 1.2060989 1.4248851 -0.4759345
Data:
set.seed(42)
l <- replicate(3, matrix(rnorm(6), 3, 2, dimnames=list(NULL, c('x', 'loss_amount'))), simplify=F)
dir.create('foo') ## creates/overwrites `foo` in wd!
Map(\(x, y) saveRDS(x, paste0('foo/', y, '.rds')), l, letters[seq_along(l)])
file_path <- list.files('foo', full.names=TRUE)
Here is one possible way to solve your problem using lapply:
sum(unlist(lapply(file_path, \(fle) readRDS(fle)[, "loss_amount"])))
# or
do.call(sum, lapply(file_path, \(fle) readRDS(fle)[, "loss_amount"]))
I am trying to add the list name that wraps the sublist with a dataset as a column of the last one to allow me call the rbind afterwards and join all the datasets. In other words, to join the datasets I need to keep the original list name for reference. I acomplished it with a for, but I wonder if I could use the purrr methods or apply functions instead because my data is far more complex than this below.
To ilustrate what I mean, I share a simple example:
x1 <- data.frame(a = 1:3, b = letters[1:3])
x2 <- data.frame(a = 4:6, b = letters[4:6])
y1 <- data.frame(a = 10:15, b = letters[10:15])
y2 <- data.frame(a = 13:17, b = letters[13:17])
l1 <- list(x1, y1)
l2 <- list(x2, y2)
l <- list(l1, l2)
nm <- c("list1", "list2")
names(l) <- nm
lmod <- l
for (i in 1:length(lengths(l))) {
for(j in 1:length(l[[i]])) {
lmod[[i]][[j]]$nm = names(l[i])
}
}
Using lapply I tryied something like:
lmod <- lapply(l, function(x) {
x[[1]] <- names(x)
x[[2]] <- names(x)
return(x)
})
But it did not work at all.
Does anyone has a clue on this one?
Map(lapply, l, list(cbind), nm=names(l))
This can also be written as:
Map(\(x, y)lapply(x, cbind, nm = y),l, names(l))
all.equal(lmod, Map(lapply, l, list(cbind), nm=names(l)))
[1] TRUE
How to apply a function on every second column of a data frame? That is to say, how to modify df2 <- sapply(df1, fun) such that df2 equals df1 but with fun applied to every second column? Here is what I tried:
a <- c(1,2,3,4,5)
b <- c(6,7,8,9,10)
df1 <- data.frame(a,b)
df2 <- sapply(df1[c(TRUE, FALSE)], function(x) x^2)
isTRUE(dim(df1)==dim(df2)) # FALSE
The problem with this code is, that it deletes all columns to which fun was not applied to (dim(df2) # 5 1).
Assigning variables to slices
You can assign new values for subsets of an object. Say for:
x <- c(1,2,3)
x[2] <- 4
Now x will be c(1,4,2). Similarly you can do this for row/columns of a matrix or dataframe. Here we use the apply function with the second argument 2 for cols (1 for cols). I recommend the seq function to generate a sequence of indices from=1, by=2 gives odd and from=2, by=2 gives even indices. Specifying this it way generalises to other subsets and straightforward to check you got it right.
a <- c(1,2,3,4,5)
b <- c(6,7,8,9,10)
df1 <- data.frame(a,b)
df2 <- df1
df2[,seq(1, ncol(df2), 2)] <- apply(df2[,seq(1, ncol(df2), 2)], 2, function(x) x^2)
Loops
Note that you can also do this with a loop:
df2 <- df1
for(col in seq(1, ncol(df2), 2)) df2[,col] <- sapply(df2[,col], function(x) x^2)
Vectorised functions
Since the squared operation is "vectorised" in R, in this case you could also do:
for(col in seq(1, ncol(df2), 2)) df2[,col] <- df2[,col]x^2
Or use vectorisation completely:
df2 <- df1
df2[,seq(1, ncol(df2), 2)] <- df2[,seq(1, ncol(df2), 2)]^2
Consider this code:
df <- as.data.frame(matrix(rnorm(9),3,3))
names(df) <- c("A","B","C")
y <- c(1,2,3)
list1 <- lapply(df, function(x) as.vector(x))
par(mfrow=c(1,3))
lapply(list1, function(x) plot(x,y))
I want to paste the name of each vector in list1 (A, B, C) to the x-axis of the respective x,y-plot.
Can i do it during lappy, or is it necessary to write a loop?
You can do it using 'lapply', while iterating over names:
lapply(names(list1),function(nn){
plot(list1[[nn]],y,ylab=nn)
}
)
Packages (wmtsa, gtools, caret)
I have the following vector and below the following function
z2 <- c(-0.1100, 0.1800, 0.0620, 0.1000,-0.0730,-0.1310, 0.2170,-0.0680,-0.0840,
-0.1350,-0.3070, 0.0670, 0.1360, 0.1000,-0.0150, 0.2450,0.1190,-0.0560, -0.0600,
-0.1400, -0.0420, 0.1250, 0.0060, -0.0280,-0.0620, -0.0010, 0.0880, -0.0180, 0.0720,
0.3160, -0.0270, -0.0460,0.0600, -0.0300, 0.0550, -0.0560, 0.1780, 0.0200, 0.0070,
0.0430)
Wavs4 <- wavMODWT(z2, wavelet="s4", n.levels=ilogb(length(z), base=2),position=list
(from=1, by=1,units=character()), units=character(),title.data=character(),
documentation=character(), keep.series=FALSE)
MRD4<-wavMRD(Wavs4, level=NULL, osc=NULL)
wavs4access<-as.matrix(MRD4)
wavs4access<-as.matrix(wavs4access)
Dxu4 <- wavs4access [,"D1"]
Dxi4 <- wavs4access [,"D2"]
Dxa4 <- wavs4access [,"D3"]
#From above selected combinations
a <- c("Dxi4","Dxu4","Dxa4")
b <- combinations(3, 2, a, set=TRUE, repeats.allowed=FALSE)
#I extract the coefficients from comb...
d1 <- c(b[[1,1]],b[[1,2]])
d2 <- c(b[[2,1]],b[[2,2]])
d3 <- c(b[[3,1]],b[[3,2]])
#I create a list
dlist<-list(d1=d1,d2=d2,d3=d3)
I would like to apply the above dlist in the below expression pastevar, so I produce an expression that can be recursively looped for all d values with a function, rather than writing three times the below expression and then running the function every time.
pastevar <- paste(c("z2[1:length(z2)] ~ ", paste(d1, collapse=" + ")))
X <- model.matrix(as.formula(pastevar))[,-1]
X <- data.frame(X)
Y <- z2[1:length(z2)]
Thank you