I want to use an apply statement to do something to each row of a data frame in R.
The following works where I call the function "calc.Sphere.Metrics" with a bunch of parameters and an index i. I store the result in each row.
for(i in 1: dim(position.matrix)[1]){
results.obs[i,] <- calc.Sphere.Metrics(i, culled.mutation.data, position.matrix, protein.metrics, radius)
}
I've tried several apply, mapply statements but am having no luck. What would be the correct way to do this?
EDIT:
As requested, here's a skeleton of calc.Sphere.Metrics
calc.Sphere.Metrics <- function(index, culled.mutation.data, position.matrix, protein.metrics, radius){
results <- matrix(data = 0, nrow = 1, ncol = 8)
colnames(results) <- c("Line.Length","Center", "Start","End","Positions","MutsCount","P.Value", "Within.Range")
results <- as.data.frame(results)
....
look up a bunch of stuff and fill in each column of results. All the data required is in the parameters passed in and the index.
.....
return(results)
}
Results has the same number of columns as results.obs in the top function. Hope this helps!
Thanks!
Probably something like this:
result.obs <- do.call(rbind, lapply(seq_len(dim(position_matrix)[1]),
calc.Sphere.Metrics, culled.mutation.data, position.matrix, protein.metrics, radius))
Related
Incredibly basic question. I'm brand new to R. I feel bad for asking, but also like someone will crush it:
I'm trying to generate a number of vectors with a for loop. Each with an unique name, numbered by iteration. The code I'm attaching throws an error, but I think it explains what I'm trying to do in principle fairly well.
Thanks in advance.
vectorBuilder <- function(num){
for (x in num){
paste0("vec",x) <- rnorm(10000, mean = 0, sd = 1)}
}
numSeries <- 1:10
vectorBuilder(numSeries)
You can write the function to return a named list :
create_vector <- function(n) {
setNames(replicate(n, rnorm(10000), simplify = FALSE),
paste0('vec', seq_len(n)))
}
and call it as :
data <- create_vector(10)
data will have list of length 10 with each element having a vector of size 10000. It is better to keep data in this list instead of creating lot of vectors in global environment. However, if you still want separate vectors you can use list2env :
list2env(data, .GlobalEnv)
I am trying to create such list with value from different data frame called kc2 to kc10. anyone provide me some advice how to formulate this for loop?
sum_square=append(sum_square,weighted.mean(x=kc2$withinss,w=kc2$size, na.rm=TRUE))
I tried something like this but didnt work:
for (i in 2:10){
nam1 = paste0("kc",i,"$withinss")
nam2 = paste0("kc",i,"$size")
sum_square = append(sum_square, lapply(c(as.numeric(nam1),as.numeric(nam2)), weighted.mean))
}
There are a lot of problems with the code you posted, so I'll just cut right to the point. In R, when you want to apply a function to multiple objects and collect the result, you should be thinking of using lapply. lapply loops through a list of objects (you can put your data frames into a list), applies the chosen function to each, and then returns the result of each as a list. The below code is in the form of what you want:
# Add data frames to list by name
list_of_data_frames <- list(kc2, kc3, kc4, kc5, kc6, kc7, kc8, kc9, kc10)
# OR add them programatically
list_of_data_frames <- mget(paste0('kc', seq.int(from = 2, to = 10)))
result <- lapply(list_of_data_frames,
function(x) weighted.mean(x = x$withiniss, w = x$size, na.rm=TRUE))
I am a beginner in R and i know the way i have done is wrong and slow. I would like to fill a matrix and i need to compute each term. I have tried two for loops, here is the code. Do you know a better way to do it?
KernelGaussianMatrix <- function(x,delta){
Mat = matrix(0,nrow=length(x),ncol=length(x))
for (i in 1:length(x)){
for (j in 1:length(x)){
Mat[i,j] = KernelGaussian(x[i],x[j],delta)
}
}
return(Mat)
}
Thx
you want to use the function outer as in:
Mat <- outer(x,x,KernelGaussian,delta)
note that any arguments after the third argument in outer are provided as additional arguments to the function provided as the third argument to outer
If a for loop is required to generate the values than your method is fine.
If the values are already in an array values you can try mat = matrix(values, nrow=n, ncol=p) or something similar.
I am having trouble optimising a piece of R code. The following example code should illustrate my optimisation problem:
Some initialisations and a function definition:
a <- c(10,20,30,40,50,60,70,80)
b <- c(“a”,”b”,”c”,”d”,”z”,”g”,”h”,”r”)
c <- c(1,2,3,4,5,6,7,8)
myframe <- data.frame(a,b,c)
values <- vector(length=columns)
solution <- matrix(nrow=nrow(myframe),ncol=columns+3)
myfunction <- function(frame,columns){
athing = 0
if(columns == 5){
athing = 100
}
else{
athing = 1000
}
value[colums+1] = athing
return(value)}
The problematic for-loop looks like this:
columns = 6
for(i in 1:nrow(myframe){
values <- myfunction(as.matrix(myframe[i,]), columns)
values[columns+2] = i
values[columns+3] = myframe[i,3]
#more columns added with simple operations (i.e. sum)
solution <- rbind(solution,values)
#solution is a large matrix from outside the for-loop
}
The problem seems to be the rbind function. I frequently get error messages regarding the size of solution which seems to be to large after a while (more than 50 MB).
I want to replace this loop and the rbind with a list and lapply and/or foreach. I have started with converting myframeto a list.
myframe_list <- lapply(seq_len(nrow(myframe)), function(i) myframe[i,])
I have not really come further than this, although I tried applying this very good introduction to parallel processing.
How do I have to reconstruct the for-loop without having to change myfunction? Obviously I am open to different solutions...
Edit: This problem seems to be straight from the 2nd circle of hell from the R Inferno. Any suggestions?
The reason that using rbind in a loop like this is bad practice, is that in each iteration you enlarge your solution data frame and then copy it to a new object, which is a very slow process and can also lead to memory problems. One way around this is to create a list, whose ith component will store the output of the ith loop iteration. The final step is to call rbind on that list (just once at the end). This will look something like
my.list <- vector("list", nrow(myframe))
for(i in 1:nrow(myframe)){
# Call all necessary commands to create values
my.list[[i]] <- values
}
solution <- rbind(solution, do.call(rbind, my.list))
A bit to long for comment, so I put it here:
If columns is known in advance:
myfunction <- function(frame){
athing = 0
if(columns == 5){
athing = 100
}
else{
athing = 1000
}
value[colums+1] = athing
return(value)}
apply(myframe, 2, myfunction)
If columns is not given via environment, you can use:
apply(myframe, 2, myfunction, columns) with your original myfunction definition.
I have a 2D contingency table in R; it is a table object. I want to transform it into a new table by applying a function on each of its elements.
I looked at sapply, tapply, etc., but they are all aimed at summarising/aggregating the data. I've written my own mapping function which does this, which I reproduce below:
map.table = function(t,fn)
{
rows = dim(t)[1]
columns = dim(t)[2]
x = matrix(nrow=rows, ncol=columns)
rownames(x) = unlist(dimnames(bush.t)[1], use.names=FALSE)
colnames(x) = unlist(dimnames(bush.t)[2], use.names=FALSE)
for(row in seq(from=1, to=rows))
{
for(column in seq(from=1, to=columns))
{
x[row,column] = fn(t[row,column])
}
}
as.table(x)
}
This creates a matrix from scratch, fills up the dimension names, and the elements. Is there a better way of doing this? Is there an R function/package which already does this functionality?
You can probably just use apply:
set.seed(21)
x <- data.frame(a=sample(letters[1:5],20,TRUE),
b=sample(letters[1:5],20,TRUE))
y <- table(x)
z <- as.table(apply(y, 1:2, sqrt))
Why are you going to all this trouble? You should be able to do this:
fn(t)
(But also consider using a different name for your table, since t is a perfectly good base function name. At first I thought you were transposing your rows and columns.)
If your function does not work with vectors there is the possibility you can persuade it to do so. Try this:
vfn <- Vectorize(fn)
t2[] <- vfn(t)