Hi sorry still learning here and slow to learning code arguments.
Just wondering could anyone explain what a certain part of a function means:
x = sum(abs(apply(embed(y, 4), 1, prod)))
It does give the following on paper:
#sum(y|{j}|*y|{j-1}|*y|{j-2}|*|y{j-3}|)
I am wondering what does the 1 do? as I think the (y, 4) means y with y plus 3 lags and prod I know is product
this specific function was wrote for be by I am trying to modify it to equal:
#sum((|y{j}|^3/2)*(|y{j-1}|^3/2)*(|y{j-2}|^3/2)*(|y{j-3}|^3/2))
So basically I am wondering should my modified function to raise the y's to ^3/2 should I compute:
x = sum(abs(apply(embed((y^3/2), 4), 1, prod)))
or to:
x = sum(abs(apply(embed(y, 4), 3/2, prod)))
or another?
Any help?
Thank you in advance for your input
1 is part of the apply function of argument called MARGIN. This is why I advocate specifying argument names. Anyway, apply function will "loop" through rows (1) or columns (2) of data frames, arrays, matrices... An expression or an evaluated object should be passed as X or if you prefer your functions bare, the first argument. If you want to raise y by some amount, you will have to do it like you've showed in one of the lines: y^(3/2).
In other words, this command will sum all elements across rows:
apply(X = my.object, MARGIN = 1, FUN = sum)
or across columns:
apply(X = my.object, MARGIN = 2, FUN = sum)
Related
I have two lists and I want to subtract one list element wise with the other, in order to replicate a Matlab function bsxfun(#minus, lt, lt2). The two lists look something like the below (edit: now works without pracma package):
# Code
# First list
lt = c(list())
# I use these lines to pre-dim the list...
lt[[1]] = c(rep(list(1)))
lt[[2]] = c(rep(list(1)))
# ... such that I can add matrices it this way:
lt[[1]][[1]] = matrix(c(3),nrow=1, ncol=1,byrow=TRUE)
lt[[2]][[1]] = matrix(c(1),nrow=1, ncol=1, byrow=TRUE)
# Same with the second list:
lt2 = c(list())
lt2[[1]] = c(rep(list(1)))
lt2[[2]] = c(rep(list(1)))
lt2[[1]][[1]] = matrix(c(2,2,2),nrow=3, ncol=1,byrow=TRUE)
lt2[[2]][[1]] = matrix(c(1,1,1),nrow=3, ncol=1,byrow=TRUE)
Element wise subtraction would mean that that each row of an element of lt2 would be subtracted
by the respective element of the object lt, i.e., lt2[[1]][[1]] each row by 3, resulting in t(c(-1 -1 -1)).... and lt2[[2]][[1]] = t(c(0,0,0)) by 1 ... It is important to me that the list structure is maintained in the results.
Now I tried using lapply(lt2,"-",lt) but it does not work. Any suggestions?
I suspect you are looking for something like this skeleton code which subtracts 2 lists element-wise...
x <- list(1,2,3)
y <- list(4,5,6)
mapply('-', y, x, SIMPLIFY = FALSE)
but as noted, you need 2 identical lists (or at least R's recycling algorithms must make sense) as for example...
z <- list(4,5,6,7,8,9)
mapply('-',z,x,SIMPLIFY = FALSE)
You might be looking for something like this where you subtract a constant from each member of the list...
mapply('-',y,2, SIMPLIFY= FALSE)
I figured it out - I had another mistake in the question :/
Changing the second class as.numeric worked
lt3 = lapply(lt2[[1]],"-",as.numeric(lt[[1]]))
Imagine I have an array in R with N dimensions (a matrix would be an array with 2 dimensions) and I want to select the rows from 1 to n of my array. I was wondering if there was a syntax to do this in R without knowing the number of dimensions.
Indeed, I can do
x = matrix(0, nrow = 10, ncol = 2)
x[1:5, ] # to take the 5 first rows of a matrix
x = array(0, dim = c(10, 2, 3))
x[1:5, , ] # to take the 5 first rows of a 3D array
So far I haven't found a way to use this kind of writing to extract rows of an array without knowing its number of dimensions (obviously if I knew the number of dimensions I would just have to put as many commas as needed). The following snippet works but does not seem to be the most native way to do it:
x = array(0, dim = c(10, 2, 3, 4)
apply(x, 2:length(dim(x)), function(y) y[1:5])
Is there a more R way to achieve this?
Your apply solution is the best, actually.
apply(x, 2:length(dim(x)), `[`, 1:5)
or even better as #RuiBarradas pointed out (please vote his comment too!):
apply(x, -1, `[`, 1:5)
Coming from Lisp, I can say, that R is very lispy.
And the apply solution is a very lispy solution.
And therefore it is very R-ish (a solution following the functional programming paradigm).
Function slice.index() is easily overlooked (as I know to my cost! see magic::arow()) but can be useful in this case:
x <- array(runif(60), dim = c(10, 2, 3))
array(x[slice.index(x,1) %in% 1:5],c(5,dim(x)[-1]))
HTH, Robin
I want to concatenate iris$SepalLength, so I can use that in a function to get the Sepal Length column from iris data frame. But when I use paste function paste("iris$", colnames(iris[3])), the result is as characters (with quotes), as "iris$SepalLength". I need the result not as a character. I have tried noquotes(), as.datafram() etc but it doesn't work.
freq <- function(y) {
for (i in iris) {
count <-1
y <- paste0("iris$",colnames(iris[count]))
data.frame(as.list(y))
print(y)
span = seq(min(y),max(y), by = 1)
freq = cut(y, breaks = span, right = FALSE)
table(freq)
count = count +1
}
}
freq(1)
The crux of your problem isn't making that object not be a string, it's convincing R to do what you want with the string. You can do this with, e.g., eval(parse(text = foo)). Isolating out a small working example:
y <- "iris$Sepal.Length"
data.frame(as.list(y)) # does not display iris$Sepal.Length
data.frame(as.list(eval(parse(text = y)))) # DOES display iris.$Sepal.Length
That said, I wanted to point out some issues with your function:
The input variable appears to not do anything (because it is immediately overwritten), which may not have been intended.
The for loop seems broken, since it resets count to 1 on each pass, which I think you didn't mean. Relatedly, it iterates over all i in iris, but then it doesn't use i in any meaningful way other than to keep a count. Instead, you could do something like for(count in 1 : length(iris) which would establish the count variable and iterate it for you as well.
It's generally better to avoid for loops in R entirely; there's a host of families available for doing functions to (e.g.) every column of a data frame. As a very simple version of this, something like apply(iris, 2, table) will apply the table function along margin 2 (the columns) of iris and, in this case, place the results in a list. The idea would be to build your function to do what you want to a single vector, then pass each vector through the function with something from the apply() family. For instance:
cleantable <- function(x) {
myspan = seq(min(x), max(x)) # if unspecified, by = 1
myfreq = cut(x, breaks = myspan, right = FALSE)
table(myfreq)
}
apply(iris[1:4], 2, cleantable) # can only use first 4 columns since 5th isn't numeric
would do what I think you were trying to do on the first 4 columns of iris. This way of programming will be generally more readable and less prone to mistakes.
I would like to convert a for cycle into a faster operation such as apply.
Here is my code
for(a in 1:dim(k)[1]){
for(b in 1:dim(k)[2]){
if( (k[a,b,1,1]==0) & (k[a,b,1,2]==0) & (k[a,b,1,3]==0) ){
k[a,b,1,1]<-1
k[a,b,1,2]<-1
k[a,b,1,3]<-1
}
}
}
It's a simple code that does a check on each element of the multidimensional array k and if the three elements are the same and equal to 0, it assigns the value 1.
Is there a way to make it faster?. The matrix k has 1,444,000 elements and it takes too long to run it. Can anyone help?
Thanks
With apply you can return all your 3-combinations as a numeric vector and then check for your specific condition:
# This creates an array with the same properties as yours
array <- array(data = sample(c(0, 1), 81, replace = TRUE,
prob = c(0.9, 0.1)), c(3, 3, 3, 3))
# This loops over all vectors in the fourth dimension and returns a
# vector of ones if your condition is met
apply(array, MARGIN = c(1, 2, 3), FUN = function(x) {
if (sum(x) == 0 & length(unique(x)) == 1)
return(c(1, 1, 1))
else
return(x)
})
Note that the MARGIN argument specifies the dimensions over which to loop. You want the fourth dimension vectors so you specify c(1, 2, 3).
If you then assign this newly created array to the old one, you replaced all vectors where the condition is met with ones.
You should first use the filter function twice (composed), and then the apply (lapply?) function on the filtered array. Maybe you can also reduce the array, because it looks like you're not very interested in the third dimension (always accessing the 1st item). You should probably do some reading about functional programming in R here http://adv-r.had.co.nz/Functionals.html
Note I'm not a R programmer, but I'm quite familiar with functional programming (Haskell etc) so this might give you an idea. This might be faster, but it depends a bit on how R is designed (lazy or eager evaluation etc).
Say I have a function called
myfun <- function(x,y) {median(x,y)} # obviously the actual function is
# something more complicated
Now lets say in a certain use, the y parameter is constant, (say c(1,2,3,4,5)). Is there any way I can pass this into apply without wrapping it in another function?
i.e.
instead of
apply(mydf, 2, function(x) myfun(x, c(1,2,3,4,5)))
to pass something like
apply(mydf, 2, myfun(,(c(1,2,3,4,5))))
This is purely cosmetic and I know it won't make much difference to the running time. I just want to know if an option like this is possible because wrapping my function in a function each time seems inefficient
I think this should work:
apply(mydf, 2, myfun, y = c(1,2,3,4,5))
Remains untested as I can't access R right now.