I want to use the function "nestedness(M)" from the "bipartite" R package. It calculates an index from a matrix (M). I have an array with 1000 matrices and I want to apply this function 1000 times varying the input matrix file each time. I have tried apply family functions but the solution does not come this way. I don't know how to vary the input of a function when it is not a number but a matrix. Any aid to put me on the way would be very very appreciated.
Lets say you have an array that is 3x3x3 ie 3 matrices that are each 3
rows and 3 columns. The dimensions of an array are c("row", "column",
"slice"). You can use apply over any of these dimensions. In your
case over the 3rd dimension will calculate your function over each
matrix. Here is the example array:
a <- array(1:27, dim = c(3,3,3))
Now calculate the max function for each slice (dimension 3) of the array
apply(a, 3, max)
[1] 9 18 27
Related
I created one dimensional array of random numbers between 0 and 1 using
myData <- runif(1000, 0.0, 1.0);
How can I create an n dimensional array of 1000 nodes. for example a 10 dimensional random points
The function array() will create an array of arbitrary number of dimensions. The dim argument specifies the dimensions. So, to create an array of 1000 points in a 10x10x10 array, we use
a <- array(runif(1000), dim=c(10,10,10))
This can be extrapolated to any number of dimensions you wish.
For example
a <- array(runif(1000), dim=c(2,2,2,5,5,5))
creates a 6 dimensional array with 1000 (=2*2*2*5*5*5) points. There is no way to decompose 1000 points into a 10 dimensional array, unless some of the dimensions have length 1.
To access values of the array, you can use standard subsetting with [ ], and specify the correct number of dimensions within the brackets. E.g.
a[1,2,1,5,3,2]
# [1] 0.3232738
It is worth noting that a matrix in R is simply a special case of an array with two dimensions.
I'm trying to fill a 10 x 1500 matrix with a loop.
I have to fill that matrix with 150 small 10 x 10 matrixes. I have tried to implement this with a double loop, but unsuccessfully. My problem is that each 10*10 matrix is the result of a scalar product.
At the begin it seems to be easy, but then I realized I couldn't figure out the sizes of the 10 x 1500 matrix with the 150 small 10*10 matrixes.
Here is what I did:
es_var is a 1 x 150 matrix, which I converted to a vector to simplify the scalar product (at least in my opinion).
diax is a 10 x 10 matrix.
I want to multiply each value of the es_var vector per the whole diag 10*10 matrix.
I am having troubles because I don't manage to input R in filling 10 rows per time. Thus in the end I get a 10*1500 matrix, but it is the same 10*10 time matrix repeated 150 times.
Here is my code
es_var1 = as.vector(es_var)
v = matrix(0, 10, 10*N)
for (i in 1:N){
v[,] = es_var1[i] * diax
}
Can somebody help in figuring out this, please? I spent the whole day trying it. And I need to do that without using in build functions since this is a small part of a big math demonstration I have to implement.
If I understand your requirement correctly, you can accomplish this with the following line:
v <- matrix(diax,10,1500)*rep(es_var1,each=100);
This constructs a 10x1500 matrix with the 10x10 diax matrix as the initial values, cycled sufficiently to cover the complete 10x1500 size. Then, to apply the es_var1 multiplication, you can replicate each of its elements 100 times, such that they will naturally align with each consecutive 10x10 small matrix during vectorized multiplication.
I'm a novice R user, who's learning to use this coding language to deal with data problems in research. I am trying to understand how knowledge evolves within an industry by looking at patenting in subclasses. So far I managed to get the following:
# kn.matrices<-with(patents, table(Class,year,firm))
# kn.ind <- with(patents, table(Class, year))
patents is my datafile, with Subclass, app.yr, and short.name as three of the 14 columns
# for (k in 1:37)
# kn.firms = assign(paste("firm", k ,sep=''),kn.matrices[,,k])
There are 37 different firms (in the real dataset, here only 5)
This has given 37 firm-specific and 1 industry-specific 2635 by 29 matrices (in the real dataset). All firm-specific matrices are called firmk with k going from 1 until 37.
I would like to perform many operations in each of the firm-specific matrices (e.g. compare the numbers in app.yr 't' with the average of the 3 previous years across all rows) so I am looking for a way that allows me to loop the operations for every matrix named firm1,firm2,firm3...,firm37 and that generates new matrices with consistent naming, e.g. firm1.3yearcomparison
Hopefully I framed this question in an appropriate way. Any help would be greatly appreciated.
Following comments I'm trying to add a minimal reproducible example
year<-c(1990,1991,1989,1992,1993,1991,1990,1990,1989,1993,1991,1992,1991,1991,1991,1990,1989,1991,1992,1992,1991,1993)
firm<-(c("a","a","a","b","b","c","d","d","e","a","b","c","c","e","a","b","b","e","e","e","d","e"))
class<-c(1900,2000,3000,7710,18000,19000,36000,115000,212000,215000,253600,383000,471000,594000)
These three vectors thus represent columns in a spreadsheet that forms the "patents" matrix mentioned before.
it looks like you already have a 3 dimensional array with all your data. You can basically view this as your 38 matrices all piled one on top of the other. You don't want to split this into 38 matrices and use loops. Instead, you can use R's apply function and extraction functions. Just view the help topic on the apply() family and it should show you how to do what you want. Here are a few basic examples
examples:
# returns the sums of all columns for all matrices
apply(kn.matrices, 3, colSums)
# extract the 5th row of all matrices
kn.matrices[5, , ]
# extract the 5th column of all matrices
kn.matrices[, 5, ]
# extract the 5th matrix
kn.matrices[, , 5]
# mean of 5th column for all matrices
colMeans(kn.matrices[, 5, ])
I have created two vectors in R, using statistical distributions to build the vectors.
The first is a vector of locations on a string of length 1000. That vector has around 10 values and is called mu.
The second vector is a list of numbers, each one representing the number of features at each location mentioned above. This vector is called N.
What I need to do is generate a random distribution for all features (N) at each location (mu)
After some fiddling around, I found that this code works correctly:
for (i in 1:length(mu)){
a <- rnorm(N[i],mu[i],20)
feature.location <- c(feature.location,a)
}
This produces the right output - a list of numbers of length sum(N), and each number is a location figure which correlates with the data in mu.
I found that this only worked when I used concatenate to get the values into a vector.
My question is; why does this code work? How does R know to loop sum(N) times but for each position in mu? What role does concatenate play here?
Thanks in advance.
To try and answer your question directly, c(...) is not "concatenate", it's "combine". That is, it combines it's argument list into a vector. So c(1,2,3) is a vector with 3 elements.
Also, rnorm(n,mu,sigma) is a function that returns a vector of n random numbers sampled from the normal distribution. So at each iteration, i,
a <- rnorm(N[i],mu[i],20)
creates a vector a containing N[i] random numbers sampled from Normal(mu[i],20). Then
feature.location <- c(feature.location,a)
adds the elements of that vector to the vector from the previous iteration. So at the end, you have a vector with sum(N[i]) elements.
I guess you're sampling from a series of locations, each a variable no. of times.
I'm guessing your data looks something like this:
set.seed(1) # make reproducible
N <- ceiling(10*runif(10))
mu <- sample(seq(1000), 10)
> N;mu
[1] 3 4 6 10 3 9 10 7 7 1
[1] 206 177 686 383 767 496 714 985 377 771
Now you want to take a sample from rnorm of length N(i), with mean mu(i) and sd=20 and store all the results in a vector.
The method you're using (growing the vector) is not recommended as it will be re-copied in memory each time an element is added. (See Circle 2, although for small examples like this, it's not so important.)
First, initialize the storage vector:
f.l <- NULL
for (i in 1:length(mu)){
a <- rnorm(n=N[i], mean=mu[i], sd=20)
f.l <- c(f.l, a)
}
Then, each time, a stores your sample of length N[i] and c() combines it with the existing f.l by adding it to the end.
A more efficient approach is
unlist(mapply(rnorm, N, mu, MoreArgs=list(sd=20)))
Which vectorizes the loop. Unlist is used as mapply returns a list of vectors of varying lengths.
I am trying to get a matrix that contains the distances between the points in two lists.
The vector of points contain the latitude and longitude, and the distance can be calculated between any two points using the function distCosine in the geosphere package.
> Points_a
lon lat
1 -77.69271 45.52428
2 -79.60968 43.82496
3 -79.30113 43.72304
> Points_b
lon lat
1 -77.67886 45.48214
2 -77.67886 45.48214
3 -77.67886 45.48214
4 -79.60874 43.82486
I would like to get a matrix out that would look like:
d_11 d_12 d_13
d_21 d_22 d_23
d_31 d_32 d_33
d_41 d_42 d_43
I am struggling to think of a way to generate the matrix without just looping over Points_a and Points_b and calculating each combination, can anyone suggest a more elegant solution?
You can use this:
outer(seq(nrow(Points_a)),
seq(nrow(Points_b)),
Vectorize(function(i, j) distCosine(Points_a[i,], Points_b[j,]))
)
(based on tip by #CarlWitthoft)
According to the desired output you post, maybe you'll want the transpose t() of this, or simply replace _a with _b above.
EDIT: some explanation:
seq(nrow(Points_x)): creates a sequence from 1 to the number of rows of Points_x;
distCosine(Points_a[i,], Points_b[j,]): expression to compute the distance between points given by row i of Points_a and row j of Points_b;
function(i, j): makes the above an unnamed function in two parameters;
Vectorize(...): ensure that, given inputs i and j of length greater than one, the unnamed function above is called only once for each element of the vectors (see this for more info);
outer(x, y, f): creates "expanded" vectors x and y such that all combinations of its elements are present, and calls f using this input (see link above). The result is then reassembled into a nice matrix.