I have a vector with 49 numeric values. I want to have a 7x7 numeric matrix instead.
Is there some sort of convenient automatic conversion statement I can use, or do I have to do 7 separate column assignments of the correct vector subsets to a new matrix? I hope that there is something like the oposite of c(myMatrix), with the option of giving the number of rows and/or columns I want to have, of course.
Just use matrix:
matrix(vec,nrow = 7,ncol = 7)
One advantage of using matrix rather than simply altering the dimension attribute as Gavin points out, is that you can specify whether the matrix is filled by row or column using the byrow argument in matrix.
A matrix is really just a vector with a dim attribute (for the dimensions). So you can add dimensions to vec using the dim() function and vec will then be a matrix:
vec <- 1:49
dim(vec) <- c(7, 7) ## (rows, cols)
vec
> vec <- 1:49
> dim(vec) <- c(7, 7) ## (rows, cols)
> vec
[,1] [,2] [,3] [,4] [,5] [,6] [,7]
[1,] 1 8 15 22 29 36 43
[2,] 2 9 16 23 30 37 44
[3,] 3 10 17 24 31 38 45
[4,] 4 11 18 25 32 39 46
[5,] 5 12 19 26 33 40 47
[6,] 6 13 20 27 34 41 48
[7,] 7 14 21 28 35 42 49
Related
I have inserted a table from text in R with the command read.table, but I want to multiply the prices in one of its rows with some number. I have tried to do the following:
x=matrx(, nrow=184,ncol=1)
for (i in 1:184){x[i]=c[2,i+1]*z}
where c is a table of 3 columns and 185 rows and z is a number, say 10. I can not make this multiplication, why is this?
Also, should I insert the tables as matrices or it is the same? If no, is there a way to convert them somehow or insert them with some other command instead of read.table?
In R, a matrix is just a vector of numbers that has dimensions. The matrix function adds the dimensions to the vector. The values in the vector fill each column, in order, in the matrix. Consider:
> x = 5:16
> x
[1] 5 6 7 8 9 10 11 12 13 14 15 16
> xMatrix = matrix(x, nrow = 4)
> xMatrix
[,1] [,2] [,3]
[1,] 5 9 13
[2,] 6 10 14
[3,] 7 11 15
[4,] 8 12 16
Now, values in the vector and the matrix can be accessed using indices:
> x[10:11]
[1] 14 15
> xMatrix[2:3,3]
[1] 14 15
Interestingly, the matrix is still a vector (it's just a vector with dimensions), and can be treated as such:
> xMatrix[10:11]
[1] 14 15
Multiplication in R is vectorized. If you multiply two vectors together, the first value in each vector is multiplied together, the second values in each vector is multiplied together, etc. So:
> x*x
[1] 25 36 49 64 81 100 121 144 169 196 225 256
If one vector is shorter than the other, the short vector is "recycled:"
> x * 1:2
[1] 5 12 7 16 9 20 11 24 13 28 15 32
If you use a single value (a vector of length = 1), then that value is recycled:
> x * 10
[1] 50 60 70 80 90 100 110 120 130 140 150 160
So, putting it all together, you can multiply each value in an entire matrix by a constant simply by using the code:
> xMatrix * 2
[,1] [,2] [,3]
[1,] 10 18 26
[2,] 12 20 28
[3,] 14 22 30
[4,] 16 24 32
If you want to multiply only one row of a vector by a value, and update the contents of the vector, use:
xMatrix[2,] = xMatrix[2,] * 10
to update the second row. Result:
> xMatrix
[,1] [,2] [,3]
[1,] 50 9 13
[2,] 600 100 140
[3,] 70 11 15
[4,] 80 12 16
I have a numeric vector with integers which:
I want to transform into "bins".
I want these bins to be used as sample frames from which I can then sample again, uniformly.
So far I can do both using findInterval but I am looking for a way to do it with cut.
Let's consider a random vector with integers which will be split in equally sized intervals of length 2:
df = sample(1:100,10)
df
[1] 81 11 38 95 45 14 10 61 96 88
Using findInterval I get the bins and a approximate way for sampling:
breaks = seq(1,max(df+1),by=10)
b <- findInterval(df, breaks)
b
[1] 9 2 4 10 5 2 1 7 10 9
# If b is equal to 1 or 100, then use ifelse() to prevent leaking outside [1,100]
sam <- round(runif(10,ifelse(b==1,10*b-9,10*b-10),ifelse(b==10,10*b,10*b+10)))
sam
[1] 85 14 39 94 50 16 7 63 93 85
Using cut I get the intervals:
breaks = seq(1,max(df+1),by=10)
cut(df,breaks,right=TRUE)
[1] (71,81] (1,11] (31,41] <NA> (41,51] (11,21] (1,11] (51,61] <NA> (81,91] Levels: (1,11] (11,21] (21,31] (31,41] (41,51] (51,61] (61,71] (71,81] (81,91]
But I don't know how to use those values as intervals from which to sample.
If there is another approach, I would be interested to know!
Good Question! I will give you a completely different approach.
So basically you want to perform Latin Hypercube sampling, i.e. stratified uniform sampling in the interval [0,100] with each bin of 10.
For this, it would be easier to download lhs package and use randomLHS function to perform stratified sampling.
First step: Generate uniform draws from every 10 quartiles (strata) as many times as you want. In this example, let's do 5 times:
library(lhs)
randomLHS(10, 5)
> X
[,1] [,2] [,3] [,4] [,5]
[1,] 0.92154144 0.22185959 0.49953326 0.66248165 0.79035832
[2,] 0.47571700 0.05894016 0.55883326 0.34875162 0.98831829
[3,] 0.57738486 0.64525528 0.04955733 0.50939147 0.46297294
[4,] 0.17578838 0.83843074 0.27138703 0.87421301 0.16401042
[5,] 0.03850768 0.40746004 0.69518073 0.23487653 0.55537945
[6,] 0.83942905 0.52957416 0.84952231 0.14031915 0.84956654
[7,] 0.22802502 0.79911728 0.76789194 0.09788194 0.08667802
[8,] 0.61821268 0.93088726 0.30789950 0.95831993 0.36903120
[9,] 0.70391230 0.11445154 0.97976851 0.42027836 0.61097786
[10,] 0.31385709 0.33557430 0.18389684 0.70124986 0.27601550
Second step: Although the output of X is stratified, the columns are still unsorted. Therefore, when we show the final stratified draws, we sort them.
Y <- apply(X,2, function(x) sort(round(x*100)))
> Y
[,1] [,2] [,3] [,4] [,5]
[1,] 4 6 5 10 9
[2,] 18 11 18 14 16
[3,] 23 22 27 23 28
[4,] 31 34 31 35 37
[5,] 48 41 50 42 46
[6,] 58 53 56 51 56
[7,] 62 65 70 66 61
[8,] 70 80 77 70 79
[9,] 84 84 85 87 85
[10,] 92 93 98 96 99
NB: I have done rounding only for convenience to make it obvious but no need to call round function if you are happy to have non-integer draws as output).
I have extracted tables from pdf file with tabulizer package. After extracting tables I want to rbind different tables extracted as list with different length.
table1 <- extract_tables("\\AC002_2017.pdf")
final <- do.call(rbind, table1)
But it gives me following error
Error in (function (..., deparse.level = 1) :
number of columns of matrices must match (see arg 2)
How can I rbind it?
Format of data is as follows
[[1]] [,1] [,2] [,3] [,4]
[1,] 20 45 34 34
[2,] 23 34 67 43
[3,] 22 23 42 34
[4,] 45 44 56 54
[5,] 12 11 12 14
[6,] 34 33 45 32
In R, I want to make a for loop in which I want to select the n lowest values, then the n lowest values excluding lowest value, then the n lowest values excluding the 2 lowest values etc.
Here's an example to clarify:
set.seed(1)
x <- round(rnorm(10,20,15))
n <- 4
I want to get:
7 8 11 15
8 11 15 23
11 15 23 25
15 23 25 27
23 25 27 29
25 27 29 31
27 29 31 44
I tried the following code, but then I do not get the last row (does not include last/highest value). I could get this by adding another code line in the for loop, but was wondering whether this could be done more efficient.
y <- matrix(data=NA, nrow=length(x)+1-n, ncol=n)
for (i in 1:(length(x)-n)) {y[i,] <- sort(x)[i:(i+n-1)]}
Thanks
set.seed(1)
x <- round(rnorm(10,20,15))
n <- 4
Get the pattern:
rbind(sort(x)[1:4], sort(x)[2:5], sort(x)[3:6], sort(x)[4:7], sort(x)[5:8], sort(x)[6:9], sort(x)[6:9], sort(x)[7:10])
Now, use dynamic programming in R to finish (in the general case):
matrix(c( sapply(1:(length(x)+1-n), function(i) sort(x)[i:(i+3)] )),nrow=length(x)+1-n, byrow=TRUE)
[,1] [,2] [,3] [,4]
[1,] 7 8 11 15
[2,] 8 11 15 23
[3,] 11 15 23 25
[4,] 15 23 25 27
[5,] 23 25 27 29
[6,] 25 27 29 31
[7,] 27 29 31 44
The most perfect one:
t(sapply(1:(length(x)+1-n), function(i) sort(x)[i:(i+3)] ))
[,1] [,2] [,3] [,4]
[1,] 7 8 11 15
[2,] 8 11 15 23
[3,] 11 15 23 25
[4,] 15 23 25 27
[5,] 23 25 27 29
[6,] 25 27 29 31
[7,] 27 29 31 44
Note that sapply provides columnwise outputs, hence a transpose finished the inconvinience.
Note to Rob: Apply family (apply, mapply, sapply, tapply etc.) overrides for. Hence, you should use this family as long as possible.
I have following problem:
myvec <- c(1:3)
mymat <- as.matrix(cbind(a = 6:15, b = 16:25, c= 26:35))
mymat
a b c
[1,] 6 16 26
[2,] 7 17 27
[3,] 8 18 28
[4,] 9 19 29
[5,] 10 20 30
[6,] 11 21 31
[7,] 12 22 32
[8,] 13 23 33
[9,] 14 24 34
[10,] 15 25 35
I want to multiply the mymat with myvec and construct new vector such that
sum(6*1, 16*2, 26*3)
sum(7*1, 17*2, 27*3)
....................
sum(15*1, 25*2, 35*3)
Sorry, this is simple question that I do not know...
Edit: typo corrected
The %*% operator in R does matrix multiplication:
> mymat %*% myvec
[,1]
[1,] 116
[2,] 122
...
[10,] 170
An alternative, but longer way can be this one:
rowSums(t(apply(mymat, 1, function(x) myvec*x)),na.rm=T)
Is the only way that I found that can ignore NA's inside the matrix.
Matrices are vectors in column major order:
colSums( t(mymat) * myvec )
(Edited after hopefully reading question correctly this time.)