I have a probability distribution X and I would like to create samples of 100 observations:
I use sample(X,size=100,replace=TRUE) I would like to plot the sample mean PDF on 100,1000,10000 samples so I tried to create matrices of observations usingmatrix(sample(X,size=100,replace=TRUE),nrow=100,ncol=100) but it would generate the same sample in all columns. Any ideas on how to create a new sample for each column?
how about this? Substitute rnorm with your sample call. This will take a new sample for each column
replicate(3,rnorm(10))
# [,1] [,2] [,3]
# [1,] -0.439366440511456290974 0.349113310500896667499 2.10467702915785226381
# [2,] 0.788892611945899879800 0.572377925929974273878 0.92566383997665424577
# [3,] 0.098359807623723205516 -0.642162545019581476602 0.28636140673186011307
# [4,] -3.063133170307587249681 1.322694510750672014510 0.66340500173312999532
# [5,] 0.255018412772398617161 1.492588176987205361712 1.11444057062233659039
# [6,] -1.069621910039232570711 -1.460604130070508821504 -0.81534768620081377044
# [7,] -1.036421328330551894226 1.525817374339748067058 0.47070620500783272311
# [8,] -0.139135286049327872027 -0.065015174557339946992 0.21483758566831215320
# [9,] -0.370005496738202488416 1.573987068922320986530 -1.21431499328084857581
#[10,] -0.070508137614489943545 1.657541962601124518883 0.45886687983031809734
Related
I have a dataset which has 2 variables and want to write an R function as follows: If I input one pair (one row) arbitrarily from the dataset into this function, I want to extract the corresponding index of the dataset through the function.
I used,
data = rnorm2d(10,rho = 0.4)
x = c(data[10,1],data[10,2])
print(match(x, data))
generated dataset is:
[,1] [,2]
[1,] -0.1792099 1.3007178
[2,] 0.3280193 0.6615251
[3,] -0.4390389 -1.9611801
[4,] -1.3096660 -0.9117184
[5,] 0.5165317 -0.3229271
[6,] -1.0963584 -1.1492360
[7,] 0.3447118 0.5357070
[8,] -0.8919166 0.4934032
[9,] -0.2199690 0.5788579
[10,] -0.9864628 0.6880458
But this gave me an output as follows:
[1] 10 20
I want to get coefficient of regression for each data frame in a list of dataframes with a rolling period but somehow I am getting very different result from what I am looking for.
I have tried the following code:
my data looks like this
library("zoo") ## for rollapply()
data <- list(mtcars,mtcars,mtcars)
fapplyFunction <- function(x){
coef(lm(mpg ~ drat, data=as.data.frame(x)))}
coef_list <- lapply(data, rollapply, 20, fapplyFunction, partial = FALSE, by.column = FALSE)
I wish to get regression result for each element for rolling windows as a list ,which I can later bind
I am new to R. Any help would be much appreciated.
Providing a data.frame as the first rollapply argument will apply FUN to every column of the data.frame separately. Operating on data from two columns simultaneously can be achieved by moving a rolling window across the the sequence of row numbers in the data.frame.
lapply(data, function(x)
rollapply(1:nrow(x), 20, function(i) coef(lm(mpg ~ drat, data = x[i, ]))))
#[[1]]
# (Intercept) drat
# [1,] -11.70889 8.981350
# [2,] -12.09923 9.124252
# [3,] -11.47530 9.015324
# [4,] -11.91551 9.124458
# [5,] -12.51405 9.094820
# [6,] -12.10843 8.994363
# [7,] -15.57941 9.937651
# [8,] -14.06719 9.511583
# [9,] -14.42693 9.684131
#[10,] -11.68393 8.789089
#[11,] -12.12158 8.954089
#[12,] -13.12850 9.243443
#[13,] -12.81957 9.095040
#
#[[2]]
# (Intercept) drat
# [1,] -11.70889 8.981350
# [2,] -12.09923 9.124252
# [3,] -11.47530 9.015324
# [4,] -11.91551 9.124458
# [5,] -12.51405 9.094820
# [6,] -12.10843 8.994363
# [7,] -15.57941 9.937651
# [8,] -14.06719 9.511583
# [9,] -14.42693 9.684131
#[10,] -11.68393 8.789089
#[11,] -12.12158 8.954089
#[12,] -13.12850 9.243443
#[13,] -12.81957 9.095040
#
#[[3]]
# (Intercept) drat
# [1,] -11.70889 8.981350
# [2,] -12.09923 9.124252
# [3,] -11.47530 9.015324
# [4,] -11.91551 9.124458
# [5,] -12.51405 9.094820
# [6,] -12.10843 8.994363
# [7,] -15.57941 9.937651
# [8,] -14.06719 9.511583
# [9,] -14.42693 9.684131
#[10,] -11.68393 8.789089
#[11,] -12.12158 8.954089
#[12,] -13.12850 9.243443
#[13,] -12.81957 9.095040
I have a list of matrices and a list of vectors, and I want to divide the columns of each matrix with the corresponding vector element.
For example, given
set.seed(230)
data <- list(cbind(c(NA, rnorm(6)),c(rnorm(6),NA)), cbind(runif(7), runif(7)))
divisors <- list(c(0.5,2), c(3,4))
I'm looking for a vectorized function that produces output that looks the same as
for(i in 1:length(data)){
for(j in 1:ncol(data[[i]])){data[[i]][,j] <- data[[i]][,j] / divisors[[i]][j]}
}
i.e.
[[1]]
[,1] [,2]
[1,] NA 0.28265752
[2,] -0.46967014 -0.07132588
[3,] 0.20253439 -0.37432527
[4,] 0.65736410 0.06630705
[5,] 0.72349294 0.67202129
[6,] 0.88532648 -0.80892508
[7,] 0.08162027 NA
[[2]]
[,1] [,2]
[1,] 0.26597435 0.18120979
[2,] 0.31213250 0.16493883
[3,] 0.19250804 0.14104145
[4,] 0.21196882 0.10172964
[5,] 0.10389773 0.04979742
[6,] 0.02754329 0.15064043
[7,] 0.25771766 0.23042586
The closest I have been able to come is
Map(`/`, data, divisors)
But that divides rows (rather than columns) of the matrix by the vector. Any help appreciated.
Transpose your matrices before and after:
lapply(Map(`/`, lapply(data, t), divisors), t)
# [[1]]
# [,1] [,2]
# [1,] NA 0.28265752
# [2,] -0.46967014 -0.07132588
# [3,] 0.20253439 -0.37432527
# [4,] 0.65736410 0.06630705
# [5,] 0.72349294 0.67202129
# [6,] 0.88532648 -0.80892508
# [7,] 0.08162027 NA
#
# [[2]]
# [,1] [,2]
# [1,] 0.26597435 0.18120979
# [2,] 0.31213250 0.16493883
# [3,] 0.19250804 0.14104145
# [4,] 0.21196882 0.10172964
# [5,] 0.10389773 0.04979742
# [6,] 0.02754329 0.15064043
# [7,] 0.25771766 0.23042586
I prefer the transpose approach above, but another option is to expand your divisor vectors into matrices of the same dimensions as in data:
div_mat = Map(matrix, data = divisors, nrow = sapply(data, nrow), ncol = 2, byrow = T)
Map("/", data, div_mat)
so basicly I want to separate a random generated matrix into 2 matrix, 1 for training and 1 for testing.
a <- s[sample(nrow(s),size=3,replace=FALSE),]
b <- s[-a,]
> s
[,1] [,2]
[1,] 0.69779187 -0.75869384
[2,] -0.46857477 -0.33813598
[3,] 0.53903809 -0.95950598
[4,] -0.33312675 -0.49951164
[5,] 0.88500834 0.08256923
[6,] 0.63664652 0.87420720
[7,] 0.61614134 0.77893294
[8,] 0.36956134 0.07586245
[9,] -0.03678593 -0.23743987
[10,] -0.27057064 -0.86067063
> a
[,1] [,2]
[1,] 0.8850083 0.08256923
[2,] 0.6366465 0.87420720
[3,] -0.2705706 -0.86067063
> b
[,1] [,2]
The idea here is generate a 10*2 matrix, and random pick 3 rows as training data from matrix, then output the training matrix and the rest row of matrix as testing matrix.
Does anyone has some suggestions on how to delete a from s?
The issue is that you're trying to index s with a matrix a, rather than the randomly selected indices. Modifying your code to the following should do the trick:
i <- sample(nrow(s),size=3,replace=FALSE)
a <- s[i,]
b <- s[-i,] # Note the indexing with i, rather than a
I have a matrix with 2 columns, and I'd like to turn it into a matrix with specified dimensions.
> t <- matrix(rnorm(20), ncol=2, nrow=10)
[,1] [,2]
[1,] 1.4938530 1.2493088
[2,] -0.8079445 1.8715868
[3,] 0.5775695 -0.9277420
[4,] 0.4415969 2.6357908
[5,] 0.3209226 -1.1306049
[6,] 0.5109251 -0.8661100
[7,] 1.9495571 0.2092941
[8,] 0.7816373 1.1517466
[9,] 0.0300595 -0.1351532
[10,] 0.7550894 0.7778869
What I'd like to do is something like:
> tt <- matrix(t, ncol=4, nrow=5)
[,1] [,2] [3,] [4,]
[1,] 1.4938530 1.2493088 -0.8079445 1.8715868
[2,] 0.5775695 -0.9277420 0.4415969 2.6357908
[3,] etc.
I tried to do things with modulo but my head hurts too much for me to try even one more minute.
You can transpose your first matrix, so that data is stored in the order you want, and then fill the second matrix by row:
tt <- matrix(t(t), ncol=4, nrow=5, byrow = T)
t
# [,1] [,2]
# [1,] -1.4162465950 0.01532476
# [2,] -0.2366332875 -0.04024386
# [3,] 0.5146631983 -0.34720239
# [4,] 1.9243922633 -0.24016160
# [5,] 1.6161165230 0.63187438
# [6,] -0.3558181508 -0.73199138
# [7,] 0.7459405376 0.01934826
# [8,] -1.0428581093 -2.04422042
# [9,] 0.0003166344 0.98973993
#[10,] 0.6390745275 -0.65584930
tt
# [,1] [,2] [,3] [,4]
# [1,] -1.4162465950 0.01532476 -0.2366333 -0.04024386
# [2,] 0.5146631983 -0.34720239 1.9243923 -0.24016160
# [3,] 1.6161165230 0.63187438 -0.3558182 -0.73199138
# [4,] 0.7459405376 0.01934826 -1.0428581 -2.04422042
# [5,] 0.0003166344 0.98973993 0.6390745 -0.65584930
When you work with matrix in R, you can think of it as a vector with data stored column by column. So extracting data by row from a matrix is not as straight forward as extracting by column which is essentially how data is stored. After transposing the first matrix, the data will be stored in an order you want to extract and then fill the second matrix by row would be straight forward.