Different regression output using dynlm and lm - r

I ran a regression first using lm and then using dynlm(from the package dynlm). Here is what I did using lm:
Euribor3t <- ts(diff(Euribor3))
OIS3t <- ts(diff(Ois3))
x <- ts(diff(Eurepo3-Ois3))
Vstoxxt <- ts(diff(Vstoxx))
CDSt <- ts(diff(CDS))
omo2 <- ts(diff(log(Open.Market.Operations)))
l1 <- (lag(Euribor3t, k=-1))
axx <- ts.intersect(Euribor3t, OIS3t, x, Vstoxxt, CDSt, omo2, l1)
reg1 <- lm(Euribor3t~OIS3t+CDSt+x+Vstoxxt+omo2+l1, data=axx)
summary(reg1)
and for dynlm:
zooX = zoo(test[, -1])
lmx <- dynlm(d(Euribor3)~d(Ois3)+d(CDS)+d(Eurepo3-Ois3)+d(Vstoxx)+d(log(Open.Market.Operations))+d(L(Euribor3, 1)), data=zooX)
summary(lmx)
These two approaches give me exact the same output. However if I add a subset to both regressions from 1 to 24 (all else equal):
Euribor3t <- ts(diff(Euribor3))
OIS3t <- ts(diff(Ois3))
x <- ts(diff(Eurepo3-Ois3))
Vstoxxt <- ts(diff(Vstoxx))
CDSt <- ts(diff(CDS))
omo2 <- ts(diff(log(Open.Market.Operations)))
l1 <- (lag(Euribor3t, k=-1))
axx <- ts.intersect(Euribor3t, OIS3t, x, Vstoxxt, CDSt, omo2, l1)
reg1 <- lm(Euribor3t~OIS3t+CDSt+x+Vstoxxt+omo2+l1, data=axx, subset=1:24)
summary(reg1)
zooX = zoo(test[, -1])
lmx <- dynlm(d(Euribor3)~d(Ois3)+d(CDS)+d(Eurepo3-Ois3)+d(Vstoxx)+d(log(Open.Market.Operations))+d(L(Euribor3, 1)), data=zooX[1:24])
summary(lmx)
The two outputs differ from each other. What might be the problem causing the deviation in my regression outputs?
Here is the data sample I experimented with:
Date Euribor3 Ois3 Eurepo3 Vstoxx CDS Open.Market.Operations
1 03.01.2005 2.154 2.089 2.09 14.47 17.938 344999
2 04.01.2005 2.151 2.084 2.09 14.51 17.886 344999
3 05.01.2005 2.151 2.087 2.08 14.42 17.950 333998
4 06.01.2005 2.150 2.085 2.08 13.80 17.950 333998
5 07.01.2005 2.146 2.086 2.08 13.57 17.913 333998
6 10.01.2005 2.146 2.087 2.08 12.92 17.958 333998
7 11.01.2005 2.146 2.089 2.08 13.68 17.962 333998
8 12.01.2005 2.145 2.085 2.08 14.05 17.886 339999
9 13.01.2005 2.144 2.084 2.08 13.64 17.568 339999
10 14.01.2005 2.144 2.085 2.08 13.57 17.471 339999
11 17.01.2005 2.143 2.085 2.08 13.20 17.365 339999
12 18.01.2005 2.144 2.085 2.08 13.17 17.214 347999
13 19.01.2005 2.143 2.086 2.08 13.63 17.143 354499
14 20.01.2005 2.144 2.087 2.08 14.17 17.125 354499
15 21.01.2005 2.143 2.087 2.08 13.96 17.193 354499
16 24.01.2005 2.143 2.086 2.08 14.11 17.283 354499
17 25.01.2005 2.144 2.086 2.08 13.63 17.083 354499
18 26.01.2005 2.143 2.086 2.08 13.32 17.348 347999
19 27.01.2005 2.144 2.085 2.08 12.46 17.295 352998
20 28.01.2005 2.144 2.084 2.08 12.81 17.219 352998
21 31.01.2005 2.142 2.084 2.08 12.72 17.143 352998
22 01.02.2005 2.142 2.083 2.08 12.36 17.125 352998
23 02.02.2005 2.141 2.083 2.08 12.25 17.000 357499
24 03.02.2005 2.144 2.088 2.08 12.38 16.808 357499
25 04.02.2005 2.142 2.084 2.08 11.60 16.817 357499
26 07.02.2005 2.142 2.084 2.08 11.99 16.798 359999
27 08.02.2005 2.141 2.083 2.08 11.92 16.804 355500
28 09.02.2005 2.142 2.080 2.08 12.19 16.589 355500
29 10.02.2005 2.140 2.080 2.08 12.04 16.500 355500
30 11.02.2005 2.140 2.078 2.08 11.99 16.429 355500
31 14.02.2005 2.139 2.078 2.08 12.52 16.042 355500

You are not allowing dynlm to use the same amount of data as in lm. The latter model contains two fewer observations.
dim(model.frame(reg1))
# [1] 24 7
dim(model.frame(lmx))
# [1] 22 7
The reason is that withlm you are transforming the variables (differencing) with the entire data set (31 observations), while in dynlm you are passing only 24 observations and, hence, dynlm will do the differencing with 24 observations. Due to the observations that are lost after differencing, the resulting number of rows is not the same in both cases.
In dylm you should use data=zooX[1:26]. In this way the same subset is used and the same result is obtained:
reg1 <- lm(Euribor3t~OIS3t+CDSt+x+Vstoxxt+omo2+l1, data=axx, subset=1:24)
lmx <- dynlm(d(Euribor3)~d(Ois3)+d(CDS)+d(Eurepo3-Ois3)+d(Vstoxx)+
d(log(Open.Market.Operations))+d(L(Euribor3, 1)), data=zooX[1:26])
all.equal(as.vector(fitted(reg1)), as.vector(fitted(lmx)))
# [1] TRUE
all.equal(coef(reg1), coef(lmx), check.attributes=FALSE)
# [1] TRUE

Related

How to calculate average for columns?

I need to find the average of every 6 months, starting from v1 to v15. Now that i know that there are v15 columns hence its working with my below code. But there will more than 15 columns and I need a generic code that can solve the purpose.
Logic i am using is: taking the average of columns - 1:6 and printing, then 2:7 and so on- till 15, as i know there are 15 columns. But there will more columns in actual.
csv file:
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13 V14 V15
1 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
2 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.11 0.04 0.04 0.04 0.04 0.04 0.04 0.04
3 3.29 3.56 3.97 3.23 2.96 2.35 0.06 1.72 2.19 1.92 1.84 2.87 2.57 2.24 3.06
4 11.79 15.01 14.76 13.19 18.29 4.51 16.24 11.92 10.49 13.05 12.74 12.95 12.25 14.46 14.27
5 20.11 21.76 21.92 23.67 19.87 25.59 23.04 16.67 22.78 21.32 20.85 21.57 21.99 22.69 22.96
6 24.85 26.56 29.45 24.96 25.91 16.31 27.51 22.56 28.35 26.96 26.53 28.23 28.24 29.85 29.79
7 29.02 32.75 29.95 27.7 29.6 17.91 32.08 25.71 33.16 31.56 30.89 32.68 34.05 36.26 33.27
8 32.83 33.09 17.03 33.23 31.22 39.71 35.43 28.77 37.09 34.18 34.05 36.98 37.16 38.74 37.32
9 32.86 36.34 35.47 33.6 35 42.79 37.22 30.62 38.74 35.83 36.17 39.48 39.18 42.87 39.54
10 36.02 37.66 36.15 34.79 36.84 22.19 38.9 32.62 40.28 37.87 38.09 41.04 41.62 44.94 42.18
11 36.96 39.22 19.13 36.68 37.43 46.26 40.84 33.88 41.31 39.09 39.14 43.46 42.75 47.2 43.8
12 37.34 40.87 35.91 37.66 39.22 46.95 42.26 35.19 42.93 41 40.61 44.73 45.2 48.14 44.49
13 38.92 38.37 41.01 39.01 41 48.89 43.8 37.16 44.1 42.46 41.3 45.47 46.65 50.48 47.6
14 21.67 43.16 20.98 39.84 42 49.62 44.35 37.46 44.63 43.15 42.64 48.48 48.53 53.55 48.57
a <- t(apply(mat,1,function(x){ c(mean(x[1:6]),mean(x[2:7]),mean(x[3:8]),mean(x[4:9]),mean(x[5:10]),mean(x[6:11]),mean(x[7:12]),mean(x[8:13]),mean(x[9:14]),mean(x[10:15])) }))
Please help. thanks in Advance.
We can do this with a rolling mean (rollmean
library(zoo)
t(apply(df1, 1, function(x) rollmean(x, 6)))
Using base R:
n=6
d=lapply(1:(ncol(data)-(n-1)),function(x) x:(x+n-1))
sapply(d,function(w) rowMeans(data[,w]))
another base solution:
rowlingRowMeans <- function(matrix, n_meanrows){
out <- NULL
for(z in 1:(nrow(matrix)-n_meanrows+2)){
out <- cbind(out, rowMeans(matrix[,z:(z+n_meanrows-1)]))
}
return(out)
}
mat <- matrix(rnorm(15*14, 1,10), ncol=15, nrow=14)
rowlingRowMeans(mat, 6)

R: Variable length differ

I want to regress a differenced dependent variable on differenced independent variables and on one non- differenced variable.
I tried the following lines in R:
xt <- ts(xx)
yt <- ts(yy)
zt <- ts(zz)
bt <- ts(bb)
mt <- ts(mm)
xtd <- diff(xt)
ytd <- diff(yt)
ztd <- diff(zt)
btd <- diff(bt)
axx <- ts.intersect(xtd, ytd, ztd, btd, mt)
reg1 <- lm(xtd~ytd+ztd+btd+mt, axx)
summary(reg1)
Without the command ts.intersect() a error message pops up, saying that the variable lengths differ, found for the variable mt. Which makes sense since it isnt differenced. My questions are:
i) is this a correct way to deal with different variable lengths? and ii) is there a more efficient way? many thanks in advance
Date xx yy zz bb mm
1 03.01.2005 0.065 0.001 14.4700 17.938 345001.0
2 04.01.2005 0.067 0.006 14.5100 17.886 345001.0
3 05.01.2005 0.064 -0.007 14.4200 17.950 334001.0
4 06.01.2005 0.065 -0.005 13.8000 17.950 334001.0
5 07.01.2005 0.060 -0.006 13.5700 17.913 334001.0
6 10.01.2005 0.059 -0.007 12.9200 17.958 334001.0
7 11.01.2005 0.057 -0.009 13.6800 17.962 334001.0
8 12.01.2005 0.060 -0.005 14.0500 17.886 340001.0
9 13.01.2005 0.060 -0.004 13.6400 17.568 340001.0
10 14.01.2005 0.059 -0.005 13.5700 17.471 340001.0
11 17.01.2005 0.058 -0.005 13.2000 17.365 340001.0
12 18.01.2005 0.059 -0.005 13.1700 17.214 340001.0
13 19.01.2005 0.057 -0.006 13.6300 17.143 354501.0
14 20.01.2005 0.057 -0.007 14.1700 17.125 354501.0
15 21.01.2005 0.056 -0.007 13.9600 17.193 354501.0
16 24.01.2005 0.057 -0.006 14.1100 17.283 354501.0
17 25.01.2005 0.058 -0.006 13.6300 17.083 354501.0
18 26.01.2005 0.057 -0.006 13.3200 17.348 348001.0
19 27.01.2005 0.059 -0.005 12.4600 17.295 353001.0
20 28.01.2005 0.060 -0.004 12.8100 17.219 353001.0
21 31.01.2005 0.058 -0.004 12.7200 17.143 353001.0
22 01.02.2005 0.059 -0.003 12.3600 17.125 353001.0
23 02.02.2005 0.058 -0.003 12.2500 17.000 357501.0
24 03.02.2005 0.056 -0.008 12.3800 16.808 357501.0
25 04.02.2005 0.058 -0.004 11.6000 16.817 357501.0
26 07.02.2005 0.058 -0.004 11.9900 16.798 357501.0
27 08.02.2005 0.058 -0.003 11.9200 16.804 355501.0
28 09.02.2005 0.062 0.000 12.1900 16.589 355501.0
29 10.02.2005 0.060 0.000 12.0400 16.500 355501.0
30 11.02.2005 0.062 0.002 11.9900 16.429 355501.0
The short answer is yes you need to use ts.intersect() when you have some variables that are differenced and some that are not.
You can probably clean up the code a little bit so you don't have so many lines repeated but (especially if you these are all your variables it doesn't really make a difference.
For example, you might recode all columns as time.series in one step by doing ts.d=ts(d[2:6]).

Error while producing an ARMA model using the TSA package in R

Had anyone else had this problem, or even better, does anyone know why this is giving me an error?
I'm attempting to create an ARMA model of order 3, 3. I'm using the TSA package.
stocks_arma <- arma(stocks$close, order = c(3,3))
I'm getting this warning:
Warning message:
In arma(VIXts, order = c(3, 3)) : Hessian negative-semidefinite
I understand that a Hessian negative-semidefinite matrix is a bad thing because we usually want global mins/maxes. However, I don't understand why this is happening. I am unsure is this is a mathematical issue or a syntactial issue.
My data is a very modest vector of 1000 entries. Here is one-tenth of it:
15.14 15.31 15.08 15.24 16.41 17.99 17.92 16.65 16.68 18.61 18.49 19.08 17.58 18.42 17.59 16.69 18.60 17.81 18.12 18.33 18.83 16.62 16.97 15.03 15.07 15.22 15.27 16.14 15.59 16.29 16.37 15.11 14.33 14.55 15.43 15.71 16.32 15.73 14.84 16.81 15.43 14.15 13.98 14.07 13.88 14.18 14.59 14.51 14.05 15.80 16.41 16.28 14.38 15.63 17.74 17.98 17.47 17.83 17.06 16.49 16.35 15.18 15.96 15.11 15.02 14.02 13.45 14.29 14.63 14.85 13.70 14.74 15.28 15.32 15.99 15.95 15.64 17.57 18.96 18.93 18.03 16.70 17.53 19.34 20.47 18.62 16.27 15.45 16.16 16.48 17.11 16.74 18.36 17.95 18.72 18.05 17.10 17.50 16.66 16.80 17.08 19.71 19.45 19.72 20.38
There is nothing overtly fishy about the values at all.
Any insight is very much appreciated.

Mapping spatial Distributions in R

My data set includes 17 stations and for each station there are 24 hourly temperature values.
I would like to map each stations value in each hour and doing so for all the hours.
What I want to do is something like the image.
The data is in the following format:
N2 N3 N4 N5 N7 N8 N10 N12 N13 N14 N17 N19 N25 N28 N29 N31 N32
1 1.300 -0.170 -0.344 2.138 0.684 0.656 0.882 0.684 1.822 1.214 2.046 2.432 0.208 0.312 0.530 0.358 0.264
2 0.888 -0.534 -0.684 1.442 -0.178 -0.060 0.430 -0.148 1.420 0.286 1.444 2.138 -0.264 -0.042 0.398 -0.196 -0.148
3 0.792 -0.564 -0.622 0.998 -0.320 1.858 -0.036 -0.118 1.476 0.110 0.964 2.048 -0.480 -0.434 0.040 -0.538 -0.322
4 0.324 -1.022 -1.128 1.380 -0.792 1.042 -0.054 -0.158 1.518 -0.102 1.354 2.386 -0.708 -0.510 0.258 -0.696 -0.566
5 0.650 -0.774 -0.982 1.124 -0.540 3.200 -0.052 -0.258 1.452 0.028 1.022 2.110 -0.714 -0.646 0.266 -0.768 -0.532
6 0.670 -0.660 -0.844 1.248 -0.550 2.868 -0.098 -0.240 1.380 -0.012 1.164 2.324 -0.498 -0.474 0.860 -0.588 -0.324
MeteoSwiss
1 -0.6
2 -1.2
3 -1.0
4 -0.8
5 -0.4
6 -0.2
where N2, N3, ...m MeteoSwiss are the stations and each row presents the station's temperature value for each hour.
id Longitude Latitude
2 7.1735 45.86880001
3 7.17254 45.86887001
4 7.171636 45.86923601
5 7.18018 45.87158001
7 7.177229 45.86923001
8 7.17524 45.86808001
10 7.179299 45.87020001
12 7.175189 45.86974001
13 7.179379 45.87081001
14 7.175509 45.86932001
17 7.18099 45.87262001
19 7.18122 45.87355001
25 7.15497 45.87058001
28 7.153399 45.86954001
29 7.152649 45.86992001
31 7.154419 45.87004001
32 7.156099 45.86983001
MeteoSwiss 7.184 45.896
I define a toy example more or less resembling your data:
vals <- matrix(rnorm(24*17), nrow=24)
cds <- data.frame(id=paste0('N', 1:17),
Longitude=rnorm(n=17, mean=7.1),
Latitude=rnorm(n=17, mean=45.8))
vals <- as.data.frame(t(vals))
names(vals) <- paste0('H', 1:24)
The sp package defines several classes and methods to store and
display spatial data. For your example you should use the
SpatialPointsDataFrame class:
library(sp)
mySP <- SpatialPointsDataFrame(coords=cds[,-1], data=data.frame(vals))
and the spplot method to display the information:
spplot(mySP, as.table=TRUE,
col.regions=bpy.colors(10),
alpha=0.8, edge.col='black')
Besides, you may find useful the spacetime package
(paper at JSS).

pca in R with princomp() and using svd() [duplicate]

This question already has an answer here:
Closed 11 years ago.
Possible Duplicate:
Comparing svd and princomp in R
How to perform PCA using 2 methods (princomp() and svd of correlation matrix ) in R
I have a data set like:
438,498,3625,3645,5000,2918,5000,2351,2332,2643,1698,1687,1698,1717,1744,593,502,493,504,445,431,444,440,429,10
438,498,3625,3648,5000,2918,5000,2637,2332,2649,1695,1687,1695,1720,1744,592,502,493,504,449,431,444,443,429,10
438,498,3625,3629,5000,2918,5000,2637,2334,2643,1696,1687,1695,1717,1744,593,502,493,504,449,431,444,446,429,10
437,501,3625,3626,5000,2918,5000,2353,2334,2642,1730,1687,1695,1717,1744,593,502,493,504,449,431,444,444,429,10
438,498,3626,3629,5000,2918,5000,2640,2334,2639,1696,1687,1695,1717,1744,592,502,493,504,449,431,444,441,429,10
439,498,3626,3629,5000,2918,5000,2633,2334,2645,1705,1686,1694,1719,1744,589,502,493,504,446,431,444,444,430,10
440,5000,3627,3628,5000,2919,3028,2346,2330,2638,1727,1684,1692,1714,1745,588,501,492,504,451,433,446,444,432,10
444,5021,3631,3634,5000,2919,5000,2626,2327,2638,1698,1680,1688,1709,1740,595,500,491,503,453,436,448,444,436,10
451,5025,3635,3639,5000,2920,3027,2620,2323,2632,1706,1673,1681,1703,753,595,499,491,502,457,440,453,454,442,20
458,5022,3640,3644,5000,2922,5000,2346,2321,2628,1688,1666,1674,1696,744,590,496,490,498,462,444,458,461,449,20
465,525,3646,3670,5000,2923,5000,2611,2315,2631,1674,1658,1666,1688,735,593,495,488,497,467,449,462,469,457,20
473,533,3652,3676,5000,2925,5000,2607,2310,2623,1669,1651,1659,1684,729,578,496,487,498,469,454,467,476,465,20
481,544,3658,3678,5000,2926,5000,2606,2303,2619,1668,1643,1651,1275,723,581,495,486,497,477,459,472,484,472,20
484,544,3661,3665,5000,2928,5000,2321,2304,5022,1647,1639,1646,1270,757,623,493,484,495,480,461,474,485,476,20
484,532,3669,3662,2945,2926,5000,2326,2306,2620,1648,1639,1646,1270,760,533,493,483,494,507,461,473,486,476,20
482,520,3685,3664,2952,2927,5000,2981,2307,2329,1650,1640,1644,1268,757,533,492,482,492,513,459,474,485,474,20
481,522,3682,3661,2955,2927,2957,2984,1700,2622,1651,1641,1645,1272,761,530,492,482,492,513,462,486,483,473,20
480,525,3694,3664,2948,2926,2950,2995,1697,2619,1651,1642,1646,1269,762,530,493,482,492,516,462,486,483,473,20
481,515,5018,3664,2956,2927,2947,2993,1697,2622,1651,1641,1645,1269,765,592,489,482,495,531,462,499,483,473,20
479,5000,3696,3661,2953,2927,2944,2993,1702,2622,1649,1642,1645,1269,812,588,489,481,491,510,462,481,483,473,20
480,506,5019,3665,2941,2929,2945,2981,1700,2616,1652,1642,1645,1271,814,643,491,480,493,524,461,469,484,473,20
479,5000,5019,3661,2943,2930,2942,2996,1698,2312,1653,1642,1644,1274,811,617,491,479,491,575,461,465,484,473,20
479,5000,5020,3662,2945,2931,2942,2997,1700,2313,1654,1642,1644,1270,908,616,490,478,489,503,460,460,478,473,10
481,508,5021,3660,2954,2936,2946,2966,1705,2313,1654,1643,1643,1270,1689,678,493,477,483,497,467,459,476,473,10
486,510,522,3662,2958,2938,2939,2627,1707,2314,1659,1643,1639,1665,1702,696,516,476,477,547,465,457,470,474,10
479,521,520,3663,2954,2938,2941,2957,1712,2314,1660,1643,1638,1660,1758,688,534,475,475,489,461,456,465,474,10
480,554,521,3664,2954,2938,2941,2632,1715,2313,1660,1643,1637,1656,1761,687,553,475,474,558,462,453,465,476,10
481,511,5023,3665,2954,2937,2941,2627,1707,2312,1660,1641,1636,1655,1756,687,545,475,475,504,463,458,470,477,10
482,528,524,3665,2953,2937,2940,2629,1706,2312,1657,1640,1635,1654,1756,566,549,475,476,505,464,459,468,477,10
So I am doing this:
x <- read.csv("C:\\data_25_1000.txt",header=F,row.names=NULL)
p1 <- princomp(x, cor = TRUE) ## using correlation matrix
p1
Call:
princomp(x = x, cor = TRUE)
Standard deviations:
Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6 Comp.7 Comp.8 Comp.9 Comp.10 Comp.11 Comp.12 Comp.13 Comp.14 Comp.15 Comp.16
1.9800328 1.8321498 1.4147367 1.3045541 1.2016116 1.1708212 1.1424120 1.0134829 1.0045317 0.9078734 0.8442308 0.8093044 0.7977656 0.7661921 0.7370972 0.7075442
Comp.17 Comp.18 Comp.19 Comp.20 Comp.21 Comp.22 Comp.23 Comp.24 Comp.25
0.7011462 0.6779179 0.6671614 0.6407627 0.6077336 0.5767217 0.5659030 0.5526520 0.5191375
25 variables and 1000 observations.
For the second method suppose I have the correlation matrix of "C:\data_25_1000.txt"
which is:
1.0 0.3045 0.1448 -0.0714 -0.038 -0.0838 -0.1433 -0.1071 -0.1988 -0.1076 -0.0313 -0.157 -0.1032 -0.137 -0.0802 0.1244 0.0701 0.0457 -0.0634 0.0401 0.1643 0.3056 0.3956 0.4533 0.1557
0.3045 0.9999 0.3197 0.1328 0.093 -0.0846 -0.132 0.0046 -0.004 -0.0197 -0.1469 -0.1143 -0.2016 -0.1 -0.0316 0.0044 -0.0589 -0.0589 0.0277 0.0314 0.078 0.0104 0.0692 0.1858 0.0217
0.1448 0.3197 1 0.3487 0.2811 0.0786 -0.1421 -0.1326 -0.2056 -0.1109 0.0385 -0.1993 -0.1975 -0.1858 -0.1546 -0.0297 -0.0629 -0.0997 -0.0624 -0.0583 0.0316 0.0594 0.0941 0.0813 -0.1211
-0.0714 0.1328 0.3487 1 0.6033 0.2866 -0.246 -0.1201 -0.1975 -0.0929 -0.1071 -0.212 -0.3018 -0.3432 -0.2562 0.0277 -0.1363 -0.2218 -0.1443 -0.0322 -0.012 0.1741 -0.0725 -0.0528 -0.0937
-0.038 0.093 0.2811 0.6033 1 0.4613 0.016 0.0655 -0.1094 0.0026 -0.1152 -0.1692 -0.2047 -0.2508 -0.319 -0.0528 -0.1839 -0.2758 -0.2657 -0.1136 -0.0699 0.1433 -0.0136 -0.0409 -0.1538
-0.0838 -0.0846 0.0786 0.2866 0.4613 0.9999 0.2615 0.2449 0.1471 0.0042 -0.1496 -0.2025 -0.1669 -0.142 -0.1746 -0.1984 -0.2197 -0.2631 -0.2675 -0.1999 -0.1315 0.0469 0.0003 -0.1113 -0.1217
-0.1433 -0.132 -0.1421 -0.246 0.016 0.2615 1 0.3979 0.3108 0.1622 -0.0539 0.0231 0.1801 0.2129 0.1331 -0.1325 -0.0669 -0.0922 -0.1236 -0.1463 -0.1452 -0.2422 -0.0768 -0.1457 0.036
-0.1071 0.0046 -0.1326 -0.1201 0.0655 0.2449 0.3979 1 0.4244 0.3821 0.119 -0.0666 0.0163 0.0963 -0.0078 -0.1202 -0.204 -0.2257 -0.2569 -0.2334 -0.234 -0.2004 -0.138 -0.0735 -0.1442
-0.1988 -0.004 -0.2056 -0.1975 -0.1094 0.1471 0.3108 0.4244 0.9999 0.5459 0.0498 -0.052 0.0987 0.186 0.2576 -0.052 -0.1921 -0.2222 -0.1792 -0.0154 -0.058 -0.1868 -0.2232 -0.3118 0.0186
-0.1076 -0.0197 -0.1109 -0.0929 0.0026 0.0042 0.1622 0.3821 0.5459 0.9999 0.2416 0.0183 0.063 0.0252 0.186 0.0519 -0.1943 -0.2241 -0.2635 -0.0498 -0.0799 -0.0553 -0.1567 -0.2281 -0.0263
-0.0313 -0.1469 0.0385 -0.1071 -0.1152 -0.1496 -0.0539 0.119 0.0498 0.2416 1 0.2601 0.1625 -0.0091 -0.0633 0.0355 0.0397 -0.0288 -0.0768 -0.2144 -0.2581 0.1062 0.0469 -0.0608 -0.0578
-0.157 -0.1143 -0.1993 -0.212 -0.1692 -0.2025 0.0231 -0.0666 -0.052 0.0183 0.2601 0.9999 0.3685 0.3059 0.1269 -0.0302 0.1417 0.1678 0.2219 -0.0392 -0.2391 -0.2504 -0.2743 -0.1827 -0.0496
-0.1032 -0.2016 -0.1975 -0.3018 -0.2047 -0.1669 0.1801 0.0163 0.0987 0.063 0.1625 0.3685 1 0.6136 0.2301 -0.1158 0.0366 0.0965 0.1334 -0.0449 -0.1923 -0.2321 -0.1848 -0.1109 0.1007
-0.137 -0.1 -0.1858 -0.3432 -0.2508 -0.142 0.2129 0.0963 0.186 0.0252 -0.0091 0.3059 0.6136 1 0.4078 -0.0615 0.0607 0.1223 0.1379 0.0072 -0.1377 -0.3633 -0.2905 -0.1867 0.0277
-0.0802 -0.0316 -0.1546 -0.2562 -0.319 -0.1746 0.1331 -0.0078 0.2576 0.186 -0.0633 0.1269 0.2301 0.4078 1 0.0521 -0.0345 0.0444 0.0778 0.0925 0.0596 -0.2551 -0.1499 -0.2211 0.244
0.1244 0.0044 -0.0297 0.0277 -0.0528 -0.1984 -0.1325 -0.1202 -0.052 0.0519 0.0355 -0.0302 -0.1158 -0.0615 0.0521 1 0.295 0.2421 -0.06 0.0921 0.243 0.0953 0.0886 0.0518 -0.0032
0.0701 -0.0589 -0.0629 -0.1363 -0.1839 -0.2197 -0.0669 -0.204 -0.1921 -0.1943 0.0397 0.1417 0.0366 0.0607 -0.0345 0.295 0.9999 0.4832 0.2772 0.0012 0.1198 0.0411 0.1213 0.1409 0.0368
0.0457 -0.0589 -0.0997 -0.2218 -0.2758 -0.2631 -0.0922 -0.2257 -0.2222 -0.2241 -0.0288 0.1678 0.0965 0.1223 0.0444 0.2421 0.4832 1 0.2632 0.0576 0.0965 -0.0043 0.0818 0.102 0.0915
-0.0634 0.0277 -0.0624 -0.1443 -0.2657 -0.2675 -0.1236 -0.2569 -0.1792 -0.2635 -0.0768 0.2219 0.1334 0.1379 0.0778 -0.06 0.2772 0.2632 1 0.2036 -0.0452 -0.142 -0.0696 -0.0367 0.3039
0.0401 0.0314 -0.0583 -0.0322 -0.1136 -0.1999 -0.1463 -0.2334 -0.0154 -0.0498 -0.2144 -0.0392 -0.0449 0.0072 0.0925 0.0921 0.0012 0.0576 0.2036 0.9999 0.2198 0.1268 0.0294 0.0261 0.3231
0.1643 0.078 0.0316 -0.012 -0.0699 -0.1315 -0.1452 -0.234 -0.058 -0.0799 -0.2581 -0.2391 -0.1923 -0.1377 0.0596 0.243 0.1198 0.0965 -0.0452 0.2198 1 0.2667 0.2833 0.2467 0.0288
0.3056 0.0104 0.0594 0.1741 0.1433 0.0469 -0.2422 -0.2004 -0.1868 -0.0553 0.1062 -0.2504 -0.2321 -0.3633 -0.2551 0.0953 0.0411 -0.0043 -0.142 0.1268 0.2667 1 0.4872 0.3134 0.1663
0.3956 0.0692 0.0941 -0.0725 -0.0136 0.0003 -0.0768 -0.138 -0.2232 -0.1567 0.0469 -0.2743 -0.1848 -0.2905 -0.1499 0.0886 0.1213 0.0818 -0.0696 0.0294 0.2833 0.4872 0.9999 0.4208 0.1317
0.4533 0.1858 0.0813 -0.0528 -0.0409 -0.1113 -0.1457 -0.0735 -0.3118 -0.2281 -0.0608 -0.1827 -0.1109 -0.1867 -0.2211 0.0518 0.1409 0.102 -0.0367 0.0261 0.2467 0.3134 0.4208 1 0.0592
0.1557 0.0217 -0.1211 -0.0937 -0.1538 -0.1217 0.036 -0.1442 0.0186 -0.0263 -0.0578 -0.0496 0.1007 0.0277 0.244 -0.0032 0.0368 0.0915 0.3039 0.3231 0.0288 0.1663 0.1317 0.0592 0.9999
I have also computed svd of this correlation matrix and got:
> s = svd(Correlation_25_1000)
$d
[1] 3.9205298 3.3567729 2.0014799 1.7018614 1.4438704 1.3708223 1.3051053 1.0271475 1.0090840 0.8242341 0.7127256 0.6549736 0.6364299 0.5870503 0.5433123 0.5006188 0.4916060
[18] 0.4595726 0.4451043 0.4105769 0.3693401 0.3326079 0.3202462 0.3054243 0.2695037
$u
matrix
$v
matrix
My question is, how can I use $d, $u and $v to get principal components
Could I use prcomp() ?? If, so how?
Try this one
princomp
princomp(USArrests, cor = TRUE)$loadings
Loadings:
Comp.1 Comp.2 Comp.3 Comp.4
Murder -0.536 0.418 -0.341 0.649
Assault -0.583 0.188 -0.268 -0.743
UrbanPop -0.278 -0.873 -0.378 0.134
Rape -0.543 -0.167 0.818
svd
svd(cor(USArrests))$u
[,1] [,2] [,3] [,4]
[1,] -0.5358995 0.4181809 -0.3412327 0.64922780
[2,] -0.5831836 0.1879856 -0.2681484 -0.74340748
[3,] -0.2781909 -0.8728062 -0.3780158 0.13387773
[4,] -0.5434321 -0.1673186 0.8177779 0.08902432
eigen
eigen(cor(USArrests))$vectors
[,1] [,2] [,3] [,4]
[1,] -0.5358995 0.4181809 -0.3412327 0.64922780
[2,] -0.5831836 0.1879856 -0.2681484 -0.74340748
[3,] -0.2781909 -0.8728062 -0.3780158 0.13387773
[4,] -0.5434321 -0.1673186 0.8177779 0.08902432
For cor matrix, all princomp, svd, and eigen produces same results.

Resources