Calculate period return from monthly returns - r
This may sound naive but I can't seem to find the solution. I need to calculate 1, 3 and 5-year returns and my dataset consists of monthly returns rather than prices. The dataset I'm working on is similar to managers
data(managers)
tail(managers)
HAM1 HAM2 HAM3 HAM4 HAM5 HAM6 EDHEC LS EQ SP500 TR US 10Y TR US 3m TR
2006-07-31 -0.0144 -0.0131 0.0102 -0.0120 -0.0164 -0.0225 -0.0031 0.00620 0.01580 0.00423
2006-08-31 0.0161 -0.0113 0.0253 -0.0183 0.0169 0.0193 0.0114 0.02380 0.02190 0.00441
2006-09-30 0.0068 -0.0231 0.0072 0.0197 0.0132 -0.0177 0.0001 0.02580 0.01140 0.00456
2006-10-31 0.0427 0.0167 0.0183 0.0518 0.0266 0.0189 0.0194 0.03260 0.00584 0.00381
2006-11-30 0.0117 0.0206 0.0269 0.0373 0.0038 0.0300 0.0200 0.01900 0.01419 0.00430
2006-12-31 0.0115 -0.0062 0.0110 0.0206 0.0317 0.0215 0.0153 0.01403 -0.01550 0.00441
I looked into the Return.cumulative from package PerformanceAnalytics but there is no argument for specifying periods. ROC from TTR can specify the number of periods to use but it is not based on return. What would be the best way to do this? Thank you in advance!
Base on what you want and what you know about ROC from TTR , I will only provide the Data preparation part
#Sample Data
df=read.table(text=' Date HAM1 HAM2 HAM3 HAM4 HAM5 HAM6
2006-07-31 -0.0144 -0.0131 0.0102 -0.0120 -0.0164 -0.0225
2006-08-31 0.0161 -0.0113 0.0253 -0.0183 0.0169 0.0193
2006-09-30 0.0068 -0.0231 0.0072 0.0197 0.0132 -0.0177
2006-10-31 0.0427 0.0167 0.0183 0.0518 0.0266 0.0189
2006-11-30 0.0117 0.0206 0.0269 0.0373 0.0038 0.0300
2006-12-31 0.0115 -0.0062 0.0110 0.0206 0.0317 0.0215 ',header=T,stringsAsFactors=F)
#Make the Return to Price assume all stocks initial value with 100
for (i in 2:dim(df)[2]){
B=Reduce(function(x,y) {x * (1+y)}, df[,i], init=100, accumulate=T)# if it is log Return: {x * exp(y)}
if (i==2){
Price= B
}else{
Price=cbind(Price,B)
}
}
Price=data.frame(cbind(df$Date,Price[-1,]))
names(Price)=names(df)
> Price
Date HAM1 HAM2 HAM3 HAM4 HAM5 HAM6
1 2006-07-31 98.56 98.69 101.02 98.8 98.36 97.75
2 2006-08-31 100.146816 97.574803 103.575806 96.99196 100.022284 99.636575
3 2006-09-30 100.8278143488 95.3208250507 104.3215518032 98.902701612 101.3425781488 97.8730076225
4 2006-10-31 105.133162021494 96.9126828290467 106.230636201199 104.025861555502 104.038290727558 99.7228074665652
5 2006-11-30 106.363220017145 98.909084095325 109.088240315011 107.906026191522 104.433636232323 102.714491690562
6 2006-12-31 107.586397047342 98.295847773934 110.288210958476 110.128890331067 107.744182500887 104.922853261909
Then you can use the normal package to annualized the return(or customize)
Using #Wen's data:
df=read.table(text=' Date HAM1 HAM2 HAM3 HAM4 HAM5 HAM6
2006-07-31 -0.0144 -0.0131 0.0102 -0.0120 -0.0164 -0.0225
2006-08-31 0.0161 -0.0113 0.0253 -0.0183 0.0169 0.0193
2006-09-30 0.0068 -0.0231 0.0072 0.0197 0.0132 -0.0177
2006-10-31 0.0427 0.0167 0.0183 0.0518 0.0266 0.0189
2006-11-30 0.0117 0.0206 0.0269 0.0373 0.0038 0.0300
2006-12-31 0.0115 -0.0062 0.0110 0.0206 0.0317 0.0215 ',header=T,stringsAsFactors=F)
You can use the rollaply function from the zoo package.
library(zoo)
roll <- function(x, n, stat) {
if (length(x) <= n) NA
else x <- x + 1
rollapply(x, list(-seq(n)), stat, fill = NA)
}
df2 <- transform(df, four_month_return_HAM1 = ave(HAM1, FUN = function(x) roll(x, 4, prod)-1))
Change 4 to the period you want to calculate the cumulative return over. So, for one year, this would be 12. This will then give you the 12 month rolling returns.
Related
Compute euclidean distance for PCA in R
I did a PCA (using eigenvalue and smartPCA) and now I am trying to compute the Euclidean distance to one population. For example , in this dataset, I would want to compute in R the Euclidean distance of all population to population 6. Individual PCA1 PCA2 PCA3 PCA4 PCA5 PCA6 PCA7 PCA8 PCA8 PCA10 1: Pop1 0.0346 0.0095 -0.0022 -0.0018 -0.0033 0.0002 0.0042 -0.0003 0.0028 0.0268 2: Pop2 0.0370 0.0095 -0.0027 0.0015 -0.0027 -0.0024 0.0038 0.0012 0.0053 0.0210 3: Pop3 0.0379 0.0100 -0.0030 0.0021 -0.0017 -0.0043 0.0033 0.0005 0.0036 0.0144 4: Pop4 0.0352 0.0092 -0.0031 -0.0021 -0.0029 -0.0005. 0.0038 -0.0003 0.0047 0.0349 5: Pop5 0.0342 0.0089 -0.0027 -0.0013 -0.0031 -0.0008 0.0032 -0.0017 -0.0009 0.0265 6: Pop6 0.0342 -0.0524 -0.0503 -0.1028 -0.4785 -0.0244 0.0279 0.0038 -0.0264 -0.0022 -0.0265 I found a thread on how to do it in Python but can't find it in R ! I tried dist() but the I don't understand the output or how to compare to only one population (I have 500+ population in total) Thanks !
How to color the branches and tick labels in the heatmap.2?
I have done a Heat Map using the function heatmap.2 of gplots in R, but I don't have an idea of how to coloring the branches and tick labels per groups (Eg. if I cut the tree to have four gruops like in my second figure). I have checked that it is possible to color the dendrogram alone using dendextend package. Also there is a heatmap here: selecting number of leaf nodes of dendrogram in heatmap.2 in R with a colored dendrogram, but I can't implement it in my example. Somebody can help me with this issue? Update This is my Heat Map: and I would like to have one like this with branches and tick labels in color according their four groups (this figure was edited with Illustrator to explain this question): Here is the data and code that I have used: Data YEAR varA varB varC varD varE varF var1 var2 var3 var4 var5 var6 var7 var8 var9 var10 var11 var12 var13 var14 var15 var16 var17 2005 1.175290887 1.535846033 1.531113178 -1.10297075 0.0284 26 -25.5470 -24.2101 24.7900 3.3345 0.0468 0.5058 0.0087 1.7378 0.0703 2.7070 0.0183 0.0340 0.0177 0.0176 0.0240 0.0015 0.0292 2004 0.834733204 0.64917365 -0.403174087 0.116169692 0.033 50 -24.4170 -22.2574 27.3400 3.4106 0.1151 0.5822 0.0085 1.8133 0.0762 3.2604 0.0114 0.0178 0.0086 0.0086 0.0824 0.0018 0.0308 2003 1.297607635 1.224946337 0.4486378 0.227557968 0.0544 181 -24.5080 -23.2790 27.4200 3.5092 0.1052 0.5239 0.0038 0.9815 0.0681 2.7465 0.0074 0.0099 0.0025 0.0025 0.0142 0.0015 0.0298 2002 1.043780072 0.650695815 -0.337133061 0.016766696 0.0374 227 -22.6110 -21.7828 30.0200 3.6270 0.1119 0.5753 0.0106 0.7916 0.0805 3.0434 0.0069 0.0086 0.0109 0.0108 0.0313 0.0017 0.0288 2001 0.781864124 0.534881678 -0.740527443 0.171745261 0.0074 20 -23.9170 -23.2327 3.8007 0.1243 0.6216 0.0553 1.2333 0.3414 2.9606 0.0074 0.0384 0.0079 0.0082 0.0570 0.0018 0.0360 2000 0.742528229 0.667207042 -0.614740091 0.189253192 0.0257 88 -22.6420 -21.4066 30.8900 3.1693 0.0287 0.6244 0.0070 1.0256 0.1336 2.7033 0.0063 0.0102 0.0185 0.0186 0.0248 0.0015 0.0278 1999 0.701222612 1.059869033 0.772334853 0.290190993 0.0476 312 -22.4730 -21.8328 26.6600 3.0578 0.0719 0.6363 0.0032 0.7183 0.0649 2.5445 0.0066 0.0070 0.0063 0.0063 0.0095 0.0016 0.0252 1998 0.904634938 1.16455833 0.646654191 0.086214161 0.0546 332 -23.2070 -22.4399 26.1400 3.2344 0.0656 0.7096 0.0046 0.6709 0.0718 2.5656 0.0072 0.0166 0.0132 0.0131 0.0144 0.0016 0.0275 1997 0.965775183 1.362520795 0.653268963 0.007038426 0.0791 509 -23.4830 -22.4253 26.0400 3.0278 0.0438 0.7575 0.0081 0.5002 0.0657 2.5755 0.0077 0.0072 0.0083 0.0083 0.0108 0.0017 0.0252 1996 0.956113049 1.439534042 0.618648101 -0.334351083 0.0411 245 -23.4290 -23.0417 27.3000 2.9331 0.0363 0.9229 0.0050 0.4819 0.1306 2.7239 0.0072 0.0166 0.0027 0.0026 0.0174 0.0018 0.0240 1995 1.786742729 1.732091021 2.654237394 0.190377371 0.0842 646 -22.7600 -22.0212 24.2100 3.1562 0.0202 1.1728 0.0072 0.6133 0.0772 3.1313 0.0080 0.0051 0.0035 0.0035 0.0055 0.0022 0.0266 1994 0.811695681 0.670904284 0.76646691 2.163378723 0.0394 203 -22.4920 -21.3677 28.6500 3.2475 0.0132 1.7476 0.0084 0.9386 0.1880 3.8856 0.0082 0.0120 0.0129 0.0129 0.0151 0.0026 0.0280 1993 0.754876913 0.302624208 -0.927234708 -0.108263802 0.017 66 -22.3880 -21.2900 32.8400 3.5853 0.0008 1.6626 0.0221 1.2307 0.4173 3.8864 0.0079 0.0379 0.0199 0.0196 0.0225 0.0028 0.0319 1992 0.723058507 0.818965047 -0.52053294 0.384656566 0.0345 155 -21.4920 -20.8724 32.2000 3.3116 0.0068 1.5673 0.0104 0.9411 0.2245 4.0228 0.0075 0.0123 0.0308 0.0306 0.0112 0.0027 0.0287 1991 1.024225427 0.71537408 0.22672288 -0.029575009 0.0297 235 -23.4850 -22.7000 27.8400 4.4024 0.0097 1.6126 0.0698 0.6344 0.2832 4.4160 0.0108 0.0127 0.0184 0.0184 0.0122 0.0030 0.0356 1990 0.873807193 1.168599747 1.317306687 -0.335682786 0.0533 172 -23.7170 -23.5029 25.8100 4.0497 0.0170 1.5207 0.0065 0.5232 0.1734 4.6765 0.0104 0.0164 0.0131 0.0130 0.0093 0.0030 0.0332 1989 0.71498833 1.065965836 0.650281646 -0.048038841 0.0663 214 -23.5000 -23.1053 26.3500 4.1139 0.0159 1.6162 0.0096 0.5199 0.1426 4.7752 0.0106 0.0083 0.0098 0.0099 0.0076 0.0031 0.0341 1988 1.188282778 1.133076429 -0.167816244 0.448030288 0.007 64 -23.3750 -21.9900 29.3900 3.6893 0.0278 1.8392 0.0939 0.5658 1.2390 5.1103 0.0086 0.0775 0.0203 0.0202 0.0339 0.0034 0.0340 1987 0.788798159 0.276008942 -0.934596308 -0.039259431 0.012 65 -22.9540 -22.7758 28.3800 3.6375 0.0011 1.8331 0.0768 0.6187 0.6081 5.0475 0.0088 0.0554 0.0183 0.0180 0.0159 0.0038 0.0381 1986 0.757757883 1.395817348 0.455252572 -0.001274532 0.0125 47 -22.6120 -22.9011 29.7400 3.7060 0.0172 1.5279 0.0151 0.5897 0.6168 4.4917 0.0085 0.0160 0.0257 0.0256 0.0276 0.0033 0.0410 1985 1.128413419 0.321849225 -0.904189697 -0.05362552 0.0705 291 -22.7200 -21.9357 28.4100 3.5887 0.0100 1.4955 0.0022 0.3538 0.1471 4.3125 0.0091 0.0157 0.0042 0.0042 0.0041 0.0029 0.0292 1984 1.015352865 1.014625668 0.39294569 -0.267936245 0.0419 121 -23.5170 -23.1678 25.6200 4.5018 0.0353 1.8985 0.0022 0.3420 0.2620 4.9867 0.0113 0.0069 0.0058 0.0058 0.0051 0.0033 0.0356 1983 0.393985784 0.474743555 -0.368393191 -0.222845745 0.0161 49 -24.5600 -23.9514 30.5300 3.0978 0.0270 0.9467 0.0421 0.3287 0.5616 3.1256 0.0075 0.0553 0.0154 0.0155 0.0084 0.0022 0.0323 1982 0.503744901 0.524683063 -0.946225504 0.016766696 0.0118 10 -23.5970 -24.0037 30.3100 2.7288 0.0011 1.2154 0.0097 0.3022 0.8415 4.3594 0.0083 0.0254 0.0075 0.0076 0.0134 0.0029 0.0304 1981 0.872025585 1.496555573 0.658923526 -0.175816424 0.0489 343 -23.8320 -23.4716 28.1100 4.6585 0.0128 1.9205 0.0031 0.2999 0.2278 5.6588 0.0134 0.0067 0.0072 0.0071 0.0087 0.0036 0.0437 1980 2.165460373 3.419095697 3.741300435 0.250364758 0.0644 626 -24.5010 -24.0323 28.7300 3.8474 0.0122 1.4827 0.0019 0.2164 0.1859 4.3602 0.0104 0.0056 0.0050 0.0050 0.0064 0.0028 0.0337 1979 1.00201444 0.453601121 0.109577407 0.73158507 0.0281 301 -23.6070 -22.9149 27.9100 4.5765 0.0467 1.6919 0.0344 0.1940 0.3453 5.1064 0.0132 0.0162 0.0078 0.0077 0.0554 0.0032 0.0389 1978 0.829984787 0.2021646 -0.724630653 -0.178430782 0.0000 1977 0.939170906 0.192142351 -1.029656979 0.50745842 0.0068 30 -24.3510 -22.5760 29.4900 6.1029 0.3417 2.4069 0.0938 0.2824 1.3937 6.6441 0.0136 0.0609 0.0395 0.0391 0.6074 0.0045 0.0591 1976 0.741090851 0.151474404 -0.439448642 0.359471579 0.056 396 -23.7450 -22.7680 28.3700 4.3464 0.0431 1.6901 0.0234 0.2937 0.2160 5.1366 0.0113 0.0147 0.0082 0.0081 0.0317 0.0034 0.0389 1975 1.061884929 0.396763153 -1.075320241 0.433356946 0.0299 322 -23.4320 -22.9732 25.7800 5.0301 0.1740 2.2028 0.0311 0.3131 0.4254 5.8683 0.0131 0.0160 0.0182 0.0182 0.2093 0.0038 0.0443 1974 1.052548763 0.491883924 0.28198823 -0.562241025 0.0215 267 -23.3350 -22.7075 26.4100 5.3407 0.1187 2.2436 0.0231 0.2984 0.5378 5.8795 0.0127 0.0208 0.0127 0.0128 0.0821 0.0038 0.0466 1973 0.519163031 1.120525721 0.960322396 -0.84893256 0.0129 49 -23.4350 -23.0556 31.3500 6.4341 0.1105 2.4298 0.0484 0.2783 0.9249 5.8779 0.0129 0.0428 0.0124 0.0123 0.1293 0.0038 0.0499 1972 0.703961551 1.359485416 -0.306513069 -1.150818704 0.0228 247 -23.7840 -23.3257 28.3000 6.3520 0.1096 2.6043 0.0439 0.4126 0.5335 6.3320 0.0154 0.0279 0.0061 0.0062 0.0874 0.0042 0.0593 1971 0.714252707 1.621333793 -1.065184704 0.003023451 0.0274 196 -23.2140 -22.2731 31.3800 5.1332 0.0873 1.9259 0.0872 0.3598 0.4714 4.9337 0.0112 0.0234 0.0073 0.0073 0.0688 0.0034 0.0426 1970 1.022643019 1.491401283 0.088239434 -0.973528472 0.025 206 -22.9870 -21.9506 30.6200 5.0770 0.0698 2.1145 0.1825 0.3537 0.4990 5.3274 0.0129 0.0873 0.0098 0.0098 0.0316 0.0040 0.0479 1969 2.157784838 1.796722133 0.731152565 -0.193891705 0.0547 505 -24.2820 -23.9048 26.2400 5.0183 0.0637 2.2673 0.0127 0.2893 0.2420 5.1038 0.0129 0.0244 0.0069 0.0069 0.0154 0.0037 0.0440 1968 0.913026742 1.271215847 0.196849717 -1.068149218 0.0132 112 -22.9850 -21.9397 32.2300 4.0568 0.0498 2.0576 0.0965 0.2188 0.9468 5.3597 0.0080 0.0513 0.0157 0.0154 0.0507 0.0039 0.0371 1967 0.749350643 0.439194622 -1.316546028 0.306149455 0.0209 196 -23.7020 -22.8580 30.5400 4.5873 0.0703 1.9639 0.4981 0.2136 0.6086 5.1528 0.0100 0.0934 0.0103 0.0102 0.0235 0.0042 0.0415 1966 0.732785384 0.74795644 -0.681581292 1.265096245 0.0189 204 -23.3746 -22.7452 30.0600 4.8598 0.0542 1.8172 0.0437 0.2605 0.6557 5.2782 0.0131 0.0118 0.0081 0.0080 0.0203 0.0036 0.0418 1965 0.613725701 0.507953446 -1.91048851 0.825418348 0.0073 75 -24.2131 -22.5251 30.1900 5.5445 0.0691 1.9367 0.9303 0.2240 1.6461 5.5971 0.0119 0.1519 0.0318 0.0322 0.0436 0.0053 0.0467 1964 0.761469549 0.591007527 -0.715988774 -0.038091331 0.0000 1963 0.863218851 0.888615198 -0.331691877 -0.251436807 0.0123 121 -25.0690 -24.5964 27.4600 6.3232 0.0777 2.0383 0.1999 0.2465 0.9724 5.8291 0.0133 0.0349 0.0130 0.0131 0.0240 0.0044 0.0519 1962 1.194332086 1.123299319 1.400311402 -0.006545299 0.0296 250 -23.6850 -23.4588 29.3800 5.7280 0.0771 1.8900 0.0077 0.1952 0.4429 5.7635 0.0122 0.0047 0.0064 0.0063 0.0121 0.0041 0.0471 1961 0.685968021 0.396586649 -0.75076967 0.0168 201 -26.3352 -26.3457 5.5119 0.0726 1.9270 0.0180 0.1741 0.7887 5.7523 0.0121 0.0080 0.0119 0.0119 0.0208 0.0043 0.0496 1960 0.881343621 0.681729796 -0.466014418 0.0242 250 -25.5025 -25.2769 29.1200 6.5630 0.1133 2.2199 0.1176 0.2603 0.5894 6.4430 0.0159 0.0392 0.0062 0.0061 0.0308 0.0051 0.0647 1959 0.976463783 0.856497076 -0.769653776 0.0046 109 -24.9889 -25.0234 28.1000 7.4239 0.0760 3.3692 3.7315 0.4288 2.8041 7.8173 0.0178 0.6213 0.0559 0.0554 0.0902 0.0115 0.0722 1958 1.267054108 0.846073161 -0.698278256 0.0069 41 -24.5183 -25.8900 24.7200 8.4312 0.0602 3.1824 0.6086 0.4111 1.6313 7.3141 0.0165 0.0977 0.0280 0.0279 0.0575 0.0046 0.0709 1957 0.811849325 0.818326511 -1.087269506 0.0126 95 -23.4967 -23.5870 32.4900 5.6488 0.0761 2.6156 0.2207 0.4425 1.0305 7.3572 0.0159 0.0726 0.0380 0.0377 0.0437 0.0059 0.0573 1956 0.837065839 1.0007592 0.424525891 0.0115 76 -23.4403 -22.9419 32.1500 5.6087 0.0844 2.8347 0.3853 0.3125 1.1162 8.0455 0.0167 0.0696 0.0158 0.0157 0.0306 0.0058 0.0565 1955 2.044375189 1.828578166 0.0218 128 -24.9729 -24.2108 26.9000 7.4702 0.1659 4.0858 0.2619 0.3952 0.7023 9.7602 0.0222 0.0635 0.0111 0.0111 0.0338 0.0070 0.0731 1954 0.737033129 1.060103924 0.0029 8 -25.6604 -25.1068 28.9700 7.8034 0.0884 4.0907 1.8003 0.4834 5.0243 8.9409 0.0243 0.4037 0.0541 0.0529 0.2932 0.0091 0.0813 1953 0.619590578 0.647436408 0.0075 109 31.0400 1952 0.671851137 1.325676852 0.00562 41 33.1100 1951 0.894632264 1.397998867 0.00374 95 35.1800 1950 0.793048089 0.55195169 0.00186 76 -24.6750 -24.0405 37.2500 6.8214 0.1632 3.3876 1.0452 0.4622 1.7704 7.9556 0.0223 0.2316 0.0594 0.0592 0.3935 0.0066 0.0673 1949 0.70029018 1.053010492 0.0061 23 -25.2148 -26.0272 31.0900 5.8770 0.0532 3.0895 0.1231 0.4304 2.1365 7.9355 0.0165 0.1047 0.0204 0.0201 0.0735 0.0060 0.0578 1948 1.051413064 0.611568416 0.0105 86 -25.9116 -25.3761 29.6500 4.0905 0.0930 2.3578 0.7431 0.1757 1.3103 7.2889 0.0122 0.1378 0.0138 0.0136 0.0408 0.0056 0.0441 1947 0.706745895 0.323498221 0.0108 129 -26.5485 -25.8733 29.7700 5.7245 0.1294 3.2072 0.0524 0.2021 1.2550 9.1257 0.0150 0.1170 0.0155 0.0155 0.0393 0.0060 0.0588 1946 1.550656194 1.598435187 0.0164 381 -27.4603 -26.6368 28.0600 5.8659 0.1405 2.7682 0.0353 0.2424 0.3504 8.4089 0.0130 0.0437 0.0075 0.0075 0.0176 0.0057 0.0516 1945 0.877065687 0.539494611 0.0199 169 -26.7543 -26.0271 24.5700 6.2789 0.1407 2.9213 0.0309 0.3404 0.2888 7.9661 0.0131 0.0460 0.0079 0.0079 0.0185 0.0054 0.0507 1944 0.630508563 0.833959181 0.0116 20 -26.8748 -25.0203 29.4600 7.8427 0.0963 3.3664 0.8484 0.4187 0.4954 6.6868 0.0172 0.1799 0.0114 0.0114 0.0185 0.0066 0.0697 1943 0.948762137 0.552892235 0.0392 309 -24.8697 -26.9799 24.9700 7.2577 0.1020 3.2354 0.1611 0.3774 0.7706 8.0918 0.0196 0.0457 0.0060 0.0060 0.0120 0.0055 0.0699 1942 0.950673449 1.135547963 0.0148 18 -22.5094 -22.8155 28.5600 7.6926 0.1348 3.3979 0.6492 0.3347 1.3499 8.7744 0.0190 0.1142 0.0095 0.0095 0.0208 0.0072 0.0710 1941 1.185071356 1.263733805 0.0107 10 -24.3510 -22.5329 29.8200 6.2710 0.1459 3.3306 0.0560 0.3519 1.0068 9.4886 0.0179 0.0185 0.0196 0.0198 0.1190 0.0066 0.0613 1940 1.262322422 0.924262914 0.0168 133 -25.2962 -25.0828 26.2600 7.9568 0.1977 3.2329 0.0803 0.3561 3.2999 9.5743 0.0200 0.0232 0.0125 0.0125 0.0538 0.0065 0.0702 1939 1.114823086 1.548939022 0.0158 25 -25.5439 -24.3820 27.9800 4.2674 0.1624 2.3578 0.4553 0.3042 2.2656 7.3905 0.0087 0.0741 0.0100 0.0100 0.3075 0.0059 0.0413 1938 0.639727143 0.569847918 0.0115 5 -23.4696 -22.7480 5.0000 0.0751 2.6663 0.4021 0.2049 0.4997 7.9594 0.0121 0.0753 0.0093 0.0092 0.0819 0.0068 0.0485 1937 0.844930794 1.201811673 0.0269 13 -24.2616 -24.5915 25.9500 4.5623 0.0912 2.3393 0.0227 0.3172 0.2136 7.5512 0.0108 0.0093 0.0080 0.0079 0.1586 0.0049 0.0397 1936 0.603048989 0.528796963 0.0167 4 -23.4819 -23.1849 29.0200 7.1722 0.0600 2.7679 0.0126 0.2080 1.1025 7.5967 0.0175 0.0076 0.0094 0.0095 0.0608 0.0052 0.0569 1935 0.739921482 0.980951812 0.0369 402 -25.3542 -25.7692 30.5500 4.8218 0.0563 2.1489 0.0084 0.2337 1.3120 6.8994 0.0154 0.0044 0.0081 0.0081 0.0329 0.0047 0.0404 1934 0.936808475 1.350050919 0.0289 166 -26.1766 -24.8557 26.5700 4.2794 0.0626 2.1503 0.0112 0.3330 1.5501 6.8375 0.0072 0.0045 0.0248 0.0249 0.0818 0.0046 0.0362 1933 0.822006233 0.980858486 0.0187 215 -25.2825 -24.7483 27.0600 4.0682 0.0719 2.1376 0.0170 0.3042 3.6465 6.7130 0.0085 0.0074 0.0071 0.0071 0.0790 0.0047 0.0380 1932 1.128679304 1.122260931 0.0302 318 -26.5160 -24.7148 29.8100 3.4429 0.0475 2.1194 0.0111 0.2919 2.6147 7.5700 0.0093 0.0039 0.0069 0.0071 0.0472 0.0047 0.0336 1931 1.013960586 0.485124456 0.0189 13 -24.7074 -24.9517 30.7100 3.9828 0.0677 2.2806 0.0183 0.2268 3.7269 9.1548 0.0074 0.0089 0.0073 0.0073 0.0687 0.0057 0.0383 1930 1.148649752 1.029163891 0.0203 175 -26.8323 -26.0809 29.1800 3.0899 0.0697 3.5321 0.0158 0.3735 1.8765 13.0435 0.0121 0.0145 0.0103 0.0104 0.0397 0.0086 0.0506 1929 0.99387758 1.204846613 0.0376 104 -26.6411 -26.0890 28.1500 4.2733 0.0412 2.6675 0.0078 0.2893 0.1528 9.4824 0.0094 0.0112 0.0075 0.0075 0.0083 0.0064 0.0354 1928 0.905609551 0.772378969 0.0331 233 -25.8461 -26.2246 32.3600 5.8361 0.0440 2.8293 0.0095 0.2231 0.1736 8.7255 0.0186 0.0087 0.0074 0.0075 0.0091 0.0063 0.0476 1927 0.85672722 0.215215241 0.0171 152 -25.9555 -25.9299 28.1500 8.1915 0.1054 2.9585 0.0298 0.2692 0.3361 7.8459 0.0158 0.0135 0.0113 0.0112 0.2221 0.0057 0.0717 1926 0.932350398 0.425876672 0.0165 132 -27.7161 -26.9161 22.1900 7.5864 0.0875 3.2115 0.0256 0.2381 0.3483 8.4859 0.0152 0.0123 0.0127 0.0127 0.1256 0.0061 0.0618 1925 0.809324244 0.603492919 0.0174 48 -24.5765 -24.8562 28.9600 6.3520 0.0226 2.7524 0.0175 0.2355 0.3303 7.8838 0.0120 0.0130 0.0096 0.0096 0.0174 0.0058 0.0534 1924 1.735408827 1.991986688 0.027 253 -25.9985 -24.8571 31.4900 6.1000 0.1097 2.6762 0.0284 0.2676 2.2755 7.9132 0.0158 0.0089 0.0107 0.0106 0.2161 0.0054 0.0668 1923 0.787925712 1.573404755 0.0203 150 -24.6288 -25.1568 29.9300 5.6860 0.0967 2.5993 0.0231 0.2137 3.8395 9.0800 0.0128 0.0101 0.0098 0.0098 0.1010 0.0060 0.0536 1922 0.799163043 0.0208 334 -24.4215 -24.3729 28.8900 5.3341 0.0924 2.6394 0.0133 0.2462 3.8226 7.8138 0.0114 0.0069 0.0149 0.0150 0.0729 0.0054 0.0497 1921 0.77243578 0.0226 443 -23.4421 -23.8877 29.4300 6.1139 0.0805 3.2761 0.0156 0.2522 4.2754 10.1551 0.0128 0.0040 0.0195 0.0197 0.1065 0.0067 0.0623 1920 0.787155209 0.0385 278 -24.2587 -23.9798 29.2400 5.9896 0.0727 3.0804 0.0110 0.2266 3.7709 9.9680 0.0133 0.0038 0.0268 0.0269 0.0544 0.0067 0.0567 1919 0.836725864 0.0276 341 -24.7950 -24.8537 27.3900 6.5779 0.0798 3.1646 0.0126 0.2276 4.7733 10.8125 0.0149 0.0052 0.0154 0.0154 0.0604 0.0073 0.0629 1918 0.838156697 0.0058 392 -25.9260 -24.5236 30.6200 6.0259 0.0939 3.5283 0.0448 0.4603 6.5956 12.5834 0.0114 0.0238 0.0598 0.0605 0.2763 0.0095 0.0823 1917 0.966249549 0.0208 58 -25.5352 -24.7604 28.3400 5.8498 0.0925 2.8573 0.0143 0.2275 3.3143 9.2387 0.0118 0.0090 0.0238 0.0239 0.0445 0.0065 0.0535 1916 1.352618036 0.0152 567 -24.0530 -23.6626 27.6400 6.3964 0.0549 3.1876 0.0166 0.2559 6.1909 11.3232 0.0119 0.0088 0.0303 0.0302 0.0696 0.0078 0.0620 1915 0.56838431 0.0354 153 -23.6817 -23.9420 29.7600 5.9449 0.0494 3.1254 0.0118 0.2632 3.6600 10.8684 0.0125 0.0096 0.0234 0.0234 0.0455 0.0075 0.0580 1914 1.653698335 0.0096 355 -25.3230 -25.5543 30.4100 6.1042 0.0305 3.3067 0.0310 0.3592 11.7772 11.9468 0.0103 0.0189 0.0230 0.0230 0.0825 0.0083 0.0603 1913 0.673176646 0.018 479 -25.2734 -25.9128 31.0800 6.1167 0.1001 3.5575 0.0227 0.3392 8.3156 12.0722 0.0131 0.0069 0.0294 0.0291 0.0844 0.0083 0.0681 1912 1.168563731 0.0026 57 -25.4911 -25.0984 30.9900 8.2413 0.1793 5.4744 0.1320 0.7542 53.7132 17.0050 0.0120 0.1196 0.0562 0.0570 0.3436 0.0120 0.1118 1911 1.458277945 0.0119 43 -25.0742 -25.1744 29.2000 8.5525 0.0326 4.2884 0.0276 0.4920 13.5179 14.3376 0.0117 0.0126 0.0152 0.0153 0.0453 0.0096 0.0817 1910 1.653698335 0.0096 355 -25.3230 -25.5543 30.4100 6.1042 0.0305 3.3067 0.0310 0.3592 11.7772 11.9468 0.0103 0.0189 0.0230 0.0230 0.0825 0.0083 0.0603 Code # reading data test <- read.delim("clipboard", sep="") rnames <- test[,1] test <- data.matrix(test[,2:ncol(test)]) # to matrix rownames(test) <- rnames test <- scale(test, center=T, scale=T) # data standarization test <- t(test) # transpose ## Creating a color palette & color breaks my_palette <- colorRampPalette(c("forestgreen", "yellow", "red"))(n = 299) col_breaks = c(seq(-1,-0.5,length=100), # forestgreen seq(-0.5,0.5,length=100), # yellow seq(0.5,1,length=100)) # red # distance & hierarchical clustering distance= dist(test, method ="euclidean") hcluster = hclust(distance, method ="ward.D") # Creating Heat Map heatmap.2(test, main = paste( "test"), trace="none", margins =c(5,7), col=my_palette, breaks=col_breaks, dendrogram="row", Rowv = as.dendrogram(hcluster), Colv = "NA", key.xlab = "Concentration (index)", cexRow =0.6, cexCol = 0.8, na.rm = TRUE )
Solution: use the color_branches function from the dendextend package (or the set function, with the "branches_k_color", "k", and "value" parameters ). First we need to get the data into R and create the relevant objects ready (this part is the same as the code in the question): test <- read.delim("clipboard", sep="") rnames <- test[,1] test <- data.matrix(test[,2:ncol(test)]) # to matrix rownames(test) <- rnames test <- scale(test, center=T, scale=T) # data standarization test <- t(test) # transpose ## Creating a color palette & color breaks my_palette <- colorRampPalette(c("forestgreen", "yellow", "red"))(n = 299) col_breaks = c(seq(-1,-0.5,length=100), # forestgreen seq(-0.5,0.5,length=100), # yellow seq(0.5,1,length=100)) # red # distance & hierarchical clustering distance= dist(test, method ="euclidean") hcluster = hclust(distance, method ="ward.D") Next, we get the dendrogram and the heatmap ready: dend1 <- as.dendrogram(hcluster) # Get the dendextend package if(!require(dendextend)) install.packages("dendextend") library(dendextend) # get some colors cols_branches <- c("darkred", "forestgreen", "orange", "blue") # Set the colors of 4 branches dend1 <- color_branches(dend1, k = 4, col = cols_branches) # or with: # dend1 <- set(dend1, "branches_k_color", k = 4, value = cols_branches) # get the colors of the tips of the dendrogram: # col_labels <- cols_branches[cutree(dend1, k = 4)] # this may need tweaking in various cases - the following is a more general solution. # The following code will work on its own once I uplode dendextend 0.18.6 to CRAN - but that can # take several good weeks until that happens. In the meantime # Either use devtools::install_github('talgalili/dendextend') # Or just the following: source("https://raw.githubusercontent.com/talgalili/dendextend/master/R/attr_access.R") col_labels <- get_leaves_branches_col(dend1) # But due to the way heatmap.2 works - we need to fix it to be in the # order of the data! col_labels <- col_labels[order(order.dendrogram(dend1))] # Creating Heat Map if(!require(gplots)) install.packages("gplots") library(gplots) heatmap.2(test, main = paste( "test"), trace="none", margins =c(5,7), col=my_palette, breaks=col_breaks, dendrogram="row", Rowv = dend1, Colv = "NA", key.xlab = "Concentration (index)", cexRow =0.6, cexCol = 0.8, na.rm = TRUE, RowSideColors = col_labels, # to add nice colored strips colRow = col_labels # to add nice colored labels - only for qplots 2.17.0 and higher ) Which produces this plot: For more details on the package, you can have a look at its vignette. p.s.: to get the labels colored depends on parameters of heatmap.2, and this should be asked from the maintainer of gplots (i.e.: from greg at warnes.net) update: this answer now includes the new "colRow" parameter in qplots 2.17.0.
this is the maintainer of the gplots package. I've added two new arguments to the gplots::heatmap.2 function, 'colRow' and 'colCol' to control the colors of the row and column labels. This will be part of gplots 2.17.0 which should be submitted to CRAN in the next day or so.
Separate fields in R when no delimiter exists
I have a dataset like the following: structure(list(Info = c("Acacia melanoceras 0.0369 0.0427 0.0267 0.0298 0.0501 0.0042 ", "Acalypha diversifolia van 0.0670 0.0439 0.0281 0.0427 0.0464 -0.0148 ", "Acalypha macrostachya vin 0.0657 0.0621 0.0441 0.0522 0.0473 -0.0173 ", "Adelia triloba 0.0481 0.0350 0.0202 0.0174 0.0286 -0.0349 ", "Aegiphila panamensis 0.0437 0.0312 0.0166 0.0148 0.0194 -0.0497 ", "Alchornea costaricensis 0.0568 0.0781 0.0502 0.0221 0.0734 -0.0153 " )), .Names = "Info", row.names = c(NA, 6L), class = "data.frame") It currently has only one column and it looks like this Info 1 Acacia melanoceras 0.0369 0.0427 0.0267 0.0298 0.0501 0.0042 2 Acalypha diversifolia van 0.0670 0.0439 0.0281 0.0427 0.0464 -0.0148 3 Acalypha macrostachya vin 0.0657 0.0621 0.0441 0.0522 0.0473 -0.0173 4 Adelia triloba 0.0481 0.0350 0.0202 0.0174 0.0286 -0.0349 5 Aegiphila panamensis 0.0437 0.0312 0.0166 0.0148 0.0194 -0.0497 6 Alchornea costaricensis 0.0568 0.0781 0.0502 0.0221 0.0734 -0.0153 I would like it to have 7 columns and look like this: Species V1 V2 V3 V4 V5 V6 1 Acacia melanoceras 0.0369 0.0427 0.0267 0.0298 0.0501 0.0042 2 Acalypha diversifolia van 0.0670 0.0439 0.0281 0.0427 0.0464 -0.0148 3 Acalypha macrostachya vin 0.0657 0.0621 0.0441 0.0522 0.0473 -0.0173 4 Adelia triloba 0.0481 0.0350 0.0202 0.0174 0.0286 -0.0349 5 Aegiphila panamensis 0.0437 0.0312 0.0166 0.0148 0.0194 -0.0497 6 Alchornea costaricensis 0.0568 0.0781 0.0502 0.0221 0.0734 -0.0153 This probelm has been giving me headaches as the species name is not always two words. The original text file is not delimited, so I have been unable to read it in delimited. I have only been able to get it in as one column strings. Anyone have any suggestions?
Try using gsub for putting a comma before every number in the "Info" column of a dataframe we will assume is named "dat" and then re-read with read.csv: > read.csv(text=gsub("( [-[:digit:].])", ",\\1", dat$Info), header=FALSE) V1 V2 V3 V4 V5 V6 V7 1 Acacia melanoceras 0.0369 0.0427 0.0267 0.0298 0.0501 0.0042 2 Acalypha diversifolia van 0.0670 0.0439 0.0281 0.0427 0.0464 -0.0148 3 Acalypha macrostachya vin 0.0657 0.0621 0.0441 0.0522 0.0473 -0.0173 4 Adelia triloba 0.0481 0.0350 0.0202 0.0174 0.0286 -0.0349 5 Aegiphila panamensis 0.0437 0.0312 0.0166 0.0148 0.0194 -0.0497 6 Alchornea costaricensis 0.0568 0.0781 0.0502 0.0221 0.0734 -0.0153 I thank you for describing your use case. I might be able to use this myself in the future.
Suppose ds is your data: ds <- structure(list(Info = c("Acacia melanoceras 0.0369 0.0427 0.0267 0.0298 0.0501 0.0042 ", "Acalypha diversifolia van 0.0670 0.0439 0.0281 0.0427 0.0464 -0.0148 ", "Acalypha macrostachya vin 0.0657 0.0621 0.0441 0.0522 0.0473 -0.0173 ", "Adelia triloba 0.0481 0.0350 0.0202 0.0174 0.0286 -0.0349 ", "Aegiphila panamensis 0.0437 0.0312 0.0166 0.0148 0.0194 -0.0497 ", "Alchornea costaricensis 0.0568 0.0781 0.0502 0.0221 0.0734 -0.0153 " )), .Names = "Info", row.names = c(NA, 6L), class = "data.frame") You can then do something like ds$Info <- gsub(" (-?[0-9])", ", \\1", ds$Info) do.call(rbind, strsplit(ds$Info, ", ")) # [,1] [,2] [,3] [,4] [,5] [,6] [,7] #[1,] "Acacia melanoceras" "0.0369" "0.0427" "0.0267" "0.0298" "0.0501" "0.0042 " #[2,] "Acalypha diversifolia van" "0.0670" "0.0439" "0.0281" "0.0427" "0.0464" "-0.0148 " #[3,] "Acalypha macrostachya vin" "0.0657" "0.0621" "0.0441" "0.0522" "0.0473" "-0.0173 " #[4,] "Adelia triloba" "0.0481" "0.0350" "0.0202" "0.0174" "0.0286" "-0.0349 " #[5,] "Aegiphila panamensis" "0.0437" "0.0312" "0.0166" "0.0148" "0.0194" "-0.0497 " #[6,] "Alchornea costaricensis" "0.0568" "0.0781" "0.0502" "0.0221" "0.0734" "-0.0153 " where ds is your data as above, you're nearly done. You first look for the space followed by a number and put in a comma. Then we split the strings and combine the vectors. You can then yourself convert the object to a data.frame, covert the relevant columns to numeric, and add colnames. EDIT: As seen in BondedDust's answer, using read.csv is much more elegant. read.csv(text = ds$Info, header = FALSE)
Here's my suggestion: 1) Split by ' ', 2) paste the species and genus names together (I assume you have 6 numeric columns) and 3) make a (character) data.frame. 4) Finally convert columns to numeric and 5) set Species as colname. df <- structure(list(Info = c("Acacia melanoceras 0.0369 0.0427 0.0267 0.0298 0.0501 0.0042 ", "Acalypha diversifolia van 0.0670 0.0439 0.0281 0.0427 0.0464 -0.0148 ", "Acalypha macrostachya vin 0.0657 0.0621 0.0441 0.0522 0.0473 -0.0173 ", "Adelia triloba 0.0481 0.0350 0.0202 0.0174 0.0286 -0.0349 ", "Aegiphila panamensis 0.0437 0.0312 0.0166 0.0148 0.0194 -0.0497 ", "Alchornea costaricensis 0.0568 0.0781 0.0502 0.0221 0.0734 -0.0153 " )), .Names = "Info", row.names = c(NA, 6L), class = "data.frame") df # split sp <- strsplit(df$Info, ' ') sp # make (character) data.frame require(plyr) newdf <- ldply(sp, function(x) { l <- length(x) dta <- x[(l-5):l] spec <- paste(x[1:(l-6)], collapse = ' ') out <- c(spec, dta) return(out) }) # make numeric cols newdf[ , 2:7] <- apply(newdf[ , 2:7], 2, function(x) as.numeric(x)) names(newdf)[1] <- 'Species' str(newdf)
Mapping spatial Distributions in R
My data set includes 17 stations and for each station there are 24 hourly temperature values. I would like to map each stations value in each hour and doing so for all the hours. What I want to do is something like the image. The data is in the following format: N2 N3 N4 N5 N7 N8 N10 N12 N13 N14 N17 N19 N25 N28 N29 N31 N32 1 1.300 -0.170 -0.344 2.138 0.684 0.656 0.882 0.684 1.822 1.214 2.046 2.432 0.208 0.312 0.530 0.358 0.264 2 0.888 -0.534 -0.684 1.442 -0.178 -0.060 0.430 -0.148 1.420 0.286 1.444 2.138 -0.264 -0.042 0.398 -0.196 -0.148 3 0.792 -0.564 -0.622 0.998 -0.320 1.858 -0.036 -0.118 1.476 0.110 0.964 2.048 -0.480 -0.434 0.040 -0.538 -0.322 4 0.324 -1.022 -1.128 1.380 -0.792 1.042 -0.054 -0.158 1.518 -0.102 1.354 2.386 -0.708 -0.510 0.258 -0.696 -0.566 5 0.650 -0.774 -0.982 1.124 -0.540 3.200 -0.052 -0.258 1.452 0.028 1.022 2.110 -0.714 -0.646 0.266 -0.768 -0.532 6 0.670 -0.660 -0.844 1.248 -0.550 2.868 -0.098 -0.240 1.380 -0.012 1.164 2.324 -0.498 -0.474 0.860 -0.588 -0.324 MeteoSwiss 1 -0.6 2 -1.2 3 -1.0 4 -0.8 5 -0.4 6 -0.2 where N2, N3, ...m MeteoSwiss are the stations and each row presents the station's temperature value for each hour. id Longitude Latitude 2 7.1735 45.86880001 3 7.17254 45.86887001 4 7.171636 45.86923601 5 7.18018 45.87158001 7 7.177229 45.86923001 8 7.17524 45.86808001 10 7.179299 45.87020001 12 7.175189 45.86974001 13 7.179379 45.87081001 14 7.175509 45.86932001 17 7.18099 45.87262001 19 7.18122 45.87355001 25 7.15497 45.87058001 28 7.153399 45.86954001 29 7.152649 45.86992001 31 7.154419 45.87004001 32 7.156099 45.86983001 MeteoSwiss 7.184 45.896
I define a toy example more or less resembling your data: vals <- matrix(rnorm(24*17), nrow=24) cds <- data.frame(id=paste0('N', 1:17), Longitude=rnorm(n=17, mean=7.1), Latitude=rnorm(n=17, mean=45.8)) vals <- as.data.frame(t(vals)) names(vals) <- paste0('H', 1:24) The sp package defines several classes and methods to store and display spatial data. For your example you should use the SpatialPointsDataFrame class: library(sp) mySP <- SpatialPointsDataFrame(coords=cds[,-1], data=data.frame(vals)) and the spplot method to display the information: spplot(mySP, as.table=TRUE, col.regions=bpy.colors(10), alpha=0.8, edge.col='black') Besides, you may find useful the spacetime package (paper at JSS).
pca in R with princomp() and using svd() [duplicate]
This question already has an answer here: Closed 11 years ago. Possible Duplicate: Comparing svd and princomp in R How to perform PCA using 2 methods (princomp() and svd of correlation matrix ) in R I have a data set like: 438,498,3625,3645,5000,2918,5000,2351,2332,2643,1698,1687,1698,1717,1744,593,502,493,504,445,431,444,440,429,10 438,498,3625,3648,5000,2918,5000,2637,2332,2649,1695,1687,1695,1720,1744,592,502,493,504,449,431,444,443,429,10 438,498,3625,3629,5000,2918,5000,2637,2334,2643,1696,1687,1695,1717,1744,593,502,493,504,449,431,444,446,429,10 437,501,3625,3626,5000,2918,5000,2353,2334,2642,1730,1687,1695,1717,1744,593,502,493,504,449,431,444,444,429,10 438,498,3626,3629,5000,2918,5000,2640,2334,2639,1696,1687,1695,1717,1744,592,502,493,504,449,431,444,441,429,10 439,498,3626,3629,5000,2918,5000,2633,2334,2645,1705,1686,1694,1719,1744,589,502,493,504,446,431,444,444,430,10 440,5000,3627,3628,5000,2919,3028,2346,2330,2638,1727,1684,1692,1714,1745,588,501,492,504,451,433,446,444,432,10 444,5021,3631,3634,5000,2919,5000,2626,2327,2638,1698,1680,1688,1709,1740,595,500,491,503,453,436,448,444,436,10 451,5025,3635,3639,5000,2920,3027,2620,2323,2632,1706,1673,1681,1703,753,595,499,491,502,457,440,453,454,442,20 458,5022,3640,3644,5000,2922,5000,2346,2321,2628,1688,1666,1674,1696,744,590,496,490,498,462,444,458,461,449,20 465,525,3646,3670,5000,2923,5000,2611,2315,2631,1674,1658,1666,1688,735,593,495,488,497,467,449,462,469,457,20 473,533,3652,3676,5000,2925,5000,2607,2310,2623,1669,1651,1659,1684,729,578,496,487,498,469,454,467,476,465,20 481,544,3658,3678,5000,2926,5000,2606,2303,2619,1668,1643,1651,1275,723,581,495,486,497,477,459,472,484,472,20 484,544,3661,3665,5000,2928,5000,2321,2304,5022,1647,1639,1646,1270,757,623,493,484,495,480,461,474,485,476,20 484,532,3669,3662,2945,2926,5000,2326,2306,2620,1648,1639,1646,1270,760,533,493,483,494,507,461,473,486,476,20 482,520,3685,3664,2952,2927,5000,2981,2307,2329,1650,1640,1644,1268,757,533,492,482,492,513,459,474,485,474,20 481,522,3682,3661,2955,2927,2957,2984,1700,2622,1651,1641,1645,1272,761,530,492,482,492,513,462,486,483,473,20 480,525,3694,3664,2948,2926,2950,2995,1697,2619,1651,1642,1646,1269,762,530,493,482,492,516,462,486,483,473,20 481,515,5018,3664,2956,2927,2947,2993,1697,2622,1651,1641,1645,1269,765,592,489,482,495,531,462,499,483,473,20 479,5000,3696,3661,2953,2927,2944,2993,1702,2622,1649,1642,1645,1269,812,588,489,481,491,510,462,481,483,473,20 480,506,5019,3665,2941,2929,2945,2981,1700,2616,1652,1642,1645,1271,814,643,491,480,493,524,461,469,484,473,20 479,5000,5019,3661,2943,2930,2942,2996,1698,2312,1653,1642,1644,1274,811,617,491,479,491,575,461,465,484,473,20 479,5000,5020,3662,2945,2931,2942,2997,1700,2313,1654,1642,1644,1270,908,616,490,478,489,503,460,460,478,473,10 481,508,5021,3660,2954,2936,2946,2966,1705,2313,1654,1643,1643,1270,1689,678,493,477,483,497,467,459,476,473,10 486,510,522,3662,2958,2938,2939,2627,1707,2314,1659,1643,1639,1665,1702,696,516,476,477,547,465,457,470,474,10 479,521,520,3663,2954,2938,2941,2957,1712,2314,1660,1643,1638,1660,1758,688,534,475,475,489,461,456,465,474,10 480,554,521,3664,2954,2938,2941,2632,1715,2313,1660,1643,1637,1656,1761,687,553,475,474,558,462,453,465,476,10 481,511,5023,3665,2954,2937,2941,2627,1707,2312,1660,1641,1636,1655,1756,687,545,475,475,504,463,458,470,477,10 482,528,524,3665,2953,2937,2940,2629,1706,2312,1657,1640,1635,1654,1756,566,549,475,476,505,464,459,468,477,10 So I am doing this: x <- read.csv("C:\\data_25_1000.txt",header=F,row.names=NULL) p1 <- princomp(x, cor = TRUE) ## using correlation matrix p1 Call: princomp(x = x, cor = TRUE) Standard deviations: Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6 Comp.7 Comp.8 Comp.9 Comp.10 Comp.11 Comp.12 Comp.13 Comp.14 Comp.15 Comp.16 1.9800328 1.8321498 1.4147367 1.3045541 1.2016116 1.1708212 1.1424120 1.0134829 1.0045317 0.9078734 0.8442308 0.8093044 0.7977656 0.7661921 0.7370972 0.7075442 Comp.17 Comp.18 Comp.19 Comp.20 Comp.21 Comp.22 Comp.23 Comp.24 Comp.25 0.7011462 0.6779179 0.6671614 0.6407627 0.6077336 0.5767217 0.5659030 0.5526520 0.5191375 25 variables and 1000 observations. For the second method suppose I have the correlation matrix of "C:\data_25_1000.txt" which is: 1.0 0.3045 0.1448 -0.0714 -0.038 -0.0838 -0.1433 -0.1071 -0.1988 -0.1076 -0.0313 -0.157 -0.1032 -0.137 -0.0802 0.1244 0.0701 0.0457 -0.0634 0.0401 0.1643 0.3056 0.3956 0.4533 0.1557 0.3045 0.9999 0.3197 0.1328 0.093 -0.0846 -0.132 0.0046 -0.004 -0.0197 -0.1469 -0.1143 -0.2016 -0.1 -0.0316 0.0044 -0.0589 -0.0589 0.0277 0.0314 0.078 0.0104 0.0692 0.1858 0.0217 0.1448 0.3197 1 0.3487 0.2811 0.0786 -0.1421 -0.1326 -0.2056 -0.1109 0.0385 -0.1993 -0.1975 -0.1858 -0.1546 -0.0297 -0.0629 -0.0997 -0.0624 -0.0583 0.0316 0.0594 0.0941 0.0813 -0.1211 -0.0714 0.1328 0.3487 1 0.6033 0.2866 -0.246 -0.1201 -0.1975 -0.0929 -0.1071 -0.212 -0.3018 -0.3432 -0.2562 0.0277 -0.1363 -0.2218 -0.1443 -0.0322 -0.012 0.1741 -0.0725 -0.0528 -0.0937 -0.038 0.093 0.2811 0.6033 1 0.4613 0.016 0.0655 -0.1094 0.0026 -0.1152 -0.1692 -0.2047 -0.2508 -0.319 -0.0528 -0.1839 -0.2758 -0.2657 -0.1136 -0.0699 0.1433 -0.0136 -0.0409 -0.1538 -0.0838 -0.0846 0.0786 0.2866 0.4613 0.9999 0.2615 0.2449 0.1471 0.0042 -0.1496 -0.2025 -0.1669 -0.142 -0.1746 -0.1984 -0.2197 -0.2631 -0.2675 -0.1999 -0.1315 0.0469 0.0003 -0.1113 -0.1217 -0.1433 -0.132 -0.1421 -0.246 0.016 0.2615 1 0.3979 0.3108 0.1622 -0.0539 0.0231 0.1801 0.2129 0.1331 -0.1325 -0.0669 -0.0922 -0.1236 -0.1463 -0.1452 -0.2422 -0.0768 -0.1457 0.036 -0.1071 0.0046 -0.1326 -0.1201 0.0655 0.2449 0.3979 1 0.4244 0.3821 0.119 -0.0666 0.0163 0.0963 -0.0078 -0.1202 -0.204 -0.2257 -0.2569 -0.2334 -0.234 -0.2004 -0.138 -0.0735 -0.1442 -0.1988 -0.004 -0.2056 -0.1975 -0.1094 0.1471 0.3108 0.4244 0.9999 0.5459 0.0498 -0.052 0.0987 0.186 0.2576 -0.052 -0.1921 -0.2222 -0.1792 -0.0154 -0.058 -0.1868 -0.2232 -0.3118 0.0186 -0.1076 -0.0197 -0.1109 -0.0929 0.0026 0.0042 0.1622 0.3821 0.5459 0.9999 0.2416 0.0183 0.063 0.0252 0.186 0.0519 -0.1943 -0.2241 -0.2635 -0.0498 -0.0799 -0.0553 -0.1567 -0.2281 -0.0263 -0.0313 -0.1469 0.0385 -0.1071 -0.1152 -0.1496 -0.0539 0.119 0.0498 0.2416 1 0.2601 0.1625 -0.0091 -0.0633 0.0355 0.0397 -0.0288 -0.0768 -0.2144 -0.2581 0.1062 0.0469 -0.0608 -0.0578 -0.157 -0.1143 -0.1993 -0.212 -0.1692 -0.2025 0.0231 -0.0666 -0.052 0.0183 0.2601 0.9999 0.3685 0.3059 0.1269 -0.0302 0.1417 0.1678 0.2219 -0.0392 -0.2391 -0.2504 -0.2743 -0.1827 -0.0496 -0.1032 -0.2016 -0.1975 -0.3018 -0.2047 -0.1669 0.1801 0.0163 0.0987 0.063 0.1625 0.3685 1 0.6136 0.2301 -0.1158 0.0366 0.0965 0.1334 -0.0449 -0.1923 -0.2321 -0.1848 -0.1109 0.1007 -0.137 -0.1 -0.1858 -0.3432 -0.2508 -0.142 0.2129 0.0963 0.186 0.0252 -0.0091 0.3059 0.6136 1 0.4078 -0.0615 0.0607 0.1223 0.1379 0.0072 -0.1377 -0.3633 -0.2905 -0.1867 0.0277 -0.0802 -0.0316 -0.1546 -0.2562 -0.319 -0.1746 0.1331 -0.0078 0.2576 0.186 -0.0633 0.1269 0.2301 0.4078 1 0.0521 -0.0345 0.0444 0.0778 0.0925 0.0596 -0.2551 -0.1499 -0.2211 0.244 0.1244 0.0044 -0.0297 0.0277 -0.0528 -0.1984 -0.1325 -0.1202 -0.052 0.0519 0.0355 -0.0302 -0.1158 -0.0615 0.0521 1 0.295 0.2421 -0.06 0.0921 0.243 0.0953 0.0886 0.0518 -0.0032 0.0701 -0.0589 -0.0629 -0.1363 -0.1839 -0.2197 -0.0669 -0.204 -0.1921 -0.1943 0.0397 0.1417 0.0366 0.0607 -0.0345 0.295 0.9999 0.4832 0.2772 0.0012 0.1198 0.0411 0.1213 0.1409 0.0368 0.0457 -0.0589 -0.0997 -0.2218 -0.2758 -0.2631 -0.0922 -0.2257 -0.2222 -0.2241 -0.0288 0.1678 0.0965 0.1223 0.0444 0.2421 0.4832 1 0.2632 0.0576 0.0965 -0.0043 0.0818 0.102 0.0915 -0.0634 0.0277 -0.0624 -0.1443 -0.2657 -0.2675 -0.1236 -0.2569 -0.1792 -0.2635 -0.0768 0.2219 0.1334 0.1379 0.0778 -0.06 0.2772 0.2632 1 0.2036 -0.0452 -0.142 -0.0696 -0.0367 0.3039 0.0401 0.0314 -0.0583 -0.0322 -0.1136 -0.1999 -0.1463 -0.2334 -0.0154 -0.0498 -0.2144 -0.0392 -0.0449 0.0072 0.0925 0.0921 0.0012 0.0576 0.2036 0.9999 0.2198 0.1268 0.0294 0.0261 0.3231 0.1643 0.078 0.0316 -0.012 -0.0699 -0.1315 -0.1452 -0.234 -0.058 -0.0799 -0.2581 -0.2391 -0.1923 -0.1377 0.0596 0.243 0.1198 0.0965 -0.0452 0.2198 1 0.2667 0.2833 0.2467 0.0288 0.3056 0.0104 0.0594 0.1741 0.1433 0.0469 -0.2422 -0.2004 -0.1868 -0.0553 0.1062 -0.2504 -0.2321 -0.3633 -0.2551 0.0953 0.0411 -0.0043 -0.142 0.1268 0.2667 1 0.4872 0.3134 0.1663 0.3956 0.0692 0.0941 -0.0725 -0.0136 0.0003 -0.0768 -0.138 -0.2232 -0.1567 0.0469 -0.2743 -0.1848 -0.2905 -0.1499 0.0886 0.1213 0.0818 -0.0696 0.0294 0.2833 0.4872 0.9999 0.4208 0.1317 0.4533 0.1858 0.0813 -0.0528 -0.0409 -0.1113 -0.1457 -0.0735 -0.3118 -0.2281 -0.0608 -0.1827 -0.1109 -0.1867 -0.2211 0.0518 0.1409 0.102 -0.0367 0.0261 0.2467 0.3134 0.4208 1 0.0592 0.1557 0.0217 -0.1211 -0.0937 -0.1538 -0.1217 0.036 -0.1442 0.0186 -0.0263 -0.0578 -0.0496 0.1007 0.0277 0.244 -0.0032 0.0368 0.0915 0.3039 0.3231 0.0288 0.1663 0.1317 0.0592 0.9999 I have also computed svd of this correlation matrix and got: > s = svd(Correlation_25_1000) $d [1] 3.9205298 3.3567729 2.0014799 1.7018614 1.4438704 1.3708223 1.3051053 1.0271475 1.0090840 0.8242341 0.7127256 0.6549736 0.6364299 0.5870503 0.5433123 0.5006188 0.4916060 [18] 0.4595726 0.4451043 0.4105769 0.3693401 0.3326079 0.3202462 0.3054243 0.2695037 $u matrix $v matrix My question is, how can I use $d, $u and $v to get principal components Could I use prcomp() ?? If, so how?
Try this one princomp princomp(USArrests, cor = TRUE)$loadings Loadings: Comp.1 Comp.2 Comp.3 Comp.4 Murder -0.536 0.418 -0.341 0.649 Assault -0.583 0.188 -0.268 -0.743 UrbanPop -0.278 -0.873 -0.378 0.134 Rape -0.543 -0.167 0.818 svd svd(cor(USArrests))$u [,1] [,2] [,3] [,4] [1,] -0.5358995 0.4181809 -0.3412327 0.64922780 [2,] -0.5831836 0.1879856 -0.2681484 -0.74340748 [3,] -0.2781909 -0.8728062 -0.3780158 0.13387773 [4,] -0.5434321 -0.1673186 0.8177779 0.08902432 eigen eigen(cor(USArrests))$vectors [,1] [,2] [,3] [,4] [1,] -0.5358995 0.4181809 -0.3412327 0.64922780 [2,] -0.5831836 0.1879856 -0.2681484 -0.74340748 [3,] -0.2781909 -0.8728062 -0.3780158 0.13387773 [4,] -0.5434321 -0.1673186 0.8177779 0.08902432 For cor matrix, all princomp, svd, and eigen produces same results.