I have a table with numbers and can plot a 3d histogram in excel.
Here is my histogram in excel:
How can i do the same in R with plot3d?
In their example they are use 3 digits for x, y, z.
Here their dataset and histogram in R:
But i have only one digit for one bar
My table:
-2.88 -1.76 -0.41 -2.25 -0.83 -0.62 -1.25 -2.68 -2.41 -1.74 -2.51 -0.78 -1.97 -2.67 -1.41 -1.56 0.49 -1.54 -1.37 -1.47 -2.32 0.66
-2.39 -1.98 -0.65 -2.33 -1.98 -1.19 -2.44 -2.13 -2.16 -2.44 -2.20 -1.77 -0.60 -0.73 -0.77 -1.59 -1.01 -1.37 -1.68 -0.92 -1.28 -0.12
-1.99 -2.48 -0.43 -1.75 -1.81 -2.37 -1.08 -1.18 -0.80 -3.30 -2.04 -1.96 -0.65 -2.44 -0.83 -1.67 -0.48 -1.03 -1.76 0.04 -1.30 -0.71
-2.73 -2.22 -0.98 -1.24 -2.21 -1.29 -1.37 -0.89 -0.86 -2.22 -1.32 -2.13 -1.04 -1.12 -0.60 -1.58 0.20 0.01 -1.81 -0.17 -0.38 -1.74
-1.63 -1.29 -1.31 -1.94 -2.39 -1.20 -1.66 -0.14 -0.96 -1.10 -0.40 -1.29 -0.44 -0.26 0.01 -2.71 -0.55 0.17 -3.44 -0.95 0.75 -1.08
-0.95 -0.15 -1.13 -1.18 -1.74 0.09 -1.12 -0.37 -0.80 -0.44 -1.18 -1.53 -1.28 0.36 -0.56 -1.54 -0.58 0.71 -1.53 -0.57 -0.91 -1.29
-0.67 0.02 -1.82 -0.84 -2.11 -0.38 -1.12 -0.57 -0.81 -1.04 -1.22 -0.93 -1.29 -0.26 0.02 -0.76 -0.28 -0.24 -0.43 -0.37 -1.30 -1.61
-3.45 -2.79 -0.44 -2.25 -0.81 -1.00 -1.20 -2.90 -1.96 -2.79 -2.91 -0.58 -1.65 -3.10 -1.23 -2.20 -0.15 -1.60 -1.51 -0.97 -2.35 0.38
-3.03 -3.12 -0.62 -2.01 -2.25 -1.84 -2.29 -2.51 -1.86 -2.93 -2.32 -1.63 -0.35 -1.05 -1.09 -2.04 -0.79 -1.18 -2.39 -0.54 -0.60 -0.71
-2.78 -2.60 -0.49 -1.69 -1.96 -2.10 -1.70 -1.26 -0.37 -2.80 -2.40 -2.23 -0.61 -2.26 -0.80 -2.11 -0.17 -0.21 -2.61 -0.09 -1.18 -1.26
-3.13 -1.96 -1.19 -1.17 -2.76 -0.87 -1.96 -0.22 -0.49 -2.75 -1.81 -2.48 -1.26 -1.04 0.08 -2.52 0.21 0.80 -2.28 -0.14 -0.27 -1.69
-1.52 -1.85 -1.36 -1.42 -2.28 -0.49 -1.58 -0.34 -1.11 -0.59 -0.74 -1.63 -0.58 -0.23 0.12 -2.97 0.17 0.68 -3.14 -0.64 0.21 -1.70
-1.05 -0.42 -1.50 -1.46 -2.32 -0.57 -0.63 -0.17 -0.79 -0.92 -1.52 -1.69 -1.25 0.34 -0.46 -1.94 0.27 0.82 -1.48 0.35 -1.25 -1.89
-1.03 0.28 -1.39 -0.82 -2.44 -0.75 -0.86 -0.69 -1.07 -1.38 -1.46 -1.09 -1.71 -0.50 0.59 -1.42 -0.54 -0.13 -0.86 -0.14 -1.28 -1.84
UPD:
I tried to insert a full dataset to one of examples. Just want to see how plot3 handle with a huge amount of bars. Its pretty stucking.
And i dont see a negative bars. I assume that positive bar will apperars upper 0 and negative bottom, like on my first picture.
So, i realize that firstly i need to render a big amount of data to be able to choose a right library.
Also i assume, that full realtime 3d rendering maybe impossible for that amount of data. So it will be normal if library will render just a 1 picture like a hist3d does.
m <- structure(c(-2.88, -1.76, -0.41, -2.25, -0.83, -0.62, -1.25, -2.68, -2.41, -1.74, -2.51, -0.78, -1.97, -2.67, -1.41, -1.56, 0.49, -1.54, -1.37, -1.47, -2.32, 0.66,
-2.39, -1.98, -0.65, -2.33, -1.98, -1.19, -2.44, -2.13, -2.16, -2.44, -2.20, -1.77, -0.60, -0.73, -0.77, -1.59, -1.01, -1.37, -1.68, -0.92, -1.28, -0.12,
-1.99, -2.48, -0.43, -1.75, -1.81, -2.37, -1.08, -1.18, -0.80, -3.30, -2.04, -1.96, -0.65, -2.44, -0.83, -1.67, -0.48, -1.03, -1.76, 0.04, -1.30, -0.71,
<=-=-=-=-=-=-=-=-=-=-=-skipped ==============>>
-2.64, -0.89, -1.60, -2.28, -3.56, -0.84, 0.31, 0.48, -0.31, 0.03, -2.42, 0.92, -3.10, -2.35, 0.03, -2.56, -0.91, 1.01, -5.90, -0.40, 2.95, -1.32,
-3.06, -0.69, -0.74, -2.46, -4.16, 0.46, 0.97, 0.46, -0.47, -0.79, -3.12, 1.09, -3.53, -1.08, -0.25, -1.26, -0.57, 0.67, -4.76, 0.01, -0.08, -1.56,
-2.70, -0.89, -0.97, -2.40, -5.45, -1.26, 1.65, 0.24, -1.60, -1.79, -2.05, 0.18, -3.01, -0.39, 0.47, -2.21, -0.50, 0.77, -3.05, 0.81, -0.36, -1.98), .Dim = c(700L, 22L))
library(graph3d)
dat <- cbind(
expand.grid(x = 1:700, y = 1:22),
z = c(m)
)
graph3d(
dat,
~x, ~y, ~z,
type = "bar"
)
Help me please to plot a histogram from a full txt file with positive up bars and negative down.
My full txt file is here https://pastebin.com/2zyyRDy8
I've read my txt file to res_cut, but i see data structure different from your examples, in my there 700 objs of 23 variable
res_cut <- read.delim("d:/result_cut.txt",sep = "\t", header = FALSE)
With the graph3d package:
m <- structure(c(-2.88, -2.39, -1.99, -2.73, -1.63, -0.95, -0.67,
-3.45, -3.03, -2.78, -3.13, -1.52, -1.05, -1.03, -1.76, -1.98,
-2.48, -2.22, -1.29, -0.15, 0.02, -2.79, -3.12, -2.6, -1.96,
-1.85, -0.42, 0.28, -0.41, -0.65, -0.43, -0.98, -1.31, -1.13,
-1.82, -0.44, -0.62, -0.49, -1.19, -1.36, -1.5, -1.39, -2.25,
-2.33, -1.75, -1.24, -1.94, -1.18, -0.84, -2.25, -2.01, -1.69,
-1.17, -1.42, -1.46, -0.82, -0.83, -1.98, -1.81, -2.21, -2.39,
-1.74, -2.11, -0.81, -2.25, -1.96, -2.76, -2.28, -2.32, -2.44,
-0.62, -1.19, -2.37, -1.29, -1.2, 0.09, -0.38, -1, -1.84, -2.1,
-0.87, -0.49, -0.57, -0.75, -1.25, -2.44, -1.08, -1.37, -1.66,
-1.12, -1.12, -1.2, -2.29, -1.7, -1.96, -1.58, -0.63, -0.86,
-2.68, -2.13, -1.18, -0.89, -0.14, -0.37, -0.57, -2.9, -2.51,
-1.26, -0.22, -0.34, -0.17, -0.69, -2.41, -2.16, -0.8, -0.86,
-0.96, -0.8, -0.81, -1.96, -1.86, -0.37, -0.49, -1.11, -0.79,
-1.07, -1.74, -2.44, -3.3, -2.22, -1.1, -0.44, -1.04, -2.79,
-2.93, -2.8, -2.75, -0.59, -0.92, -1.38, -2.51, -2.2, -2.04,
-1.32, -0.4, -1.18, -1.22, -2.91, -2.32, -2.4, -1.81, -0.74,
-1.52, -1.46, -0.78, -1.77, -1.96, -2.13, -1.29, -1.53, -0.93,
-0.58, -1.63, -2.23, -2.48, -1.63, -1.69, -1.09, -1.97, -0.6,
-0.65, -1.04, -0.44, -1.28, -1.29, -1.65, -0.35, -0.61, -1.26,
-0.58, -1.25, -1.71, -2.67, -0.73, -2.44, -1.12, -0.26, 0.36,
-0.26, -3.1, -1.05, -2.26, -1.04, -0.23, 0.34, -0.5, -1.41, -0.77,
-0.83, -0.6, 0.01, -0.56, 0.02, -1.23, -1.09, -0.8, 0.08, 0.12,
-0.46, 0.59, -1.56, -1.59, -1.67, -1.58, -2.71, -1.54, -0.76,
-2.2, -2.04, -2.11, -2.52, -2.97, -1.94, -1.42, 0.49, -1.01,
-0.48, 0.2, -0.55, -0.58, -0.28, -0.15, -0.79, -0.17, 0.21, 0.17,
0.27, -0.54, -1.54, -1.37, -1.03, 0.01, 0.17, 0.71, -0.24, -1.6,
-1.18, -0.21, 0.8, 0.68, 0.82, -0.13, -1.37, -1.68, -1.76, -1.81,
-3.44, -1.53, -0.43, -1.51, -2.39, -2.61, -2.28, -3.14, -1.48,
-0.86, -1.47, -0.92, 0.04, -0.17, -0.95, -0.57, -0.37, -0.97,
-0.54, -0.09, -0.14, -0.64, 0.35, -0.14, -2.32, -1.28, -1.3,
-0.38, 0.75, -0.91, -1.3, -2.35, -0.6, -1.18, -0.27, 0.21, -1.25,
-1.28, 0.66, -0.12, -0.71, -1.74, -1.08, -1.29, -1.61, 0.38,
-0.71, -1.26, -1.69, -1.7, -1.89, -1.84), .Dim = c(14L, 22L))
library(graph3d)
dat <- cbind(
expand.grid(x = 1:14, y = 1:22),
z = c(m)
)
graph3d(
dat,
~x, ~y, ~z,
type = "bar"
)
You could use hist3D from plot3Dpackage with z parameter:
m <- structure(c(-2.88, -2.39, -1.99, -2.73, -1.63, -0.95, -0.67,
-3.45, -3.03, -2.78, -3.13, -1.52, -1.05, -1.03, -1.76, -1.98,
-2.48, -2.22, -1.29, -0.15, 0.02, -2.79, -3.12, -2.6, -1.96,
-1.85, -0.42, 0.28, -0.41, -0.65, -0.43, -0.98, -1.31, -1.13,
-1.82, -0.44, -0.62, -0.49, -1.19, -1.36, -1.5, -1.39, -2.25,
-2.33, -1.75, -1.24, -1.94, -1.18, -0.84, -2.25, -2.01, -1.69,
-1.17, -1.42, -1.46, -0.82, -0.83, -1.98, -1.81, -2.21, -2.39,
-1.74, -2.11, -0.81, -2.25, -1.96, -2.76, -2.28, -2.32, -2.44,
-0.62, -1.19, -2.37, -1.29, -1.2, 0.09, -0.38, -1, -1.84, -2.1,
-0.87, -0.49, -0.57, -0.75, -1.25, -2.44, -1.08, -1.37, -1.66,
-1.12, -1.12, -1.2, -2.29, -1.7, -1.96, -1.58, -0.63, -0.86,
-2.68, -2.13, -1.18, -0.89, -0.14, -0.37, -0.57, -2.9, -2.51,
-1.26, -0.22, -0.34, -0.17, -0.69, -2.41, -2.16, -0.8, -0.86,
-0.96, -0.8, -0.81, -1.96, -1.86, -0.37, -0.49, -1.11, -0.79,
-1.07, -1.74, -2.44, -3.3, -2.22, -1.1, -0.44, -1.04, -2.79,
-2.93, -2.8, -2.75, -0.59, -0.92, -1.38, -2.51, -2.2, -2.04,
-1.32, -0.4, -1.18, -1.22, -2.91, -2.32, -2.4, -1.81, -0.74,
-1.52, -1.46, -0.78, -1.77, -1.96, -2.13, -1.29, -1.53, -0.93,
-0.58, -1.63, -2.23, -2.48, -1.63, -1.69, -1.09, -1.97, -0.6,
-0.65, -1.04, -0.44, -1.28, -1.29, -1.65, -0.35, -0.61, -1.26,
-0.58, -1.25, -1.71, -2.67, -0.73, -2.44, -1.12, -0.26, 0.36,
-0.26, -3.1, -1.05, -2.26, -1.04, -0.23, 0.34, -0.5, -1.41, -0.77,
-0.83, -0.6, 0.01, -0.56, 0.02, -1.23, -1.09, -0.8, 0.08, 0.12,
-0.46, 0.59, -1.56, -1.59, -1.67, -1.58, -2.71, -1.54, -0.76,
-2.2, -2.04, -2.11, -2.52, -2.97, -1.94, -1.42, 0.49, -1.01,
-0.48, 0.2, -0.55, -0.58, -0.28, -0.15, -0.79, -0.17, 0.21, 0.17,
0.27, -0.54, -1.54, -1.37, -1.03, 0.01, 0.17, 0.71, -0.24, -1.6,
-1.18, -0.21, 0.8, 0.68, 0.82, -0.13, -1.37, -1.68, -1.76, -1.81,
-3.44, -1.53, -0.43, -1.51, -2.39, -2.61, -2.28, -3.14, -1.48,
-0.86, -1.47, -0.92, 0.04, -0.17, -0.95, -0.57, -0.37, -0.97,
-0.54, -0.09, -0.14, -0.64, 0.35, -0.14, -2.32, -1.28, -1.3,
-0.38, 0.75, -0.91, -1.3, -2.35, -0.6, -1.18, -0.27, 0.21, -1.25,
-1.28, 0.66, -0.12, -0.71, -1.74, -1.08, -1.29, -1.61, 0.38,
-0.71, -1.26, -1.69, -1.7, -1.89, -1.84), .Dim = c(14L, 22L))
plot3D::hist3D(z=m)
Related
I used a for loop to create a correlation matrix, because I needed to use polychor to generate polychoric correaltions and I was only able to get polychor to correlate two variables at a time. Anyway, I created my own correlation table with the following code:
for(i in 1:ncol(gd2)) {
for (j in 1:ncol(gd2)) {
corVal
The table looks like this:
head(dtnew)
Better Afraid Alive Bored Drop Empty Energy Happy Help Home Hope Memory Satis Spirit Worth TOT
1: 1.00 0.32 0.29 0.39 0.36 0.46 0.25 0.43 0.39 0.13 0.46 0.39 0.50 0.45 0.48 0.67
2: 0.32 1.00 0.25 0.20 0.24 0.30 0.23 0.30 0.43 0.15 0.44 0.28 0.31 0.29 0.34 0.62
3: 0.29 0.25 1.00 0.26 0.28 0.46 0.38 0.60 0.35 0.19 0.41 0.10 0.49 0.53 0.43 0.65
4: 0.39 0.20 0.26 1.00 0.36 0.56 0.31 0.36 0.39 0.16 0.32 0.23 0.39 0.35 0.44 0.67
5: 0.36 0.24 0.28 0.36 1.00 0.44 0.41 0.37 0.43 0.31 0.35 0.22 0.42 0.37 0.40 0.72
6: 0.46 0.30 0.46 0.56 0.44 1.00 0.32 0.55 0.51 0.18 0.45 0.17 0.62 0.52 0.64 0.75
>
But longer.
Here is the dput()
structure(list(Better = c(1, 0.32, 0.29, 0.39, 0.36, 0.46, 0.25,
0.43, 0.39, 0.13, 0.46, 0.39, 0.5, 0.45, 0.48, 0.67), Afraid = c(0.32,
1, 0.25, 0.2, 0.24, 0.3, 0.23, 0.3, 0.43, 0.15, 0.44, 0.28, 0.31,
0.29, 0.34, 0.62), Alive = c(0.29, 0.25, 1, 0.26, 0.28, 0.46,
0.38, 0.6, 0.35, 0.19, 0.41, 0.1, 0.49, 0.53, 0.43, 0.65), Bored = c(0.39,
0.2, 0.26, 1, 0.36, 0.56, 0.31, 0.36, 0.39, 0.16, 0.32, 0.23,
0.39, 0.35, 0.44, 0.67), Drop = c(0.36, 0.24, 0.28, 0.36, 1,
0.44, 0.41, 0.37, 0.43, 0.31, 0.35, 0.22, 0.42, 0.37, 0.4, 0.72
), Empty = c(0.46, 0.3, 0.46, 0.56, 0.44, 1, 0.32, 0.55, 0.51,
0.18, 0.45, 0.17, 0.62, 0.52, 0.64, 0.75), Energy = c(0.25, 0.23,
0.38, 0.31, 0.41, 0.32, 1, 0.48, 0.37, 0.36, 0.31, 0.14, 0.4,
0.43, 0.38, 0.74), Happy = c(0.43, 0.3, 0.6, 0.36, 0.37, 0.55,
0.48, 1, 0.45, 0.21, 0.49, 0.22, 0.69, 0.84, 0.49, 0.8), Help = c(0.39,
0.43, 0.35, 0.39, 0.43, 0.51, 0.37, 0.45, 1, 0.2, 0.51, 0.32,
0.5, 0.44, 0.6, 0.73), Home = c(0.13, 0.15, 0.19, 0.16, 0.31,
0.18, 0.36, 0.21, 0.2, 1, 0.23, 0.13, 0.13, 0.15, 0.26, 0.63),
Hope = c(0.46, 0.44, 0.41, 0.32, 0.35, 0.45, 0.31, 0.49,
0.51, 0.23, 1, 0.38, 0.48, 0.47, 0.59, 0.73), Memory = c(0.39,
0.28, 0.1, 0.23, 0.22, 0.17, 0.14, 0.22, 0.32, 0.13, 0.38,
1, 0.25, 0.24, 0.31, 0.66), Satis = c(0.5, 0.31, 0.49, 0.39,
0.42, 0.62, 0.4, 0.69, 0.5, 0.13, 0.48, 0.25, 1, 0.66, 0.6,
0.78), Spirit = c(0.45, 0.29, 0.53, 0.35, 0.37, 0.52, 0.43,
0.84, 0.44, 0.15, 0.47, 0.24, 0.66, 1, 0.51, 0.77), Worth = c(0.48,
0.34, 0.43, 0.44, 0.4, 0.64, 0.38, 0.49, 0.6, 0.26, 0.59,
0.31, 0.6, 0.51, 1, 0.77), TOT = c(0.67, 0.62, 0.65, 0.67,
0.72, 0.75, 0.74, 0.8, 0.73, 0.63, 0.73, 0.66, 0.78, 0.77,
0.77, 0.89)), row.names = c(NA, -16L), class = c("data.table",
"data.frame"), .internal.selfref = <pointer: 0x000001d7adc21ef0>)
</pre/>
I would like to generate a visual using corrplot. However, when I try, I get an error:
Error in is.finite(tmp) : default method not implemented for type 'list'
My data is indeed of type list. I have tried usuing 'unlist'. Not sure what else to try.
There is a problem with your dput() output, possibly because you have a data.table. I can read it by deleting ", .internal.selfref = <pointer: 0x000001d7adc21ef0>" from the last line so that it ends class = c("data.table", "data.frame")). Printing that out shows a problem with the last line/column (Tot). The bottom row in that column should be 1.00, but it is 0.89. We can trim that and use as.matrix (my mistake in the earlier comment) to convert the data frame:
gd3 <- gd2[-16, -16]
corrplot(as.matrix(gd3))
library(corrplot)
M <- cor(df)
head(round(M,2))
corrplot(M, method="number")
I'm having difficulties about doing a CC analysis in R.
The assignment which I'm doing is from "Applied Multivariate Analysis" by Sharma, exercise 13.7, if you're familiar with it.
Basically, I'm asked to conduct a CCA on a set of variables. There are seven X variables, but only five Y variables, thus R complains that the dimensions are not compatible. See the image below for a visual representation of the data called CETNEW.
Edited (Changed from image to dput):
structure(list(...
1 = c("X1", "X2", "X3", "X4", "X5", "X6", "X7", "Y1", "Y2", "Y3", "Y4", "Y5"),
2 = c(2.72, 1.2, 0.82, 0.92, 1.19, 1, 1.45, 0.68, 0.98, 0.57, 1.07, 0.91), ...
3 = c(1.2, 3.78, 0.7, 1.04, 1.06, 1.32, 1.31, 0.56, 1, 0.79, 1.13, 1.38), ...
4 = c(0.82, 0.7, 1.7, 0.59, 0.83, 1.08, 1.01, 0.65, 0.78, 0.66, 0.93, 0.77), ...
5 = c(0.92, 1.04, 0.59, 3.09, 1.06, 0.93, 1.47, 0.62, 1.26, 0.51, 0.94, 0.85), ...
6 = c(1.19, 1.06, 0.83, 1.06, 2.94, 1.36, 1.66, 0.68, 1.16, 0.77, 1.37, 1.11), ...
7 = c(1, 1.32, 1.08, 0.93, 1.36, 2.94, 1.56, 0.9, 1.23, 0.78, 1.65, 1.31), ...
8 = c(1.45, 1.31, 1.01, 1.47, 1.66, 1.56, 3.11, 1.03, 1.7, 0.81, 1.63, 1.44), ...
9 = c(0.68, 0.56, 0.65, 0.62, 0.68, 0.9, 1.03, 1.71, 0.99, 0.65, 0.86, 0.72), ...
10 = c(0.98, 1, 0.78, 1.26, 1.16, 1.23, 1.7, 0.99, 3.07, 0.61, 1.43, 1.28), ...
11 = c(0.57, 0.79, 0.66, 0.51, 0.77, 0.78, 0.81, 0.65, 0.61, 2.83, 1.04, 0.84), ...
12 = c(1.07, 1.13, 0.93, 0.94, 1.37, 1.65, 1.63, 0.86, 1.43, 1.04, 2.83, 1.6), ...
13 = c(0.91, 1.38, 0.77, 0.85, 1.11, 1.31, 1.44, 0.72, 1.28, 0.84, 1.6, 4.01)),
row.names = c(NA, -12L), class = c("tbl_df", "tbl", "data.frame"))
What I've Done so Far
CETNEW <- CETNEW[,-1] #To remove the non-numeric values
Create two variables (criterion and predictor variables) as:
CETNEWx <- CETNEW[1:7,]
CETNEWy <- CETNEW[8:12,]
Then I've been using various packages such as CCA, CCP and candisk. From CCA:
ccCETNEW <- cc(CETNEWx,CETNEWy)
Yields the following error message:
Error in cov(X, Y, use = "pairwise") : incompatible dimensions
The matcor function also from CCA, yields the following error message:
Error in data.frame(..., check.names = FALSE) : arguments imply differing number of rows: 7, 5
Thus, it would seem that it all boils down to the different dimension problem. I've talked to my professor about it, but since he is using SAS, which apparently are compatible with this problem and could solve it, he could not help me.
Please, if you're familiar with canonical correlation and have had a similar problem before, any help regarding this topic is highly appreciated.
If you look at your data, notice the first column is divided into X and Y labels. That suggests to me that your data are transposed. If so, each column is an observation and the X and Y labels indicate various measurements taken on each observation. Canonical correlations are performed on two groups of measurements/variables from a single set of observations. First, here is the transposed data:
CETNEW.T <- structure(list(X1 = c(2.72, 1.2, 0.82, 0.92, 1.19, 1, 1.45, 0.68,
0.98, 0.57, 1.07, 0.91), X2 = c(1.2, 3.78, 0.7, 1.04, 1.06, 1.32,
1.31, 0.56, 1, 0.79, 1.13, 1.38), X3 = c(0.82, 0.7, 1.7, 0.59,
0.83, 1.08, 1.01, 0.65, 0.78, 0.66, 0.93, 0.77), X4 = c(0.92,
1.04, 0.59, 3.09, 1.06, 0.93, 1.47, 0.62, 1.26, 0.51, 0.94, 0.85
), X5 = c(1.19, 1.06, 0.83, 1.06, 2.94, 1.36, 1.66, 0.68, 1.16,
0.77, 1.37, 1.11), X6 = c(1, 1.32, 1.08, 0.93, 1.36, 2.94, 1.56,
0.9, 1.23, 0.78, 1.65, 1.31), X7 = c(1.45, 1.31, 1.01, 1.47,
1.66, 1.56, 3.11, 1.03, 1.7, 0.81, 1.63, 1.44), Y1 = c(0.68,
0.56, 0.65, 0.62, 0.68, 0.9, 1.03, 1.71, 0.99, 0.65, 0.86, 0.72
), Y2 = c(0.98, 1, 0.78, 1.26, 1.16, 1.23, 1.7, 0.99, 3.07, 0.61,
1.43, 1.28), Y3 = c(0.57, 0.79, 0.66, 0.51, 0.77, 0.78, 0.81,
0.65, 0.61, 2.83, 1.04, 0.84), Y4 = c(1.07, 1.13, 0.93, 0.94,
1.37, 1.65, 1.63, 0.86, 1.43, 1.04, 2.83, 1.6), Y5 = c(0.91,
1.38, 0.77, 0.85, 1.11, 1.31, 1.44, 0.72, 1.28, 0.84, 1.6, 4.01
)), class = "data.frame", row.names = c(NA, -12L))
Now the analysis runs fine:
library("CCA")
str(CETNEW.T)
# 'data.frame': 12 obs. of 12 variables:
# $ X1: num 2.72 1.2 0.82 0.92 1.19 1 1.45 0.68 0.98 0.57 ...
# $ X2: num 1.2 3.78 0.7 1.04 1.06 1.32 1.31 0.56 1 0.79 ...
# $ X3: num 0.82 0.7 1.7 0.59 0.83 1.08 1.01 0.65 0.78 0.66 ...
# $ X4: num 0.92 1.04 0.59 3.09 1.06 0.93 1.47 0.62 1.26 0.51 ...
# $ X5: num 1.19 1.06 0.83 1.06 2.94 1.36 1.66 0.68 1.16 0.77 ...
# $ X6: num 1 1.32 1.08 0.93 1.36 2.94 1.56 0.9 1.23 0.78 ...
# $ X7: num 1.45 1.31 1.01 1.47 1.66 1.56 3.11 1.03 1.7 0.81 ...
# $ Y1: num 0.68 0.56 0.65 0.62 0.68 0.9 1.03 1.71 0.99 0.65 ...
# $ Y2: num 0.98 1 0.78 1.26 1.16 1.23 1.7 0.99 3.07 0.61 ...
# $ Y3: num 0.57 0.79 0.66 0.51 0.77 0.78 0.81 0.65 0.61 2.83 ...
# $ Y4: num 1.07 1.13 0.93 0.94 1.37 1.65 1.63 0.86 1.43 1.04 ...
# $ Y5: num 0.91 1.38 0.77 0.85 1.11 1.31 1.44 0.72 1.28 0.84 ...
X <- CETNEW.T[, 1:7]
Y <- CETNEW.T[, 8:12]
ccCETNEW <- cc(X, Y)
ccCETNEW is list with 5 parts containing the results.
Currently we compute and sort data of stocks (X1 to X10). Historical data is stored in Excel and R for the time period 1950-1980, 1980-1999 and for 1950-1999.
The dataset:
date X1 X2 X3 X4 X5 X6 X7 X8 X9 X10
1 1950-01-01 5.92 6.35 4.61 4.08 5.47 3.90 2.35 1.49 2.27 0.82
2 1950-02-01 2.43 2.16 2.10 1.58 -0.05 1.14 1.51 1.52 2.02 1.12
3 1950-03-01 -0.81 0.21 -1.67 -0.02 -0.79 0.18 -0.22 1.03 0.12 1.75
4 1950-04-01 5.68 6.45 5.41 5.94 6.10 5.87 3.82 3.34 3.44 3.97
5 1950-05-01 3.84 1.60 1.64 3.33 2.54 2.12 4.46 2.83 3.82 4.75
6 1950-06-01 -9.88 -10.56 -8.02 -7.86 -7.27 -7.44 -7.13 -7.76 -6.32 -5.04
7 1950-07-01 9.09 8.76 7.31 5.88 3.84 4.61 3.09 3.07 1.41 0.42
598 1999-10-01 -0.95 -1.88 -1.25 -0.52 1.65 0.72 5.41 4.38 5.58 6.59
599 1999-11-01 11.57 9.15 8.17 7.14 6.15 4.95 5.78 4.21 1.55 2.15
600 1999-12-01 12.32 14.97 9.29 11.77 11.09 5.89 11.88 11.26 6.23 5.64
The main question is, we would like to compute/plot efficient frontiers for these 4 time periods to see how the efficient frontier has evolved in 1 graph. Are there ways to do this in R?
The efficient frontier is the set of optimal portfolios that offers the highest expected return for a defined level of risk or the lowest risk for a given level of expected return.
In modern portfolio theory, the efficient frontier (or portfolio frontier) is an investment portfolio which occupies the 'efficient' parts of the risk-return spectrum. Formally, it is the set of portfolios which satisfy the condition that no other portfolio exists with a higher expected return but with the same standard deviation of return.
So, how would one go about computing this in R?
dput sample data (first 50 rows)
> dput(head(data,50))
structure(list(X__1 = structure(c(-631152000, -628473600, -626054400,
-623376000, -620784000, -618105600, -615513600, -612835200, -610156800,
-607564800, -604886400, -602294400, -599616000, -596937600, -594518400,
-591840000, -589248000, -586569600, -583977600, -581299200, -578620800,
-576028800, -573350400, -570758400, -568080000, -565401600, -562896000,
-560217600, -557625600, -554947200, -552355200, -549676800, -546998400,
-544406400, -541728000, -539136000, -536457600, -533779200, -531360000,
-528681600, -526089600, -523411200, -520819200, -518140800, -515462400,
-512870400, -510192000, -507600000, -504921600, -502243200), class = c("POSIXct",
"POSIXt"), tzone = "UTC"), X__2 = c(5.92, 2.43, -0.81, 5.68,
3.84, -9.88, 9.09, 4.93, 3.99, -0.5, 3.09, 15.77, 8.22, 0.36,
-7.36, 3.84, -2.81, -7.12, 3.57, 6.59, 1.04, -1.41, -1.42, -0.53,
1.86, -3.25, 0.68, -4.4, 0.57, 2.5, -0.36, -0.74, -1.11, -0.58,
3.22, 0.33, 5.01, 2.75, -1.25, -2.13, 1.3, -4.42, 0.25, -5.56,
-4.09, 2.71, 2.01, -3.15, 8.48, -0.16), X__3 = c(6.35, 2.16,
0.21, 6.45, 1.6, -10.56, 8.76, 4.63, 3.52, -1.2, 3.36, 10.98,
8.41, 0.81, -4.01, 3.56, -4.27, -6.11, 4.7, 5.3, 2.73, -3.07,
-0.13, 0.6, 1.1, -2.77, 2.37, -4.5, 1.87, 3.18, 1.51, 0.43, -1.91,
-1.52, 4.91, 1.43, 3.4, 3.03, -2.25, -2, 0.34, -4.75, 2.24, -6.53,
-1.87, 1.97, 1.78, -2.96, 7.38, 0.43), X__4 = c(4.61, 2.1, -1.67,
5.41, 1.64, -8.02, 7.31, 4.56, 5.18, -0.46, 3.52, 10.78, 8.46,
0.28, -4.88, 4.26, -3.25, -6.76, 6.78, 4.99, 3.86, -2.57, 0.59,
0.16, 1.75, -2.04, 2.49, -5.29, 1.76, 2.88, 0.76, 0.67, -1.67,
-1.45, 5.69, 2.95, 3.66, 1.15, -1.58, -2.34, 0.51, -3.82, 0.72,
-6.25, -2.33, 3.1, 2.19, -2.63, 7.3, 1.82), X__5 = c(4.08, 1.58,
-0.02, 5.94, 3.33, -7.86, 5.88, 4.68, 5.99, 0.75, 2.68, 9.29,
8, 1.08, -3.13, 4.21, -3.35, -5.01, 5.77, 4.85, 2.73, -3.44,
0.27, 1.56, 1.62, -2.35, 2.93, -4.62, 2.36, 2.56, 0.86, 0.16,
-1.8, -2.04, 5.12, 2.72, 3.21, 1.21, -2.17, -1.84, 0.32, -3.63,
1.47, -5.16, -0.65, 3.33, 1.34, -1.36, 6.24, 1.19), X__6 = c(5.47,
-0.05, -0.79, 6.1, 2.54, -7.27, 3.84, 6.29, 4.46, -0.24, 2.42,
6.12, 8.63, 0.88, -3.31, 4.56, -2.14, -5.62, 5.73, 5.36, 2.44,
-1.88, 0.83, 0.65, 1.47, -1.81, 2.31, -4.48, 2.56, 2.69, 0.9,
0.34, -0.62, -1.58, 6.59, 0.86, 3.58, 1.92, -1.85, -2.79, 0.7,
-3.4, 1.26, -5.26, -1.18, 4.26, 1.35, -0.97, 6.66, 1.77), X__7 = c(3.9,
1.14, 0.18, 5.87, 2.12, -7.44, 4.61, 4.57, 6.14, -0.84, 4.22,
8.37, 7.44, 0.69, -4.26, 4.13, -2.24, -6.75, 5.81, 4.35, 1.98,
-2.87, 0.93, 0.61, 1.27, -2.18, 2.97, -4.09, 2.27, 2.96, 1.16,
-0.38, -2.37, -0.71, 5.53, 2.45, 1.3, 0.31, -0.47, -2.03, 0.14,
-3.26, 1.79, -5.5, -1.47, 4.18, 1.96, -1.35, 7.06, 1.69), X__8 = c(2.35,
1.51, -0.22, 3.82, 4.46, -7.13, 3.09, 5.01, 5.84, -1.05, 3.81,
7.54, 6.46, 0.71, -3.56, 4.42, -1.87, -4.52, 7.3, 3.66, 2.11,
-2.92, 2.25, 2.17, 1.32, -1.71, 3.17, -4.63, 2.59, 3.89, 0.49,
0.21, -1.71, -1.18, 4.95, 3.21, 1.41, 0.89, -1.02, -2.89, 0.59,
-2.67, 1.47, -4.62, -0.69, 4.07, 2.83, -1.44, 6.11, 1.58), X__9 = c(1.49,
1.52, 1.03, 3.34, 2.83, -7.76, 3.07, 3.72, 6.21, -1.66, 3.46,
6.14, 7.17, 2.13, -3.19, 4.59, -2.65, -3.5, 7.43, 3.5, 2.41,
-2.73, 1.35, 1.97, 1.72, -1.8, 4.06, -5.35, 2.57, 3.14, 1.89,
-0.86, -1.73, -0.95, 6.07, 1.73, 1.09, 0.37, -1.34, -2.48, 0.31,
-3.2, 1.34, -4.99, -0.18, 4.35, 3.03, 0.09, 5.65, 2.39), X__10 = c(2.27,
2.02, 0.12, 3.44, 3.82, -6.32, 1.41, 4.54, 5.55, -0.97, 3.8,
5.69, 5.65, 1.78, -2.6, 4.21, -1.29, -2.63, 7.15, 3.52, 1.85,
-2.32, 0.96, 2.74, 1.9, -2.6, 3.83, -4.31, 3.15, 2.76, 0.93,
-0.39, -1.86, -1.57, 7.05, 2.36, -0.33, -0.23, -0.54, -2.6, 0.61,
-2.37, 2.12, -3.76, 0.47, 3.98, 3.03, 0.2, 5.63, 1.26), X__11 = c(0.82,
1.12, 1.75, 3.97, 4.75, -5.04, 0.42, 4.96, 4.32, 0.25, 2.26,
4.71, 5.05, 1.63, -1.53, 5.12, -2.59, -1.92, 6.89, 4.48, -0.09,
-2.49, 0.26, 4.03, 1.37, -2.82, 4.95, -5.1, 3.4, 4.29, 0.89,
-1.06, -2.18, -0.31, 5.76, 3.32, -1.04, -0.63, -1.78, -2.97,
0.55, -1.3, 2.75, -4.47, 0.48, 4.83, 2.85, 0.27, 4.4, 1.93)), .Names = c("date",
"X1", "X2", "X3", "X4", "X5", "X6", "X7", "X8",
"X9", "X10"), row.names = c(NA, 50L), class = c("tbl_df",
"tbl", "data.frame"))
After a few correpondence via the comments with #Jonathan, I widened the example data from 3 columns to 12 columns with some sampling. And the code at the "With short-selling" section at the blog scales well for 10K observations:
# using code at:
# https://www.r-bloggers.com/a-gentle-introduction-to-finance-using-r-efficient-frontier-and-capm-part-1/
# https://datashenanigan.wordpress.com/2016/05/24/a-gentle-introduction-to-finance-using-r-efficient-frontier-and-capm-part-1/
library(data.table)
calcEFParams <- function(rets)
{
retbar <- colMeans(rets, na.rm = T)
covs <- var(rets, na.rm = T) # calculates the covariance of the returns
invS <- solve(covs)
i <- matrix(1, nrow = length(retbar))
alpha <- t(i) %*% invS %*% i
beta <- t(i) %*% invS %*% retbar
gamma <- t(retbar) %*% invS %*% retbar
delta <- alpha * gamma - beta * beta
retlist <- list(alpha = as.numeric(alpha),
beta = as.numeric(beta),
gamma = as.numeric(gamma),
delta = as.numeric(delta))
return(retlist)
}
# load data
link <- "https://raw.githubusercontent.com/DavZim/Efficient_Frontier/master/data/mult_assets.csv"
df <- data.table(read.csv(link))
df2 <- df[,lapply(.SD, sample),]
df3 <- cbind(df, df2)
df4 <- df3[,lapply(.SD, sample),]
df5 <- cbind(df3, df4)
Now loading the microbenchmark package, the performance is as such:
> library(microbenchmark)
> microbenchmark(calcEFParams(df5), times = 10)
Unit: milliseconds
expr min lq mean median uq max neval
calcEFParams(df5) 2.692514 2.764053 2.795127 2.777547 2.805447 3.024349 10
It seems that David Zimmermann's code is scalable and efficient enough!
This question already has answers here:
Why are these numbers not equal?
(6 answers)
Closed 5 years ago.
v1 = c(2, 2.01, 2.02, 2.03, 2.04, 2.05, 2.06, 2.07, 2.08, 2.09, 2.1,
2.11, 2.12, 2.13, 2.14, 2.15, 2.16, 2.17, 2.18, 2.19, 2.2, 2.21,
2.22, 2.23, 2.24, 2.25, 2.26, 2.27, 2.28, 2.29, 2.3, 2.31, 2.32,
2.33, 2.34, 2.35, 2.36, 2.37, 2.38, 2.39, 2.4, 2.41, 2.42, 2.43,
2.44, 2.45, 2.46, 2.47, 2.48, 2.49, 2.5, 2.51, 2.52, 2.53, 2.54,
2.55, 2.56, 2.57, 2.58, 2.59, 2.6, 2.61, 2.62, 2.63, 2.64, 2.65,
2.66, 2.67, 2.68, 2.69, 2.7, 2.71, 2.72, 2.73, 2.74, 2.75, 2.76,
2.77, 2.78, 2.79, 2.8, 2.81, 2.82, 2.83, 2.84, 2.85, 2.86, 2.87,
2.88, 2.89, 2.9, 2.91, 2.92, 2.93, 2.94, 2.95, 2.96, 2.97, 2.98,
2.99)
> intersect(v1, seq(2, 2.99, 0.01))
[1] 2.00 2.01 2.02 2.04 2.05 2.06 2.08 2.09 2.10 2.12 2.13 2.14 2.16 2.17 2.19 2.20 2.21 2.23 2.24 2.25 2.26 2.27
[23] 2.28 2.29 2.30 2.31 2.33 2.34 2.35 2.37 2.38 2.39 2.41 2.42 2.44 2.45 2.46 2.48 2.49 2.50 2.51 2.52 2.53 2.54
[45] 2.55 2.56 2.57 2.58 2.59 2.60 2.62 2.63 2.64 2.66 2.67 2.69 2.70 2.71 2.72 2.73 2.74 2.75 2.76 2.77 2.78 2.79
[67] 2.80 2.81 2.82 2.83 2.84 2.85 2.87 2.88 2.89 2.91 2.92 2.94 2.95 2.96 2.97 2.98 2.99
I have a vector of length 100 called v1. I want to see the intersection of v1 and a seq(2, 2.99, 0.01) vector (should be just v1 itself). But I get a vector that is only 83 elements long? And clearly 2.03, 2.15 ... are not in the intersection. How is that possible?
This a floating point error in r. See the Floating Point Guide for more information.
This can be seen as the error because this returns what you're looking for:
v1 = c(2, 2.01, 2.02, 2.03, 2.04, 2.05, 2.06, 2.07, 2.08, 2.09, 2.1,
2.11, 2.12, 2.13, 2.14, 2.15, 2.16, 2.17, 2.18, 2.19, 2.2, 2.21,
2.22, 2.23, 2.24, 2.25, 2.26, 2.27, 2.28, 2.29, 2.3, 2.31, 2.32,
2.33, 2.34, 2.35, 2.36, 2.37, 2.38, 2.39, 2.4, 2.41, 2.42, 2.43,
2.44, 2.45, 2.46, 2.47, 2.48, 2.49, 2.5, 2.51, 2.52, 2.53, 2.54,
2.55, 2.56, 2.57, 2.58, 2.59, 2.6, 2.61, 2.62, 2.63, 2.64, 2.65,
2.66, 2.67, 2.68, 2.69, 2.7, 2.71, 2.72, 2.73, 2.74, 2.75, 2.76,
2.77, 2.78, 2.79, 2.8, 2.81, 2.82, 2.83, 2.84, 2.85, 2.86, 2.87,
2.88, 2.89, 2.9, 2.91, 2.92, 2.93, 2.94, 2.95, 2.96, 2.97, 2.98,
2.99)
v2 <- seq(2, 2.99, 0.01)
v1 <- round(v1,2) #rounds to 2 decimal places
v2 <- round(v2,2)
intersect(v1,v2) #returns v1
I am having trouble extracting rownames from an object.
When I type in rownames(object), I obtain "null", but if I type in object, I obtain the matrix of information. If it helps, when I type in class(object), it tells me that it is a list. What I am looking for is a method to obtain the row names on the side. Thanks!
>
alpha84 alpha91 alpha98 alpha105 alpha112 alpha119
YBR088C 1.08 0.27 0.04 -0.51 -0.80 -0.89
YDL003W 0.62 -0.01 -0.36 -0.04 -0.55 -0.55
YDR097C 0.64 0.18 -0.05 0.03 -0.76 -0.66
YDR507C 0.53 0.13 0.07 0.14 -0.56 -0.41
YER070W 0.73 0.20 0.00 0.11 -0.53 -0.72
YER095W 0.28 -0.05 -0.11 -0.13 -0.87 -0.90
YER111C 0.37 -0.19 -0.11 -0.54 -0.34 -0.47
YGR189C 0.81 0.12 0.15 -0.39 -0.60 -1.20
YKL045W 0.46 -0.27 -0.10 -0.23 -0.42 -1.21
YLR183C 0.96 0.14 0.28 -0.17 -0.14 -0.68
YML027W 0.50 -0.01 0.11 -0.33 -0.44 -0.94
YMR179W 0.42 0.04 -0.40 -0.47 -0.12 -0.61
YNL300W 0.79 0.33 0.54 -0.09 -0.31 -1.01
YOR074C 0.73 0.09 -0.27 -0.22 -0.62 -0.80
YPL163C 1.61 0.84 0.82 -0.09 -0.48 -0.97
YPL256C 1.10 0.56 0.18 -0.32 -0.38 -1.04
structure(list(4 = structure(list(alpha0 = c(-1.15, -1.22,
-0.72, -1.76, -1.46, -0.57, -1.21, -0.32, -0.8, -1.7, -1.72,
-1.3, -1.24, -1.14, -2.42, -1.41), alpha7 = c(-0.86, -0.74, -0.85,
-0.34, -0.76, 0.42, -0.26, -0.65, 0.01, -1.46, -0.66, 0.07, -0.78,
-0.31, -2.15, -0.69), alpha14 = c(1.21, 1.34, 0.54, 0.18, 1.08,
1.03, 1.36, 0.87, 0.86, 0.93, 1.73, 0.98, 0.31, 0.57, 0.66, 1.39
), alpha21 = c(1.62, 1.5, 1.04, 1.07, 1.5, 1.35, 1.37, 1.1, 0.84,
1.12, 1.29, 1.12, 1.46, 1.08, 1.98, 1.98), alpha28 = c(1.12,
0.63, 0.84, 0.37, 0.74, 0.64, 0.54, 1.17, 0.51, 0.91, 0.51, 0.13,
1.11, 1.17, 1.55, 0.74), alpha35 = c(0.16, 0.29, 0.24, 0.32,
0.47, 0.42, 0.18, 0.44, 0.14, 0.11, 0.28, 0.19, 0.62, 0.57, 0.78,
0.21), alpha42 = c(-0.44, -0.55, -0.64, -0.5, -0.7, -0.4, -0.85,
0.37, -0.4, 0, 0.23, -0.58, 0.07, 0.31, 0.14, -0.36), alpha49 = c(-0.93,
-0.65, -0.83, -0.25, -0.68, -0.9, -0.82, -0.93, -0.64, -0.73,
-0.55, -0.63, -0.23, -0.74, -0.94, -1.32), alpha56 = c(-1.23,
-0.76, -0.36, -0.48, -1.03, -0.73, -0.75, -1.45, -0.8, -0.9,
-0.97, -0.9, -0.58, -0.68, -1.03, -1.5), alpha63 = c(-0.62, -0.88,
-0.7, -0.25, -0.55, -0.47, 0.07, -0.57, 0.41, -0.46, -0.48, 0.09,
-1.01, -0.1, -1.5, -1.07), alpha70 = c(0.62, 0.69, 0.99, 0.79,
0.35, 0.2, 0.89, 0.15, 0.88, 0.85, 0.57, 0.54, -0.24, -0.38,
-0.03, 0.35), alpha77 = c(1.3, 1.25, 1.08, 0.97, 1.24, 0.78,
0.78, 0.92, 0.75, 0.93, 0.88, 1.44, 0.23, 0.75, 1.25, 1.57),
alpha84 = c(1.08, 0.62, 0.64, 0.53, 0.73, 0.28, 0.37, 0.81,
0.46, 0.96, 0.5, 0.42, 0.79, 0.73, 1.61, 1.1), alpha91 = c(0.27,
-0.01, 0.18, 0.13, 0.2, -0.05, -0.19, 0.12, -0.27, 0.14,
-0.01, 0.04, 0.33, 0.09, 0.84, 0.56), alpha98 = c(0.04, -0.36,
-0.05, 0.07, 0, -0.11, -0.11, 0.15, -0.1, 0.28, 0.11, -0.4,
0.54, -0.27, 0.82, 0.18), alpha105 = c(-0.51, -0.04, 0.03,
0.14, 0.11, -0.13, -0.54, -0.39, -0.23, -0.17, -0.33, -0.47,
-0.09, -0.22, -0.09, -0.32), alpha112 = c(-0.8, -0.55, -0.76,
-0.56, -0.53, -0.87, -0.34, -0.6, -0.42, -0.14, -0.44, -0.12,
-0.31, -0.62, -0.48, -0.38), alpha119 = c(-0.89, -0.55, -0.66,
-0.41, -0.72, -0.9, -0.47, -1.2, -1.21, -0.68, -0.94, -0.61,
-1.01, -0.8, -0.97, -1.04)), .Names = c("alpha0", "alpha7",
"alpha14", "alpha21", "alpha28", "alpha35", "alpha42", "alpha49",
"alpha56", "alpha63", "alpha70", "alpha77", "alpha84", "alpha91",
"alpha98", "alpha105", "alpha112", "alpha119"), row.names = c("YBR088C",
"YDL003W", "YDR097C", "YDR507C", "YER070W", "YER095W", "YER111C",
"YGR189C", "YKL045W", "YLR183C", "YML027W", "YMR179W", "YNL300W",
"YOR074C", "YPL163C", "YPL256C"), class = "data.frame")), .Names = "4")
You have a list of one element. This single element is a data.frame.
If you are after the rownames from this object, then index the list appropriately
rownames(object[[1]])
## [1] "YBR088C" "YDL003W" "YDR097C" "YDR507C" "YER070W" "YER095W" "YER111C" "YGR189C" "YKL045W" "YLR183C" "YML027W"
## [12] "YMR179W" "YNL300W" "YOR074C" "YPL163C" "YPL256C"
For a more general list of data.frames
# get rownames from all data.frames in a list
lapply(object, rownames)
If you want a data.frame, not a list containing a data.frame then you could simply assign the results from the first element to a separate element
object.df <- object[[1]]
If it is a list of data.frames, it probably is more idiomatically R to keep it in the list, and use lapply to work on each element.