Doing PCA with varimax rotation in R - r

My code has gone south.
I'm importing a data 578x17 sheet from csv using the:
Data=read.csv("Data.csv", header=TRUE, sep=',', dec='.', row.names= 1 , stringsAsFactors=TRUE)
My correlations and covariance matrices are the same.
When I try to do a PCA and a PCA with a Varimax Rotation, I get the same results:
PCA=princomp(x = Data, cor = TRUE, scores = TRUE)
Varimax<-princomp(Data, rotation="varimax")
When I try to do a Varimax rotation in a different way, I get:
varimax<-varimax(PCA$rotation[,1:5])
Error in if (nc < 2) return(x) : argument is of length zero
I'm not sure whether the problem is my code, or my .csv file, but any help would be greatly appreciated!

varimax rotation in PCA with vegan's rda()
The basics of this answer has been taken from:
https://stats.stackexchange.com/questions/59213/how-to-compute-varimax-rotated-principal-components-in-r
suppose that the data matrix name is mydata
library(vegan)
library(pracma)
pca.env = rda(mydata, scale=T)
loading = scores(pca.env, choices=c(1,2))$species #choices determines which pc to be taken
rloading = varimax(loading)$loadings
iloading = t(pinv(rloading))
scores = scale(mydata) %*% iloading
biplot
r = range(c(rloading, scores))
plot(scores, xlim = r, ylim= r, xlab= "PC1 ", ylab= "PC2 ")
arrows(0,0, rloading[,1], rloading[,2], length=0.1, col=2)
text(rloading[,1], rloading[,2], labels = colnames(mydata), pos=3, col=2)
text(scores[,1], scores[,2], labels = rownames(mydata), pos = 3)
abline(h=0, lty=3)
abline(v=0, lty=3)

Related

R: Plot lines are very thick

When using matplot to plot a matrix using:
matplot(t, X[,1:4], col=1:4, lty = 1, xlab="Time", ylab="Stock Value")
my graph comes out as:
How do I reduce the line thickness? I previously used a different method and my graph was fine:
I have tried manupilating lwd but to no avail.
Even tried plot(t, X[1:4097,1]), yet the line being printed is very thick. Something wrong with my R?
EDIT: Here is the code I used to produce the matrix X:
####Inputs mean return, volatility, time period and time step
mu=0.25; sigma=2; T=1; n=2^(12); X0=5;
#############Generating trajectories for stocks
##NOTE: Seed is fixed. Changing seed will produce
##different trajectories
dt=T/n
t=seq(0,T,by=dt)
set.seed(201)
X <- matrix(nrow = n+1, ncol = 4)
for(i in 1:4){
X[,i] <- c(X0,mu*dt+sigma*sqrt(dt)*rnorm(n,mean=0,sd=1))
X[,i] <- cumsum(X[,i])
}
colnames(X) <- paste0("Stock", seq_len(ncol(X)))
Just needed to add type = "l" to matplot(....). Plots fine now.
matplot(t, X[,1:4], col=1:4, type = "l", xlab="Time", ylab="Stock Value")

Plotting several variables on the same scale in R

I've tried over and over to solve this issue but I can't get it down. I have estimated a Beta-t-EGARCH model and a GARCH-t model in R and now I need to plot the results over the same plot. The final result is horrible, since the variables don't share the same scale on the y axis. I'm new to R, so please don't blame me :).
Here's the code:
library(quantmod)
library(betategarch)
library(fGarch)
library(ggplot2)
getSymbols("GOOG",src="yahoo")
google_ret <- abs(periodReturn(GOOG, period="daily", subset=NULL, type="log"))-mean(abs(periodReturn(GOOG, period="daily", subset=NULL, type="log")))
googcomp <- tegarch(google_ret, asym=FALSE, skew=FALSE)
goog1stdev <- fitted(googcomp)
#now we try to fit a standard GARCH-t model
googgarch <- garchFit(data=google_ret, cond.dist="sstd")
googgarch2 <- garchFit(data=google_ret, cond.dist="sstd", include.mean = FALSE, include.delta = FALSE, include.skew = FALSE, include.shape = FALSE, leverage = FALSE, trace = TRUE)
volatility <- volatility(googgarch2, type = "sigma")
plot(google_ret)
par(new=TRUE)
plot(googgarch2, which=2)
par(new=TRUE)
plot(goog1stdev, col="red")
The final result is a plot completely out of scale on the y axis, with variables that have lower values plotted above higher ones. Thanks a lot to anybody that wants to help me!
The recommended approach is to plot them as different plots stacked on top of each other:
layout(matrix(1:3,3))
plot(google_ret)
plot(googgarch2, which=2)
plot(goog1stdev, col="red")
You can get rid of the whitespace with calls to par("mar") to adjust margin sizes:
opar=par(mar=par("mar") -c(1,0,3,0)) # opar will then let your restore previous values
..... plotting efforts
par(opar)
I don't know your domain very much but if you cna use shifted y-ordinates then this produces a somewhat cleaned up version with overlayed plots:
png()
plot(google_ret, ylim=c(0,1), ylab="ylab="Google Returns(black); GGarch x10 +0.5 (blue); STD + 0.3(red)" )
par(new=TRUE)
plot(googgarch2#data +.5, type="l", col="blue",axes=FALSE, ylab="", main="",ylim=c(0, 1)) ;abline(h=.5, col="blue")
par(new=TRUE);
plot( 10*coredata(goog1stdev) + .3, col="red", type="l", axes=FALSE, main="",ylim=c(0,1), ylab=""); abline(h=.3, col="red")
dev.off()

Plot side-by-side Impulse Response in R package "vars"

library(vars)
data(Canada)
var_fit <- VAR(Canada, p = 1)
var_irf <- irf(var_fit, impulse = c("U", "rw"), response = "prod")
How do I plot the two Impulse Responses in a figure side-by-side
Normally, I'd use par(mfrow = c(1,2)), but it doesn't work as expected. Any help?
I found the same problem. I solved "manually", here a example for a model VAR(1) with two variables.
impulse<-irf(model)
irf1<-data.frame(impulse$irf$y1[,1],impulse$Lower$y1[,1],
impulse$Upper$y1[,1])
irf2<-data.frame(impulse$irf$y1[,2],impulse$Lower$y1[,2],
impulse$Upper$y1[,2])
par(mfrow=c(1,2), bg="azure2")
matplot(irf1, type="l", lwd=2, col="blue2",
ylab=expression(y[1]), lty=c(1,2,2))
matplot(irf2, type="l", lwd=2, col="red2",
ylab=expression(y[1]), lty=c(1,2,2))

quantile plot, two data - issues with fitting the line in R

So I am trying to plot two p values from two different data frames and compare them to the normal distribution in QQplot in R
here is the code that I am using
## Taking values from 1st dataframe to plot
Rlogp = -log10(trialR$PVAL)
Rindex <- seq(1, nrow(trialR))
Runi <- Rindex/nrow(trialR)
Rloguni <- -log10(Runi)
## Taking values from 2nd dataframe to plot on existing plot
Nlogp = -log10(trialN$PVAL)
Nlogp = sort(Nlogp)
Nindex <- seq(1, nrow(trialN))
Nuni <- Nindex/nrow(trialN)
Nloguni <- -log10(Nuni)
Nloguni <- sort(Nloguni)
qqplot(Rloguni, Rlogp, xlim=range(0,6), ylim=range(0,6), col=rgb(100,0,0,50,maxColorValue=255), pch=19, lwd=2, bty="l",xlab ="", ylab ="")
qqline(Rloguni, Rlogp,distribution=qnorm, lty="dashed")
par(new=TRUE, cex.main=4.8, col.axis="white")
plot(Nloguni, Nlogp, xlim=range(0,6), ylim=range(0,6), col=rgb(0,0,100,50,maxColorValue=255), pch=19, lwd=2, bty="l",xlab ="", ylab ="")
The code plot the graph effectively,but I am not sure of the qqline as it seems bit offset... Can someone tell me if I am doing the correct way or is there something to change
the TARGET plot will look something like this - without the third data value..

Visualize data using histogram in R

I am trying to visualize some data and in order to do it I am using R's hist.
Bellow are my data
jancoefabs <- as.numeric(as.vector(abs(Janmodelnorm$coef)))
jancoefabs
[1] 1.165610e+00 1.277929e-01 4.349831e-01 3.602961e-01 7.189458e+00
[6] 1.856908e-04 1.352052e-05 4.811291e-05 1.055744e-02 2.756525e-04
[11] 2.202706e-01 4.199914e-02 4.684091e-02 8.634340e-01 2.479175e-02
[16] 2.409628e-01 5.459076e-03 9.892580e-03 5.378456e-02
Now as the more cunning of you might have guessed these are the absolute values of some model's coefficients.
What I need is an histogram that will have for axes:
x will be the number (count or length) of coefficients which is 19 in total, along with their names.
y will show values of each column (as breaks?) having a ylim="" set, according to min and max of those values (or something similar).
Note that Janmodelnorm$coef simply produces the following
(Intercept) LON LAT ME RAT
1.165610e+00 -1.277929e-01 -4.349831e-01 -3.602961e-01 -7.189458e+00
DS DSA DSI DRNS DREW
-1.856908e-04 1.352052e-05 4.811291e-05 -1.055744e-02 -2.756525e-04
ASPNS ASPEW SI CUR W_180_270
-2.202706e-01 -4.199914e-02 4.684091e-02 -8.634340e-01 -2.479175e-02
W_0_360 W_90_180 W_0_180 NDVI
2.409628e-01 5.459076e-03 -9.892580e-03 -5.378456e-02
So far and consulting ?hist, I am trying to play with the code bellow without success. Therefore I am taking it from scratch.
# hist(jancoefabs, col="lightblue", border="pink",
# breaks=8,
# xlim=c(0,10), ylim=c(20,-20), plot=TRUE)
When plot=FALSE is set, I get a bunch of somewhat useful info about the set. I also find hard to use breaks argument efficiently.
Any suggestion will be appreciated. Thanks.
Rather than using hist, why not use a barplot or a standard plot. For example,
## Generate some data
set.seed(1)
y = rnorm(19, sd=5)
names(y) = c("Inter", LETTERS[1:18])
Then plot the cofficients
barplot(y)
Alternatively, you could use a scatter plot
plot(1:19, y, axes=FALSE, ylim=c(-10, 10))
axis(2)
axis(1, 1:19, names(y))
and add error bars to indicate the standard errors (see for example Add error bars to show standard deviation on a plot in R)
Are you sure you want a histogram for this? A lattice barchart might be pretty nice. An example with the mtcars built-in data set.
> coef <- lm(mpg ~ ., data = mtcars)$coef
> library(lattice)
> barchart(coef, col = 'lightblue', horizontal = FALSE,
ylim = range(coef), xlab = '',
scales = list(y = list(labels = coef),
x = list(labels = names(coef))))
A base R dotchart might be good too,
> dotchart(coef, pch = 19, xlab = 'value')
> text(coef, seq(coef), labels = round(coef, 3), pos = 2)

Resources