I have been running a dcc garch on R; the results is presented as matrix
I would like to extract the second column as a vector to plot, with date on the x-axis.
For the moment, if I define
DCCrho = dccresults$DCC[,2]
then head(DCCrho) yields this:
1 0.9256281
2 0.9256139
3 0.9245794
...
any help to redefine this as a simple vector of numerical values?
any other option to graph the results of dcc with date on the x-axis?
Thanks a lot!
While trying this
x <- cbind(DCCrho, com_30[,2])
head(x)
and this:
matplot(DCCrho ~ x[,2], x, xaxt = "n", type='l')
yields the following error message:
"Error in array(x, c(length(x), 1L), if (!is.null(names(x))) list(names(x), :
invalid first argument"
Apparently it was a matter of length of the vector; the Date and the DCC results need to be vectors of same length.
One also needs to plot both date and DCCrho as shown below.
matplot(com_30$date, DCCrho, xaxt = "n", type='l')
axis(1, com_30$date, format(com_30$date, "%y"), cex.axis = .7)
I'm assuming that when you said, "I would like to extract the second row..." that you actually meant "column", because you did the following: dccresults$DCC[,2] Also, as pointed out by a previous comment, the code isn't reproducible, so it's difficult to propose and answer with certainty. However, I'll do my best.
You said you wanted DCCrho to be a "simple vector of numerical values". I'm assuming that this is largely a matter of the way that the values are displayed. Does DCCrho = as.vector(dccresults$DCC[,2]) look better?.
As for the error message, I think that it's b/c in matplot(x,y, ...), x can't be a formula. Try matplot(DCCrho, x[,2]).
If you just want to plot the DCCrho value across some index, you could try something like the following:
Y <- as.vector(dccresults$DCC[,2])
X <- seq_along(Y)
plot(X,Y)
Does that work? Aside from the arbitrary time index, what did you intend to reference as "time"? I don't see a part of the code that you supplied (e.g., a column in dccresults$DCC) that would be an obvious candidate for use as a "date".
Related
I am relatively new to R and I am struggling with a error messages related to qqplot. Some sample data are at the bottom. I am trying to do a qqplot on some azimuth data, i.e. like compass directions. I've looked around here and the ?qqplot R documentation, but I don't see a solution I can understand in either. I don't understand the syntax for the function or the format the data are supposed to be in, or probably both. I First I tried loading the data as a single column of values, i.e. just the "Azimuth" column.
azimuth <- read.csv(file.choose(), header=TRUE)
qqplot(azimuth$Azimuth)
returns the following error,
Error in sort(y) : argument "y" is missing, with no default
Then I tried including the corresponding dip angles along with the azimuth data and received the same error. I also tried,
qqnorm(azimuth)
but this returned the following error,
Error in xy.coords(x, y, xlabel, ylabel, log) :
'x' and 'y' lengths differ
Dataframe "azimuth":
Azimuth Altitude
23.33211466 -6.561729793
31.51267873 4.801537153
29.04577711 5.24504954
23.63450905 14.03342708
29.12535459 7.224141678
20.76972007 47.95686329
54.89253987 4.837417689
56.57958227 13.12587996
13.09845182 -7.417776178
26.45155154 31.83546988
29.15718557 25.47767069
28.09084746 14.61603384
28.93436865 -1.641785416
28.77521371 17.30536039
29.58690392 -2.202076058
0.779859221 12.92044019
27.1359178 12.20305106
23.57084707 11.97925859
28.99803063 3.931326877
dput() version:
azimuth <-
structure(list(Azimuth = c(23.33211466, 31.51267873, 29.04577711,
23.63450905, 29.12535459, 20.76972007, 54.89253987, 56.57958227,
13.09845182, 26.45155154, 29.15718557, 28.09084746, 28.93436865,
28.77521371, 29.58690392, 0.779859221, 27.1359178, 23.57084707,
28.99803063), Altitude = c(-6.561729793, 4.801537153, 5.24504954,
14.03342708, 7.224141678, 47.95686329, 4.837417689, 13.12587996,
-7.417776178, 31.83546988, 25.47767069, 14.61603384, -1.641785416,
17.30536039, -2.202076058, 12.92044019, 12.20305106, 11.97925859,
3.931326877)), .Names = c("Azimuth", "Altitude"), class = "data.frame", row.names = c(NA, -19L))
Try:
qqPlot
with a capital P.
Maybe you want to create the graph.
Have you ever tried?
qqnorm(azimuth$Azimuth);qqline(azimuth$Azimuth)
It seems that the qqplot function takes two input parameters, x and y as follows:
qqplot(x, y, plot.it = TRUE, xlab = "your x-axis label", ylab="your y-axis label", ...)
When you made your call as given above, you only gave one vector, hence R complained the y argument was missing. Check you input data set and see if you can find what x and y should be for your call to qqplot.
How does the following code work? I got the example when I was reading the help line of R ?curve. But i have not understood this.
for(ll in c("", "x", "y", "xy"))
curve(log(1+x), 1, 100, log = ll,
sub = paste("log= '", ll, "'", sep = ""))
Particularly , I am accustomed to numeric values as arguments inside the for-loop as,
for(ll in 1:10)
But what is the following command saying:
for(ll in c("","x","y","xy"))
c("","x","y","xy") looks like a string vector? How does c("","x","y","xy") work inside curve
function as log(1+x)[what is x here? the string "x"? in c("","x","y","xy")] and log=ll ?
Apparently, there are no answers on stack overflow about how the curve function in R works and especially about the log argument so this might be a good chance to delve into it a bit more (I liked the question btw):
First of all the easy part:
c("","x","y","xy") is a string vector or more formally a character vector.
for(ll in c("","x","y","xy")) will start a loop of 4 iterations and each time ll will be '','x','y','xy' respectively. Unfortunately, the way this example is built you will only see the last one plotted which is for ll = 'xy'.
Let's dive into the source code of the curve function to answer the rest:
First of all the what does the x represent in log(1+x)?
log(1+x) is a function. x represents a vector of numbers that gets created inside the curve function in the following part (from source code):
x <- exp(seq.int(log(from), log(to), length.out = n)) #if the log argument is 'x' or
x <- seq.int(from, to, length.out = n) #if the log argument is not 'x'
#in our case from and to are 1 and 100 respectively
As long as the n argument is the default the x vector will contain 101 elements. Obviously the x in log(1+x) is totally different to the 'x' in the log argument.
as for y it is always created as (from source code):
y <- eval(expr, envir = ll, enclos = parent.frame()) #where expr is in this case log(1+x), the others are not important to analyse now.
#i.e. you get a y value for each x value on the x vector which was calculated just previously
Second, what is the purpose of the log argument?
The log argument decides which of the x or y axis will be logged. The x-axis if 'x' is the log argument, y-axis if 'y' is the log argument, both axis if 'xy' is the log argument and no log-scale if the log argument is ''.
It needs to be mentioned here that the log of either x or y axis is being calculated in the plot function in the curve function, that is the curve function is only a wrapper for the plot function.
Having said the above this is why if the log argument is 'x' (see above) the exponential of the log values of the vector x are calculated so that they will return to the logged ones inside the plot function.
P.S. the source code for the curve function can be seen with typing graphics::curve on the console.
I hope this makes a bit of sense now!
In the following R code, I try to create 30 histograms for the variable allowed.clean by the factor zip_cpt(which has 30 levels).
For each of these histograms, I also want to add mean and sample size--they need to be calculated for each level of the factor zip_cpt. So I used panel.text to do this.
After I run this code, I had error message inside each histogram which reads "Error using packet 21..."x" is missing, with..." (I am not able to read the whole error message because they don't show up in whole). I guess there's something wrong with the object x. Is it because mean(x) and length(x) don't actually apply to the data at each level of the factor zip_cpt?
I appreciate any help!
histogram(~allowed.clean|zip_cpt,data=cpt.IC_CAB1,
type='density',
nint=100,
breaks=NULL,
layout=c(10,3),
scales= list(y=list(relation="free"),
x=list(relation="free")),
panel=function(x,...) {
mean.values <-mean(x)
sample.n <- length(x)
panel.text(lab=paste("Sample size = ",sample.n))
panel.text(lab=paste("Mean = ",mean.values))
panel.histogram(x,col="pink", ...)
panel.mathdensity(dmath=dnorm, col="black",args=list(mean=mean(x, na.rm = TRUE),sd=sd(x, na.rm = TRUE)), ...)})
A discussion I found online is helpful for adding customized text (e.g., basic statistics) on each of the histograms:
https://stat.ethz.ch/pipermail/r-help/2007-March/126842.html
This question might have been repeated. But even after going through the previous links, I am not able to solve this. I have a file as follows:
data <- read.table("data.txt", header=TRUE)
Samp1 Samp2 Samp3
cg00000029 0.79015390399987 0.8301816 0.8966661
cg00000108 0.970260858767027 0.9655997 0.9699428
cg00000109 0.948456317952246 0.9209855 0.9325146
cg00000165 0.267769194351135 0.2370634 0.3867273
I wish to create a density plot out of the column (say Samp1). When I use the following
>plot(density(na.omit(data$Samp1)), col="black")
I get the following error:
Error in density.default(na.omit(data$Samp1)) : argument 'x' must be numeric
Can anyone help me know how to i correct this? I have created density plots for similar files, but did not get this error. It is only for this file.
Your help appreciated.
Thanks in advance..
Well, for some reason your data is non-numeric: have you tried using as.numeric() to coerce it into the right type?
Edit: Using unlist() to convert it out of a list type seems to be the answer
I was getting the same problem, I was able to solve it by doing:
a<- density(treeDF$real)
plot(a$x, a$y, col="turquoise2")
b<- density(treeDF$rpart)
lines(b$x, b$y, col="deeppink3")
c<- density(treeDF$rpart_Fourier)
lines(c$x, c$y, col="blue")
legend("topright", legend = c("real value", "rpart()", "rpart() + Fourier tem"),
col=c("turquoise2", "deeppink3", "blue"), lty=1:1, cex=0.8, box.lty = 0)
how can I have a data set of only time intervals (no dates) in R, like the following:
TREATMENT_A TREATMENT_B
1:01:12 0:05:00
0:34:56 1:08:09
and compute mean times, etc, and draw boxplots with time intervals in the y-axis?
I am new to R, and I searched for this but found no example in the net.
Thanks
The chron-package has a 'times' class that supports arithmetic. You could also do all of that with POSIXct objects and format the date-time output to not include the date. I thought axis.POSIXct function has a format argument that should let you have time outputs. However, it does not seem to get dispatched properly, so I needed to construct the axis "by hand."
dft <- data.frame(x= factor( sample(1:2, 100, repl=TRUE)),
y= Sys.time()+rnorm(100)*4000 )
boxplot(y~x, data=dft, yaxt='n')
axis(2, at=seq(from=range(dft$y)[1], to =range(dft$y)[2], by=3000) ,
labels=format.POSIXct(seq(from=range(dft$y)[1], to =range(dft$y)[2], by=3000),
format ="%H:%M:%S") )
There did turn out to be an appropriate method, Axis.POSIXt (to which I thought boxplot should have been turning for plotting, but it did not seem to recognize the class of the 'y' argument):
boxplot(y~x, data=dft, yaxt='n')
Axis(side=2, x=range(dft$y), format ="%H:%M:%S")
Regarding your request for something "simpler", take a look at theis ggplot2 based solution, using the dft dataframe defined above with POSIXct times. (I did try with the chron-times object but got a message saying ggplot did not support that class):
require(ggplot2); p <- ggplot(dft, aes(x,y))
p + geom_boxplot()
Check out the "lubridate" package, and the "hms" function within it.