I am trying to add text (the nonlinearity p-value) to a plot of a restricted cubic spline regression model. When I try this in R studio it works fine, however the moment I try to add code to save it as a .tiff file, I get the following:
Error in strwidth(legend, units = "user", cex = cex, font = text.font) :
plot.new has not been called yet
This is my code (for a restricted cubic spline regression model for the association between determinant labtsh and endpoint ecard with 3 knots):
spline <- cph(Surv(ecard_f,ecard_n)~rcs(labtsh,3)+age+gender+ckdepi+smoking, data=smart)
tiff(file ="H:/documents/test.tiff", width = 1000, height =1000, units = "px", res = 145)
plot(Predict(spline,labtsh,fun=exp), pch = 20, las = 1,
conf.int=T,
main="Relationship TSH and myocardial infarction",
xlab="TSH (mIU/L)",
ylab="Hazard Ratio")
my.p <- anova(spline)[2,3] ##this is the value I want to add in the plot
rp = vector('expression',2)
rp[1] = substitute(expression(Non-linearity))[2]
rp[2] = substitute(expression(italic(p)-value == MYVALUE),
list(MYVALUE = format(my.p, digits = 3)))[2]
legend('topleft', legend = rp, bty = 'n')
dev.off()
Like I said, if I take away the call to make it a .tiff-file and the dev.off() line, the code works fine.
I have searched multiple answers for similar questions here on stackoverflow, but the answers don't seem to help me.
Related
I have 21 variables in 4 distinct forms (original data, averages, log and log1p transformations). I would like to plot them in one large (long) figure of 4 cols and 21 rows. I have been playing around with the arguments a lot... "width" and "height", chaged the units, changed the resolution tried png, jpeg, pdf... bunches of combinations, but I can't manage to get the figure, I get that old message:
Error in plot.new() : figure margins too large
Sometimes, deppending on the values I choose, I get a poor figure, but then I try to open it and windows give me a message that says that I don't have permission to open that file (???). I was told that R must be writing a corrupted file. I'm brand new to R and don't know what else to do. Any tips or materials I can read to get a solution?
Here is my code... If the dataframe with the variables is relevant, let me know and I'll figure a way to provide (it's huge)
#### 4 Histograms ####
#install.packages("png")
library(png)
jpeg(filename = "histograms.jpeg", width = 30, height = 60, units = "cm", pointsize = 12, bg = "white", res = 120)
par(pty = "m", mfrow = c(21, 4), cex.lab = 1 , cex.main = 0.9);
for(i in 1:ncol(climData_num_2)){
# All clim data
hist((climData_num_2)[,i],prob=TRUE,breaks =40, col ="thistle", xlab=(colnames((climData_num_2)[i])),main = paste("Histogram of" , colnames((climData_num_2)[i])))
curve(dnorm(x, mean=mean(((climData_num_2)[,i]),na.rm=TRUE), sd=sd(((climData_num_2)[,i]),na.rm=TRUE)), col="blue",lwd=2, add=TRUE)
lines(density(sort((climData_num_2)[,i])), col="red",lwd=2, add=TRUE)
# Average variables
hist((av_cdn_2)[,i],prob=TRUE,breaks =40, col ="thistle", xlab=(colnames((av_cdn_2)[i])),main = paste("Histogram of" , colnames((av_cdn_2)[i])))
curve(dnorm(x, mean=mean(((av_cdn_2)[,i]),na.rm=TRUE), sd=sd(((av_cdn_2)[,i]),na.rm=TRUE)), col="blue",lwd=2, add=TRUE)
lines(density(sort((av_cdn_2)[,i])), col="red",lwd=2, add=TRUE)
# Log transformed
hist(log(av_cdn_2)[,i],prob=TRUE,breaks =40, col ="thistle", xlab=(colnames((av_cdn_2)[i])),main = paste("Histogram of log" , colnames((av_cdn_2)[i])))
curve(dnorm(x, mean=mean((log(av_cdn_2)[,i]),na.rm=TRUE), sd=sd((log(av_cdn_2)[,i]),na.rm=TRUE)), col="blue",lwd=2, add=TRUE)
lines(density(sort(log(av_cdn_2)[,i])), col="red",lwd=2, add=TRUE)
# log1p transformed
hist(log1p(av_cdn_2)[,i],prob=TRUE,breaks =40, col ="thistle", xlab=(colnames((av_cdn_2)[i])),main = paste("Histogram of log1p" , colnames((av_cdn_2)[i])))
curve(dnorm(x, mean=mean(log1p((av_cdn_2)[,i]),na.rm=TRUE), sd=sd(log1p((av_cdn_2)[,i]),na.rm=TRUE)), col="blue",lwd=2, add=TRUE)
lines(density(sort(log1p(av_cdn_2[,i]))), col="red",lwd=2, add=TRUE)
}
dev.off()
All suggestions are apreciated
Thank you in advance
Edit:
data.frames to run the code:
av_cdn_2<-data.frame(replicate(21,sample(0:1000,30,rep=TRUE)))
climData_num_2<-data.frame(replicate(21,sample(0:1000,30,rep=TRUE)))
I'm trying to follow the instructions given in PDF to construct a co-expression network. One of the first steps is constructing a dendrogram. This is the code.
The link to LiverFemale3600.csv is here in a zipped file.
# Load the WGCNA package
library(WGCNA);
options(stringsAsFactors = FALSE);
#Read in the female liver data set
femData = read.csv("LiverFemale3600.csv");
datExpr0 = as.data.frame(t(femData[, -c(1:8)]));
names(datExpr0) = femData$substanceBXH;
rownames(datExpr0) = names(femData)[-c(1:8)];
sampleTree = hclust(dist(datExpr0), method = "average");
par(cex = 0.6);
par(mar = c(0,4,2,0))
plot(sampleTree, main = "Sample clustering to detect outliers", sub="", xlab="", cex.lab = 1.5, cex.axis = 1.5, cex.main = 2)
Here the plot() doesn't return anything in RStudio. The plot window is blank, but it doesn't return any error either.
When show(sampleTree) is run I got the following.
> show(sampleTree)
Call:
hclust(d = dist(datExpr0), method = "average")
Cluster method : average
Distance : euclidean
Number of objects: 135
Run just the plot line if you want your plot to appear in RStudio's plotting frame. Otherwise running the par lines will open a separate plotting window, and the graph will not appear in your normal plotting frame in RStudio.
I´m recently trying to analyse my data and want to make the graphs a little nicer but I´m failing at this.
So I have a data set with 144 sites and 5 environmental variables. It´s basically about the substrate composition around an island and the fish abundance. On this island there is supposed to be a difference in the substrate composition between the north and the southside. Right now I am doing a pca and with the biplot function it works quite fine, but I would like to change the plot a bit.
I need one where the sites are just points and not numbered, arrows point to the different variable and the sites are colored according to their location (north or southside). So I tried everything i could find.
Most examples where with the dune data and suggested something like this:
library(vegan)
library(biplot)
data(dune)
mod <- rda(dune, scale = TRUE)
biplot(mod, scaling = 3, type = c("text", "points"))
So according to this I would just need to say text and points and R would label the variables and just make points for the sites. When i do this, however I get the Error:
Error in plot.default(x, type = "n", xlim = xlim, ylim = ylim, col = col[1L], :
formal argument "type" matched by multiple actual arguments
No idea how to get around this.
So next strategy I found, is to make a plot manually like this:
require("vegan")
data(dune, dune.env)
mod <- rda(dune, scale = TRUE)
scl <- 3 ## scaling == 3
colvec <- c("red2", "green4", "mediumblue")
plot(mod, type = "n", scaling = scl)
with(dune.env, points(mod, display = "sites", col = colvec[Use],
scaling = scl, pch = 21, bg = colvec[Use]))
text(mod,display="species", scaling = scl, cex = 0.8, col = "darkcyan")
with(dune.env, legend("bottomright", legend = levels(Use), bty = "n",
col = colvec, pch = 21, pt.bg = colvec))
This works fine so far as well, I get different colors and points, but now the arrows are missing. So I found that this should be corrected easy, if i just put "display="bp"" in the text line. But this doesn´t work either. Everytime I put "bp" R says:
Error in match.arg(display) :
argument "display" is missing, with no default
So I´m kind of desperate now. I looked through all the answers here and I don´t understand why display="bp" and type=c("text","points") is not working for me.
If anyone has an idea i would be super grateful.
https://www.dropbox.com/sh/y8xzq0bs6mus727/AADmasrXxUp6JTTHN5Gr9eufa?dl=0
This is the link to my dropbox folder. It contains my R-script and the csv files. The one named environmentalvariables_Kon1 also contains the data about north and southside.
So yeah...if anyone could help me. That would be awesome. I really don´t know what to do anymore.
Best regards,
Nancy
You can add arrows with arrows(). See the code for vegan:::biplot.rda to see how it works in the original function.
With your plot, add
g <- scores(mod, display = "species")
len <- 1
arrows(0, 0, len * g[, 1], len * g[, 2], length = 0.05, col = "darkcyan")
You might want to adjust the value of len to make the arrows longer
I've used the 'PCA' function from the 'FactoMineR' package to obtain principal component scores. I've tried reading through the package details and similar questions on this forum but can't figure out the code to modify the line type of the arrows used to represent supplementary variables on the variables factor map. By default, these are blue and dashed lines and I desperately cannot find how to make them continuous
I don't manage ggplot and really want to know if there is a solution to this kind of plot :
plot(res, choix="var")
Does someone know the tip please ?
For example, here is a code :
library(FactoMineR)
data("decathlon")
res <- PCA(decathlon,quanti.sup = 10:12,quali.sup = 13) #this would generate an automatic plot but I'd prefer working on a personal plot
windows()
plot(res,choix="var", shadow = T, title="", cex = 1.2, cex.lab = 1.3)
The dashed line parameter is hard-coded, so you can't alter it when calling the function. Here is the exact line of code the function is calling:
arrows(0, 0, coord.quanti[q, 1], coord.quanti[q, 2], length = 0.1, angle = 15, code = 2, lty = 2, col=coll2[q])
If you need to change it, you would have to get the code from github and change lty=2 to lty=1 on line 350 (shown above), or just make it an optional input parameter to the function and set lty to the variable value. Then, you would call plot.PCA(res,choix="var", shadow = T, title="", cex = 1.2, cex.lab = 1.3)
Hoping for some pointers or some experiences insight as i'm literally losing my mind over this, been trying for 2 full days to set up the right values to have a function spit out clean simple line plots from the gbm.plot function (packages dismo & gbm).
Here's where I start. bty=n in par to turn off the box & leave me with only left & bottom axes. Gbm.plot typically spits out one plot per explanatory variable, so usually 6 plots etc, but I'm tweaking it to do one per variable & looping it. I've removed the loop & lots of other code so it's easy to see what's going on.
png(filename = "whatever.png",width=4*480, height=4*480, units="px", pointsize=80, bg="white", res = NA, family="", type="cairo-png")
par(mar=c(2.6,2,0.4,0.5), fig=c(0,1,0.1,1), las=1, bty="n", mgp=c(1.6,0.5,0))
gbm.plot(my_gbm_model,
n.plots=1,
plot.layout = c(1,1),
y.label = "",
write.title=F,
variable.no = 1, #this is part of the multiple plots thing, calls the explanatory variable
lwd=8, #this controls the width of the main result line ONLY
rug=F)
dev.off()
So this is what the starting condition looks like. Aim: make the axes & ticks thicker. That's it.
Putting "lwd=20" in par does nothing.
Adding axes=F into gbm.plot() turns the axes and their numbers off. So I conclude that the control of these axes is handled by gbm.plot, not par. Here's where it get's frustrating and crap. Accepted wisdom from searches says that lwd should control this but it only controls the wiggly centre line as per my note above. So maybe I could add axis(side=1, lwd=8) into gbm.plot() ?
It runs but inexplicably adds a smoother! (which is very thin & hard to see on the web but it's there, I promise). It adds these warnings:
In if (smooth & is.vector(predictors[[j]])) { ... :
the condition has length > 1 and only the first element will be used
Fine, R's going to be a dick for seemingly no reason, I'll keep plugging the leaks as they come up. New code with axis as before and now smoother turned off:
png(filename = "whatever.png",width=4*480, height=4*480, units="px", pointsize=80, bg="white", res = NA, family="", type="cairo-png")
par(mar=c(2.6,2,0.4,0.5), fig=c(0,1,0.1,1), las=1, bty="n", mgp=c(1.6,0.5,0))
gbm.plot(my_gbm_model,
n.plots=1,
plot.layout = c(1,1),
y.label = "",
write.title=F,
variable.no = 1,
lwd=8,
rug=F,
smooth=F,
axis(side=1,lwd=8))
dev.off()
Gives error:
Error in axis(side = 1, lwd = 8) : plot.new has not been called yet
So it's CLEARLY drawing axes within plot since I can't affect the axes from par and I can turn them off in plot. I can do what I want and make one axis bold, but that results in a smoother and warnings. I can turn the smoother off, but then it fails because it says plot.new hadn't been called. And this doesn't even account for the other axis I have to deal with, which also causes the plot.new failure if I call 2 axis sequentially and allow the smoother.
Am I the butt of a big joke here, or am I missing something obvious? It took me long enough to work out that par is supposed to be before all plots unless you're outputting them with png etc in which case it has to be between png & plot - unbelievably this info isn't in ?par. I know I'm going off topic by ranting, sorry, but yeah, 2 full days. Has this been everyone's experience of plotting in R?
I'm going to open the vodka in the freezer. I appreciate I've not put the full reproducible code here, apologies, I can do if absolutely necessary, but it's such a huge timesuck to get to reproducible stage and I'm hoping someone can see a basic logical/coding failure screaming out at them from what I've given.
Thanks guys.
EDIT: reproducibility
core data csv: https://drive.google.com/file/d/0B6LsdZetdypkWnBJVDJ5U3l4UFU
(I've tried to make these data reproducible before and I can't work out how to do so)
samples<-read.csv("data.csv", header = TRUE, row.names=NULL)
my_gbm_model<-gbm.step(data=samples, gbm.x=1:6, gbm.y=7, family = "bernoulli", tree.complexity = 2, learning.rate = 0.01, bag.fraction = 0.5))
Here's what will widen your axis ticks:
..... , lwd.ticks=4 , ...
I predict on the basis of no testing because I keep getting errors with what limited code you have provided) that it will get handled correctly in either gbm.plot or in a subsequent axis call. There will need to be a subsequent axis call, two of them in fact (because as you noted 'lwd' gets passed around indiscriminately):
png(filename = "whatever.png",width=4*480, height=4*480, units="px", pointsize=80, bg="white", res = NA, family="", type="cairo-png")
par(mar=c(2.6,2,0.4,0.5), fig=c(0,1,0.1,1), las=1, bty="n", mgp=c(1.6,0.5,0))
gbm.plot(my_gbm_model,
n.plots=1,
plot.layout = c(1,1),
y.label = "",
write.title=F,
variable.no = 1,
lwd=8,
rug=F,
smooth=F, axes="F",
axis(side=1,lwd=8))
axis(1, lwd.ticks=4, lwd=4)
# the only way to prevent `lwd` from also affecting plot line
axis(2, lwd.ticks=4, lwd=4)
dev.off()
This is what I see with a simple example:
png(); Speed <- cars$speed
Distance <- cars$dist
plot(Speed, Distance,
panel.first = lines(stats::lowess(Speed, Distance), lty = "dashed"),
pch = 0, cex = 1.2, col = "blue", axes=FALSE)
axis(1, lwd.ticks=4, lwd=4)
axis(2, lwd.ticks=4, lwd=4)
dev.off()