I have the following script:
FGM = function (n,r,z){
x = r*sqrt(n)/(2*z)
Px = 1-pnorm(x)
}
re = 10000
data = data.frame(abs(rnorm(re,0,1)), abs(rnorm(re,0,1)), abs(rnorm(re,0,1)))
colnames(data) = c("n","r","z")
data$Px = FGM(data$n,data$r,data$z)
data$x = data$r*sqrt(data$n)/(2*data$z)
par(mar=c(4.5,4.5,1,1))
plot(data$x,data$Px, xlim = c(0,3), pch = 19, cex = 0.1, xaxs="i", yaxs="i",
xlab = expression(paste("Standardized mutational size (",italic(x), ")")),
ylab = expression(paste("P"[a],"(",italic(x),")")))
Which is a recreation of the graph found here (box 2). You can see in this script that I do this by just plotting 10000 small black points with various values of n,z, and r. This seems like an ugly work around, I think I should just be able to give R my function
FGM = function (n,r,z){
x = r*sqrt(n)/(2*z)
Px = 1-pnorm(x)
}
and have it plot a line on a graph. However, a few hours of scouring the web has been unproductive, and I tried a few ways with abline and lines but nothing worked, is there a way of doing it with these functions or another function?
Tried this...
plot(data$x,data$Px, xlim = c(0,3), ylim = c(0,0.5), xaxs="i", yaxs="i",
xlab = expression(paste("Standardized mutational size (",italic(x), ")")),
ylab = expression(paste("P"[a],"(",italic(x),")")), type = "n")
curve(1-pnorm(r*sqrt(n)/(2*z)), add=T)
>Error in curve(1 - pnorm(r * sqrt(n)/(2 * z)), add = T) :
'expr' must be a function, or a call or an expression containing 'x'
>
#PaulRegular offered this solution but it still plots based on data, not the formula itself. I'm looking for a solution which can produce the curve properly without large values of "re" - using the following but with "re" set to 10 you can see what I mean...
data <- data[order(data$x),]
lines(data$x, data$Px, lwd=1)
You can pass a function of just one variable to plot. I guess that you are looking for:
plot(function(x) 1-pnorm(x),0,3)
Try sorting your data by x, then add the line:
data <- data[order(data$x),]
lines(data$x, data$Px, lwd=2)
Related
This sounds like a really basic question and it probably is but I can't figure out how to change the line width when plotting a locfit object. If you do a simple test, such as:
plot(locfit::locfit(~rnorm(n = 1000)))
and compare it with
plot(locfit::locfit(~rnorm(n = 1000)), lwd = 2.0)
You will see that the plotted line has the same thickness in both plots. So using lwd does not work when plotting a locfit object? Is there any workaround?
Thanks!
You could use your model to predict the output to use lines on an empty plot which makes it possible to change the linewidth with lwd like this:
library(locfit)
#> locfit 1.5-9.7 2023-01-02
set.seed(7)
fit <- locfit(~rnorm(n = 1000))
plot(fit)
set.seed(7)
xvalues <- seq(min(rnorm(n = 1000)), max(rnorm(n = 1000)), length.out = 100)
pred <- predict(fit, xvalues)
plot(1, type="n", xlab="", ylab="", xlim=c(-3, 3), ylim=c(0, 0.4))
lines(xvalues, pred, lwd = 10)
Created on 2023-02-09 with reprex v2.0.2
There is not currently a way to do that in the existing function. When you call plot() on the locfit object, it calls preplot.locift() on your object, and then plot.preplot.locfit() which calls plot.locfit.1d(). The relevant lines from the code are:
plot(xev[ord], yy[ord], type = "n", xlab = xlab, ylab = ylab,
main = main, xlim = range(x$xev[[1]]), ylim = ylim,
...)
}
lines(xev[ord], yy[ord], type = type, lty = lty, col = col)
As you can see, the ... goes through to the plot function, but the line actually gets added with lines() which does not have access to other arguments specified in ...
We have been told to make a histogram and line using our given data. I can make the histogram I think correctly. However we were told to use bw='sj' in our density function. I do not understand how I would put this to use.
i tried putting it in the hist() function as I thought it is a parameter however I get an error that says:
Warning messages:
1: In plot.window(xlim, ylim, "", ...) : "bw" is not a graphical parameter
2: In title(main = main, sub = sub, xlab = xlab, ylab = ylab, ...) :
"bw" is not a graphical parameter
3: In axis(1, ...) : "bw" is not a graphical parameter
4: In axis(2, ...) : "bw" is not a graphical parameter
This is part of my code that deals with the problem in R.
# histogram 1
rdi4p -> data_shhs[,'rdi4p']
hist(rdi4p ,probability=TRUE,col=rgb(0,0,1,1/4),breaks=30,
xlab="rdi4p",
main="Histogram 1",col.axis="blue")
lines(x=density(x= rdi4p),type="l",col="blue",lwd=3)
Of course, I don't have your data to work on (in particular we would need to know what rdi4p and sj were to make this fully reproducible), so I'll make up our own values for these variables:
set.seed(1) # Make example reproducible
rdi4p <- rnorm(1000) # Vector of 1000 samples from normal distribution
sj <- diff(range(rdi4p))/30 # 1/30 of the range of vector rdi4p
Now we draw the histogram using your code:
hist(rdi4p, probability = TRUE, col = rgb(0, 0, 1, 1/4), breaks = 30,
xlab = "rdi4p", main = "Histogram 1", col.axis = "blue")
and then we add the line. Note that we have to pass the parameter bw = sj to the density function, which is itself sitting inside the call to lines:
lines(x = density(x = rdi4p, bw = sj), type = "l", col = "blue", lwd = 3)
I am trying to plot few graphs using loops. I am now describing in details.
First I have a function which is calculates the y-variable (called effect for vertical axis)
effect<- function (x, y){
exp(-0.35*log(x)
+0.17*log(y)
-0.36*sqrt(log(x)*log(y)/100))
}
Now I run the following code and use the option par to plot the lines in the same graph. I use axis=FALSE and xlab="" to get a plot without labels. I do this so that my labels are not re-written each time the loop runs and looks ugly.
for (levels in seq(exp(8), exp(10), length.out = 5)){
x = seq(exp(1),exp(10), length.out = 20)
prc= effect(levels,x)
plot(x, prc,xlim = c(0,max(x)*1.05), ylim=c(0.0,0.3),
type="o", xlab = "",ylab = "", pch = 16,
col = "dark blue", lwd = 2, cex = 1, axes = F)
label = as.integer(levels) #x variable
text(max(x)*1.03,max(prc), label )
par(new=TRUE)
}
Finally, I duplicate the plot command this time using the xlab and ylab options
plot(x, prc, xlab = "X-label", ylab = "effect",
xlim = c(0,max(x)*1.05), ylim = c(0,0.3),
type="l", col ='blue')
I have several other plots in the similar lines, using complex equations. I have two questions:
Is there an better option to have the same plot with smoother lines?
Is there an easier option with few lines to achieve the same, where I can place the texts (levels) for each line on the right with white background at the back?
I believe working with the plot function was tedious and time consuming. So, I have finally used ggplot2 to plot. There were several help available online, which I have used.
I am quite new to R programming and have been given the task of representing some data in a boxplot. We were only provided the five figure summary of the data, i.e the lowest value, lower quartile,median,upper quartile,highest value. We are also told the amount of samples (n).
I read bxp was a function similar to boxplot but drew the boxplot based upon this five figure summary.
However, I know varwidth can be used to change the width of boxes proportionate to N, yet it does not seem to work here as all boxes are the same length. This is what I need help with.
MORSEYear1 <- c(18.2,58.5,64.4,73.4,91.1)
MORSEYear2 <- c(22.3,56.4,64.3,75.7,97.4)
MORSEYear3 <- c(29.1,57.9,66.6,73.4,86.0)
MathStatYear1 <- c(46.8,54.8,66.1,71.4,84.1)
MathStatYear2 <- c(35.1,47.8,57.8,65.7,82.8)
MathStatYear3 <- c(32.6,56.3,61.1,75.6,89.4)
MORSE1<-list(stats=matrix(MORSEYear1,MORSEYear1[5],MORSEYear1[1]), n=139)
MORSE2<-list(stats=matrix(MORSEYear2,MORSEYear2[5],MORSEYear2[1]), n=132)
MORSE3<-list(stats=matrix(MORSEYear3,MORSEYear3[5],MORSEYear3[1]), n=131)
MS1 <- list(stats=matrix(MathStatYear1,MathStatYear1[5],MathStatYear1[1]), n= 21)
MS2 <- list(stats=matrix(MathStatYear2,MathStatYear2[5],MathStatYear2[1]), n=20)
MS3 <- list(stats=matrix(MathStatYear3,MathStatYear3[5],MathStatYear3[1]), n= 14)
bxp(MORSE1, xlim = c(0.5,6.5),ylim = c(0,100),varwidth= TRUE, main = "Graph comparing distribution of marks across different years of MORSE and MathStat",ylab = "Marks", xlab = "Course and year of study (Course,Year)", axes = FALSE)
par(new=T)
bxp(MORSE2, xlim = c(-0.5,5.5), ylim = c(0,100),axes= TRUE, varwidth=TRUE)
par(new=T)
bxp(MORSE3, xlim = c(-1.5,4.5), ylim = c(0,100), varwidth=TRUE, axes = FALSE)
par(new=T)
bxp(MS1, xlim = c(-2.5,3.5), ylim = c(0,100), varwidth=TRUE, axes = FALSE)
par(new=T)
bxp(MS2, xlim = c(-3.5,2.5), ylim = c(0,100), varwidth=TRUE, axes = FALSE)
par(new=T)
bxp(MS3, xlim = c(-4.5,1.5), ylim = c(0,100), varwidth=TRUE, axes = FALSE)
NOTE: My supervisor said to use par(new=T) and change the xlim to plot multiple graphs using bxp(), if someone could verify if this is the best method or not that would be great!
Thanks
Stumbled upon the same problem, without much experience with R.
The varwidth argument of the bxp() function requires multiple boxplots being plotted at once. Adding to an initial plot does not count, as no readjustment is possible after the fact.
The question is how to construct a multidimensional z argument for bxp(). To answer this, a look at the result of something like boxplot(c(c(1,1),c(2,2))~c(c(11,11),c(22,22))) helps.
First, a generic example with made-up data to aid anyone that lands here:
# data
d1 <- c(1,2,3,4,5)
d2 <- c(1,2,3,5,8,13,21,34)
# summaries (generated with quantile and structured accordingly)
z1 <- list(
stats=matrix(quantile(d1, c(0.05,0.25,0.5,0.75,0.85))),
n=length(d1)
)
z2 <- list(
stats=matrix(quantile(d2, c(0.05,0.25,0.5,0.75,0.85))),
n=length(d2)
)
# merging the summaries appropriately
z <- list(
stats=cbind(z1$stats,z2$stats),
n=c(z1$n,z2$n)
)
# check result
print(z)
# call bxp with needed parameters ("at" can/should also be used here)
bxp(z=z,varwidth=TRUE)
In the case of the original question, one should merge MORSE# and MS#. The code is far from optimal - there might be a better way to merge and a function for this can be written, but the aim is ugly clarity and simplicity:
z <- list(
stats=cbind(MORSE1$stats, MORSE2$stats, MORSE3$stats, M1$stats, M2$stats, M3$stats),
n=c(MORSE1$stats, MORSE2$n, MORSE3$n, M1$n, M2$n, M3$n)
)
Run-on question following this problem setting axis widths in gbm.plot; I'm now using plot.gbm directly and don't seem to be able to remove the y axis label, which seems to be set within the plot.gbm function code.
png(filename="name.png",width=4*480, height=4*480, units="px", pointsize=80, bg="white", res=NA, family="", type="cairo-png")
par(mar=c(2.6,2,0.4,0.5), fig=c(0,1,0.1,1), las=1, lwd=8, bty="n", mgp=c(1.6,0.5,0))
plot.gbm(my_gbm_model,1,return.grid=FALSE, write.title=F,lwd=8, ylab=F, axes=F, ylabel=FALSE, ylabel="")
axis(1, lwd.ticks=8, lwd=8, labels=FALSE)
axis(2, lwd.ticks=8, lwd=8, labels=NA, ylab=FALSE,ylabel=FALSE)
dev.off()
Result:
The y axis label is still there despite all my atempts to remove it through par and plot and axis. I could try burrowing into the function and changing this (and similar) lines:
print(stripplot(X1 ~ temp | X2 * X3, data = X.new,
xlab = x$var.names[i.var[i[1]]],
ylab = paste("f(", paste(x$var.names[i.var[1:3]], collapse = ","), ")", sep = ""),
...))
...but I've been advised against such practices. Any thoughts why this might be working? Simply that the function overrides the setting?
Reproducibility:
#core data csv: https://drive.google.com/file/d/0B6LsdZetdypkWnBJVDJ5U3l4UFU
#(I've tried to make these data reproducible before and I can't work out how to do so)
library(dismo)
samples <- read.csv("data.csv", header = TRUE, row.names=NULL)
my_gbm_model <- gbm.step(data=samples, gbm.x=1:6, gbm.y=7, family = "bernoulli",
tree.complexity = 2, learning.rate = 0.01, bag.fraction = 0.5)
The problem is that plot.gbm just isn't a very R-like function. Since anyone can submit a package to CRAN it's not required that they follow traditional R patterns and that looks like what happened here. If you step though plot.gbm with your sample data, you see that ultimately the plotting is done with
plot(X$X1, X$y, type = "l", xlab = x$var.names[i.var], ylab = ylabel)
and the ylabel is set immediately before with no option to disable it. The authors simply provided no standard way to suppress the ylab for this particular plotting function.
In this case the easiest way might just be to reduce the left margin so the label prints off the plot. Seems like
par("mar"=c(5,2.2,4,2)+.1, fig=c(0,1,0.1,1), las=1, lwd=8, bty="n")
plot.gbm(my_gbm_model,1,return.grid=FALSE, write.title=F,lwd=8, ylab="", axes=F)
axis(1, lwd.ticks=8, lwd=8, labels=FALSE)
axis(2, lwd.ticks=8, lwd=8, labels=FALSE)