Using plot with title containing text, formulas and variables - r

I want to make the title of my plot contain text, formulas and variables. Consider the toy example where I want the title to read as:
Histogram of normal distribution with (mu/sigma) equal to (value of mu/sigma)
(where the first bracket is to be rendered as a formula)
Based on some questions around this site, I tried the following code:
x <- rnorm(1000)
mu <- 1
sigma <- 0
hist(x, main=bquote("Histogram of normal distribution with " *frac(mu,sigma)* " equal to ", .(mu/sigma) ) )
Now the problem is that the value of mu/sigma is not shown, like so:
How can I get the last bit to show?

Here's one way to do it:
title(main=substitute(paste("Histogram of normal distribution with ",
frac(mu,sigma), " equal to ", frac(m,s)),
list(m=mu, s=sigma)))

Related

How to make PDF output and code output look right in Rmarkdown

I'm working with a large collection of R Markdown files that are collectively used to create a book (PDF output) using bookdown. For long lines of code (that will show up in the book), I've been writing it out close to column 80, at which point I insert a hard line break, tab (4 spaces) over, and continue the code. This is what I think looks good in the text (for readers). However, I'm running into problems when this is (for example) part of a ggplot() function. Here's a quick (meaningless) example:
testx <- seq(1:100)
testy <- rnorm(testx)
testdat <- as.data.frame(cbind(testx, testy))
g <- ggplot(testdat, aes(x = testx/100, y = testy))
g <- g + geom_point() + geom_line()
g <- g + labs(title = "The Posterior Distribution",
caption = "The mode and 95% HPD intervals are the dot and horizontal line at
the bottom, respectively.",
x = expression(beta[1]~(slope)))
g
In this example, I need to break the caption() argument across two lines because it is long, but I do NOT want it to break in the actual caption.
Is there a way to do this?
If I understood correctly, you can use paste0.
caption = paste0("The mode and 95% HPD intervals are the dot",
" and horizontal line at the bottom, respectively.")

Scatter plot in R doesn't use the x values in the variable indicated in the plot statement

I am trying to make a scatter plot in R between two numeric variables, and it uses the observation number as the x variable. This is the problem I'm trying to fix: I would like to have a scatter plot that uses the values of the x variable I indicated in the plot statement.
Yes, both the X variable and the Y variable are numeric.
I've attached a screenshot showing the data setup (Galton height data), the fact that the father and son variables are both numeric, and the resulting plot.
Here's the code that sets up the data and runs the scatter plot:
#install.packages("dplyr")
library('dplyr')
#tidyverse is name of package used for class
library(tidyverse)
remove.packages('HistData')
install.packages('HistData')
library(HistData)
data("GaltonFamilies")
childNum <- galton_heights[,6]
gender <- galton_heights[,8]
#Different code to get son height
#If we wanted to follow the lesson exactly, we would
#use the following
son_data <- GaltonFamilies[GaltonFamilies$gender == "male" & GaltonFamilies$childNum == 1,]
son <- son_data$childHeight
#Now we can compare the oldest child's height (if they happen to be male) with that of the father:
GaltonFamilies %>% summarize(mean(father), sd(father), mean(son), sd(son))
GaltonFamilies$father2 <- as.numeric(GaltonFamilies$father)
#galton_heights$father <- as.numeric(levels(galton_heights$father))[galton_heights$father]
plot(GaltonFamilies$father,GaltonFamilies$son)
plot(GaltonFamilies$father2, GaltonFamilies$son, main="Scatterplot Example",
xlab="Father ", ylab="Son ")
Edit: the filter statement creating son_data wasn't working when I ran the above code fresh. I don't know why. I've replaced it with a way to get son_data without the filter.
son_data <- GaltonFamilies[GaltonFamilies$gender == "male" & GaltonFamilies$childNum == 1,]
There is no GaltonFamilies$son. See also: Random data added when using `plot` in R

plot.lm(): extracting numbers labelled in the diagnostic Q-Q plot

For the simple example below, you can see that there are certain points that are identified in the ensuing plots. How can I extract the row numbers identified in these plots, especially the Normal Q-Q plot?
set.seed(2016)
maya <- data.frame(rnorm(100))
names(maya)[1] <- "a"
maya$b <- rnorm(100)
mara <- lm(b~a, data=maya)
plot(mara)
I tried using str(mara) to see if I could find a list there, but I can't see any of the numbers from the Normal Q-Q plot there. Thoughts?
I have edited your question using set.seed(2016) for reproducibility. To answer your question, I need to explain how to produce the Q-Q plot you see.
se <- sqrt(sum(mara$residuals^2) / mara$df.residual) ## Pearson residual standard error
hii <- lm.influence(mara, do.coef = FALSE)$hat ## leverage
std.resi <- mara$residuals / (se * sqrt(1 - hii)) ## standardized residuals
## these three lines can be replaced by: std.resi <- rstandard(mara)
Now, let's compare the Q-Q plot we generate ourselves and that generated by plot.lm:
par(mfrow = c(1,2))
qqnorm(std.resi, main = "my Q-Q"); qqline(std.resi, lty = 2)
plot(mara, which = 2) ## only display Q-Q plot
The same, right?
Now, the only issue left is how the numbers are labelled. Those labelled points mark the largest 3 absolute standardised residuals. Consider:
x <- sort(abs(std.resi), decreasing = TRUE)
id <- as.integer(names(x))
id[1:3]
# [1] 23 8 12
Now, if you look at the graph closely, you can see that those three numbers are exactly what is shown. Knowing this, you can also check out, for example, id[1:5].

rpart plot text shorter

I am using the prp function from the rpart.plot package to plot a tree. For categorical data like states, it gives a really long list of variables and makes it less readable. Is there any way to wrap text to two or more lines if exceeds some length?
Here's an example that wraps long split labels over multiple
lines. The maximum length of each line is 25 characters. Change the
25 to suit your purposes. (This example is derived from Section 6.1 in
the rpart.plot vignette.)
tree <- rpart(Price/1000 ~ Mileage + Type + Country, cu.summary)
split.fun <- function(x, labs, digits, varlen, faclen)
{
# replace commas with spaces (needed for strwrap)
labs <- gsub(",", " ", labs)
for(i in 1:length(labs)) {
# split labs[i] into multiple lines
labs[i] <- paste(strwrap(labs[i], width=25), collapse="\n")
}
labs
}
prp(tree, split.fun=split.fun)

Automatically update graph title with parameter

I am not very familiar with R. I was using R to make the poisson distribution plot for different lambda (from 1 to 10), and display the plot for each just as a comparison.
But I would like to add a title say: "lambda = 1" for plot 1, "lambda=2" for plot 2 ... etc on the graph automatically according to lambda. But I wasn't able to figure out how to update the title automatically. This is my code, I was able to output 10 different graph correctly , but not sure how to update or add the corresponding lambda to the title automatically. Could someone give me some hint.
Also is it possible to say have a font size of "small" for the plot 1 to 5, and then a font size of 6 to 10?
Thanks
the_data_frame<-data.frame(matrix(ncol=10,nrow=21))
lam<-seq(1,10,1)
lam
x<-seq(0,20,1)
x
for (i in 1:10){
the_data_frame[i]<-exp(-lam[i])*lam[i]**x/gamma(x+1)
}
the_data_frame<-cbind(the_data_frame, x)
par(mfrow=c(5,2))
for (i in 1:10){
plot(the_data_frame[[i]]~the_data_frame[[11]], the_data_frame)
}
You can simplify the problem. Using one loop, over the lamda values, you compute at each iteration the value of y using the poison formula then you plot it. I use main argument to add a title for each plot. Here I am using bquote to get a plotmath format of lambda value.
For example , for 4 values of lambda , you get:
x<-seq(0,20,1);lam = c(0.5,1,2,4)
par(mfrow=c(2,2))
lapply(lam,function(lamd){
y <- exp(-lamd)*lamd*x/gamma(x+1)
plot(x,y,main=bquote(paste(lambda,'=',.(lamd))),type='l')
})
This might help:
for (i in 1:10){
plot(the_data_frame[[i]]~the_data_frame[[11]], the_data_frame,
main=paste("lambda=", i, sep=""))
}
library(ggplot2)
xval <- rep(0:20,10)
lambda <- rep(1:10,21)
yvtal <- exp(- lambda)*lambda**xval/gamma(xval+1)
the_new_data_frame <- data.frame(cbind(xval,lambda,yval))
plot1 <- ggplot(the_new_data_frame, aes(xval, yval)) + geom_line(aes(colour=factor(lambda)))
plot1
plot1 + facet_grid(~lambda)
Were you looking for an interactive window where you can input the text and update the figure title? If yes you may want to look for the tcltk package.
See
http://bioinf.wehi.edu.au/~wettenhall/RTclTkExamples/modalDialog.html

Resources