R Statistics Distributions Plotting - r

I am having some trouble with a homework I have at Statistics.
I am required to graphical represent the density and the distribution function in two inline plots for a set of parameters at my choice ( there must be minimum 4 ) for Student, Fisher and ChiS repartitions.
Let's take only the example of Student Repartition.
From what I have searched on the internet, I have come with this:
First, I need to generate some random values.
x <- rnorm( 20, 0, 1 )
Question 1: I need to generate 4 of this?
Then I have to plot these values with:
plot(dt( x, df = 1))
plot(pt( x, df = 1))
But, how to do this for four set of parameters? They should be represented in the same plot.
Is this the good approach to what I came so far?
Please, tell me if I'm wrong.

To plot several densities of a certain distribution, you have to first have a support vector, in this case x below.
Then compute the values of the densities with the parameters of your choice.
Then plot them.
In the code that follows, I will plot 4 Sudent-t pdf's, with degrees of freedom 1 to 4.
x <- seq(-5, 5, by = 0.01) # The support vector
y <- sapply(1:4, function(d) dt(x, df = d))
# Open an empty plot first
plot(1, type = "n", xlim = c(-5, 5), ylim = c(0, 0.5))
for(i in 1:4){
lines(x, y[, i], col = i)
}
Then you can make the graph prettier, by adding a main title, changing the axis titles, etc.
If you want other distributions, such as the F or Chi-squared, you will use x strictly positive, for instance x <- seq(0.0001, 10, by = 0.01).

Related

Converting a plotted line into a model object

Let's say I have a series of points between which I want to plot straight lines:
x <- c(0, 2, 4, 7, 12)
y <- c(0, 0, 4, 5, 0)
plot(x, y, type = 'l')
How would I go about turning this plotted line into a simple model object? For instance, something with which I would be able to use the stats::predict() function to do something like this:
model.object <- ???
predict(model.object, data.frame(x = 3))
Output:
2
Or, at the very least, is there some way R can identify the slopes and intercepts of each of these lines between the points so I could manually create a piecewise function using if-statements?
While it's a bit different than predict, you can use approxfun to do interpolation between points
f <- approxfun(x, y)
f(3)
# [1] 2
Note that it just takes a vector of x values rather than a data.frame to make predictions.

Undertanding Sample Sizes in qqplots with R

I'm trying to plot QQ graphs with the MASS Boston data set and comparing how the plots will change with increased random data points. I'm looking at the R documentation on qqnorm() but it doesn't seem to let me select an n value as a parameter? I'd like to plot the QQplots of “random” samples of size 10, 100, and 1000 samples all from a normal distribution for the same variable all in a 3x1 matrix.
Example would be if I wanted to look at the QQplot for Boston Crime, how would I get
qqnorm(Boston$crim) #find how to set n = 10
qqnorm(Boston$crim) #find how to set n = 100
qqnorm(Boston$crim) #find how to set n = 1000
Also if someone could elaborate when to use qqplot() vs qqnorm(), I'd appreciate it.
I'm inclined to believe that I should use qqplot() as such, as it does seem to give me the output I want, but I want to make sure that using rnorm(n) and then using that variable as a second argument is okay to do:
x <- rnorm(10)
y <- rnorm(100)
z <- rnorm(1000)
par(mfrow = c(1,3))
qqplot(Boston$crim, x)
qqplot(Boston$crim, y)
qqplot(Boston$crim, z)
The question is not clear but to plot samples of a vector, define a vector N of sample sizes and loop through it. The lapply loop will sample from the vector, plot with the Q-Q line and return the qqnorm plot.
data(Boston, package = "MASS")
set.seed(2021)
N <- c(10, 100, nrow(Boston))
qq_list <- lapply(N, function(n){
subtitle <- paste("Sample size:", n)
i <- sample(nrow(Boston), n, replace = FALSE)
qq <- qqnorm(Boston$crim[i], sub = subtitle)
qqline(Boston$crim[i])
qq
})

How do I plot a power function 1-\phi(4.65-x/2) in R?

I want to plot a [power function][1] in R, namely 1-\phi(4.65-z/2). This can be written as \int_{-\infty}^{4.65-z/2}\frac{1}{2\pi} \exp(-\frac{x^2}{2}}) in latex.
Can someone explain how to plot this? Is there a specific command for the phi function?
This function, \Phi, is a cumulative distribution function of a standard normal random variable, and yes, there is a function for that in R: dnorm. Hence,
z <- seq(-2, 20, length = 1000)
plot(z, 1 - dnorm(4.65 - z / 2), type = 'l')
# or also just curve(1 - dnorm(4.65 - x / 2), -2, 20)

how do i fit unique curves on each unique plot in a for loop

I have written this code (see below) for my data frame kleaf.df to combine multiple plots of variable press_mV with each individual plot for unique ID
I need some help fitting curves to my plots. when i run this code i get the same fitted curve (the curve fitted for the first plot) on ALL the plots where i want each unique fitted curve on each unique plot.
thanks in advance for any help given
f <- function(t,a,b) {a * exp(b * t)}
par(mfrow = c(5, 8), mar = c(1,1,1,1), srt = 0, oma = c(1,6,5,1))
for (i in unique(kleaf.df$ID))
{
d <- subset(kleaf.df, kleaf.df$ID == i)
plot(c(1:length(d$press_mV)),d$press_mV)
#----tp:turning point. the last maximum value before the values start to decrease
tp <- tail(which( d$press_mV == max(d$press_mV) ),1)
#----set the end points(A,B) to fit the curve to
A <- tp+5
B <- A+20
#----t = time, p = press_mV
# n.b:shift by 5 accomadate for the time before attachment
t <- A:B+5
p <- d$press_mV[A:B]
fit <- nls(p ~ f(t,a,b), start = c(a=d$press_mV[A], b=-0.01))
#----draw a curve on plot using the above coefficents
curve(f(x, a=co[1], b=co[2]), add = TRUE, col="green", lwd=2)
}

Easiest way to plot inequalities with hatched fill?

Refer to the above plot. I have drawn the equations in excel and then shaded by hand. You can see it is not very neat. You can see there are six zones, each bounded by two or more equations. What is the easiest way to draw inequalities and shade the regions using hatched patterns ?
To build up on #agstudy's answer, here's a quick-and-dirty way to represent inequalities in R:
plot(NA,xlim=c(0,1),ylim=c(0,1), xaxs="i",yaxs="i") # Empty plot
a <- curve(x^2, add = TRUE) # First curve
b <- curve(2*x^2-0.2, add = TRUE) # Second curve
names(a) <- c('xA','yA')
names(b) <- c('xB','yB')
with(as.list(c(b,a)),{
id <- yB<=yA
# b<a area
polygon(x = c(xB[id], rev(xA[id])),
y = c(yB[id], rev(yA[id])),
density=10, angle=0, border=NULL)
# a>b area
polygon(x = c(xB[!id], rev(xA[!id])),
y = c(yB[!id], rev(yA[!id])),
density=10, angle=90, border=NULL)
})
If the area in question is surrounded by more than 2 equations, just add more conditions:
plot(NA,xlim=c(0,1),ylim=c(0,1), xaxs="i",yaxs="i") # Empty plot
a <- curve(x^2, add = TRUE) # First curve
b <- curve(2*x^2-0.2, add = TRUE) # Second curve
d <- curve(0.5*x^2+0.2, add = TRUE) # Third curve
names(a) <- c('xA','yA')
names(b) <- c('xB','yB')
names(d) <- c('xD','yD')
with(as.list(c(a,b,d)),{
# Basically you have three conditions:
# curve a is below curve b, curve b is below curve d and curve d is above curve a
# assign to each curve coordinates the two conditions that concerns it.
idA <- yA<=yD & yA<=yB
idB <- yB>=yA & yB<=yD
idD <- yD<=yB & yD>=yA
polygon(x = c(xB[idB], xD[idD], rev(xA[idA])),
y = c(yB[idB], yD[idD], rev(yA[idA])),
density=10, angle=0, border=NULL)
})
In R, there is only limited support for fill patterns and they can only be
applied to rectangles and polygons.This is and only within the traditional graphics, no ggplot2 or lattice.
It is possible to fill a rectangle or polygon with a set of lines drawn
at a certain angle, with a specific separation between the lines. A density
argument controls the separation between the lines (in terms of lines per inch)
and an angle argument controls the angle of the lines.
here an example from the help:
plot(c(1, 9), 1:2, type = "n")
polygon(1:9, c(2,1,2,1,NA,2,1,2,1),
density = c(10, 20), angle = c(-45, 45))
EDIT
Another option is to use alpha blending to differentiate between regions. Here using #plannapus example and gridBase package to superpose polygons, you can do something like this :
library(gridBase)
vps <- baseViewports()
pushViewport(vps$figure,vps$plot)
with(as.list(c(a,b,d)),{
grid.polygon(x = xA, y = yA,gp =gpar(fill='red',lty=1,alpha=0.2))
grid.polygon(x = xB, y = yB,gp =gpar(fill='green',lty=2,alpha=0.2))
grid.polygon(x = xD, y = yD,gp =gpar(fill='blue',lty=3,alpha=0.2))
}
)
upViewport(2)
There are several submissions on the MATLAB Central File Exchange that will produce hatched plots in various ways for you.
I think a tool that will come handy for you here is gnuplot.
Take a look at the following demos:
feelbetween
statistics
some tricks

Resources