I am not an expert on stats, but I've been trying to plot a cdf out of an array of points. I've tried R and Python both. These are my example set of points:(1.5,1.5,2.5,3.5,3.5,3.5,4.5,5.5,5.5,6)
Using the ecdf function in R, I manage to get this:
This was my code:
data <- c(1.5,1.5,2.5,3.5,3.5,3.5,4.5,5.5,5.5,6)
plot(ecdf(data))
Is there a way to get the same plotted as a histogram or would that be fundamentally wrong?
Is this what you mean?
par(mar = c(5,5,2,5))
data <- c(1.5,1.5,2.5,3.5,3.5,3.5,4.5,5.5,5.5,6)
h <- hist(
data,
breaks = seq(0, 10, 1),
xlim = c(0,10))
par(new = T)
ec <- ecdf(data)
plot(x = h$mids, y=ec(h$mids)*max(h$counts), col = rgb(0,0,0,alpha=0), axes=F, xlab=NA, ylab=NA)
lines(x = h$mids, y=ec(h$mids)*max(h$counts), col ='red')
axis(4, at=seq(from = 0, to = max(h$counts), length.out = 11), labels=seq(0, 1, 0.1), col = 'red', col.axis = 'red')
mtext(side = 4, line = 3, 'Cumulative Density', col = 'red')
Related
Given the following empty plot:
plot(1, type="n", xlab="x1", ylab="x2", xlim=c(0, 10), ylim=c(0, 10), axes = F)
axis(1, seq(0,10,1), pos = 0)
axis(2, seq(0,10,1), pos = 0)
lines(x = c(0,10), y = c(10,10))
lines(x = c(10,10), y = c(0,10))
I would like to plot a smooth curve in which x1*x2 = 38, assuming x1 and x2 are both between 0 and 10.
What kind of function could I use to accomplish this?
You may try
plot(1, type="n", xlab="x1", ylab="x2", xlim=c(0, 10), ylim=c(0, 10), axes = F)
axis(1, seq(0,10,1), pos = 0)
axis(2, seq(0,10,1), pos = 0)
lines(x = c(0,10), y = c(10,10))
lines(x = c(10,10), y = c(0,10))
t <- seq(from = 3.8, to = 10, by = .1)
lines(x = t, y = 38/t)
Using curve.
curve(38/x, xlim=c(0, 10), ylim=c(0, 10), xlab='x1', ylab='x2')
I want to plot density lines without showing the histogram, I used this code:
hist(www, prob=TRUE, xlab = "X", main = "Plot",xlim=c(0,11), ylim=c(0,1), breaks =100)
lines(density(x, adjust=5), col="red", lwd=2)
lines(density(y, adjust=5), col="blue", lwd=2)
lines(density(z, adjust=5), col="green", lwd=2)
And the result is showing in the the picture.
How can I remove the Histogram? Thank you in advance!
You could use plot(density(...)) instead of hist:
set.seed(123)
x <- rnorm(100, 0, 1)
y <- rnorm(100, 0.5, 2)
z <- rnorm(100, 1, 1)
dens <- lapply(list(x=x, y=y, z=z), density)
ran <- apply(do.call(rbind, sapply(dens, function(i) list(data.frame(x=range(i$x), y=range(i$y))))), 2, range)
plot(dens[[1]], xlim=ran[,1], ylim=ran[,2], type = 'n', main="Density")
lapply(seq_along(dens), function(i) lines(dens[[i]], col=i))
legend("topright", names(dens), col=seq_along(dens), lty=1)
Created on 2021-01-31 by the reprex package (v1.0.0)
Even easier is plotting with the ggplot2 package:
library(ggplot2)
dat <-data.frame(group=unlist(lapply(c("x", "y", "z"), function(i) rep(i, length(get(i))))),
value=c(x, y, z))
ggplot(dat, aes(x=value, colour=group))+
geom_density()
Using three toy vectors, try this:
x <- rnorm(100, 0, 1)
y <- rnorm(100, 0.5, 2)
z <- rnorm(100, 1, 1)
plot(density(x, adjust = 5), col = "red", lwd = 2,
xlim = c(-20, 20), ylim = c(0, 0.25), xlab = "X")
par(new=T)
plot(density(y, adjust = 5), col = "blue", lwd = 2,
xlim = c(-20, 20), ylim = c(0, 0.25), xlab = "")
par(new=T)
plot(density(z, adjust = 5), col = "green", lwd = 2,
xlim = c(-20, 20), ylim = c(0, 0.25), xlab = "")
You will need to adjust xlim and ylim in the right way
I am struggling to customise the jump size on the x-axis in R.
Current code:
par(mfrow = c(2,2))
r.star.ts.sp <- ts(r.star.sp, frequency = 4, start = c(1978,1), end = c(2018, 1))
# Big drop in r* around 123th quarter equivalent to 2008:Q4 / 2009:Q1
trendgrowth.ts.sp <- ts(trendgrowth.sp, frequency = 4, start = c(1978,1), end = c(2018, 1))
plot.ts(r.star.ts.sp,
ylim = c(-3, 4), xlab = " ", ylab = " ", axes = F, col = "blue")
lines(trendgrowth.ts.sp, lty = 2, col = "red")
abline(h = 0, lty = 2)
title(main ="r* and Trend Growth", line = 0.5, font.main = 3)
box()
axis(4)
axis(1)
legend("bottomleft", legend = c("r*", "Trend Growth (g)"),
bty = "n", lty = c(1,2), col = c("blue", "red"), horiz = F, text.col = "black",
cex = 1, pt.cex = .5, inset = c(0.02, 0.02))
# -------------------------------------- #
# Plot output gap and real rate gap
# -------------------------------------- #
outputgap.ts.sp <- ts(outputgap.sp, frequency = 4, start = c(1978,1), end = c(2018, 1))
realrategap.ts.sp <- ts(realrategap.sp, frequency = 4, start = c(1978,1), end = c(2018, 1))
plot.ts(outputgap.ts.sp, ylim = c(-20, 15), xlab=" ", ylab=" ", axes = F, col="blue")
lines(realrategap.ts.sp, lty = 2, col = "red")
abline(h = 0, lty = 2)
legend("topright", legend = c("Output Gap", "Real Rate Gap"),
bty = "n", lty = c(1,2), col = c("blue", "red"), horiz = F, text.col = "black",
cex = 1, pt.cex = .5, inset = c(0.02, 0.02))
title(main = "Output Gap and Real Rate Gap", line = 0.5, font.main = 3)
box()
axis(side = 4)
axis(side = 1)
How would one specify the years on the x-axis from 1975 to 2020 with jumps of 5 years?
Furthermore, (off-topic) I need two plots next to each other, but I feel that par(mfrow = c(2,2)) is not the correct statement. However, changing it into par(mfrow = c(1,2)) creates abnormal large figures.
Thanks!
The OP has requested to specify the years on the x-axis from 1975 to 2020 with jumps of 5 years.
This can be achieved by
axis(1, at = seq(1975L, 2020L, by = 5L))
However, the result may depend on the mfrow parameter. Here is a a dummy example using par(mfrow = c(2, 2)):
Note that the x-axis of the left graph was created by axis(1) while the x-axis of the right graph was created by axis(1, at = seq(1975L, 2020L, by = 5L)). Also note the large white space below the two graphs.
With par(mfrow = c(1, 2)) the result becomes
Here, the right graph shows unlabeled ("minor") tick marks. This is explained in the mfrow section of ?par: In a layout with exactly two rows and columns the base value of "cex" is reduced by a factor of 0.83. So, font size is reduzed by 17% per cent which allows to label all tick marks without overplotting.
I'm plotting a cdf of some data, and I've added logarithmic scale on the "x" axis.
The ticks spacing is exactly as I want it to be, but I'd like to be able to add
some tick marks on specific points.
I don't want to change the distribution of the ticks in my plot, from n by n to m by m, I want simply to have, among the ticks from n by n, some further tick marks on some values.
I'd like to have it reflected in both x and y axis, so that I can fit a grid into these new marks throughout the graph.
So far I have the graph, and the grid -- I don't mind about having the grid behind or upon the graph, I just want to add some custom ticks.
# Cumulative Distribuition
pdf("g1_3.pdf")
plot(x = f$V2, y = cumsum(f$V1), log = "x", pch = 3,
xlab = "Frequency", ylab = "P(X <= x)",
panel.first = grid(equilogs = FALSE))
axis(1, at = c(40, 150))
abline(h = 0.6, v = 40, col = "lightgray", lty = 3)
abline(h = 0.6, v = 150, col = "lightgray", lty = 3)
dev.off()
UPDATE: The graph I have so far:
Considering the initial script, and the tips given by #BenBolker, I had to use:
axis(side = 1, at = c([all the ticks you want]))
in order to add the ticks in the graph. Here's the final result:
# Cumulative Distribuition
pdf("g1_3.pdf")
plot(x = f$V2, y = cumsum(f$V1), log = "x", pch = 3,
xlab = "Frequency", ylab = "P(X <= x)", axes = FALSE)
ticks = c(1, 5, 10, 40, 150, 500, 1000)
axis(side = 1, at = ticks)
axis(side = 2)
abline(h = seq(0, 1, 0.2), v = ticks, col = "lightgray", lty = 3)
box()
When I draw grid lines on a plot using abline() the grid lines are drawn over the data.
Is there a way to draw the abline() lines behind the data? I feel this would look better.
Example:
x <- seq(0, 10)
y <- x
plot(x, y, col = 'red', type = 'o', lwd = 3, pch = 15)
abline(h = seq(0, 10, .5), col = 'lightgray', lty = 3)
abline(v = seq(0, 10, .5), col = 'lightgray', lty = 3)
The plot produced has the gray grid lines going over the data (red line). I would like the red line to be on top of the gray lines.
The panel.first argument of plot() can take a list or vector of functions so you can put your abline() calls in there.
plot(1:4, panel.first =
c(abline(h = 1:4, lty = 2, col = 'grey')
,abline(v = 1:4, lty = 2, col = 'grey')))
Use plot() to set up the plotting window, but use type = "n" to not plot any data. Then do your abline() calls, or use grid(), and then plot the data using whatever low-level function is appropriate (here points() is fine).
x <- seq(0, 10)
y <- x
plot(x, y, type = "n")
abline(h = seq(0, 10, .5), col = 'lightgray', lty = 3)
abline(v = seq(0, 10, .5), col = 'lightgray', lty = 3)
points(x, y, col = 'red', type = 'o', lwd = 3, pch = 15)
or
## using `grid()`
plot(x, y, type = "n")
grid()
points(x, y, col = 'red', type = 'o', lwd = 3, pch = 15)
See ?grid for details of how to specify the grid as per your abline() version.
Plot first with type="n" to establish coordinates. Then put in the grid lines, then plot again with your regular plot type:
plot(x, y, col = 'red', type = 'n', lwd = 3, pch = 15)
abline(h = seq(0, 10, .5), col = 'lightgray', lty = 3)
abline(v = seq(0, 10, .5), col = 'lightgray', lty = 3)
par(new=TRUE)
plot(x, y, col = 'red', type = 'o', lwd = 3, pch = 15)
I admit that I have always thought the name for that par parameter was "backwards."
Another way of creating grid lines is to set tck=1 when plotting or in the axis function (you may still want to plot the points using points after creating the grid lines.