Related
The following is my code, and I would like to get the dpear and ppear plots to appear on the histogram, just as I have done with the density curve, but I am running into some issues trying to do that. If someone could point me in the right direction, that would be great!
library(readr)
library(fitdistrplus)
library(moments)
library(PearsonDS)
library(ggplot2)
newdata <- read_csv("Downloads/newctdata - Sheet1.csv")
data <- rpearson(1000, moments = c(mean = 0.5205263, variance = 0.3940497, skewness = 1.747905, kurtosis = 5.706342))
ppar <- pearsonFitML(data)
print(unlist(ppar))
print(unlist(pearsonFitM(moments = empMoments(data))))
pIpars <- list(a = 0.44883385, b = 2.22621271, location = 0.04565093, scale = 3.20779382)
t <- seq(0, 2.5, by = 0.1)
dpearsonI(t, params = pIpars)
dpear <- plot(t, dpearsonI(t, params = pIpars), type = "l")
ppearsonI(t, params = pIpars)
dpear <- plot(t, ppearsonI(t, params = pIpars), type = "l")
hist(newdata$Mean, prob = TRUE, xlab = "Mean Duration of Asymptomatic Infection in Women", ylab = "Frequency", col = "steelblue", breaks = 12, cex.main = 1.3, cex.axis = 1.5, cex.lab = 1.5)
lines(density(newdata$Mean), col = "tomato", lwd = 4)
I am working with the R programming language. I am trying to plot some categorical and continuous data that I am working with, but I am getting an error that tells me that such plots are only possible with "only numeric variables".
library(survival)
library(ggplot2)
data(lung)
data = lung
data$sex = as.factor(data$sex)
data$status = as.factor(data$status)
data$ph.ecog = as.factor(data$ph.ecog)
str(data)
#plot
mycolours <- rainbow(length(unique(data$sex)), end = 0.6)
# png("gally.png", 500, 400, type = "cairo", pointsize = 14)
par(mar = c(4, 4, 0.5, 0.75))
plot(NULL, NULL, xlim = c(1, 5), ylim = range(data[, 1:6]) + c(-0.2, 0.2),
bty = "n", xaxt = "n", xlab = "Variable", ylab = "Standardised value")
axis(1, 1:5, labels = colnames(data)[1:6])
abline(v = 1:5, col = "#00000033", lwd = 2)
abline(h = seq(-2.5, 2.5, 0.5), col = "#00000022", lty = 2)
for (i in 1:nrow(data)) lines(as.numeric(data[i, 1:6]), col = mycolours[as.numeric(data$sex[i])])
legend("topright", c("Female", "Male"), lwd = 2, col = mycolours, bty = "n")
# dev.off()
Does anyone know if this is possible to do with both categorical and continuous data?
Thanks
Sources: R: Parallel Coordinates Plot without GGally
Yup. You just have to be careful with the values. Remember how the factors are coded internally: they are just spicy integer variables with value labels (similar to names). You can losslessly cast it to character or to numeric. For the sake of plotting, you need numbers for line coordinates, so the factor-y nature of your variables will come at the end.
Remember that the quality of your visualisation and the information content depends on the order of your variables in you data set. For factors, labels are absolutely necessary. Help the reader by doing some completely custom improvements impossible in ggplot2 in small steps!
I wrote a custom function allowing anyone to add super-legible text on top of the values that are not so obvious to interpret. Give meaningful names, choose appropriate font size, pass all those extra parameters to the custom function as an ellipsis (...)!
Here you can see that most of the dead patients are female and most of the censored ones are males. Maybe adding some points with slight jitter will give the reader idea about the distributions of these variables.
library(survival)
data(lung)
# Data preparation
lung.scaled <- apply(lung, 2, scale)
drop.column.index <- which(colnames(lung) == "sex")
lung.scaled <- lung.scaled[, -drop.column.index] # Dropping the split variable
split.var <- lung[, drop.column.index]
lung <- lung[, -drop.column.index]
mycolours <- rainbow(length(unique(split.var)), end = 0.6, v = 0.9, alpha = 0.4)
# png("gally.png", 500, 400, type = "cairo", pointsize = 14)
par(mar = c(5.5, 4, 0.5, 0.75))
plot(NULL, NULL, xlim = c(1, ncol(lung.scaled)), ylim = range(lung.scaled, na.rm = TRUE) + c(-0.2, 0.2),
bty = "n", xaxt = "n", xlab = "", ylab = "Standardised value")
axis(1, 1:ncol(lung.scaled), labels = colnames(lung), cex.axis = 0.95, las = 2)
abline(v = 1:ncol(lung), col = "#00000033", lwd = 2)
abline(h = seq(round(min(lung.scaled, na.rm = TRUE)), round(max(lung.scaled, na.rm = TRUE), 0.5)), col = "#00000022", lty = 2)
for (i in 1:nrow(lung.scaled)) lines(as.numeric(lung.scaled[i, ]), col = mycolours[as.numeric(split.var[i])])
legend("topleft", c("Female", "Male"), lwd = 3, col = mycolours, bty = "n")
# Labels for some categorical variables with a white halo for readability
labels.with.halo <- function(varname, data.scaled, labels, nhalo = 32, col.halo = "#FFFFFF44", hscale = 0.04, vscale = 0.04, ...) {
offsets <- cbind(cos(seq(0, 2*pi, length.out = nhalo + 1)) * hscale, sin(seq(0, 2*pi, length.out = nhalo + 1)) * vscale)[-(nhalo + 1), ]
ind <- which(colnames(data.scaled) == varname)
yvals <- sort(unique(data.scaled[, ind]))
for (i in 1:nhalo) text(rep(ind, length(yvals)) + offsets[i, 1], yvals + offsets[i, 2], labels = labels, col = col.halo, ...)
text(rep(ind, length(yvals)), yvals, labels = labels, ...)
}
labels.with.halo("status", lung.scaled, c("Censored", "Dead"), pos = 3)
labels.with.halo("ph.ecog", lung.scaled, c("Asymptomatic", "Symp. but ambul.", "< 50% bed", "> 50% bed"), pos = 3, cex = 0.9)
# dev.off()
I have created the following fanchart using the fanplot package. I'm trying to add axis ticks and labels to the y axis, however it's only giving me the decimals and not the full number. Looking for a solution to display the full number (e.g 4.59 and 4.61) on the y axis
I am also unsure of how to specify the breaks and number of decimal points for the labels on the y-axis using plot(). I know doing all of this in ggplot2 it would look something like this scale_y_continuous(breaks = seq(min(data.ts$Index),max(data.ts$Index),by=0.02)) . Any ideas on how to specify the breaks in the y axis as well as the number of decimal points using the base plot() feature in R?
Here is a reproductible of my dataset data.ts
structure(c(4.6049904235401, 4.60711076016453, 4.60980084146652,
4.61025389170935, 4.60544515681515, 4.60889021700954, 4.60983993107244,
4.61091608826696, 4.61138799159174, 4.61294431148318, 4.61167545843765,
4.61208284263432, 4.61421991328081, 4.61530485425155, 4.61471465043043,
4.6155992084451, 4.61195799200607, 4.61178486640435, 4.61037927954796,
4.60744590947049, 4.59979957741728, 4.59948551500254, 4.60078678080182,
4.60556092645471, 4.60934962087565, 4.60981147563749, 4.61060477704678,
4.61158365084251, 4.60963435263623, 4.61018215733317, 4.61209710959768,
4.61231368335184, 4.61071363571141, 4.61019496497916, 4.60948652606191,
4.61068813487859, 4.6084092003352, 4.60972706132393, 4.60866915174087,
4.61192565195909, 4.60878767339377, 4.61341471281265, 4.61015272152397,
4.6093479714315, 4.60750965935653, 4.60768790690338, 4.60676463096309,
4.60746490411374, 4.60885670935448, 4.60686846708382, 4.60688947889575,
4.60867708110485, 4.60448791268212, 4.60387348166032, 4.60569806689426,
4.6069320880709, 4.6087143894128, 4.61059688801283, 4.61065399116698,
4.61071421014339), .Tsp = c(2004, 2018.75, 4), class = "ts")
and here is a reproductible of the code I'm using
# # Install and Load Packages
## pacman::p_load(forecast,fanplot,tidyverse,tsbox,lubridate,readxl)
# Create an ARIMA Model using the auto.arima function
model <- auto.arima(data.ts)
# Simulate forecasts for 4 quarters (1 year) ahead
forecasts <- simulate(model, n=4)
# Create a data frame with the parameters needed for the uncertainty forecast
table <- ts_df(forecasts) %>%
rename(mode=value) %>%
mutate(time0 = rep(2019,4)) %>%
mutate(uncertainty = sd(mode)) %>%
mutate(skew = rep(0,4))
y0 <- 2019
k <- nrow(table)
# Set Percentiles
p <- seq(0.05, 0.95, 0.05)
p <- c(0.01, p, 0.99)
# Simulate a qsplitnorm distribution
fsval <- matrix(NA, nrow = length(p), ncol = k)
for (i in 1:k)
fsval[, i] <- qsplitnorm(p, mode = table$mode[i],
sd = table$uncertainty[i],
skew = table$skew[i])
# Create Plot
plot(data.ts, type = "l", col = "#75002B", lwd = 4,
xlim = c(y0 - 2,y0 + 0.75), ylim = range(fsval, data.ts),
xaxt = "n", yaxt = "n", ylab = "",xlab='',
main = '')
title(ylab = 'Log AFSI',main = 'Four-Quarter Ahead Forecast Fan - AFSI',
xlab = 'Date')
rect(y0 - 0.25, par("usr")[3] - 1, y0 + 2, par("usr")[4] + 1,
border = "gray90", col = "gray90")
fan(data = fsval, data.type = "values", probs = p,
start = y0, frequency = 4,
anchor = data.ts[time(data.ts) == y0 - .25],
fan.col = colorRampPalette(c("#75002B", "pink")),
ln = NULL, rlab = NULL)
# Add axis labels and ticks
axis(1, at = y0-2:y0 + 2, tcl = 0.5)
axis(1, at = seq(y0-2, y0 + 2, 0.25), labels = FALSE, tcl = 0.25)
abline(v = y0 - 0.25, lty = 1)
abline(v = y0 + 0.75, lty = 2)
axis(2, at = range(fsval, data.ts), las = 2, tcl = 0.5)
range(blah) will only return two values (the minimum and maximum). The at parameter of axis() requires a sequence of points at which you require axis labels. Hence, these are the only two y values you have on your plot. Take a look at using pretty(blah) or seq(min(blah), max(blah), length.out = 10).
The suggestions of #Feakster are worth looking at, but the problem here is that the y-axis margin isn't wide enough. You could do either of two things. You could round the labels so they fit within the margins, for example you could replace this
axis(2, at = range(fsval, data.ts), las = 2, tcl = 0.5)
with this
axis(2, at = range(fsval, data.ts),
labels = sprintf("%.3f", range(fsval, data.ts)), las = 2, tcl = 0.5)
Or, alternatively you could increase the y-axis margin before you make the plot by specifying:
par(mar=c(5,5,4,2)+.1)
plot(data.ts, type = "l", col = "#75002B", lwd = 4,
xlim = c(y0 - 2,y0 + 0.75), ylim = range(fsval, data.ts),
xaxt = "n", yaxt = "n", ylab = "",xlab='',
main = '')
Then everything below that should work. The mar element of par sets the number of lines printed in the margin of each axis. The default is c(5,4,4,2).
We can get marginal effects of a linear model with margins::margins() and can select variables of interest with option variables.
fit <- lm(mpg ~ factor(vs) + gear:factor(vs) + qsec, mtcars)
library(margins)
marg1 <- margins(fit, variables="vs")
> summary(marg1)
factor AME SE z p lower upper
vs1 4.8023 2.6769 1.7940 0.0728 -0.4443 10.0490
The package has a implemented method plot.margins, so we can plot the marginal effects
plot(marg1)
at allows us to specify the values at which to calculate the marginal effects:
marg2 <- margins(fit, variables="vs", at=list(gear=c(3, 4, 5)))
> summary(marg2)
factor gear AME SE z p lower upper
vs1 3.0000 2.8606 3.3642 0.8503 0.3952 -3.7332 9.4544
vs1 4.0000 5.6849 2.6713 2.1282 0.0333 0.4493 10.9206
vs1 5.0000 8.5093 3.8523 2.2089 0.0272 0.9588 16.0597
However, attempting to plot these specified margins will yield an error:
plot(marg2)
Error in `[.data.frame`(summ, , names(attributes(x)[["at"]]), drop = FALSE) :
undefined columns selected
Since the margins package claims to be "an R-port of Stata's 'margins' command ", I'd expect a plot similar to the one Stata gives:
So, how can we plot the predicted margins when they are specified with at?
edit:
Note that this is not really an ordinary interaction plot, since
with(mtcars[mtcars$gear %in% c(3, 4, 5), ],
interaction.plot(gear, vs, mpg, pch=rep(1, 2), type="b"))
gives a different output:
The error comes from what seems to be a bug in the plot method for objects of class "margins", plot.margins.
This is an attempt to correct it. The changes are in the function body, just execute this or save it in a file "plotmargins.R" and then source("plotmargins.R").
plot.margins <-
function (x, pos = seq_along(marginal_effects(x, with_at = FALSE)),
which = colnames(marginal_effects(x, with_at = FALSE)), labels = gsub("^dydx_",
"", which), horizontal = FALSE, xlab = "", ylab = "Average Marginal Effect",
level = 0.95, pch = 21, points.col = "black", points.bg = "black",
las = 1, cex = 1, lwd = 2, zeroline = TRUE, zero.col = "gray",
...)
{
pars <- list(...)
summ <- summary(x, level = level, by_factor = TRUE)
MEs <- summ[, "AME", drop = TRUE]
lower <- summ[, ncol(summ) - 1L]
upper <- summ[, ncol(summ)]
r <- max(upper) - min(lower)
#--- changes start here
nms <- intersect(names(summ), names(attributes(x)[["at"]]))
at_levels <- unique(summ[, nms, drop = FALSE])
#--- changes end here
n_at_levels <- nrow(at_levels)
if (n_at_levels > 1) {
pos2 <- rep(pos, each = n_at_levels)
pos2 <- pos2 + seq(from = -0.2, to = 0.2, length.out = n_at_levels)
}
else {
pos2 <- pos
}
if (isTRUE(horizontal)) {
xlim <- if ("xlim" %in% names(pars))
xlim
else c(min(lower) - 0.04 * r, max(upper) + 0.04 * r)
ylim <- if ("ylim" %in% names(pars))
xlim
else c(min(pos2) - (0.04 * min(pos2)), max(pos2) + (0.04 *
max(pos2)))
}
else {
xlim <- if ("xlim" %in% names(pars))
xlim
else c(min(pos2) - (0.04 * min(pos2)), max(pos2) + (0.04 *
max(pos2)))
ylim <- if ("ylim" %in% names(pars))
xlim
else c(min(lower) - 0.04 * r, max(upper) + 0.04 * r)
}
if (isTRUE(horizontal)) {
plot(NA, xlim = xlim, ylim = ylim, yaxt = "n", xlab = ylab,
ylab = xlab, las = las, ...)
if (isTRUE(zeroline)) {
abline(v = 0, col = zero.col)
}
points(MEs, pos2, col = points.col, bg = points.bg, pch = pch)
axis(2, at = pos, labels = as.character(labels), las = las)
mapply(function(pos, upper, lower, lwd) {
segments(upper, pos, lower, pos, col = points.col,
lwd = lwd)
}, pos2, upper, lower, seq(max(lwd), 0.25, length.out = length(MEs)))
}
else {
plot(NA, xlim = xlim, ylim = ylim, xaxt = "n", xlab = xlab,
ylab = ylab, las = las, ...)
if (isTRUE(zeroline)) {
abline(h = 0, col = zero.col)
}
points(pos2, MEs, col = points.col, bg = points.bg, pch = pch)
axis(1, at = pos, labels = as.character(labels), las = las)
mapply(function(pos, upper, lower, lwd) {
segments(pos, upper, pos, lower, col = points.col,
lwd = lwd)
}, pos2, upper, lower, seq(max(lwd), 0.25, length.out = length(MEs)))
}
invisible(x)
}
Now your code and the graph.
source("plotmargins.R")
marg2 <- margins(fit, variables = "vs",
at = list(gear = c(3, 4, 5)))
plot(marg2)
Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 4 years ago.
Improve this question
I am trying to plot 3 different plot on the same axis of scale in one plot. The plot is coming fine but y-axis scale number are overlapping each other.
Here is my plot.
h1<-hazard.plot.w2p(beta=beta.spreda,eta=eta.spreda,time=exa1.dat$time,line.colour="orange")
h2<-hazard.plot.w2p(beta=1.007629,eta=32.56836,time=exa1.dat$time,line.colour="red")
h3<-hazard.plot.w2p(beta=1.104483,eta=36.53923,time=exa1.dat$time,line.colour="green")
Function used to run this code:
hazard.plot.w2p <- function(beta, eta, time, line.colour, nincr = 500) {
max.time <- max(time, na.rm = F)
t <- seq(0, max.time, length.out = nincr)
r <- numeric(length(t))
for (i in 1:length(t)) {
r[i] <- failure.rate.w2p(beta, eta, t[i])
}
plot(t, r, type = 'l', bty = 'l',
col = line.colour, lwd = 2,
main = "", xlab = "Time",
ylab = "Failure rate",
las = 1, adj = 0.5,
cex.axis = 0.85, cex.lab = 1.2)
par(new=TRUE)
}
Sample DataSet:
[fail time
a 4.55
a 4.65
a 5.21
b 3.21
a 1.21
a 5.65
a 7.12][1]
This is the output I am getting:
Here's a test, though I don't know if it works (lacking some of your functions/variables):
hazard.plot.w2p <- function(beta, eta, time, line.colour, nincr = 500,
add = FALSE) {
max.time <- max(time, na.rm = F)
t <- seq(0, max.time, length.out = nincr)
r <- failure.rate.w2p(beta, eta, t)
if (!add) {
plot(NA, type = 'n', bty = 'l', xlim=range(t), ylim=range(r),
main = "", xlab = "Time", ylab = "Failure rate",
las = 1, adj = 0.5, cex.axis = 0.85, cex.lab = 1.2)
}
lines(t, r, col = line.colour, lwd = 2)
}
failure.rate.w2p <- function(beta,eta,time) (beta/eta) * (time/eta)^(beta-1)
h1<-hazard.plot.w2p(beta=1.002,eta=30,time=exa1.dat$time,line.colour="orange")
h2<-hazard.plot.w2p(beta=1.007629,eta=32.56836,time=exa1.dat$time,line.colour="red",add=T)
h3<-hazard.plot.w2p(beta=1.104483,eta=36.53923,time=exa1.dat$time,line.colour="green",add=T)