Related
I have been trying to display the odds of receiving a psychotropic medication for a list of psychiatric diagnoses but have not been able to show the entire range (on a log scale) due to the limitations of the x axis.
Looking at the forestplot documentation, it appears that the clip() is what is used to specify the xlimits. However, I have noticed that anytime I set it to be something greater than 54 the number on the bottom will not be shown at all and it stops at 4. This is an issue for me because I need to plot numbers as high as 221 (the upper confidence limit for my highest odds ratio).
I am using the following code:
# Cochrane data from the 'rmeta'-package
base_data <- tibble::tibble(mean = c(19.92 , 41.46, 11.67, 11.69, 25.44, 105.89, 145.45),
lower = c(17.09, 34.70, 9.04, 10.92, 19.78, 67.40, 95.64),
upper = c(23.22, 49.54, 15.07, 12.51, 32.73, 166.37, 221.22),
study = c("Autism", "Conduct Problems", "Tic Disorder", "ADHD",
"OCD", "Schizophrenia", "Manic Bipolar"),
OR = c("19.92" , "41.46", "11.67", "11.69", "25.44", "105.89", "145.45"))
base_data |>
forestplot(labeltext = c(study, OR),
clip = c(0.1, 54),
xlog = TRUE) |>
fp_set_style(box = "royalblue",
line = "darkblue",
summary = "royalblue") |>
fp_add_header(study = c("", "Study"),
OR = c("", "OR")) |>
fp_append_row(mean = 60.22,
lower = 41,
upper = 83,
study = "Summary",
OR = "60.22",
is.summary = TRUE) |>
fp_set_zebra_style("#EFEFEF")
Which creates this graph:
If I set the clip to 220 I am able to plot this but the x axis will stop at 4 as shown below:
Does anyone know how to get past this issue and set the xlimit ticks to a very high number (e.g. 100+) while still using a log scale?
Keeping it on a log scale would mean there would be an equal distance between 1, 10, 100, and show the entire range of answers (up till the final value of 221)while still allowing one to see the difference between values at the lower end.
Any help is extremely appreciated. Thank you so much!
According to the docs:
xlog: The xlog outputs the axis in log() format but the input data
should be in antilog/exp format
So you could change your data using exp. To add labels you can use xticks. Here some reproducible code:
library(forestplot)
base_data$mean <- exp(base_data$mean)
base_data$lower <- exp(base_data$lower)
base_data$upper <- exp(base_data$upper)
base_data |>
forestplot(labeltext = c(study, OR),
xlog = TRUE,
xticks = c(0, 50, 100, 150, 200, 250))|>
fp_set_style(box = "royalblue",
line = "darkblue",
summary = "royalblue") |>
fp_add_header(study = c("", "Study"),
OR = c("", "OR")) |>
fp_append_row(mean = 60.22,
lower = 41,
upper = 83,
study = "Summary",
OR = "60.22",
is.summary = TRUE) |>
fp_set_zebra_style("#EFEFEF")
Created on 2023-01-24 with reprex v2.0.2
I am trying to plot a heatmap (colored by odds ratios) using ggplot2. The odds ratio values range from 0-200. I would like my heatmap legend to show markings corresponding to certain values (0.1, 1, 10, 50, 100, 200). This is the code I am using but my legend does not label all the values (see below)
Code below:
map is a sample data frame with columns: segments, OR, tissue type
segments <- c("TssA", "TssBiv", "BivFlnk", "EnhBiv","ReprPC", "ReprPCWk", "Quies", "TssAFlnk", "TxFlnk", "Tx", "TxWk", "EnhG", "Enh", "ZNF/Rpts", "Het")
OR <- c(1.4787622, 46.99886002, 11.74417278, 4.49223136, 204.975818, 1.85228517, 0.85762414, 0.67926846, 0.33696213, 0.06532777, 0.10478027, 0.07462983, 0.06501252, 1.32922162, 0.32638438)
df <- data.frame(segments, OR)
map <- df %>% mutate(tissue = 'colon')
ggplot(map, aes(tissue,segments, fill = OR))+ geom_tile(colour="gray80")+
theme_bw()+coord_equal()+
scale_fill_gradientn(colours=c("lightskyblue1", "white","navajowhite","lightsalmon", "orangered2", "indianred1"),
values=rescale(c(0.1, 1, 10, 50, 100, 200)), guide="colorbar", breaks=c(0.1, 1, 10, 50, 150, 200))
I am looking for my legend to look something similar to this (using the values I specified):
With your map data, first rescale OR to log(OR).
Also, you might want to assign white to OR = 1. If that's the case, your approach would be able to achieve that. You may want to try different limits values to achieve that with real data.
map_1 <-map %>% mutate(OR = log(OR))
OR_max <- max(map$OR, na.rm = TRUE)
log_list <- c(0.2, 1, 10, 50, 200) %>% log
ggplot(map_1, aes(tissue,segments, fill = OR))+ geom_tile(colour="gray80")+
theme_bw()+coord_equal()+
scale_fill_gradientn(
colours = c("red3", "white", "navy"),
values=rescale(log_list),
guide="colorbar",
breaks=log_list,
limits = c(1/OR_max, OR_max) %>% log,
labels = c("0.1", "1", "10", "50", "200")
)
I want to know how to control the key scale in scatterplot with 3 axis in the openair package.
When I make scatter plot, the key scale is randomly applied according to data. I want to fix the key scale to 0~100 % RH.
scatterPlot(data,
x="O3",y="SOC",z="RH",col="jet",linear="FALSE",cex=0.8,fontsize=35,
xlim=c(0,0.05),ylim=c(0,20),key.fooer = "RH(%)", xlab="O3 (ppm)",
ylab="SOC(ug/m3)",labelFontsize=13)
To force the graph, you can create pseudo data outside the limits and with HR = 0. For example:
data <- data.frame("O3"=c(0.01, 0.02, 0.03, 0.04, 0.03),
"SOC"=c(5, 3, 4, 2, 8),
"RH"=c(100, 52, 75, 83, 63))
newdata <- rbind(data, data.frame("O3"=-1, "SOC"=-1, "RH"=0))
scatterPlot(newdata, x="O3",y="SOC",z="RH",col="jet",linear="FALSE",cex=0.8,fontsize=35,
xlim=c(0,0.05),ylim=c(0,20),key.footer = "RH(%)", xlab="O3 (ppm)",
ylab="SOC(ug/m3)",labelFontsize=13)
I would like to make some plots from my data. Unfortunately, it is hard to predict how many plots I will generate because it depends on data and may be different. It is a reason why I would like to make it easy adjustable. However, it will be most often a plot from group of 3 rows each time.
So, I would like to plot from rows 1:3, 4-6,7-9, etc.
This is data:
> dput(DF_final)
structure(list(AC = c(0.0031682160632777, 0.00228591145206846,
0.00142094444568728, 0.000661218113472149, 0.0010078157353918,
0.000400289437089513, 40.4634784175177, 40.5055070858594, 0.0183737773741582
), SD = c(0.00250647379467532, 0.0013244185401148, 0.000469332241199189,
0.000294558308707343, 0.000385553400676202, 0.000104447914881357,
11.0693842400794, 8.78768774254084, 0.00696532251341454), ln_AC = c(-5.75458660556339,
-6.08099044923792, -6.556433525855, -7.32142679754668, -6.89996992823399,
-7.8233226797995, 3.70039979980691, 3.70143794229703, -3.99683077355773
), ln_SD = c(-5.98887837626238, -6.62678175351058, -7.66419963690747,
-8.13003358225542, -7.86083085139947, -9.16682203300101, 2.40418312097106,
2.17335162163583, -4.96681136795312), Percent_AC = c(126.401324043689,
172.597361244303, 302.758754023937, 224.477834753288, 261.394591157605,
383.243109777925, 365.544076706723, 460.934756361151, 263.789326894369
), Percent_SD = c(100, 100, 100, 100, 100, 100, 100, 100, 100
), TP = c(0, 40, 80, 0, 40, 80, 0, 40, 80)), row.names = c("Tim_0",
"Tim_40", "Tim_80", "Jack_0", "Jack_40", "Jack_80", "Tom_0",
"Tom_40", "Tom_80"), class = "data.frame")
Column ln_AC should be set as an Y axis and column TP as X axis. First of all I would like to have all of them on separate graphs next to each other (remember about issue that the number of plots may be igh at some point) and if possible everything at the same graph. It should be a point plot with trend line.
Is it also possible to get a slope, SD slope, R^2 on a plot from linear regression ?
I manage to do it a for a single plot but regression line looks strange...
The code below was used to generate this plot and regression line.
fit <- lm(DF_final$ln_AC~DF_final$TP, data=DF_final)
plot(DF_final[1:3,7], DF_final[1:3,3], type = "p", ylim = c(-10,0), xlim=c(0,100), col = "red")
lines(DF_final$TP, fitted(fit), col="blue")
In base R (without so many packages), you can do:
# splits every 3 rows
DF = split(DF_final,gsub("_[^ ]*","",rownames(DF_final) ))
# you can also do
# DF = split(DF_final,(1:nrow(DF_final) - 1) %/%3 ))
To store your values:
slopes = vector("numeric",3)
names(slopes) = names(DF)
rsq = vector("numeric",3)
names(rsq) = names(DF)
To plot:
par(mfrow=c(1,3))
for(i in names(DF)){
fit <- lm(ln_AC~TP, data=DF[[i]])
plot(DF[[i]]$TP, DF[[i]]$ln_AC, type = "p", col = "red",main=i)
abline(fit, col="blue")
slopes[i]=round(fit$coefficients[2],digits=2)
rsq[i]=round(summary(fit)$r.squared,digits=2)
mtext(side=1,paste("slope=",slopes[i],"\nrsq=",rsq[i]),
padj=-2,cex=0.7)
}
And your values:
slopes
Jack Tim Tom
-0.01 -0.01 -0.10
rsq
Jack Tim Tom
0.29 0.99 0.75
If I understand correctly, the reason you want 3 observation per graph is because you have different individuals (Jack,Tim,Tom) . Is that so?
If you don't want to worry about that number, you can do this
# move rownames to column
data$person <- rownames(data)
data$person <- gsub("\\_.*","",data$person) # remove TP from names
# better to use library(data.table) for this step
data <- melt(data,id.vars=c("person","TP","ln_AC"))
ggplot(data,aes(x=TP, y=ln_AC)) + geom_point() +
geom_smooth(method = "lm") + facet_grid(~person)
This results in a plot like #giocomai, but it will work also if you have 4,5,6 or whatever persons in your data.
---- Edit
If you want to add R2 values, you can do something like this. Note, that it may not be the best and elegant solution, but it works.
data <- data.frame(...)
data$person <- rownames(data)
data$person <- gsub("\\_.*","",data$person)
# run lm for all persons and save them in a data.frame
nomi <- unique(data$person)
#lmStats <- data.frame()
lmStats <- sapply(nomi,
function(ita){
model <- lm(ln_AC~TP,data= data[which(data$person == ita),])
lmStat <- summary(model)
# I only save r2, but you can get all the statistics you need
lmRow <- data.frame("r2" = lmStat$r.squared )
#lmStats <- rbind(lmStats,lmRow)
}
)
lmStats <- do.call(rbind,lmStats)
# format the output,and create a dataframe we will use to annotate facet_grid
lmStats <- as.data.frame(lmStats)
rownames(lmStats) <- gsub("\\..*","",rownames(lmStats))
lmStats$person <- rownames(lmStats)
colnames(lmStats)[1] <- "r2"
lmStats$r2 <- round(lmStats$r2,2)
lmStats$TP <- 40
lmStats$ln_AC <- 0
lmStats$lab <- paste0("r2= ",lmStats$r2)
# melt and add r2 column to the data (not necessary, but I like to have everything I plot in teh data)
data <- melt(data,id.vars=c("person","TP","ln_AC"))
data$r2 <- lmStats[match(data$person,rownames(lmStats)),1]
ggplot(data,aes(x=TP, y=ln_AC)) + geom_point() +
geom_smooth(method = "lm") + facet_grid(~person) +
geom_text(data=lmStats,label=lmStats$lab)
An easier way (less steps) would be to use facet_grid(~r2), so that you have the R.square value in the title.
If I understand correctly what you mean, assuming you will always have three observation per graph, your main issue would be creating a categorical variable to separate them. Here's one way to accomplish it. Depending on the layout you prefer, you may want to check facet_wrap instead of facet_grid.
library("dplyr")
library("ggplot2")
DF_final <- structure(list(AC = c(0.0031682160632777, 0.00228591145206846,
0.00142094444568728, 0.000661218113472149, 0.0010078157353918,
0.000400289437089513, 40.4634784175177, 40.5055070858594, 0.0183737773741582
), SD = c(0.00250647379467532, 0.0013244185401148, 0.000469332241199189,
0.000294558308707343, 0.000385553400676202, 0.000104447914881357,
11.0693842400794, 8.78768774254084, 0.00696532251341454), ln_AC = c(-5.75458660556339,
-6.08099044923792, -6.556433525855, -7.32142679754668, -6.89996992823399,
-7.8233226797995, 3.70039979980691, 3.70143794229703, -3.99683077355773
), ln_SD = c(-5.98887837626238, -6.62678175351058, -7.66419963690747,
-8.13003358225542, -7.86083085139947, -9.16682203300101, 2.40418312097106,
2.17335162163583, -4.96681136795312), Percent_AC = c(126.401324043689,
172.597361244303, 302.758754023937, 224.477834753288, 261.394591157605,
383.243109777925, 365.544076706723, 460.934756361151, 263.789326894369
), Percent_SD = c(100, 100, 100, 100, 100, 100, 100, 100, 100
), TP = c(0, 40, 80, 0, 40, 80, 0, 40, 80)), row.names = c("Tim_0",
"Tim_40", "Tim_80", "Jack_0", "Jack_40", "Jack_80", "Tom_0",
"Tom_40", "Tom_80"), class = "data.frame")
DF_final %>%
mutate(id = as.character(sapply(1:(nrow(DF_final)/3), rep, 3))) %>%
ggplot(aes(x=TP, y=ln_AC)) +
geom_point() +
geom_smooth(method = "lm") +
facet_grid(~id)
Created on 2020-02-06 by the reprex package (v0.3.0)
I have got dive depth data for seabirds over several trips and I would like to find the modes for each trip, plot the density functions and a line corresponding to the modes. So far, here's the code I have been using:
maxdepths<-read.csv("maximum_depths.csv", header=T)
maxdepths_ind21<-maxdepths[maxdepths$bird=="21",]
# create value labels
trip.f <- factor(maxdepths_ind21$trip, levels= c(21.1,21.2,21.3,21.4,21.5,21.6,21.7,21.8),
labels = c("Trip1", "Trip2", "Trip3", "Trip4", "Trip5", "Trip6", "Trip7", "Trip8"))
# plot densities
z<-sm.density.compare(maxdepths_ind21$maxdep, maxdepths_ind21$trip,model="equal")
sm.density.compare(maxdepths_ind21$maxdep, maxdepths_ind21$trip, xlab="Maximum depth (m)", xlim=c(0, 90), axes=F)
title(main="Maximum dive depth by trip, individu 21")
axis(side = 1, at = c(0,5,10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90))
axis(side = 2, at = c(0,0.01,0.02,0.03, 0.04, 0.05, 0.06, 0.07, 0.08, 0.09, 1))
# add legend via mouse click
colfill<-c(2:(2+length(levels(trip.f))))
legend(locator(1), levels(trip.f), fill=colfill)
The result looks good, ie I've got one curve per trip with different colours/line types per trip.
I now would like to draw lines for each trip when the density functions are maximized, as well as find those values. I am aware of this thread
R: getting data (instead of plot) back from sm.density.compare
and I have tried assigning the result of sm.density.compare to an object and then calling it, like so:
z<-sm.density.compare(maxdepths_ind21$maxdep, maxdepths_ind21$trip,model="equal")
z
I was looking for the values of the modes within this output but I got confused by all the values that are returned.
Any help would be much appreciated!
TIA