Merge information on the same plot - r

I have some data that looks like this:
Expression1 Expression2 CellType Patient
9.34 8.23 3.2 A
8.2 3.2 10.9 B
2.12 5.3 12.9 B
2.10 1.3 2.9 B
2.12 1.5 2.9 A
2.11 9.5 6.9 A
... .... ... ... ....
I would like to generate a plot (with ggplot) with Expression1 and Expression2 on the y and x axes respectively and dots coloured in a gradient of a single color according to the CellType column and at the same time distinguishing between Patient A and B on the same plot.
Can anyone help me please?
ggplot(myDF, aes(Expression1, Expression2)) + geom_point(aes(colour = CellType)) + scale_colour_gradient2(low="black",mid="white" , high="red", + ggtitle("First_attempt")
I don't know how to add a gradient for Patient
Thank you in advance

The below seems to work fine:
dt <- data.table::fread('Expression1 Expression2 CellType Patient
9.34 8.23 3.2 A
8.2 3.2 10.9 B
2.12 5.3 12.9 B
2.10 1.3 2.9 B
2.12 1.5 2.9 A
2.11 9.5 6.9 A ')
library(ggplot2)
ggplot(dt) + geom_point(aes(x = Expression2, y = Expression1,
color = CellType, shape = Patient))
output

Related

Problems with scatterplot error bars in ggplot2

I have a question about how I can do a scatterplot with error bars. I´m working with stable isotopes so I have data on D13C and D15N for faunal samples. I want to obtain a plot like this one (without convex hulls) attached (target)target.png
But on the contrary I obtain a plot like this (CNPlot)CNPlot.png
I´m using this script :
a<-read.table("Means.txt", header = TRUE)
theme_set(theme_classic(base_size = 16))
ggplot(a, aes(x=D13C, y=D15N)) +
geom_errorbar(aes(ymax=D13C+D13C.ds, ymin=D13C-D13C.ds), width=0.15,alpha=.8)+
geom_errorbarh(aes(xmax=D15N+D15N.ds, xmin=D15N-D15N.ds), height=0.15,alpha=.8)+
geom_point(aes(shape=Species),fill="white",size=4) +
geom_point(aes(color=Species,fill=Species,shape=Species),size=4, alpha = .5) +
scale_color_manual(values=c("black","dodgerblue1","coral4","darkorchid"))+
scale_fill_manual(values=c("black","dodgerblue1","coral4","darkorchid"))+
scale_shape_manual(values=c(21,23,22,24))+
labs(title=NULL,
subtitle=NULL,
caption=NULL,
x=expression(paste(delta^{13}, "Ccol(‰)")),
y=expression(paste(delta^{15}, "N(‰)")))
and I have two datasets but I´m using the one named Means but I have another one named CN_fauna where I included the raw data
Means:
Species D13C D13C.ds D15N D15N.ds
Bird -16.4 7.1 7.6 1.5
SH -18.5 1.7 5.5 2.7
CH -14.8 2.9 8.8 0.6
Deer -19.2 0.7 4.8 1.04
CN_fauna:
taxa D13C D15N
Bird -24.1 7.9
Bird -9.9 9
Bird -15.2 5.9
SH -17.0 9.6
SH -16.6 7.3
SH -20.3 4.6
SH -20.3 2.6
SH -20.3 2.7
SH -18.6 6.6
CH -16.9 9.4
CH -11.5 8.2
CH -16.1 8.8
Deer -18.6 3.0
Deer -19.1 6.0
Deer -18.3 5.4
Deer -17.9 5.4
Deer -19.2 5.6
Deer -20.4 5.6
Deer -19.5 6.1
Deer -20.3 5.9
Deer -18.7 5.4
Deer -19.7 3.8
Deer -19.2 3.4
Deer -19.9 4.1
Deer -18.4 4.3
Deer -20.1 4.1
I do not understand why the scales of the error barplots are different in my plot, any help is more than welcome.
Not to take away from the sage advice that #stefan provided (reproducible questions get better responses faster)...
I could be wrong, but I think your errorbar data is on the wrong axis. Is that what you were trying to create?
If so, you need to change your xmin to ymin and so on for the two errorbar layers. It would look something like this:
ggplot(Means, aes(x = D13C, y = D15N)) +
geom_errorbar(aes(xmax = D13C + D13C.ds,
xmin = D13C - D13C.ds), width=0.15,alpha=.8)+
geom_errorbar(aes(ymax = D15N + D15N.ds,
ymin = D15N - D15N.ds), height=0.15,alpha=.8)+
geom_point(aes(shape=Species),fill="white",size=4) +
geom_point(aes(color=Species,fill=Species,shape=Species),size=4, alpha = .5) +
scale_color_manual(values=c("black","dodgerblue1","coral4","darkorchid"))+
scale_fill_manual(values=c("black","dodgerblue1","coral4","darkorchid"))+
scale_shape_manual(values=c(21,23,22,24))+
labs(title=NULL,
subtitle=NULL,
caption=NULL,
x=expression(paste(delta^{13}, "Ccol(‰)")),
y=expression(paste(delta^{15}, "N(‰)")))

Ordering a variable by a specific year in ggplot bar chart R

I have a question related to ordering specific values of a bar chart created with ggplot.
My data "df" is the following:
city X2020 X2021
1 Stuttgart 2.9 3.1
2 Munich 2.3 2.4
3 Berlin 2.2 2.3
4 Hamburg 3.8 4.0
5 Dresden 3.3 3.0
6 Dortmund 2.5 2.6
7 Paderborn 1.7 1.8
8 Essen 2.6 2.6
9 Heidelberg 3.0 3.2
10 Karlsruhe 2.5 2.4
11 Kiel 2.6 2.7
12 Ravensburg 3.3 2.7
I want exactly this kind of barchart below, but cities should be only ordered by the value of 2021! I tried "reorder" in the ggplot as recommended, but this does not fit. There are some cities where the ordering is pretty weird and I do not understand what R is doing here. My code is the following:
df_melt <- melt(df, id = "city")
ggplot(df_melt, aes(value, reorder(city, -value), fill = variable)) +
geom_bar(stat="identity", position = "dodge")
str(df_melt)
'data.frame': 24 obs. of 3 variables:
$ city : chr "Stuttgart" "Munich" "Berlin" "Hamburg" ...
$ variable: Factor w/ 2 levels "X2020","X2021": 1 1 1 1 1 1 1 1 1 1 ...
$ value : num 2.9 2.3 2.2 3.8 3.3 2.5 1.7 2.6 3 2.5 ...
https://i.stack.imgur.com/rJQMV.png
I think this gets messy because in the variable "value" there are values of both 2020 and 2021 and R possibly takes the mean of both (I dont know!). But I have no idea to deal with this further. I hope somebody can help me with my concern.
Thanks!
You could try sorting your df with arrange and then use fct_inorder to ensure that the city levels is in the order that you want.
library(tidyverse)
df <- read_table(" city X2020 X2021
1 Stuttgart 2.9 3.1
2 Munich 2.3 2.4
3 Berlin 2.2 2.3
4 Hamburg 3.8 4.0
5 Dresden 3.3 3.0
6 Dortmund 2.5 2.6
7 Paderborn 1.7 1.8
8 Essen 2.6 2.6
9 Heidelberg 3.0 3.2
10 Karlsruhe 2.5 2.4
11 Kiel 2.6 2.7
12 Ravensburg 3.3 2.7 ")
#> Warning: Missing column names filled in: 'X1' [1]
df %>%
select(-X1) %>%
pivot_longer(-city) %>%
arrange(desc(name), -value) %>%
mutate(
city = fct_inorder(city)
) %>%
ggplot(aes(city, value, fill = name)) +
geom_col(position = "dodge")
Created on 2021-07-13 by the reprex package (v1.0.0)
I just want to add to the previous answer that you can also take this plot and use coord_flip() to achieve the final result you were looking for. 😉

Plotting sales over time in R

I am trying to show the top 100 sales on a scatterplot by year. I used the below code to take top 100 games according to sales and then set it as a data frame.
top100 <- head(sort(games$NA_Sales,decreasing=TRUE), n = 100)
as.data.frame(top100)
I then tried to plot this with the below code:
ggplot(top100)+
aes(x=Year, y = Global_Sales) +
geom_point()
I bet the below error when using the subset top100
Error: data must be a data frame, or other object coercible by fortify(), not a numeric vector
if i use the actual games dataseti get the plot attached.
Any ideas?
As pointed out in comments by #CMichael, you have several issues in your code.
In absence of reproducible example, I used iris dataset to explain you what is wrong with your code.
top100 <- head(sort(games$NA_Sales,decreasing=TRUE), n = 100)
By doing that you are only extracting a single column.
The same command with the iris dataset:
> head(sort(iris$Sepal.Length, decreasing = TRUE), n = 20)
[1] 7.9 7.7 7.7 7.7 7.7 7.6 7.4 7.3 7.2 7.2 7.2 7.1 7.0 6.9 6.9 6.9 6.9 6.8 6.8 6.8
So, first, you do not have anymore two dimensions to be plot in your ggplot2. Second, even colnames are not kept during the extraction, so you can't after ask for ggplot2 to plot Year and Global_Sales.
So, to solve your issue, you can do (here the example with the iris dataset):
top100 = as.data.frame(head(iris[order(iris$Sepal.Length, decreasing = TRUE), 1:2], n = 100))
And you get a data.frame of of this type:
> str(top100)
'data.frame': 100 obs. of 2 variables:
$ Sepal.Length: num 7.9 7.7 7.7 7.7 7.7 7.6 7.4 7.3 7.2 7.2 ...
$ Sepal.Width : num 3.8 3.8 2.6 2.8 3 3 2.8 2.9 3.6 3.2 ...
> head(top100)
Sepal.Length Sepal.Width
132 7.9 3.8
118 7.7 3.8
119 7.7 2.6
123 7.7 2.8
136 7.7 3.0
106 7.6 3.0
And then if you are plotting:
library(ggplot2)
ggplot(top100, aes(x = Sepal.Length, y = Sepal.Width)) + geom_point()
Warning Based on what you provided in your example, I will suggest you to do:
top100 <- as.data.frame(head(games[order(games$NA_Sales,decreasing=TRUE),c("Year","Global_Sales")], 100))
However, if this is not satisfying to you, you should consider to provide a reproducible example of your dataset How to make a great R reproducible example

How to add labels at the top of vlines in ggplot2 and add these in separate legends

I have created a dummy dataframe representative of my data-
SQ AgeGroup Prop LCI UCI
2010-1 0 to 18 4.3 4.2 4.4
2010-1 19 to 25 5.6 5.3 5.6
2010-1 26 and over 7.8 7.6 7.9
2010-2 0 to 18 4.1 3.9 4.2
2010-2 19 to 25 5.8 5.6 5.9
2010-2 26 and over 8.1 7.9 8.3
2010-3 0 to 18 4.2 4 4.4
2010-3 19 to 25 5.5 5.2 5.6
2010-3 26 and over 7.6 7.4 7.7
2010-4 0 to 18 3.9 3.6 4.1
2010-4 19 to 25 5.2 5 5.4
2010-4 26 and over 7.4 7.2 7.6
2011-1 0 to 18 4.3 4.1 4.5
2011-1 19 to 25 5.7 5.5 5.8
2011-1 26 and over 8.2 8 8.3
2011-2 0 to 18 4.1 4 4.5
2011-2 19 to 25 5.7 5.5 5.9
2011-2 26 and over 8.2 8 8.4
2011-3 0 to 18 4.4 4.2 4.6
2011-3 19 to 25 5.7 5.5 7.9
2011-3 26 and over 8.2 8 8.4
which creates an image that looks like this-
I have used the following code-
library(readxl)
library(dplyr)
library(epitools)
library(gtools)
library(reshape2)
library(binom)
library(pivottabler)
library(readxl)
library(phecharts)
library(ggplot2)
library(RODBC)
rm(list=ls())
df<-read_xlsx("Dummydata.xlsx")
pd<-position_dodge(width=0.3)
limits <- aes(ymax =df$UCI , ymin = df$LCI)
p<-ggplot(df, aes(x = SQ, y =Prop, group=AgeGroup, colour= AgeGroup)) +
geom_line(position=pd)+
geom_point(size=2.0, position=pd)+
geom_errorbar(limits, width = 0.55, size=0.4, position= pd)+
labs(
y = "Percentage",
x = "Study Quarter")
p<-p +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1))+
scale_y_continuous(name="Percentage",breaks=c(0,2,4,6,8,10),limits=c(0,10))+#limits need to change with every pot
scale_fill_manual(values = pal)+
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1,size=16))+
theme(axis.text.y=element_text(size=16))+
theme(legend.text = element_text(size=18))+
theme(legend.title=element_text(size=16))+
theme(legend.title=element_blank())+
theme(legend.position="bottom")+
theme(axis.title = element_text(size=22))
p + geom_vline(xintercept = c(2,4,6), linetype="dotted",
color = "black", size=1.0, show.legend = TRUE)
However, what I want is that the three geom lines should have a lable (L1, L2 and L3) at the top of each of these lines and a separate legend at the bottom where I can add what these lines stand for. Something like this-
L1: Launch of x
L2: Launch of y
L3: Launch of z
Can someone please help with this?

R - How to conditionally color the values of a data.frame and make a plot

I have this dataframe created with the function table.CalendarReturns from the PerfomanceAnalytics package (the rownames are the years and the colnames are the months ):
tabRet
>
gen feb mar apr may jun jul aug sep oct nov dec System
2004 0.0 3.5 2.9 2.0 1.1 -0.4 1.2 3.3 0.9 1.8 3.0 -0.6 20.1
2005 1.6 2.3 -1.2 4.0 0.0 1.6 -1.4 2.4 0.7 2.9 2.9 0.4 17.3
2006 0.8 2.7 0.3 1.4 6.2 -2.6 2.1 2.8 0.5 0.3 0.7 3.1 19.6
2007 1.3 0.1 1.4 0.1 1.6 -1.0 1.0 1.5 -0.7 1.0 1.3 -0.7 7.0
2008 1.4 -1.2 2.2 1.2 -0.3 -0.8 2.2 0.4 1.1 0.1 4.4 -1.3 9.7
2009 4.8 3.2 1.6 3.5 0.7 1.7 2.1 2.2 2.5 1.9 1.5 2.8 32.4
2010 3.5 0.5 0.4 1.3 1.8 3.8 3.7 3.0 1.1 1.2 3.9 3.4 31.2
2011 4.3 2.1 1.6 -0.8 3.9 1.5 4.0 5.4 2.3 2.9 0.2 1.5 33.0
2012 1.1 1.9 -0.1 2.3 1.0 3.6 1.5 0.7 0.0 1.5 1.2 0.5 16.3
2013 0.8 2.5 1.2 1.4 0.0 1.7 2.3 1.7 0.5 0.2 1.3 0.6 15.1
2014 0.1 0.7 0.3 -0.7 1.0 1.0 0.2 0.9 -0.7 2.3 1.4 1.4 8.2
2015 2.3 1.0 1.1 3.1 4.5 -0.7 -0.3 2.3 2.4 0.4 -1.3 1.0 16.7
2016 2.1 2.5 0.9 1.0 0.2 NA NA NA NA NA NA NA 7.0
I would like to create a plot with something like this for the colors of the numbers: ifelse(values< 0,'red','black').
I tried with the addtable2plot function, from the plotrix package with bad results.
Any tips about this problem? Thank you in advance guys.
EDIT:
I need something like this but with the negative numbers in red:
textplot(Hmisc::format.df(tabRet, na.blank=TRUE, numeric.dollar=FALSE, cdec=rep(1,dim(tabRet)[2])), rmar = 0.8, cmar = 1, max.cex=.9, halign = "center", valign = "center", row.valign="center", wrap.rownames=20, wrap.colnames=10, col.colnames="Darkblue",col.rownames="Darkblue", mar = c(0,0,4,0)+0.1) title(main="Calendar Monthly Returns",col.main="Darkblue", cex.main=1)
PLOT
Simply add col.data=ifelse(tabRet<0,'red','black'), after col.rownames="Darkblue", to your code
Here is a solution using geom_tile{ggplot2} with a reproducible example:
# load libraries
library(ggplot2)
library(ggthemes)
library(data.table)
library(PerformanceAnalytics)
# load data
data(managers)
df <- as.data.frame(t(table.CalendarReturns(managers[,c(1,7,8)])))
# Convert row names into first column
df <- setDT(df, keep.rownames = TRUE)[]
setnames(df, "rn", "month")
# reshape your data
df2 <- melt(df, id.var = "month")
# Plot
ggplot(df2, aes(x=month, y=variable)) +
geom_tile( fill= "white", color = "white") +
geom_text(aes(label=value, color= value < 0)) +
scale_color_manual(guide=FALSE, values=c("red", "black")) +
theme_pander( ) +
theme(axis.text = element_text(face = "bold")) +
ggtitle("Calendar Monthly Returns")
You can also choose to fill the tiles instead of the text.
ggplot(df2) +
geom_tile( aes(x=month , y=variable, fill= value < 0), color = "gray70", alpha = 0.7) +
scale_fill_manual(guide=FALSE, values=c("red", "black")) +
theme_pander()
In any case, this answer provides a general approach to conditional colors in ggplot.
ggplot(mtcars, aes(wt, mpg)) +
geom_point( aes(color= ifelse(mpg > 20, "A", "B")) ) +
scale_color_manual(guide=FALSE, values=c("red", "black"))
You can also do this using base:
plot(mtcars$wt, mtcars$mpg,
col=ifelse( mtcars$mpg > 20 ,"red", "black") )
If this is what you are looking for:
require(tidyr)
require(ggplot2)
d<-data.table(gen=seq(2004,2016,1),Jan=round(rnorm(13)),Feb=round(rnorm(13)),Mar=round(rnorm(13)))
gd<-gather(d,key=gen)
gd$col<-ifelse(gd$value<0.5,"1","0")
ggplot(data=gd,aes(x=gen,y=value,color=col))+geom_tile(aes(fill=col))+
scale_fill_manual(values=c("red","black"))
I do not know if I found the solution, I hope this code will serve you this code receives a data.frame and converts it into conditional format, the value of the cells is shown in the columns
[Data i a data.frame][1]
[1]: https://i.stack.imgur.com/jETa8.png
pmensual <- readXL("D:precipitacion mensual
2021/Base de Datos Precipitación mensual .xls",
rownames=FALSE, header=TRUE, na="", sheet="Preci_mes",
stringsAsFactors=TRUE)
nba.m <- melt(pmensual)
nba.s <- ddply(nba.m, .(variable), transform)
ggplot(nba.s, aes(variable, Nombre)) +
geom_tile(aes(fill = value), colour = "white") +
#scale_colour_gradientn(colours = c("blue", "green", "red"))+
scale_fill_gradient2(low = "white", high = "red", mid = "blue", midpoint = 400) +
geom_text(aes(label = round(value, 1)), size= 3.5, vjust = 0.4)+
scale_x_discrete("Mes", expand = c(0, 0)) +
scale_y_discrete("Estaciones", expand = c(0, 0)) +
theme( legend.position="bottom", legend.title=element_text(), panel.background =
element_rect(fill = "white", colour = "gray"),
panel.grid.major = element_line(colour = "gray"))+
labs(title = "Mapa de Calor Precipitaciones en la zona de Estudio",
subtitle = "Precipitacion Acumulada al mes" , fill = "Precipitacion mm" )
[result ][2]
[2]: https://i.stack.imgur.com/FVYV4.png

Resources