Volcano plot - colors - r

I am trying to plot a volcano plot with ggplot2. I would like to have three different colors based on the following criteria:
qvalue <0.05 and meth.diff > 25% = Red
qvalue <0.05 and meth.diff < -25% (minus 25%) = Green
qvalue <0.05 and meth.diff between +25 and -25 = Gray
Similar questions have been asked here before and I tried following them but keep getting error messages. Any suggestions would be highly appreciated.
Here is the raw data file:
chr start end strand pvalue qvalue meth.diff
16 chr1 37801 38100 * 2.246550e-05 4.487042e-04 -36.485769
17 chr1 38101 38400 * 5.699781e-06 1.376471e-04 55.755181
29 chr1 49501 49800 * 1.453030e-18 2.442391e-16 -18.381131
35 chr1 62701 63000 * 5.547627e-03 3.686303e-02 -31.871711
54 chr1 122401 122700 * 3.917230e-03 2.845933e-02 63.443366
57 chr1 130201 130500 * 8.941091e-04 9.253737e-03 -8.347167
myDiff1p$threshold = factor(ifelse(myDiff1p$meth.diff>25 & myDiff1p$qvalue< 0.05, 1,
ifelse(myDiff1p$meth.diff<-25 & myDiff1p$qvalue< 0.05,-1,0)))
ggplot(data=myDiff1p, aes(x=meth.diff, y=-log10(qvalue))) +
geom_point(aes(color=myDiff1p$threshold), alpha=0.4, size=1.75)+
geom_vline(xintercept=c(-25,25), color="red", alpha=1.0)+
geom_hline(yintercept=2, color="blue", alpha=1.0)+
xlab("Differential Methylation")+
ylab("-log10 (qvalue)")+
theme_bw()+
xlim(c(-75, 75)) +
ylim(c(0, 300))
Error: Discrete value supplied to continuous scale

You have an almost unnoticeable mistake in this line:
myDiff1p$threshold = factor(ifelse(myDiff1p$meth.diff>25 & myDiff1p$qvalue< 0.05, 1,
ifelse(myDiff1p$meth.diff<-25 & myDiff1p$qvalue< 0.05,-1,0)))
As there's no space in myDiff1p$meth.diff<-25, it's interpreted as myDiff1p$meth.diff <- 25 rather than myDiff1p$meth.diff < -25. As a result, meth.diff got messed up.
Here's what I recommend:
library(dplyr)
myDiff1p <- myDiff1p %>%
mutate(threshold = factor(case_when(meth.diff > 25 & qvalue < 0.05 ~ "cond1",
meth.diff < -25 & qvalue < 0.05 ~ "cond2",
TRUE ~ "cond3")))
ggplot(data=myDiff1p, aes(x=meth.diff, y=-log10(qvalue))) +
geom_point(aes(color=myDiff1p$threshold), alpha=0.4, size=1.75)+
geom_vline(xintercept=c(-25,25), color="red", alpha=1.0)+
geom_hline(yintercept=2, color="blue", alpha=1.0)+
xlab("Differential Methylation")+
ylab("-log10 (qvalue)")+
theme_bw()+
xlim(c(-75, 75)) +
ylim(c(0, 300)) +
scale_color_manual(name = "Threshold",
values = c("cond1" = "red", "cond2" = "green", "cond3" = "grey"))
I labelled the threshold factor by condition, & defined the mapping between condition & colour in a named vector in scale_color_manual(). Also, a matter of personal preference, but I think dplyr::case_when() looks neater than nested ifelse() statements.

Related

How to color the selected portion of the diagonal line of scatterplot?

I am trying to select points within diagonal line, but if you look at the plot, it is also selecting points below the diagonal line.
IBD$COLOR <- ifelse((IBD$Z0 < 0.5 &
IBD$Z0 > 0.10 &
IBD$Z1 < 0.9 &
IBD$Z1 > 0.5), "OK", "BAD")
I want to plot the points in blue without selecting the points below the diagonal line. What would be the proper way to select IBD$COLOR here?
ggplot(IBD, aes(x=Z0, y=Z1))+ geom_point(aes(color=COLOR)) + ggtitle("Replication dataset - 2441")
I believe you're suggesting there should be three colors: blue/red for ok/bad, and perhaps gray for dots not on or near the diagonal. For that, I suggest your ifelse should be a bit more complex to incorporate "distance" from the diagonal as well.
Here's some fake data to mimic your plot:
## generate fake data
set.seed(42)
dat <- data.frame(Z0=runif(10000), Z1=runif(10000))
dat <- dat[(dat$Z0 + dat$Z1) < 1,]
## your processing picks up here
dat$COLOR <- with(dat, ifelse((Z1 + Z0) < 0.95, "Boring",
ifelse(0.1 < Z0 & Z0 < 0.5 & 0.5 < Z1 & Z1 < 0.9, "OK", "Bad")))
ggplot(dat, aes(Z0, Z1)) +
geom_point(aes(color = COLOR)) +
scale_color_manual(values = c(Boring="gray", OK="blue", Bad="red"))
If you want to control the order of the COLOR legend (it will be sorted alphabetically by default), then you will need to use factors, perhaps
dat$COLOR <- factor(dat$COLOR, levels = c("OK", "Bad", "Boring"))
before plotting.
If you're using dplyr, it may be simpler to use case_when to manage the processing (and factorizing), perhaps:
library(dplyr)
dat %>%
mutate(
COLOR = case_when(
(Z1 + Z0) < 0.95 ~ "Boring",
between(Z0, 0.1, 0.5) & between(Z1, 0.5, 0.9) ~ "OK",
TRUE ~ "Bad"),
COLOR = factor(COLOR, levels = c("OK", "Bad", "Boring"))
) %>%
ggplot(aes(Z0, Z1)) +
geom_point(aes(color = COLOR)) +
scale_color_manual(values = c(Boring="gray", OK="blue", Bad="red"))
incorporating both the ease of between(.) and case_when. (Note that between uses closed ends, so this is actually equivalent to 0.1 <= Z0 & Z0 <= 0.5, etc.)

volcano plot in R: adding details: coloring common factors only

I have a problem with coloring some genes to specify common genes in 2 data sets(whole_colon/ volcano).
The code below works well. However, the thing is that I'd like to add some more detail which is quite tricky.
I would like to apply different colors(red would be great) for common genes: only when this statement is satisfied: (whole_colon$genes==volcano$genes).
I tried to differentiate groups into (specified_increased/ specified_decreased) yet, sadly didn't work out.
Here's my code attached.
Big thanks in advance.
#volcano plot using ggplot2
library(data.table)
# Adding group to decipher if the gene is significant or not:
whole_colon <- data.frame(whole_colon)
whole_colon["group"] <- "NotSignificant"
whole_colon[which(whole_colon['FDR'] < 0.05 & whole_colon['logFC'] > 1.5),"group"] <- "Increased"
whole_colon[which(volcano['FDR'] < 0.05 & volcano['logFC'] > 1.5),"group"] <- "colon_Increased_specialized"
whole_colon[which(volcano['FDR'] < 0.05 & volcano['logFC'] < -1.5),"group"] <- "colon_Decreased_specialized"
with(subset(whole_colon , FDR<0.05), points(logFC, -log10(FDR), pch=20,col="red"), whole_colon$genes==volcano$genes)
library(ggplot2)
ggplot(whole_colon, aes(x = logFC, y = -log10(FDR), color = group))+
scale_colour_manual(values = cols) +
ggtitle(label = "Volcano Plot", subtitle = "colon specific volcano plot") +
geom_point(size = 2.5, alpha = 1, na.rm = T) +
theme_bw(base_size = 14) +
theme(legend.position = "right") +
xlab(expression(log[2]("logFC"))) +
ylab(expression(-log[10]("FDR"))) +
geom_hline(yintercept = 1.30102, colour="#990000", linetype="dashed") +
geom_vline(xintercept = 1.5849, colour="#990000", linetype="dashed") +
geom_vline(xintercept = -1.5849, colour="#990000", linetype="dashed")+
scale_y_continuous(trans = "log1p")
This gives me an impaired image looking like this. (I want 'whole_colon data' to be fully marked whilst colored-redish when they have identical genes with 'volcano data')
Here are some data subset from whole_colon and volcano
whole_colon:
genes logFC FDR group
1 CST1 9.554742 5.64e-45 Increased
3 OTOP2 -9.408177 5.76e-32 Decreased
4 COL11A1 6.825363 1.00e-31 Increased
5 INHBA 6.271879 2.07e-30 Increased
6 MMP7 7.594926 2.07e-30 Increased
7 BEST4 -7.756451 8.30e-30 Decreased
8 COL10A1 7.634386 1.82e-23 Increased
9 MMP11 4.767644 2.70e-23 Increased
10 GUCA2B -6.346156 2.17e-21 Decreased
11 KRT6B 11.801550 5.37e-20 Increased
12 WNT2 9.485133 6.47e-20 Increased
13 COL8A1 3.974965 6.47e-20 Increase
volcano:
genes logFC FDR group
1 INHBA 6.271879 2.070000e-30 Increased
2 COL10A1 7.634386 1.820000e-23 Increased
3 WNT2 9.485133 6.470000e-20 Increased
4 COL8A1 3.974965 6.470000e-20 Increased
5 THBS2 4.104176 2.510000e-19 Increased
6 BGN 3.524484 5.930000e-18 Increased
7 COMP 11.916956 2.740000e-17 Increased
9 SULF1 3.540374 1.290000e-15 Increased
10 CTHRC1 3.937028 4.620000e-14 Increased
11 TRIM29 3.827088 1.460000e-11 Increased
12 SLC6A20 5.060538 5.820000e-11 Increased
13 SFRP4 5.924330 8.010000e-11 Increased
14 CDH3 5.330732 8.940000e-11 Increased
15 ESM1 6.491496 3.380000e-10 Increased
614 TDP2 -1.801368 0.002722461 NotSignificant
615 EPHX2 -1.721039 0.002722461 NotSignificant
616 RAVER2 -1.581812 0.002749728 NotSignificant
617 BMP6 -2.702780 0.002775460 Increased
619 SCNN1G -4.012111 0.002870500 Increased
620 SLC52A3 -1.868920 0.002931197 NotSignificant
621 VIPR1 -1.556238 0.002945578 NotSignificant
622 SUCLG2 -1.720993 0.003059717 NotSignificant
The example dataset provided is incomplete, as there is no overlap so it will be quite hard to color code according to that. Try the following, the key is you cannot use ==, but rather %in% to return a boolean on whether your genes in whole_colon are in volcano:
whole_colon=structure(list(genes = structure(c(5L, 11L, 3L,
7L, 10L, 1L,
2L, 9L, 6L, 8L, 12L, 4L, 13L, 14L), .Label = c("BEST4", "COL10A1",
"COL11A1", "COL8A1", "CST1", "GUCA2B", "INHBA", "KRT6B", "MMP11",
"MMP7", "OTOP2", "WNT2", "ABC", "DEF"), class = "factor"), logFC = c(9.554742,
-9.408177, 6.825363, 6.271879, 7.594926, -7.756451, 7.634386,
4.767644, -6.346156, 11.80155, 9.485133, 3.974965, 0.5, -0.5),
FDR = c(5.64e-45, 5.76e-32, 1e-31, 2.07e-30, 2.07e-30, 8.3e-30,
1.82e-23, 2.7e-23, 2.17e-21, 5.37e-20, 6.47e-20, 6.47e-20,
1, 1), group = c("Increased", "Decreased", "Increased", "specific_Increased",
"Increased", "Decreased", "specific_Increased", "Increased",
"Decreased", "Increased", "specific_Increased", "specific_Increased",
"NotSignificant", "NotSignificant")), row.names = c("1",
"3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13", "14",
"2"), class = "data.frame")
Set the groups:
#set the decreased and increased like you did:
whole_colon["group"] <- "NotSignificant"
whole_colon[which(whole_colon['FDR'] < 0.05 & whole_colon['logFC'] > 1.5),"group"] <- "Increased"
whole_colon[which(whole_colon['FDR'] < 0.05 & -whole_colon['logFC'] > 1.5),"group"] <- "Decreased"
whole_colon[which(whole_colon['FDR'] < 0.05 & whole_colon['logFC'] > 1.5 & whole_colon$genes %in% volcano$genes),"group"] <- "specific_Increased"
whole_colon[which(whole_colon['FDR'] < 0.05 & whole_colon['logFC'] < -1.5 & whole_colon$genes %in% volcano$genes),"group"] <- "specific_Decreased"
and plot:
cols = c("grey","blue","blue","red","red")
names(cols) = c("NotSignificant","Increased","Decreased",
"specific_Increased","specific_Decreased")
library(ggplot2)
ggplot(whole_colon, aes(x = logFC, y = -log10(FDR), color = group))+
scale_colour_manual(values = cols) +
ggtitle(label = "Volcano Plot", subtitle = "colon specific volcano plot") +
geom_point(size = 2.5, alpha = 1, na.rm = T) +
theme_bw(base_size = 14) +
theme(legend.position = "right") +
xlab(expression(log[2]("logFC"))) +
ylab(expression(-log[10]("FDR"))) +
geom_hline(yintercept = 1.30102, colour="#990000", linetype="dashed") +
geom_vline(xintercept = 1.5849, colour="#990000", linetype="dashed") +
geom_vline(xintercept = -1.5849, colour="#990000", linetype="dashed")+
scale_y_continuous(trans = "log1p")
#
I think I solved this problem. Quite simply just adding one more sentence, this problem was solved.
After adjusting #StupidWolf's advice and a lil redefining process of col, I got an image that I wanted.
cols<- c(red="red", orange="orange", NotSignificant="darkgrey", Increased= "#00B2FF" ,Decreased="#00B2FF", specific_Increased="#ff4d00", specific_Decreased="#ff4d00" )
head(cols)

volcano plot error (using ggplot2): drawn without data

I'm here again with another problem.
I'm currently working with making a volcano plot of DEG data using ggplot2.
The thing is that I'm getting a result without data. weird.
for more accurate diagnosis, my data(volcano) is consist of 948 DEG data (|logFC|>1, FDR<0.05).
library(ggplot2)
volcano["group"] <- "NotSignificant"
volcano[which(volcano['FDR'] < 0.01 & abs(volcano['logFC']) > 2 ),"group"] <- "Increased"
volcano[which(volcano['FDR'] < 0.01 & abs(volcano['logFC']) < -2 ),"group"] <- "Decreased"
# creating color palette
cols <- c("red" = "red", "orange" = "orange", "NotSignificant" = "darkgrey",
"Increased" = "#00B2FF", "Decreased" = "#00B2FF")
##I didn't even get to use those beautiful colors.
FDR_threshold <- 0.01
logFC_threshold <- 2
deseq.threshold <- as.factor(abs(volcano$logFC) >= logFC_threshold &
volcano$FDR < FDR_threshold)
xi <- which(deseq.threshold == TRUE)
deseq.threshold <- as.factor(abs(volcano$logFC) > 2 & volcano$FDR < 0.05)
# Make a basic ggplot2 object
vol <- ggplot(volcano, aes(x = logFC, y =-log10(FDR), colour=deseq.threshold))
# inserting manual colours as per colour palette and more
vol +
scale_colour_manual(values = cols) +
ggtitle(label = "Volcano Plot", subtitle = "colon specific volcano plot") +
geom_point(size = 2.5, alpha = 1, na.rm = T) +
theme_bw(base_size = 14) +
theme(legend.position = "none") +
xlab(expression(log[2]("logFC"))) +
ylab(expression(-log[10]("FDR"))) +
geom_hline(yintercept = 1, colour="#990000", linetype="dashed") +
geom_vline(xintercept = 0.586, colour="#990000", linetype="dashed") +
geom_vline(xintercept = -0.586, colour="#990000", linetype="dashed")+
scale_y_continuous(trans = "log1p")
Here is the lil sample of my dataset, volcano
genes logFC FDR group
1 INHBA 6.271879 2.070000e-30 Increased
2 COL10A1 7.634386 1.820000e-23 Increased
3 WNT2 9.485133 6.470000e-20 Increased
4 COL8A1 3.974965 6.470000e-20 Increased
5 THBS2 4.104176 2.510000e-19 Increased
6 BGN 3.524484 5.930000e-18 Increased
7 COMP 11.916956 2.740000e-17 Increased
9 SULF1 3.540374 1.290000e-15 Increased
10 CTHRC1 3.937028 4.620000e-14 Increased
11 TRIM29 3.827088 1.460000e-11 Increased
12 SLC6A20 5.060538 5.820000e-11 Increased
13 SFRP4 5.924330 8.010000e-11 Increased
14 CDH3 5.330732 8.940000e-11 Increased
15 ESM1 6.491496 3.380000e-10 Increased
614 TDP2 -1.801368 0.002722461 NotSignificant
615 EPHX2 -1.721039 0.002722461 NotSignificant
616 RAVER2 -1.581812 0.002749728 NotSignificant
617 BMP6 -2.702780 0.002775460 Increased
619 SCNN1G -4.012111 0.002870500 Increased
620 SLC52A3 -1.868920 0.002931197 NotSignificant
621 VIPR1 -1.556238 0.002945578 NotSignificant
622 SUCLG2 -1.720993 0.003059717 NotSignificant
I think your issue is coming from the use of deseq.threshold in the color of aes. Instead, I think you should use group column to plot the color.
BTW, your threshold to define your significant genes has a mistake because you are looking for "Decreased" for genes with an absolute value of logFC inferior to -2 which is not possible.
Here, I used an example of an output of DEG:
library(data.table)
volcano = fread("https://gist.githubusercontent.com/stephenturner/806e31fce55a8b7175af/raw/1a507c4c3f9f1baaa3a69187223ff3d3050628d4/results.txt", header = TRUE)
colnames(volcano) <- c("Gene","logFC","pvalue","FDR")
# Adding group to decipher if the gene is significant or not:
volcano <- data.frame(volcano)
volcano["group"] <- "NotSignificant"
volcano[which(volcano['FDR'] < 0.01 & volcano['logFC'] > 1 ),"group"] <- "Increased"
volcano[which(volcano['FDR'] < 0.01 & volcano['logFC'] < -1 ),"group"] <- "Decreased"
So, my example dataframe looks like (I changed a little bit the threshold you are using to get more significant genes):
> head(volcano)
Gene logFC pvalue FDR group
1 DOK6 0.5100 1.861e-08 0.0003053 NotSignificant
2 TBX5 -2.1290 5.655e-08 0.0004191 Decreased
3 SLC32A1 0.9003 7.664e-08 0.0004191 NotSignificant
4 IFITM1 -1.6870 3.735e-06 0.0068090 Decreased
5 NUP93 0.3659 3.373e-06 0.0068090 NotSignificant
6 EMILIN2 1.5340 2.976e-06 0.0068090 Increased
Now, you can plot:
library(ggplot2)
ggplot(volcano, aes(x = logFC, y = -log10(FDR), color = group))+
scale_colour_manual(values = cols) +
ggtitle(label = "Volcano Plot", subtitle = "colon specific volcano plot") +
geom_point(size = 2.5, alpha = 1, na.rm = T) +
theme_bw(base_size = 14) +
theme(legend.position = "none") +
xlab(expression(log[2]("logFC"))) +
ylab(expression(-log[10]("FDR"))) +
geom_hline(yintercept = 1, colour="#990000", linetype="dashed") +
geom_vline(xintercept = 0.586, colour="#990000", linetype="dashed") +
geom_vline(xintercept = -0.586, colour="#990000", linetype="dashed")+
scale_y_continuous(trans = "log1p")

How can I add a second geom_point in ggplot based on a subset if sometimes that subset is purposefully empty?

I have this code that works pretty well. Basically what it does is run through all the states and chambers of that state to make a plot of each:
lapply(unique(finaldat$st), function(s){
chambs <- unique(finaldat$chamber[finaldat$st == s])
p <- list(NULL)
for(c in 1:length(chambs)){
p[[c]] <- finaldat %>% filter(st == s & chamber == chambs[c]) %>%
ggplot(aes(x = average, y = score, col = color))+
geom_point(aes(size= Total,alpha = 0.5)) +
stat_smooth(method = "lm") +
geom_point(data=subset(finaldat,st==s & chamber == chambs[c] & highlight>0),aes(col="yellow")) +
ggtitle(paste(s,chambs[c],year)) +
scale_size(range = c(.5,3.5)) +
scale_color_manual(labels = c("1","2","3"),
values = c("blue","red","yellow"))
filename = filename <- paste(s,chambs[c],year)
ggsave(paste("Plots/",filename,".png"), width = 10, height = 7)
}
return(p)
})
Works fine for the first few states but I run into issues with the second geom_point line which basically serves as a way to highlight certain people in yellow:
... + geom_point(data=subset(finaldat,st==s & chamber == chambs[c] & highlight>0),aes(col="yellow")) + ...
Because it's a subset of the data frame, there are certain iterations of this data where this subset turns up an empty dataframe because the sometimes no one purposefully has values higher than zero in highlight variable -- hence I get an error and it stops the loop. But I'm trying to figure out a way where if I get an error due to that subset being empty it just ignores that one line. But I couldn't figure out any good ifelse statements or anything to make that happen.
Here's an example of what the data looks like:
st chamber average score color Total highlight
AK Upper .64 54 1 849 1
AK Upper .84 91 1 743 0
AK Upper .35 14 2 442 0
AK Upper .95 54 1 641 4
AK Lower .64 54 1 849 0
AK Lower .84 91 1 743 0
AK Lower .35 14 2 442 0
AK Lower .95 54 1 641 0
Etc throughout all the states/chambers -- but in this example, the highlight would work for AK Upper but not AK Lower. So basically when the loop gets to AK Lower I just need it to ignore that second geom_point because the empty subset will cause it to error. Any ideas?
lapply(unique(finaldat$st), function(s){
chambs <- unique(finaldat$chamber[finaldat$st == s])
p <- list(NULL)
for(c in 1:length(chambs)){
p[[c]] <- finaldat %>% filter(st == s & chamber == chambs[c]) %>%
ggplot(aes(x = average, y = score, col = color))+
geom_point(aes(size= Total,alpha = 0.5)) +
stat_smooth(method = "lm") -> gg
if (YOUR TEST HERE) {
gg <- gg + geom_point(data=subset(finaldat,st==s & chamber == chambs[c] & highlight>0),aes(col="yellow")) +
}
gg +
ggtitle(paste(s,chambs[c],year)) +
scale_size(range = c(.5,3.5)) +
scale_color_manual(labels = c("1","2","3"),
values = c("blue","red","yellow"))
filename = filename <- paste(s,chambs[c],year)
ggsave(paste("Plots/",filename,".png"), width = 10, height = 7)
}
return(p)
})

Add a circular line chart inside a circular chart point with different magnitude data

I have a file called mitodata that has 8388 rows and I writed an exemple to explain my problem:
mitodata <-"Chr gene bp foldchange p
chrM chrM-1-2 2 -1.5 0.02
chrM chrM-3-4 4 1.5 0.05
chrM chrM-5-6 6 -1.2 0.0005
chrM chrM-7-8 8 1.3 0.02
chrM chrM-9-10 10 -1.6 0.007"
mitodata<-read.table(text=mitodata,header=T)
I can easily add some informations to this data to be used to plot a circular graph showing my p-value in a Manhattan plot manner, using ggplot2 as follow:
# Load ggplot2
library(ggplot2)
#First adding a label for this exemple:
addgenelabel <- function(bp,gene) { gene <- ifelse(bp < 2, gene <- "Control-Region", ifelse(bp < 4, gene <- "tRNA", ifelse (bp < 6, gene <- "12S", ifelse(bp < 8, gene <- "tRNA", ifelse(bp < 10, gene <- "16S", gene <- "Control-Region")))))}
# Add gene names to each SNP
mitodata$gene <- addgenelabel(mitodata$bp,mitodata$gene)
# Creates and stores negative log p as a new variable
mitodata$neglogp <- -1*log10(mitodata$p)
# Adds a significance threshold line at negative log of 0.05
mitodata$neglogpline <- -1*log10(0.05)
# Adds -3 label to y axis
mitodata$extraline <- -3
# Set colors for each gene
colours <- c("Control-Region" <- "deeppink", "tRNA" <- "green", "12S" <- "mediumaquamarine", "16S" <- "sienna4","red")
lines <- data.frame(x = seq(0,10,by=1),y = 0)
lines$gene <- addgenelabel(lines$x,lines$gene)
And after this, I can plot using this command:
# Plot everything and GO
p<- ggplot(mitodata, aes(x = bp,y = neglogp,color = gene)) +
geom_point(size=0.5, alpha=1)+ coord_polar(direction = -1) +
geom_line(aes(x,1.30,color = "black"),data = lines, linetype="dotted") +
#facet_grid(.~pheno) +
geom_line(aes(y=extraline)) +
geom_point(aes(x,y,color = gene),data=lines, size=2) +
scale_colour_manual(values = colours,"Genes",breaks = c("Control-Region","tRNA","12S","16S"),
labels = c("Control-Region","tRNA","12S","16S")) +
theme(legend.justification=c(1,1), legend.position=c(0.98,0.98)) + theme(legend.text=element_text(size=10)) +
xlab("Mitochondrial CpG Location") +
ylab("-log(p-value)") +
ggtitle("Negative Log P-value of Mitochondrial CpG Hits")
ggsave("solarplot.tiff", w=12, h=12, dpi=600, compression = "lzw")
Doing it, using my 8388 rows of data, I obtained a graph like this:
After I converted p value, using negative log of p, p-value and foldchange were in the same magnitude. Thus, when plotting the graph, both overlaped. The issue is that I would like to add an inside line chart to this graph showing the fold change column information but not overlapping with the p-value information. Something like this:
obs: would be nice to have this chart changing the colors according to the defined "gene" across the circle

Resources