Update- this issue was solved, updated code is at the end of the post.
I am trying to create a swimmer plot to visualize individual patient duration of treatment with a drug administered at multiple dose levels (DLs). Each patient will be be assigned to treatment with only one DL, but multiple patients can be assigned to a given DL (e.g. 3 patients at DL1, 3 patients and DL2, etc.). I would like to color code the bars in the swimmer plot according to DL.
I am using the swimplot package for R and have been following the guide located here (https://cran.r-project.org/web/packages/swimplot/vignettes/Introduction.to.swimplot.html).
This guide has been sufficient for most things I have tried, up until I tried to change the colors of the bars in the plot and corresponding legend. Following the section in that guide titled "Modifying Colours and shapes" under "Making the plots more aesthetically pleasing with ggplot manipulations", I was able to change the bar colors in the legend, but not the bars themselves.
Example here
I have been using the following code.
library(ggplot2)
library (swimplot)
library (gdata)
library (readxl)
ClinicalTrial.Arm <- read_excel("Swimmer_Test_Data1.xls")
ClinicalTrial.Arm <- as.data.frame(ClinicalTrial.Arm)
arm_plot <- swimmer_plot(df=ClinicalTrial.Arm,id='id',end='End_trt',width=.85+ scale_fill_manual(name="Arm",values=c("DL1" ="#003f5c", "DL2"="#374c80","DL3"="#7a5195","DL4"="#bc5090","DL5"="#ef5675","DL6"="#ff764a","DL7"="#ffa600"))+ scale_color_manual(name="Arm",values=c("DL1" ="#003f5c", "DL2"="#374c80","DL3"="#7a5195","DL4"="#bc5090","DL5"="#ef5675","DL6"="#ff764a","DL7"="#ffa600"))
arm_plot
I have tried a number of things to fix this, but am quite new to R and don't think I really know enough to troubleshoot effectively. I have tried various syntax changes (e.g. removing quotation marks) and have tried using the geom bar command but wasn't sure how/what to map to X and Y (it also seems like I shouldn't need to do this).
I have also tried using the following code, but get an error.
Colors <- c("DL1" ="#003f5c", "DL2"="#374c80","DL3"="#7a5195","DL4"="#bc5090","DL5"="#ef5675","DL6"="#ff764a","DL7"="#ffa600")
arm_plot <- swimmer_plot(df=ClinicalTrial.Arm,id='id',end='End_trt',width=.85, fill = Colors)+ scale_fill_manual(name="Arm",values=c("DL1" ="#003f5c", "DL2"="#374c80","DL3"="#7a5195","DL4"="#bc5090","DL5"="#ef5675","DL6"="#ff764a","DL7"="#ffa600"))+ scale_color_manual(name="Arm",values=c("DL1" ="#003f5c", "DL2"="#374c80","DL3"="#7a5195","DL4"="#bc5090","DL5"="#ef5675","DL6"="#ff764a","DL7"="#ffa600"))
Error in `check_aesthetics()`:
! Aesthetics must be either length 1 or the same as the data (20): fill
Run `rlang::last_error()` to see where the error occurred.
Any help here would be greatly appreciated.
Solved! Updated, working code
library(ggplot2)
library (swimplot)
library (gdata)
library (readxl)
ClinicalTrial.Arm <- read_excel("Swimmer_Test_Data1.xls")
ClinicalTrial.Arm <- as.data.frame(ClinicalTrial.Arm)
Colors <- c("DL1" ="#003f5c", "DL2"="#374c80","DL3"="#7a5195","DL4"="#bc5090","DL5"="#ef5675","DL6"="#ff764a","DL7"="#ffa600")
arm_plot <- swimmer_plot(df=ClinicalTrial.Arm,id='id',end='End_trt', name_fill = "Arm", width=.85) + scale_fill_manual(name="Arm",values = Colors) +
scale_color_manual(name="Arm",values=Colors)
To make your code work you first have to map a variable on the fill aesthetic which using swimplot could be achieved via the name_fill argument:
Note: As I use the ClinicalTrial.Arm dataset from the swimplot package I adjusted your color palette to make it work with the three categories of the Arm column in this dataset.
library(ggplot2)
library(swimplot)
#pal <- c("DL1" = "#003f5c", "DL2" = "#374c80", "DL3" = "#7a5195", "DL4" = "#bc5090", "DL5" = "#ef5675", "DL6" = "#ff764a", "DL7" = "#ffa600")
pal <- c("Arm A" = "#003f5c", "Arm B" = "#bc5090", "Off Treatment" = "#ffa600")
swimmer_plot(df = ClinicalTrial.Arm, id = "id", end = "End_trt", name_fill = "Arm", width = .85) +
scale_fill_manual(name = "Arm", values = pal)
I am trying to plot the following graph:
This plot was made using a command in R; however, I need to change the x-axis. As you see the x-axis starts at 0 and finish at 46. I want that the x-axis starts in 1972 and finishes in 2018 seq(1972, 2018). The data used for this graph is the following:
For regime one
structure(c(0.996336942021931, 0.982749831853788, 0.25257000136794,
0.707797489518183, 0.339372705184362, 0.999209103898399, 0.348786927897612,
0.821500770877589, 0.569473419352121, 0.544946043345147, 0.15347485404411,
0.987921203799956, 0.00247541125926418, 0.999925918450173, 0.996940249283586,
0.0141234625702467, 0.105466117156579, 0.999992944275275, 0.991723355647765,
0.0958472062267191, 0.0362729940372193, 0.999999790503447, 0.0750715811130157,
0.999975836828039, 0.998991768987905, 0.327943641159186, 5.05723080618291e-05,
0.999999999869691, 0.995538324405397, 0.123355227931813, 0.999776636825943,
0.00875781169836433, 0.696284480883101, 0.854839147672286, 0.113243492249383,
0.00984853715078062, 0.442061195271808, 0.999959859676686, 0.0249739384218217,
0.715262186931097, 0.269481397703521, 0.708458897302807, 0.0444979324520481,
0.000133950914911277, 0.997976154782607, 0.191386380576805, 0.99775339928206,
0.97921531595208, 0.27690132186733, 0.671995422154737, 0.458800347851363,
0.999155966774432, 0.417000082142666, 0.838969001100901, 0.576424593247709,
0.439169303472056, 0.227227711549776, 0.978527102362448, 0.00408165810824898,
0.999955057843957, 0.994643622809094, 0.00847570472458959, 0.163000467960203,
0.999995704786608, 0.987482614312069, 0.0569007267419926, 0.0585312256476362,
0.999999671060746, 0.118213072794827, 0.99998536150034, 0.998897081324845,
0.212968271334585, 8.35316288758489e-05, 0.999999999920876, 0.993537683112221,
0.188538497918178, 0.999604116439039, 0.00905848219612739, 0.769430430615986,
0.794457999021984, 0.0665707154963958, 0.00776458004359329, 0.5668500474175,
0.999931021995446, 0.0265573724408095, 0.661699294173752, 0.296009575623967,
0.587638579198176, 0.0251758869152202, 0.000220356219397782,
0.997352716237698, 0.191386380576805), .Dim = c(46L, 2L))
for regime 2:
structure(c(0.00366305797806813, 0.0172501681462116, 0.74742999863206,
0.292202510481817, 0.660627294815638, 0.000790896101601132, 0.651213072102388,
0.178499229122411, 0.430526580647879, 0.455053956654853, 0.846525145955889,
0.0120787962000438, 0.997524588740736, 7.40815498269273e-05,
0.00305975071641352, 0.985876537429753, 0.894533882843421, 7.05572472485335e-06,
0.00827664435223535, 0.904152793773281, 0.963727005962781, 2.09496553467159e-07,
0.924928418886985, 2.41631719608902e-05, 0.00100823101209502,
0.672056358840815, 0.999949427691938, 1.30308744399533e-10, 0.00446167559460289,
0.876644772068187, 0.00022336317405711, 0.991242188301636, 0.303715519116899,
0.145160852327714, 0.886756507750617, 0.990151462849219, 0.557938804728191,
4.01403233139628e-05, 0.975026061578178, 0.284737813068903, 0.730518602296479,
0.291541102697193, 0.955502067547952, 0.999866049085089, 0.00202384521739295,
0.808613619423195, 0.00224660071793958, 0.0207846840479196, 0.72309867813267,
0.328004577845263, 0.541199652148637, 0.000844033225568314, 0.582999917857334,
0.161030998899099, 0.423575406752291, 0.560830696527944, 0.772772288450224,
0.0214728976375518, 0.995918341891751, 4.49421560426429e-05,
0.00535637719090558, 0.99152429527541, 0.836999532039797, 4.29521339242403e-06,
0.0125173856879312, 0.943099273258007, 0.941468774352364, 3.28939253926857e-07,
0.881786927205173, 1.46384996596921e-05, 0.00110291867515508,
0.787031728665414, 0.999916468371124, 7.91243531099699e-11, 0.00646231688777926,
0.811461502081822, 0.00039588356096145, 0.990941517803873, 0.230569569384014,
0.205542000978016, 0.933429284503604, 0.992235419956407, 0.4331499525825,
6.89780045536876e-05, 0.973442627559191, 0.338300705826248, 0.703990424376033,
0.412361420801824, 0.97482411308478, 0.999779643780602, 0.00264728376230197,
0.808613619423195), .Dim = c(46L, 2L))
I know that the red line can be plotted using geom_line but I do not know how can the black bars plot? maybe using geom_bar, and also how can I merge the plots?
Thanks for your help
It's actually plotted using base R (good old times), using your first data for For regime one:
plot(Regime1[,1],type="h",xaxt="n",ylab="",cex.axis=0.6,xlab="",xlim=c(0,46))
lines(Regime1[,2],col="red")
mtext("Smoothed Probabilities",2,padj=-5,col="red",cex=0.7)
mtext("Fitted Probabilities",4,padj=1,cex=0.7)
axis(side=1,at=c(0,20,46),labels=c(1972,1992,2018))
Your xaxis values are actually 0:46, so you turn off the x-axis ticks using xaxt="n", then with axis(), you put it at 0,20,46 with the labels 1972...
It also depends on your plotting device, so might have to change the padj parameter in the axis to adjust the axis labels. I guess you can check out post like this for base R plotting functions.
In ggplot2, I guess you just create a data.frame with the Index as the years you need, and you call geom_segment() to plot the vertical lines :
library(ggplot2)
Regime1 = data.frame(Regime1)
colnames(Regime1) = c("Fitted","Smoothed")
Regime1$index = 1:nrow(Regime1)+1972
ggplot(Regime1,aes(x=index))+
geom_segment(aes(xend=index,y=0,yend=Fitted,col="Fitted")) +
geom_line(aes(y=Smoothed,col="Smoothed")) + theme_minimal() +
scale_color_manual(values=c("black","red"))
For a ggplot2 solution, you are going to need a data.frame or tibble with 4 columns (Regime, Year, Smoothed, and Fitted). Based on the data you provided, this would have 92 rows.
Now assuming you use those column names (and storing your data into the variable example.dat), a ggplot2 solution is
example.dat %>%
ggplot( aes(x=Year) ) +
geom_line( aes(y=Smoothed), color="red" ) +
geom_linerange( aes(ymax=Fitted), ymin=0 ) +
facet_wrap( ~ Regime, ncol=1 )
Then you might need to adjust some of the scales to get the best plot.
Does anyone have an idea how is this kind of chart plotted? It seems like heat map. However, instead of using color, size of each cell is used to indicate the magnitude. I want to plot a figure like this but I don't know how to realize it. Can this be done in R or Matlab?
Try scatter:
scatter(x,y,sz,c,'s','filled');
where x and y are the positions of each square, sz is the size (must be a vector of the same length as x and y), and c is a 3xlength(x) matrix with the color value for each entry. The labels for the plot can be input with set(gcf,properties) or xticklabels:
X=30;
Y=10;
[x,y]=meshgrid(1:X,1:Y);
x=reshape(x,[size(x,1)*size(x,2) 1]);
y=reshape(y,[size(y,1)*size(y,2) 1]);
sz=50;
sz=sz*(1+rand(size(x)));
c=[1*ones(length(x),1) repmat(rand(size(x)),[1 2])];
scatter(x,y,sz,c,'s','filled');
xlab={'ACC';'BLCA';etc}
xticks(1:X)
xticklabels(xlab)
set(get(gca,'XLabel'),'Rotation',90);
ylab={'RAPGEB6';etc}
yticks(1:Y)
yticklabels(ylab)
EDIT: yticks & co are only available for >R2016b, if you don't have a newer version you should use set instead:
set(gca,'XTick',1:X,'XTickLabel',xlab,'XTickLabelRotation',90) %rotation only available for >R2014b
set(gca,'YTick',1:Y,'YTickLabel',ylab)
in R, you should use ggplot2 that allows you to map your values (gene expression in your case?) onto the size variable. Here, I did a simulation that resembles your data structure:
my_data <- matrix(rnorm(8*26,mean=0,sd=1), nrow=8, ncol=26,
dimnames = list(paste0("gene",1:8), LETTERS))
Then, you can process the data frame to be ready for ggplot2 data visualization:
library(reshape)
dat_m <- melt(my_data, varnames = c("gene", "cancer"))
Now, use ggplot2::geom_tile() to map the values onto the size variable. You may update additional features of the plot.
library(ggplot2)
ggplot(data=dat_m, aes(cancer, gene)) +
geom_tile(aes(size=value, fill="red"), color="white") +
scale_fill_discrete(guide=FALSE) + ##hide scale
scale_size_continuous(guide=FALSE) ##hide another scale
In R, corrplotpackage can be used. Specifically, you have to use method = 'square' when creating the plot.
Try this as an example:
library(corrplot)
corrplot(cor(mtcars), method = 'square', col = 'red')