Replace point shape with subject name on ggplot - r

I was trying to use ggplot to make a plot with following needs.
Use points to specify subjects.
Use color to specify models. I have 6 models, so each subject should appear 6 times on the plot.
The plot is expected to look something like this:
I can use color to specify models but cannot find a way to specify subjects as point shapes.
Example data
structure(list(subject = c("S1", "S8", "S3", "S9"), alphamean = c(0.224104019995071,
0.195354811041001, 0.5675953626788, 0.491972414993715), lambdamean = c(0.35985383877637,
0.268124038994992, 0.92122181060701, 0.43561465728315), model = c("a",
"b", "c", "d")), row.names = c(NA, -4L), class = c("data.table",
"data.frame"))
My attempts
data %>%
ggplot(aes(x = alphamean, y = lambdamean)) +
geom_point(aes(color=model,shape=subject)) +
scale_shape_manual(values = paste0('S',1:40))

You could use geom_text instead of geom_point:
library(ggplot2)
library(magrittr)
data %>%
ggplot(aes(x = alphamean, y = lambdamean)) +
geom_text(aes(color=model,label=subject))

Related

Plot a bar plot on R, grouped in 2s

Looking to plot grouped bar plots
data:
structure(list(Main = c(0.468893939007605, 0.0629924918425918,
0.561410474480681), Total = c(0.388090040532888, -0.0706047151157143,
0.483298239353565)), class = "data.frame", row.names = c(NA,
-3L))
intended output should look like this:
My current plot code which does not make sense to me:
barplot(main_total$Main, main_total$Total)
ggplot would be preferred but i have trouble coding it.any help will be appreciated. Thank you
It's because barplot prefers transposed matrices.
m <- as.matrix(main_total)
Use t to transpose the matrix.
b <- barplot(t(m), beside=TRUE, ylab="Value",
ylim=c(round(min(m), 1), round(max(m), 1)), col=3:4)
axis(1, colMeans(b), c("H", "M", "S"))
legend("topleft", legend=c("Main", "Total"), fill=3:4)
box()
Gives
You'll get the idea here since you didn't include the grouping variables in your example code.
df <- structure(list(Main = c(0.468893939007605, 0.0629924918425918,
0.561410474480681), Total = c(0.388090040532888, -0.0706047151157143,
0.483298239353565)), class = "data.frame", row.names = c(NA,
-3L))
df$Group <- c('H','M','S') # Assign group variables
library(reshape2) # Data frame needs to be in long format
df.m <- melt(df,id.vars = "Group")
df.m
library(ggplot2)
ggplot(df.m, aes(x=Group,y=value)) +
geom_bar(aes(fill=variable),stat = 'identity',position = 'dodge')

r does not allow the x axis to display the title (now with added data)

The question was how to get R to display titles on the x- and y-axes when the plot is rotated. mtext was not allowing this to happen. The question then became how to do this with the data at hand.
Here is my edited code and data.
Small segment of my Data:
library(ggplot)
x <- structure(list(
CS1 = c(51.176802507837, 11.289327763008, 10.8584547767754, 5.37665764546685, 6.47159365761892),
CS2 = c(34.9956506731101, 45.7147446193383, 23.788413903316, 42.4969135802469, 18.8998879103283),
CS3 = c(3.59556251631428, 5.59228312932411, 11.7117536894149, 15.7240944017563, 9.72486977228754),
CS4 = c(0.830633241559198, 2.57358541893362, 3.05352639873916, 7.01238591916558, 2.98276253547777),
CS5 = c(6.6094547746612, 7.67873290538655, 9.93544994944388, 8.49609094535301, 6.71423210935406)),
class = c("tbl_df", "tbl", "data.frame"))
Now some code to make a ggplot.
xplot<-ggplot(x, aes(y = test, y = CS2, group = test))+
geom_boxplot()+
labs(y = "Intensity",
x = "Variable")+
scale_x_discrete()
xplot
Try using ggplot from the tidyverse.
<del>It is useful to have a basic dataset to run from:<\del> Now that you have some data
library(tidyverse)
x <-structure(list(
CS1 = c(51.176802507837, 11.289327763008, 10.8584547767754, 5.37665764546685, 6.47159365761892),
CS2 = c(34.9956506731101, 45.7147446193383, 23.788413903316, 42.4969135802469, 18.8998879103283),
CS3 = c(3.59556251631428, 5.59228312932411, 11.7117536894149, 15.7240944017563, 9.72486977228754),
CS4 = c(0.830633241559198, 2.57358541893362, 3.05352639873916, 7.01238591916558, 2.98276253547777),
CS5 = c(6.6094547746612, 7.67873290538655, 9.93544994944388, 8.49609094535301, 6.71423210935406)),
row.names = c(NA, -5L), class = c("tbl_df", "tbl", "data.frame")
)
Now gather that data into two columns
x1 <- gather(x, test, values, CS1:CS5)
Now plot
xplot<-ggplot(x1, aes(x = test, y = values, group = test))+
geom_boxplot()+
labs(y = "Intensity",
x = "Variable")
xplot + coord_flip()

Stacked bar plot in violin plot shape

Maybe this is a stupid idea, or maybe it's a brain wave. I have a dataset of lipid classes in 4 different species. The data is proportional, and the sums are 1000. I want to visualise the differences in proportions for each class in each species. Generally a stacked bar would be the way to go here, but there are several classes, and it becomes uninterpretable since only the bottom class shares a baseline (see below).
And this appears to be the best option of a bad bunch, with pie and donut charts being nothing short of sneered at.
I was then inspired by this creation Symmetrical, violin plot-like histogram?, which creates a sort of stacked distribution violin plot (see below).
I am wondering if this could somehow be converted into a stacked violin, such that each segment represents a whole variable. In the case of my data, species' A and D would be 'fat' around the TAG segment, and 'skinnier' at the STEROL segment. This way the proportions are depicted horizontally, and always have a common baseline. Thoughts?
Data:
structure(list(Sample = c("A", "A", "A", "B", "B", "B", "C",
"C", "C", "D", "D"), WAX = c(83.7179798600773, 317.364310355766,
20.0147496567679, 93.0194886619568, 78.7886829173726, 79.3445694220837,
91.0020522660375, 88.1542855137005, 78.3313314713951, 78.4449591023115,
236.150030864875), TAG = c(67.4640254081232, 313.243238213156,
451.287867136276, 76.308508343969, 40.127554151831, 91.1910102221636,
61.658394708941, 104.617259648364, 60.7502685224869, 80.8373642262043,
485.88633863193), FFA = c(41.0963382465756, 149.264019576272,
129.672579626868, 51.049208042632, 13.7282635713804, 30.0088572108344,
47.8878116348504, 47.9564218319094, 30.3836532949481, 34.8474205480686,
10.9218910757234), `DAG1,2` = c(140.35876401479, 42.4556176551009,
0, 0, 144.993393432366, 136.722412691012, 0, 140.027443968931,
137.579074961889, 129.935353616471, 46.6128854387559), STEROL = c(73.0144390122309,
24.1680929257195, 41.8258704279641, 78.906816661241, 67.5678558060943,
66.7150537517493, 82.4794113296791, 76.7443442992891, 68.9357008866253,
64.5444668132533, 29.8342694785768), AMPL = c(251.446564854412,
57.8713327050339, 306.155806819949, 238.853696442419, 201.783872969561,
175.935515655693, 234.169038776536, 211.986239116884, 196.931330316831,
222.658181144794, 73.8944654414811), PE = c(167.99718650752,
43.3839497916674, 22.1937177530762, 150.315149187176, 153.632530721031,
141.580725482114, 164.215442147509, 155.113323256627, 143.349000132624,
128.504657216928, 50.6281347160092), PC = c(174.904702096271,
52.2494387772846, 28.8494085790995, 191.038328534942, 190.183655117756,
175.33290326259, 199.2632149392, 175.400682364295, 176.64926273487,
163.075864395099, 66.071984352649), LPC = c(0, 0, 0, 120.508804125665,
109.194191312608, 103.16895230176, 119.324634197247, 0, 107.09037767833,
97.151732936871, 0)), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -11L), .Names = c("Sample", "WAX", "TAG",
"FFA", "DAG1,2", "STEROL", "AMPL", "PE", "PC", "LPC"))
This is essentially a horizontal bar plot:
library(reshape2)
DFm <- melt(DF, id.vars = "Sample")
DFm1 <- DFm
DFm1$value <- -DFm1$value
DFm <- rbind(DFm, DFm1)
ggplot(DFm, aes(x = "A", y = value / 10, fill = variable, color = variable)) +
geom_bar(stat = "identity", position = "dodge") +
coord_flip() +
theme_minimal() +
facet_wrap(~ Sample, nrow = 1, switch = "x") +
theme(axis.text = element_blank(),
axis.title = element_blank(),
panel.grid = element_blank())

Overlaying plots with a horizontal date in R

I was attempting to overlay two plots using ggplot2, I can graph them individually, but I want to overlay them to show a comparison. They have the same y axis. The y axis is a score from 0 to 100, the x axis is a specific date in the month (from a range of 3 weeks)
Here is what I have tried:
data <- read.table(text = Level5avg, header = TRUE)
data2 <- read.table(text = Level6avg, header = TRUE)
colnames(data) = c("x","y")
colnames(data2) = c("x","y")
ggplot(rbind(data.frame(data2, group="a"), data.frame(data, group="b")), aes(x=x,y=y)) +
stat_density2d(geom="tile", aes(fill = group, alpha=..density..), contour=FALSE) + scale_fill_manual(values=c("b"="#FF0000", "a"="#00FF00")) + geom_point() + theme_minimal()
When I do this, I get a strange graph that has several dots, but I'm not sure if my code is right, since I can't distinguish the data. I want to add 3 more (small) datasets to the plot, if it is possible. If it is possible, how do I make it into a line graph in order to distinguish the datasets?
Note: I was under the impression ggplot would work for my purposes because of this post (and several other posts on this site advised using ggplot as opposed to Lattice). I'm not sure if what I want is possible, so I came here.
Data sets:
dput(data) structure(list(x = structure(1:6, .Label = c("10/27/2015",
"10/28/2015",
"10/29/2015", "10/30/2015", "10/31/2015", "11/1/2015"), class = "factor"),
y = c(0, 12.5, 0, 0, 11, 43)), .Names = c("x", "y"), class = "data.frame",
row.names = c(NA, -6L))
dput(data2) structure(list(x = structure(1:3, .Label
=c("10/28/2015","10/31/2015",
"11/1/2015"), class = "factor"), y = c(0, 0, 41.5)), .Names = c("x",
"y"), class = "data.frame", row.names = c(NA, -3L))
I've now managed to get my overlay, but is there a way to organize the horizontal axis? The dates have no order.
It seems to me that the answer that you are basing your plots on uses density plots that are not useful for your data. If you are just looking for some line plots with points, you could do the following (note I created a dataframe outside of the ggplot() call to make it look a little cleaner):
data$group <- "b"
data2$group <- "a"
df <- rbind(data2,data)
df$x <- as.Date(df$x,"%m/%d/%Y")
ggplot(df,aes(x=x,y=y,group=group,color=group)) + geom_line() +
geom_point() + theme_minimal()
Note that by converting the date, the dates end up in the right order all on their own.

ggplot scatter plot of two groups with superimposed means with X and Y error bars

How can I generate a ggplot2 scatterplot of two groups with the means indicated together with X and Y error bars, like this?
Here is a reduced example (using dput to recreate the data.frame df) with two groups of cells and three measures, and I'd like to say plot Peak against Rise, or Peak against Decay. That much is straightforward, but I would like to add points indicating the group means with X and Y error bars (+/- sem).
Is there a way to do this within ggplot2, or do I need to generate means and sem values first? This post draw my attention to geom_errorbarh but I'm still uncertain as to the best way to proceed.
library(ggplot2)
df<-structure(list(Group = c("A", "A", "A", "A", "A", "A", "A",
"A", "B", "B", "B", "B", "B", "B", "B", "B"), Peak = c(102.975,
37.805, 64.996, 66.36, 199.354, 7.425, 34.137, 366.59, 10.165,
14.833, 702.525, 39.086, 8.286, 122.783, 105.762, 37.018), Rise = c(0.346855,
0.24165, 0.24028, 0.461548, 0.194016, 0.164047, 0.484375, 0.307861,
0.438538, 0.488083, 0.549423, 0.365448, 0.511551, 0.33596, 0.331467,
0.270096), Decay = c(1.3874, 1.07407, 1.88787, 2.64408, 1.1462,
0.615963, 4.04641, 1.48701, 3.61397, 4.1838, 1.92746, 3.64329,
4.21354, 0.812695, 1.14611, 1.28279)), .Names = c("Group",
"Peak", "Rise", "Decay"), class = "data.frame", row.names = c(NA,
-16L))
ggplot(df, aes(Peak, Rise)) +
geom_point(aes(colour=Group)) +
theme_bw(14)
I have tried something like:
library(doBy)
sem <- function(x) sqrt(var(x)/length(x))
z<-summaryBy(Peak+Rise+Decay~Group, data=df, FUN=c(mean,sem))
z
to get the values, but easily (and flexibly) incorporating them into the ggplot code is defeating me.
I tend to use plyr for these kinds of summaries:
z <- ddply(df,.(Group),summarise,
Peak = mean(Peak),
Rise = mean(Rise),
PeakSE = sqrt(var(Peak))/length(Peak),
RiseSE = sqrt(var(Rise))/length(Rise))
ggplot(df,aes(x = Peak,y = Rise)) +
geom_point(aes(colour = Group)) +
geom_point(data = z,aes(colour = Group)) +
geom_errorbarh(data = z,aes(xmin = Peak - PeakSE,xmax = Peak + PeakSE,y = Rise,colour = Group,height = 0.01)) +
geom_errorbar(data = z,aes(ymin = Rise - RiseSE,ymax = Rise + RiseSE,x = Peak,colour = Group))
I confess I was a little disappointed that I had to manually tweak the crossbar height. But thinking about it, I guess that could be fairly challenging to implement.

Resources