R scatter plot by shape, colour and fill - r

I'm very new to R and I'm trying to build a scatter plot that codes my data according to shape, colour and fill.I want 5 different colours, 3 different shapes, and these to be either filled or not filled (in an non filled point, I would still want the shape and the colour).
My data looks basically like this:
blank.test <- read.table(header=T, text="Colour Shape Fill X13C X15N
1 B B A 16 10
2 D A A 16 12
3 E A B 17 14
4 C A A 14 18
5 A A B 13 18
6 C B B 18 13
7 E C B 10 12
8 E A B 11 10
9 A C B 14 13
10 B A A 11 14
11 C B A 11 10
12 E B A 11 19
13 A B A 10 18
14 A C B 17 16
15 E B A 16 13
16 A C A 16 14")
If I do this:
ggplot(blank.test, aes(x=X13C, y=X15N,size=5)) +
geom_point(aes(shape=Shape,fill=Fill,color=Colour))
I get no filled or unfilled data points
I did a little a little research and it looked like the problem was with the symbols themselves, which cannot take different settings for line and fill; it was recommended I used shapes pch between 21 and 25
But if I do this:
ggplot(blank.test, aes(x=X13C, y=X15N,color=(Colour), shape=(Shape),fill=(Fill),size=5)) +
geom_point() + scale_shape_manual(values=c(21,22,25))`
I still don't get what I want
I also tried playing around with scale_fill_manual without any good result.

I don't think you can use fill for points. What I would do is create an interaction between fill and shape and use this new factor to define your shape and fill/open symbols
blank.test$inter <- with(blank.test, interaction(Shape, Fill))
and then for your plot I would use something like that
ggplot(blank.test, aes(x=X13C, y=X15N)) +
geom_point(aes(shape=inter,color=Colour)) + scale_shape_manual(name="shape", values=c(0,15,1, 16, 2, 17)) + scale_color_manual(name="colour", values=c("red","blue","yellow", "green", "purple"))

I can get the plot to work just fine, but the legend seems to absolutely insist on being black for fill. I can't figure out why. Maybe someone else has the answer to that one.
The 5 being on the legend is cause by having it inside the aes, where only elements that change with your data belong.
Here is some example code:
ggplot(blank.test, aes(x = X13C, y = X15N, color = Colour, shape = Shape, fill = Fill)) +
geom_point(size = 5, stroke = 3) +
scale_shape_manual(values=c(21,22,25)) +
scale_color_brewer(palette = "Set2") +
scale_fill_brewer(palette = "Set1") +
theme_bw()

Related

Plot boxplots over time using multiple categories

I am sorry for the header I was not so sure how to ask about it.
I have a data frame that looks like this.
Sample=c("A","A", "A", "B","B","B","A","A", "A", "B","B","B","A","A", "A", "B","B","B","A","A", "A", "B","B","B")
Treatment=c("twiter","twiter","twiter","twiter","twiter","twiter","facebook","facebook","facebook","facebook","facebook","facebook",
"twiter","twiter","twiter","twiter","twiter","twiter","facebook","facebook","facebook","facebook","facebook","facebook")
replicate=c(1,2,3,1,2,3,1,2,3,1,2,3,1,2,3,1,2,3,1,2,3,1,2,3)
time=c( 10,10,10,10,10,10,10,10,10,10,10,10,20,20,20,20,20,20,20,20,20,20,20,20)
points=c(20,40,80,20,60,120, 30,100,55, 28, 45,90, 80,20,100, 40,90,56,20,30,12,3,5,8)
length(points)
Sample Treatment replicate time points
1 A twiter 1 10 20
2 A twiter 2 10 40
3 A twiter 3 10 80
4 B twiter 1 10 20
5 B twiter 2 10 60
6 B twiter 3 10 120
7 A facebook 1 10 30
8 A facebook 2 10 100
9 A facebook 3 10 55
10 B facebook 1 10 28
11 B facebook 2 10 45
12 B facebook 3 10 90
13 A twiter 1 20 80
14 A twiter 2 20 20
15 A twiter 3 20 100
16 B twiter 1 20 40
17 B twiter 2 20 90
18 B twiter 3 20 56
19 A facebook 1 20 20
20 A facebook 2 20 30
21 A facebook 3 20 12
22 B facebook 1 20 3
23 B facebook 2 20 5
24 B facebook 3 20 8
I would like to plot my data using boxplots at each time point.
I would like to have one box plot that shows Sample A with "twiter" Sample A with "facebook"
Sample "B" with "twiter" and Sample B with "facebook" at time point 10 and the same at time point 20.
So far I can do something like this.
ggplot(data,aes(x=time, y=points,color=Sample, fill=Sample, group=interaction(Sample,Treatment)), alpha=0.1) +
geom_boxplot(alpha=0.1) +
geom_point(position = position_dodge(width=0.75), alpha=0.2)+
theme_bw()
But this is wrong I would like to have the sample A, and B from the two different treatments next to each other at each time point to have a look at the differences. I don't want to use facet_wrap. It is a challenge for me. Thank you for your time
Turning my comment into an answer: your issue is that group=interaction(Sample,Treatment) overrides the grouping by the x-axis (time) that would normally be done. To include time in the grouping, add it to the interaction:
ggplot(data,
aes(
x = time,
y = points,
color = Sample,
fill = Sample,
group = interaction(Sample, Treatment, time)
),
alpha = 0.1) +
geom_boxplot(alpha = 0.1) +
geom_point(position = position_dodge(width = 0.75), alpha = 0.2) +
theme_bw()
Of course, the issue remains that there's no way to tell which box goes with which treatment, but I'll leave that to you to address.
Try this:
library(dplyr)
library(ggplot2)
#Plot
data %>%
arrange(Sample) %>%
mutate(Var=paste(Sample,Treatment),
Var=factor(Var,levels = unique(Var),ordered = T)) %>%
ggplot(aes(x=time,
y=points,
color=Var, fill=Var,
group=Var), alpha=0.1) +
geom_boxplot(alpha=0.1)+
geom_point(position = position_dodge(width=0.75), alpha=0.2)+
theme_bw()+
scale_color_manual(values=c('tomato','tomato','cyan3','cyan3'))+
scale_fill_manual(values=c('tomato','tomato','cyan3','cyan3'))
Output:
If you don't mind making time a factor, you can do the following. Note that I turned your data into a data frame named 'dat'.
dat <- data.frame(Sample=c("A","A", "A", "B","B","B","A","A", "A", "B","B","B","A","A", "A", "B","B","B","A","A", "A", "B","B","B"),
Treatment=c("twiter","twiter","twiter","twiter","twiter","twiter","facebook","facebook","facebook","facebook","facebook","facebook",
"twiter","twiter","twiter","twiter","twiter","twiter","facebook","facebook","facebook","facebook","facebook","facebook"),
replicate=c(1,2,3,1,2,3,1,2,3,1,2,3,1,2,3,1,2,3,1,2,3,1,2,3),
time=c( 10,10,10,10,10,10,10,10,10,10,10,10,20,20,20,20,20,20,20,20,20,20,20,20),
points=c(20,40,80,20,60,120, 30,100,55, 28, 45,90, 80,20,100, 40,90,56,20,30,12,3,5,8))
dat %>%
mutate(time = factor(time)) %>%
ggplot(aes(x=time, y=points, color=Sample, fill=Sample), alpha=0.1) +
geom_boxplot(alpha=0.1) +
geom_point(position = position_dodge(width=0.75), alpha=0.2)+
theme_bw()

How to order both positive and negative values in ggplot

How to create the ordered bar plot in ggplot2 with both positive and negative values. Here is the data:
down -11
down -10
down -9
down -6
up 6
up 6
up 6
up 6
up 7
up 7
up 8
up 8
up 8
up 8
up 8
up 8
up 8
up 10
up 10
up 11
up 11
up 12
up 14
up 14
up 21
up 21
up 24
I have tried this code:
ggplot(GO, aes(x = d1, y = order(d2), fill = factor(d1))) +
geom_bar(stat = "identity"‌​, position = "identity", width = 0.6)
This is not working.
I would like to order the plot. Can anybody please suggest some code.
Please check out my answer for a similar question. You should set your vector up in the order you want and then use +scale_y_discrete(limits = yourOrderedData) and it should plot in your order.

use ggplot to plot a panel of bar plots

I have a data frame which reads as below:
factor bin ret
1 beta 1 -0.026840807
2 beta 2 -0.051610137
3 beta 3 -0.044658901
4 beta 4 -0.053322048
5 beta 5 -0.060173704
6 size 1 -0.047448288
7 size 2 -0.045603776
8 size 3 -0.051804757
9 size 4 -0.047044614
10 size 5 -0.045720971
11 liquidity 1 -0.057657070
12 liquidity 2 -0.053105474
13 liquidity 3 -0.045501401
14 liquidity 4 -0.048572585
15 liquidity 5 -0.032209038
16 nonlinear 1 -0.045752503
17 nonlinear 2 -0.047673201
18 nonlinear 3 -0.051107792
19 nonlinear 4 -0.045364070
20 nonlinear 5 -0.047722148
21 btop 1 -0.004399745
22 btop 2 -0.035082069
23 btop 3 -0.054526058
24 btop 4 -0.063497535
25 btop 5 -0.077123859
I would like to plot a panel of charts which looks similar to this:
The difference is that the chart I would like to create would have the bin as the x- axis, and ret as the y- axis. And charts should be bar plot. Anyone could help me with this question?
FYI: The code for the sample plot I've included is:
print(ggplot(df, aes(date,value)) +ylab('return(bps)') + geom_line() + facet_wrap(~ series,ncol=input$numCol)+theme(strip.text.x = element_text(size = 20, colour = "red", angle = 0)))
I wonder if minor change to the code could solve my problem.
From you're description i'll assume this is what you're after
print(ggplot(df, aes(bin, ret)) +
ylab('return(bps)') +
geom_bar(stat="identity") +
facet_wrap(~ factor,ncol=2)+
theme(strip.text.x = element_text(size = 20, colour = "red", angle = 0)))

How to create a stacked bar chart from summarized data in ggplot2

I'm trying to create a stacked bar graph using ggplot 2. My data in its wide form, looks like this. The numbers in each cell are the frequency of responses.
activity yes no dontknow
Social events 27 3 3
Academic skills workshops 23 5 8
Summer research 22 7 7
Research fellowship 20 6 9
Travel grants 18 8 7
Resume preparation 17 4 12
RAs 14 11 8
Faculty preparation 13 8 11
Job interview skills 11 9 12
Preparation of manuscripts 10 8 14
Courses in other campuses 5 11 15
Teaching fellowships 4 14 16
TAs 3 15 15
Access to labs in other campuses 3 11 18
Interdisciplinary research 2 11 18
Interdepartamental projects 1 12 19
I melted this table using reshape2 and
melted.data(wide.data,id.vars=c("activity"),measure.vars=c("yes","no","dontknow"),variable.name="haveused",value.name="responses")
That's as far as I can get. I want to create a stacked bar chart with activities on the x axis, frequency of responses in the y axis, and each bar showing the distribution of the yes, nos and dontknows
I've tried
ggplot(melted.data,aes(x=activity,y=responses))+geom_bar(aes(fill=haveused))
but I'm afraid that's not the right solution
Any help is much appreciated.
You haven't said what it is that's not right about your solution. But some issues that could be construed as problems, and one possible solution for each, are:
The x axis tick mark labels run into each other. SOLUTION - rotate the tick mark labels;
The order in which the labels (and their corresponding bars) appear are not the same as the order in the original dataframe. SOLUTION - reorder the levels of the factor 'activity';
To position text inside the bars set the vjust parameter in position_stack to 0.5
The following might be a start.
# Load required packages
library(ggplot2)
library(reshape2)
# Read in data
df = read.table(text = "
activity yes no dontknow
Social.events 27 3 3
Academic.skills.workshops 23 5 8
Summer.research 22 7 7
Research.fellowship 20 6 9
Travel.grants 18 8 7
Resume.preparation 17 4 12
RAs 14 11 8
Faculty.preparation 13 8 11
Job.interview.skills 11 9 12
Preparation.of.manuscripts 10 8 14
Courses.in.other.campuses 5 11 15
Teaching.fellowships 4 14 16
TAs 3 15 15
Access.to.labs.in.other.campuses 3 11 18
Interdisciplinay.research 2 11 18
Interdepartamental.projects 1 12 19", header = TRUE, sep = "")
# Melt the data frame
dfm = melt(df, id.vars=c("activity"), measure.vars=c("yes","no","dontknow"),
variable.name="haveused", value.name="responses")
# Reorder the levels of activity
dfm$activity = factor(dfm$activity, levels = df$activity)
# Draw the plot
ggplot(dfm, aes(x = activity, y = responses, group = haveused)) +
geom_col(aes(fill=haveused)) +
theme(axis.text.x = element_text(angle = 90, hjust = 1, vjust = 0.25)) +
geom_text(aes(label = responses), position = position_stack(vjust = .5), size = 3) # labels inside the bar segments

How can I use stat_bin2d with pre-binned data?

I want to generate a stat_bin2d() plot but for pre-binned data;
i.e. Rather than raw points
x y
5 3
13 4
13 14
16 12
15 13
I instead have the data pre-binned with the corner points, in this case.
x y freq
0 0 1
0 10 0
10 0 1
10 10 3
I believe it might have something to do with the data param of stat_bin2d but i can't find any doco on this.
You can use geom_bin2d() (with an "identity" stat), or just directly draw rectangles.
dat <- data.frame(x=c(0,0,10,10), y=c(0,10,0,10), freq=c(1,0,1,3))
ggplot(dat) +
geom_bin2d(aes(xmin=x, ymin=y, xmax=x+10, ymax=y+10, fill=freq), stat="identity")
ggplot(dat) +
geom_rect(aes(xmin=x, ymin=y, xmax=x+10, ymax=y+10, fill=freq))

Resources