Draw plot for comparing each row? [closed] - r

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I would like to draw a plot for following table.
T6 T26 D6 D26
ENSMUSG00000026427 420 170 197 249
ENSMUSG00000026436 27 21 54 77
ENSMUSG00000018189 513 246 429 484
ENSMUSG00000026470 100 55 82 73
ENSMUSG00000026696 147 73 182 283
ENSMUSG00000026568 3620 1571 1264 1746
ENSMUSG00000026504 95 60 569 428
I want to compare each row and specified each column by different colour.
X.lab= Gene name
y.Lab= Counts

I think that the appropriate plotting choice depends on the characteristics of your full dataset, and from what I can tell, on the number of possible unique values of IDs ("ENSMUSG*") and the possible number of variables ("T26", "D26", ...). What is clear however, is that the variables have different scales, so should not be combined on the same plot, and so I have chosen a faceted grid plot below.
Here is some code that makes an appropriate choice based on the sample of the data that you have chosen to show us:
library(readr)
library(dplyr)
library(tidyr)
df_foo = read.table(textConnection(
"T6 T26 D6 D26
ENSMUSG00000026427 420 170 197 249
ENSMUSG00000026436 27 21 54 77
ENSMUSG00000018189 513 246 429 484
ENSMUSG00000026470 100 55 82 73
ENSMUSG00000026696 147 73 182 283
ENSMUSG00000026568 3620 1571 1264 1746
ENSMUSG00000026504 95 60 569 428"
))
# plot the data
df_foo %>%
add_rownames(var = "ID") %>%
gather(key = Variable, value = Value, -ID) %>%
ggplot(aes(x = ID, y = Value, fill = Variable)) +
geom_bar(stat = "identity") +
theme_bw() +
facet_wrap(~ Variable, scales = "free_y") +
theme(axis.text.x = element_text(angle = 50, hjust = 1))
# save the plot
ggsave("results/faceted_bar.png", dpi = 600)
Note that making the color aesthetic above is strictly not required given that we are faceting by Variable anyway. Here is what the above code produces:
It can be easily argued that this is not the appropriate chart for your data given more context and knowledge about your data. You should add more detail to the question as others have commented.

Related

Show columns as percentage in R ggplot

I need help with a graph I am trying to built in R.
This is the data:
Location
Total Number of Employees
Local Number
Remote Number
L1
150
50
100
L2
355
148
207
L3
477
106
371
L4
234
82
152
L5
987
523
464
L6
4564
2504
2060
L7
2342
1425
917
L8
754
415
339
And this is what I am aiming for
[1]: https://i.stack.imgur.com/eoVxL.jpg
So, basically I want to present the "Total Number of Employees" column in a 0-100% range and since L6 has the highest number of employees, 4564 should be 100%. The legend should show the local and remote number, where the "Local" column should be shown in the positive grid and the "Remote" column in the negative one. The locations should be ordered from min to max.
Something like this?
library(dplyr)
library(tidyr)
library(ggplot2)
df %>%
mutate(across(Local.Number:Remote.Number, ~ .x / max(Total.Number.of.Employees)),
Remote.Number = -Remote.Number) %>%
pivot_longer(-c(Location, Total.Number.of.Employees)) %>%
ggplot() +
aes(x = Location, y = value, fill = name) +
geom_col() +
scale_y_continuous(labels = scales::label_percent()) +
theme_bw()

Plotting each value of columns for a specific row

I am struggling to plot a specific row from a dataframe. Below is the Graph i am trying to plot. I have tried using ggplot and normal plot but i cannot figure it out.
Wt2 Wt3 Wt4 Wt5 Lngth2 Lngth3 Lngth4 Lngth5
1 48 59 95 82 141 157 168 183
2 59 68 102 102 140 168 174 170
3 61 77 93 107 145 162 172 177
4 54 43 104 104 146 159 176 171
5 100 145 185 247 150 158 168 175
6 68 82 95 118 142 140 178 189
7 68 95 109 111 139 171 176 175
Above is the Data frame I am trying to plot with. The rows are for each bears measurement. So row 1 is for bear 1. How would I plot only the Wt columns for bear 1 against an X-axis that goes from years 2 to 5
You can pivot your data frame into a longer format:
First add a column with the row number (bear number):
df = cbind("Bear"=as.factor(1:nrow(df)), df)
It needs to be factor so we can pass it as a group variable to ggplot. Now pivot:
df2 = tidyr::pivot_longer(df[,1:5], cols=2:5,
names_to="Year", values_to="Weight", names_prefix="Wt")
df2$Year = as.numeric(df2$Year)
We ignore the Length columns with df[,1:5]; say that we only want to pivot the weight columns with df[,2:5]; then say the name of the columns we want to create with names_to and values_to; and lastly the names_prefix="Wt" removes the "Wt" before the column names, leaving only the year number, but we get a character, so we need to make it numeric with as.numeric().
Then plot:
ggplot(df2, aes(x=Year, y=Weight, linetype=Bear)) + geom_line()
Output (Ps: i created my own data, so the actual numbers are off):
Just an addition, if you don't want to specify the columns of your dataset explicity, you can do:
df2 = df2[,grep("Wt|Bear", colnames(df)]
df2 = tidyr::pivot_longer(df2, cols=grep("Wt", colnames(df2)),
names_to="Year", values_to="Weight", names_prefix="Wt")
Edit: one plot for each group
You can use facet_wrap:
ggplot(df2, aes(x=Year, y=Weight, linetype=Bear)) +
facet_wrap(~Bear, nrow=2, ncol=4) +
geom_line()
Output:
You can change the nrow and ncol as you wish, and can remove the linetype from aes() as you already have a differenciation, but it's not mandatory.
You can also change the levels of the categorical data to make the labels on each graph better, do levels(df2$Bear) = paste("Bear", 1:7) for example (or do that the when creating it).
Try
ggplot(mapping = aes(x = seq.int(2, 5), y = c(48, 59, 95, 82))) +
geom_point(color = "blue") +
geom_line(color = "blue") +
xlab("Year") +
ylab("Weight")

Change Bar Colours in a Grouped Bar Plot

My data consist of numerical values between 100 - 2000 grouped into 3 different drug treatment groups, which are then subdivided into 3 groups (based on their anatomical location in an organism, termed "Inner", "Middle", "Outer"). The final plot should be 3 groups of 3 bars (each representing the mean values of cell survival in each of the 3 locations). So far I have managed to make individual barplots, but I want to combine them. Here is some code that I have, and below that is a small excerpt from the data set.
Treatment Inner Middle Outer
RAD 317 373 354
RAD 323 217 174
RAD 236 255 261
HUTS 1411 1844 1978
HUTS 1922 1756 1856
HUTS 1478 1711 1433
RGD 1433 1489 1633
RGD 1400 1500 1544
RGD 1222 1333 1444
With some help, I've been able to create a grouped bar plot using the code:
df %>%
gather(key = group, value = value, -Treatment) %>%
ggplot(aes(x = Treatment, y = value, fill = group)) +
stat_summary(fun.y = mean, geom = "col", position = position_dodge())
Now, however, I want to be able to choose the colours of the bars.
Any help would be really appreciated!

ggplot facets: show annotated text in selected facets

I want to create a 2 by 2 faceted plot with a vertical line shared by the four facets. However, because the facets on top have the same date information as the facets at the bottom, I only want to have the vline annotated twice: in this case in the two facets at the bottom.
I looked a.o. here, which does not work for me. (In addition I have my doubts whether this is still valid code, today.) I also looked here. I also looked up how to influence the font size in geom_text: according to the help pages this is size. In the case below it doesn't work out well.
This is my code:
library(ggplot2)
library(tidyr)
my_df <- read.table(header = TRUE, text =
"Date AM_PM First_Second Systolic Diastolic Pulse
01/12/2017 AM 1 134 83 68
01/12/2017 PM 1 129 84 76
02/12/2017 AM 1 144 88 56
02/12/2017 AM 2 148 93 65
02/12/2017 PM 1 131 85 59
02/12/2017 PM 2 129 83 58
03/12/2017 AM 1 153 90 62
03/12/2017 AM 2 143 92 59
03/12/2017 PM 1 139 89 56
03/12/2017 PM 2 141 86 56
04/12/2017 AM 1 140 87 58
04/12/2017 AM 2 135 85 55
04/12/2017 PM 1 140 89 67
04/12/2017 PM 2 128 88 69
05/12/2017 AM 1 134 99 67
05/12/2017 AM 2 128 90 63
05/12/2017 PM 1 136 88 63
05/12/2017 PM 2 123 83 61
")
# setting the classes right
my_df$Date <- as.Date(as.character(my_df$Date), format = "%d/%m/%Y")
my_df$First_Second <- as.factor(my_df$First_Second)
# to tidy format
my_df2 <- gather(data = my_df, key = Measure, value = Value,
-c(Date, AM_PM, First_Second), factor_key = TRUE)
# Measures in 1 facet, facets split over AM_PM and First_Second
## add anntotations column for geom_text
my_df2$Annotations <- rep("", 54)
my_df2$Annotations[c(4,6)] <- "Start"
p2 <- ggplot(data = my_df2) +
ggtitle("Blood Pressure and Pulse as a function of AM/PM,\n Repetition, and date") +
geom_line(aes(x = Date, y = Value, col= Measure, group = Measure), size = 1.) +
geom_point(aes(x = Date, y = Value, col= Measure, group = Measure), size= 1.5) +
facet_grid(First_Second ~ AM_PM) +
geom_vline(aes(xintercept = as.Date("2017/12/02")), linetype = "dashed",
colour = "darkgray") +
theme(axis.text.x=element_text(angle = -90))
p2
yields this graph:
This is the basic plot from which I start. Now we try to annotate it.
p2 + annotate(geom="text", x = as.Date("2017/12/02"), y= 110, label="start", size= 3)
yielding this plot:
This plot has the problem that the annotation occurs 4 times, while we only want it in the bottom parts of the graph.
Now we use geom_text which will use the "Annotations" column in our dataframe, in line with this SO Question. Be carefull, the column added to the dataframe must be present when you create "p2", the first time (that is why we added the column supra)
p2 + geom_text(aes(x=as.Date("2017/12/02"), y=100, label = Annotations, size = .6))
yielding this plot:
Yes, we succeeded in getting the annotation only in the bottom two parts of the graph. But the font is too big ( ... and ugly) and when we try to correct it with size, two things are interesting: (1) the font size is not changed (although you would expect that from the help pages) and (2) a legend is added.
I have been clicking around a lot and have been unable to solve this after hours and hours. Any help would be appreciated.

How ask R not to combine the X axis values for a bar chart?

I am a beginner with R . My data looks like this:
id count date
1 210 2009.01
2 400 2009.02
3 463 2009.03
4 465 2009.04
5 509 2009.05
6 861 2009.06
7 872 2009.07
8 886 2009.08
9 725 2009.09
10 687 2009.10
11 762 2009.11
12 748 2009.12
13 678 2010.01
14 699 2010.02
15 860 2010.03
16 708 2010.04
17 709 2010.05
18 770 2010.06
19 784 2010.07
20 694 2010.08
21 669 2010.09
22 689 2010.10
23 568 2010.11
24 584 2010.12
25 592 2011.01
26 548 2011.02
27 683 2011.03
28 675 2011.04
29 824 2011.05
30 637 2011.06
31 700 2011.07
32 724 2011.08
33 629 2011.09
34 446 2011.10
35 458 2011.11
36 421 2011.12
37 459 2012.01
38 256 2012.02
39 341 2012.03
40 284 2012.04
41 321 2012.05
42 404 2012.06
43 418 2012.07
44 520 2012.08
45 546 2012.09
46 548 2012.10
47 781 2012.11
48 704 2012.12
49 765 2013.01
50 571 2013.02
51 371 2013.03
I would like to make a bar graph like graph that shows how much what is the count for each date (dates in format of Month-Y, Jan-2009 for instance). I have two issues:
1- I cannot find a good format for a bar-char like graph like that
2- I want all of my data-points to be present in X axis(date), while R aggregates it to each year only (so I inly have four data-points there). Below is the current command that I am using:
plot(df$date,df$domain_count,col="red",type="h")
and my current plot is like this:
Ok, I see some issues in your original data. May I suggest the following:
Add the days in your date column
df$date=paste(df$date,'.01',sep='')
Convert the date column to be of date type:
df$date=as.Date(df$date,format='%Y.%m.%d')
Plot the data again:
plot(df$date,df$domain_count,col="red",type="h")
Also, may I add one more suggestion, have you used ggplot for ploting chart? I think you will find it much easier and resulting in better looking charts. Your example could be visualized like this:
library(ggplot2) #if you don't have the package, run install.packages('ggplot2')
ggplot(df,aes(date, count))+geom_bar(stat='identity')+labs(x="Date", y="Count")
First, you should transform your date column in a real date:
library(plyr) # for mutate
d <- mutate(d, month = as.numeric(gsub("[0-9]*\\.([0-9]*)", "\\1", as.character(date))),
year = as.numeric(gsub("([0-9]*)\\.[0-9]*", "\\1", as.character(date))),
Date = ISOdate(year, month, 1))
Then, you could use ggplot to create a decent barchart:
library(ggplot2)
ggplot(d, aes(x = Date, y = count)) + geom_bar(fill = "red", stat = "identity")
You can also use basic R to create a barchart, which is however less nice:
dd <- setNames(d$count, format(d$Date, "%m-%Y"))
barplot(dd)
The former plot shows you the "holes" in your data, i.e. month where there is no count, while for the latter it is even wuite difficult to see which bar corresponds to which month (this could however be tweaked I assume).
Hope that helps.

Resources