Density Plot in R for 5 variables - r

I am trying to plot 5 variables. however, I only see one colour. I am not really sure how can I display different colour for each variable
my data looks like
> data
Lead_1 Lead_2 Lead_3 Lead_4 Lead_5
1 138 135 128 125 130
2 126 130 133 131 128
3 120 121 126 130 129
4 129 126 121 115 110
5 142 153 160 167 179
6 305 299 294 291 283
dim(data)
[1] 8517 5
enter image description here
enter image description here
data <- read.table("5leads.csv", header=TRUE, sep=",")
data
dat <- stack(data)
ggplot(dat, aes(x = values, fill = ind)) + geom_density(alpha = 0.25)

Try this approach using both color and fill as mentioned in the comments by #Punintended:
library(ggplot2)
#Code
dat <- stack(data)
ggplot(dat, aes(x = values, fill = ind,color=ind)) + geom_density(alpha = 0.15)
Output:
Or this:
#Code 2
dat <- stack(data)
ggplot(dat, aes(x = values, fill = ind,color=ind)) + geom_density(alpha = 1.5)
Output:

I tried to use
ggplot(dat, aes(x = values, fill = ind,color=ind)) + geom_density(alpha = 0.15)
and this is what I am getting
enter image description here

Related

Plotting each value of columns for a specific row

I am struggling to plot a specific row from a dataframe. Below is the Graph i am trying to plot. I have tried using ggplot and normal plot but i cannot figure it out.
Wt2 Wt3 Wt4 Wt5 Lngth2 Lngth3 Lngth4 Lngth5
1 48 59 95 82 141 157 168 183
2 59 68 102 102 140 168 174 170
3 61 77 93 107 145 162 172 177
4 54 43 104 104 146 159 176 171
5 100 145 185 247 150 158 168 175
6 68 82 95 118 142 140 178 189
7 68 95 109 111 139 171 176 175
Above is the Data frame I am trying to plot with. The rows are for each bears measurement. So row 1 is for bear 1. How would I plot only the Wt columns for bear 1 against an X-axis that goes from years 2 to 5
You can pivot your data frame into a longer format:
First add a column with the row number (bear number):
df = cbind("Bear"=as.factor(1:nrow(df)), df)
It needs to be factor so we can pass it as a group variable to ggplot. Now pivot:
df2 = tidyr::pivot_longer(df[,1:5], cols=2:5,
names_to="Year", values_to="Weight", names_prefix="Wt")
df2$Year = as.numeric(df2$Year)
We ignore the Length columns with df[,1:5]; say that we only want to pivot the weight columns with df[,2:5]; then say the name of the columns we want to create with names_to and values_to; and lastly the names_prefix="Wt" removes the "Wt" before the column names, leaving only the year number, but we get a character, so we need to make it numeric with as.numeric().
Then plot:
ggplot(df2, aes(x=Year, y=Weight, linetype=Bear)) + geom_line()
Output (Ps: i created my own data, so the actual numbers are off):
Just an addition, if you don't want to specify the columns of your dataset explicity, you can do:
df2 = df2[,grep("Wt|Bear", colnames(df)]
df2 = tidyr::pivot_longer(df2, cols=grep("Wt", colnames(df2)),
names_to="Year", values_to="Weight", names_prefix="Wt")
Edit: one plot for each group
You can use facet_wrap:
ggplot(df2, aes(x=Year, y=Weight, linetype=Bear)) +
facet_wrap(~Bear, nrow=2, ncol=4) +
geom_line()
Output:
You can change the nrow and ncol as you wish, and can remove the linetype from aes() as you already have a differenciation, but it's not mandatory.
You can also change the levels of the categorical data to make the labels on each graph better, do levels(df2$Bear) = paste("Bear", 1:7) for example (or do that the when creating it).
Try
ggplot(mapping = aes(x = seq.int(2, 5), y = c(48, 59, 95, 82))) +
geom_point(color = "blue") +
geom_line(color = "blue") +
xlab("Year") +
ylab("Weight")

Re-order group chart same as the input

I have an input data and i would like to create a grouped chart, but when I finish the creation the problem is the order is different from the input, it arranged it as alphabetical, plus I would like to change the font style to italic, for the species names only.
> data <- read.table(
+ text = "Superfamily Drom Bactria Feru Paos
+ ERV 294 224 206 202
+ ERVL-MaLR 103 108 184 231
+ Gypsy 274 187 413 215
+ Pao 6 2 7 4
+ DIRS/Ngaro 15 14 45 25
+ Unknown 26 23 23 37
+ Undefined 76 77 80 95",
+ header = TRUE
+ )
> data
Superfamily Drom Bactria Feru Paos
1 ERV 294 224 206 202
2 ERVL-MaLR 103 108 184 231
3 Gypsy 274 187 413 215
4 Pao 6 2 7 4
5 DIRS/Ngaro 15 14 45 25
6 Unknown 26 23 23 37
7 Undefined 76 77 80 95
> data_long <- gather(data,
+ key = "Species",
+ value = "Distrubution",
+ -Superfamily)
> ggplot(data_long, aes(fill=Superfamily, y=Distrubution, x=Species)) + geom_bar(position="dodge2", stat="identity")
I would like to build the chart as the same as the input order, and italic font style to the species name only ex ( Drom Bactria ....)
I think this is what you're asking for
data_long$Species <- factor(data_long$Species, levels = unique(data_long$Species))
ggplot(data_long, aes(fill=Superfamily, y=Distrubution, x=Species)) + geom_bar(position="dodge2", stat="identity") + theme(axis.text.x = element_text(face = "italic"))
If ggplot recieves a factor, it will use the level-order as the axis order.
When it comes to the fonts, you change that in the theme argument.
--edit--
To get the superfamily in the same order as input, you would have to create a factor as we did with the species-name.
data_long$Superfamily<- factor(data_long$Superfamily, levels = data$Superfamily)
Forgoing the use of the readxl-package to read the excel sheet into R, this should work to change the species name:
colnames(data)[2:5] <- c("Alpha Drom", "Beta Bactria", "Gamma Feru", "Delta Paos")
Add this line before you create data_long.

Is there a way to add a legend to a multiple line graph including the density function? [duplicate]

This question already has answers here:
Add legend to ggplot2 line plot
(4 answers)
Closed 3 years ago.
I want to add a legend to a line diagramm with multiple lines, every line was created with a geom_density function. I can't find any solution how to do so.
# This is my code:
ggplot(Flugzeiten, aes(x = Falconidae)) +
scale_x_continuous (breaks=c(0, 20, 40, 60, 80, 100, 120, 140, 160, 180, 200, 220)) +
geom_density(kernel="gaussian", size=1.2, color = "blue") +
geom_density(aes(x = Milvus), kernel="gaussian", color = "red", size=1.2) +
geom_density(aes(x = Buteo), kernel="gaussian", color = "green", size=1.2) +
geom_density(aes(x = Gesamt), kernel="gaussian", color = "black", size=1, linetype ="dotted") +
theme(axis.text.y=element_blank()) +
labs(x = "Fluglänge (s)", y = "Häufigkeitsverteilung", title = "Aufenthalte im Gefahrenbereich nach Flugzeit")```
# This is my Data:
# A tibble: 39 x 4
Falconidae Buteo Milvus Gesamt
<dbl> <dbl> <dbl> <dbl>
1 59 63 117 117
2 112 197 97 97
3 1 75 156 156
4 32 67 142 142
5 68 115 52 52
6 22 115 41 41
7 28 26 155 155
8 NA 74 159 159
9 NA 4 111 111
10 NA 73 84 84
The problem you're having is because you're working with a wide table, instead of a long one. Wide tables are nice to input values in a spreadsheet, but aren't the best for the serious analysis you intend to do.
So, the first thing is to convert your wide data to long format. Since you didn't provide any data, I'll make some dummy:
# create dummy data
a <- data.frame(x = sample(1:100, 10), y = sample(1:100, 10), z = sample(1:100, 10))
# convert to data.table so it can be reshaped with melt:
library(data.table)
setDT(df)
# reshape the data:
newA <- melt(a) #ignore the warning
# plot it the right way:
library(ggplot2)
ggplot(newA, aes(x = value, color = variable))+geom_density()
From there on, you can start doing the cosmetics to axis, labels, etc.
It produces:

ggplot facets: show annotated text in selected facets

I want to create a 2 by 2 faceted plot with a vertical line shared by the four facets. However, because the facets on top have the same date information as the facets at the bottom, I only want to have the vline annotated twice: in this case in the two facets at the bottom.
I looked a.o. here, which does not work for me. (In addition I have my doubts whether this is still valid code, today.) I also looked here. I also looked up how to influence the font size in geom_text: according to the help pages this is size. In the case below it doesn't work out well.
This is my code:
library(ggplot2)
library(tidyr)
my_df <- read.table(header = TRUE, text =
"Date AM_PM First_Second Systolic Diastolic Pulse
01/12/2017 AM 1 134 83 68
01/12/2017 PM 1 129 84 76
02/12/2017 AM 1 144 88 56
02/12/2017 AM 2 148 93 65
02/12/2017 PM 1 131 85 59
02/12/2017 PM 2 129 83 58
03/12/2017 AM 1 153 90 62
03/12/2017 AM 2 143 92 59
03/12/2017 PM 1 139 89 56
03/12/2017 PM 2 141 86 56
04/12/2017 AM 1 140 87 58
04/12/2017 AM 2 135 85 55
04/12/2017 PM 1 140 89 67
04/12/2017 PM 2 128 88 69
05/12/2017 AM 1 134 99 67
05/12/2017 AM 2 128 90 63
05/12/2017 PM 1 136 88 63
05/12/2017 PM 2 123 83 61
")
# setting the classes right
my_df$Date <- as.Date(as.character(my_df$Date), format = "%d/%m/%Y")
my_df$First_Second <- as.factor(my_df$First_Second)
# to tidy format
my_df2 <- gather(data = my_df, key = Measure, value = Value,
-c(Date, AM_PM, First_Second), factor_key = TRUE)
# Measures in 1 facet, facets split over AM_PM and First_Second
## add anntotations column for geom_text
my_df2$Annotations <- rep("", 54)
my_df2$Annotations[c(4,6)] <- "Start"
p2 <- ggplot(data = my_df2) +
ggtitle("Blood Pressure and Pulse as a function of AM/PM,\n Repetition, and date") +
geom_line(aes(x = Date, y = Value, col= Measure, group = Measure), size = 1.) +
geom_point(aes(x = Date, y = Value, col= Measure, group = Measure), size= 1.5) +
facet_grid(First_Second ~ AM_PM) +
geom_vline(aes(xintercept = as.Date("2017/12/02")), linetype = "dashed",
colour = "darkgray") +
theme(axis.text.x=element_text(angle = -90))
p2
yields this graph:
This is the basic plot from which I start. Now we try to annotate it.
p2 + annotate(geom="text", x = as.Date("2017/12/02"), y= 110, label="start", size= 3)
yielding this plot:
This plot has the problem that the annotation occurs 4 times, while we only want it in the bottom parts of the graph.
Now we use geom_text which will use the "Annotations" column in our dataframe, in line with this SO Question. Be carefull, the column added to the dataframe must be present when you create "p2", the first time (that is why we added the column supra)
p2 + geom_text(aes(x=as.Date("2017/12/02"), y=100, label = Annotations, size = .6))
yielding this plot:
Yes, we succeeded in getting the annotation only in the bottom two parts of the graph. But the font is too big ( ... and ugly) and when we try to correct it with size, two things are interesting: (1) the font size is not changed (although you would expect that from the help pages) and (2) a legend is added.
I have been clicking around a lot and have been unable to solve this after hours and hours. Any help would be appreciated.

ggplot changing colors of bar plot

I came across this R script that use ggplot:
dat <- read.table(text = "A B C D E F G
1 480 780 431 295 670 360 190
2 720 350 377 255 340 615 345
3 460 480 179 560 60 735 1260
4 220 240 876 789 820 100 75", header = TRUE)
library(reshape2)
dat$row <- seq_len(nrow(dat))
dat2 <- melt(dat, id.vars = "row")
library(ggplot2)
ggplot(dat2, aes(x=variable, y=value, fill=row)) +
geom_bar(stat="identity") +
xlab("\nType") +
ylab("Time\n") +
guides(fill=FALSE) +
theme_bw()
That was what I've been looking for. However, I could not:
change the default colours (for example, I tried to use the "RdYlGn"
palette)
convert the raw values to frequencies.
Any suggestions?
You could try this:
library(reshape2)
library(dplyr)
library(ggplot2)
library(ggplot2)
dat%>%
melt(id.vars = "row",variable.name = "grp")%>%
group_by(grp)%>%
mutate(tot=sum(value), fq=value/tot)%>%
ggplot(aes(x=grp,y=fq,fill=row,label = sprintf("%.2f%%", fq*100)))+
geom_bar(stat = "identity")+
geom_text(size = 3, position = position_stack(vjust = 0.5))+
xlab("\nType") +
ylab("Time\n") +
guides(fill=FALSE) +
scale_fill_distiller(palette = "RdYlGn")+
theme_bw()

Resources