Circular line graph with groups - r

I have four dataframes that look like below:
X score.i score.ii score.iii mm
1: 1 -0.3958555 -0.3750726 -0.3378881 10
2: 2 -0.3954955 -0.3799290 -0.3400876 15
3: 3 -0.3962514 -0.3776692 -0.3401180 20
4: 4 -0.4033265 -0.3764099 -0.3436115 25
5: 5 -0.4035860 -0.3753792 -0.3426287 30
---
186: 186 -0.4041035 -0.3767158 -0.3419871 80
187: 187 -0.4040643 -0.3767881 -0.3417620 85
188: 188 -0.4052228 -0.3766468 -0.3436883 90
189: 189 -0.4047009 -0.3767359 -0.3431591 95
190: 190 -0.4061497 -0.3766785 -0.3433624 100
How can I plot a circular line graph with aes(x=mm, y=score.i) for these four such that there is a gap between the lines for each dataframe?

library(ggplot2)
library(dplyr)
library(tidyr)
df1 %>%
pivot_longer(-c(X, mm), names_to = "Variable", values_to = "Score") %>%
ggplot(., aes(x = mm, y = Score, color = Variable)) +
geom_line() +
coord_polar()
Data:
read.table(text =
"X score.i score.ii score.iii mm
1 -0.3958555 -0.3750726 -0.3378881 10
2 -0.3954955 -0.3799290 -0.3400876 15
3 -0.3962514 -0.3776692 -0.3401180 20
4 -0.4033265 -0.3764099 -0.3436115 25
5 -0.4035860 -0.3753792 -0.3426287 30
186 -0.4041035 -0.3767158 -0.3419871 80
187 -0.4040643 -0.3767881 -0.3417620 85
188 -0.4052228 -0.3766468 -0.3436883 90
189 -0.4047009 -0.3767359 -0.3431591 95
190 -0.4061497 -0.3766785 -0.3433624 100",
header = T, stringsAsFactors = F) -> df1

Related

Creating Boxplot in R

I have a table with data on the sales volumes of some products. I want to build several boxplots for each product. I.e. vertically I have sales volume and horizontally I have days. When building, I do not build boxplots in certain values. What is the reason for this?
Here is table:
Day Cottage cheese..pcs. Kefir..pcs. Sour cream..pcs.
1 1 99 103 111
2 2 86 101 114
3 3 92 100 116
4 4 87 112 120
5 5 86 104 111
6 6 88 105 122
7 7 88 106 118
Here is my code:
head(out1)# out1-the table above
boxplot(Day~Cottage cheese..pcs., data = out1)
Here is the result:
Try below:
# example data
out1 <- read.table(text = " Day Cottage.cheese Kefir Sour.cream
1 1 99 103 111
2 2 86 101 114
3 3 92 100 116
4 4 87 112 120
5 5 86 104 111
6 6 88 105 122
7 7 88 106 118", header = TRUE)
# reshape wide-to-long
outlong <- stats::reshape(out1, idvar = "Day", v.names = "value",
time = "product", times = colnames(out1)[2:4],
varying = colnames(out1)[2:4], direction = "long")
# then plot
boxplot(value~product, outlong)
In addition to the provided answer, if you desire to vertically have sales volume and horitontally have days (using the out1 data provided by zx8754).
library(tidyr)
library(data.table)
library(ggplot2)
#data from wide to long
dt <- pivot_longer(out1, cols = c("Kefir", "Sour.cream", "Cottage.cheese"), names_to = "Product", values_to = "Value")
#set dt to data.table object
setDT(dt)
#convert day from integer to a factor
dt[, Day := as.factor(Day)]
#ggplot
ggplot(dt, aes(x = Day, y = Value)) + geom_bar(stat = "identity") + facet_wrap(~Product)
facet_wrap provides separate graphs for the three products.
I created a bar chart here since boxplots would be useless in this case (every product has only one value each day)

ggplot showing a trend with more than 1 variables across y axis

I have a dataframe df where I need to see the comparison of the trend between weeks
df
Col Mon Tue Wed
1 47 164 163
2 110 168 5
3 31 146 109
4 72 140 170
5 129 185 37
6 41 77 96
7 85 26 41
8 123 15 188
9 14 23 163
10 152 116 82
11 118 101 5
Right now I can only plot 2 variables like below. But I need to see for Tuesday and Wednesday as well
ggplot(data=df,aes(x=Col,y=Mon))+geom_line()
You can either add a
geom_line(aes(x = Col, y = Mon), col = 1)
for each day, or you would need to restructure your data frame using a function like gather so your new columns are col, day, value. Without reformatting the data, your result would be
ggplot(data=df)+geom_line(aes(x=Col,y=Mon), col = 1) + geom_line(aes(x=Col,y=Tue), col = 2) + geom_line(aes(x=Col,y=Wed), col = 3)
with a restructure it would be
ggplot(data=df)+geom_line(aes(x=Col,y=Val, col = Day))
The standard way would be to get the data in long format and then plot
library(tidyverse)
df %>%
gather(key, value, -Col) %>%
ggplot() + aes(factor(Col), value, col = key, group = key) + geom_line()

Smoothing Lines in ggplot between all data point

I have a data.frame similar to this example
SqMt <- "Sex Sq..Meters PDXTotalFreqStpy
1 M 129 22
2 M 129 0
3 M 129 1
4 F 129 35
5 F 129 42
6 F 129 5
7 M 557 20
8 M 557 0
9 M 557 15
10 F 557 39
11 F 557 0
12 F 557 0
13 M 1208 33
14 M 1208 26
15 M 1208 3
16 F 1208 7
17 F 1208 0
18 F 1208 8
19 M 604 68
20 M 604 0
21 M 604 0
22 F 604 0
23 F 604 0
24 F 604 0"
Data <- read.table(text=SqMt, header = TRUE)
I want to show the average PDXTotalFreqStpy for each Sq..Meters organized by Sex. This is what I use:
library(ggplot2)
ggplot(Data, aes(x=Sq..Meters, y=PDXTotalFreqStpy)) + stat_summary(fun.y="mean", geom="line", aes(group=Sex,color=Sex))
How do I get these lines smoothed out so that they are not jagged and instead, nice and curvy and go through all the data points? I have seen things on spline, but I have not gotten those to work?
See if this works for you:
library(dplyr)
# increase n if the result is not smooth enough
# (for this example, n = 50 looks sufficient to me)
n = 50
# manipulate data to calculate the mean for each sex at each x-value
# before passing the result to ggplot()
Data %>%
group_by(Sex, x = Sq..Meters) %>%
summarise(y = mean(PDXTotalFreqStpy)) %>%
ungroup() %>%
ggplot(aes(x, y, color = Sex)) +
# optional: show point locations for reference
geom_point() +
# optional: show original lines for reference
geom_line(linetype = "dashed", alpha = 0.5) +
# further data manipulation to calculate values for smoothed spline
geom_line(data = . %>%
group_by(Sex) %>%
summarise(x1 = list(spline(x, y, n)[["x"]]),
y1 = list(spline(x, y, n)[["y"]])) %>%
tidyr::unnest(),
aes(x = x1, y = y1))

R find number of rows in a group and plot

I have a table of Tennis matches. I want to group by winner_ids and plot them against height, basically to check if the taller players have won more matches.
The data looks like this.
m_id winner_id winner_height
1 21 166
2 21 166
3 22 167
4 21 166
5 23 170
6 24 163
7 22 167
8 25 164
Here m_id is the match_id. I want to plot number of matches a person has won against his height
example: 21 has won 3 matches and her height is 166 cm
how can I acheive this in ggplot?
my following code doesn't seem to be working
matches %>% group_by(winner_id) %>%
ggplot(., aes(x = winner_ht, y = nrow((winner_id)))) + geom_point()
Can anyone help?
Do you mean something like this?
library(tidyverse)
df %>%
group_by(winner_id, winner_height) %>%
summarise(n = n()) %>%
ggplot(aes(winner_height, n, label = winner_id)) +
geom_point() +
geom_text(position = position_nudge(y = -0.1))
Explanation: We count the number of games n per winner_id and winner_height and pass the summarised data to ggplot where we plot winner_height vs. n. We can also add labels to indicate the winner_id.
Sample data
df <- read.table(text =
"m_id winner_id winner_height
1 21 166
2 21 166
3 22 167
4 21 166
5 23 170
6 24 163
7 22 167
8 25 164", header = T)

Points in a scatterplot with individual ellipses using ggplot2 in R

My dataset is formed by 4 columns, as shown below:
The two columns on the left represent the coordinates XY of a geographical structure, and the two on the left represent the size of "each" geographical unit (diameters North-South and East-West)
I would like to graphically represent a scatterplot where to plot all the coordinates and draw over each point an ellipse including the diameters of each geographical unit.
Manually, and using only two points, the image should be like this one:
How can I do it using ggplot2?
You can download the data here
Use geom_ellipse() from ggforce:
library(ggplot2)
library(ggforce)
d <- data.frame(
x = c(10, 20),
y = c(10, 20),
ns = c(5, 8),
ew = c(4, 4)
)
ggplot(d, aes(x0 = x, y0 = y, a = ew/2, b = ns/2, angle = 0)) +
geom_ellipse() +
coord_fixed()
Created on 2019-06-01 by the reprex package (v0.2.1)
I'm not adding any new code to what Claus Wilke already posted above. All credit should go to Claus. I'm simply testing it with the actual data, and showing OP how to post data,
Loading packages needed
# install.packages(c("tidyverse"), dependencies = TRUE)
library(tidyverse)
Reading data,
tbl <- read.table(
text = "
X Y Diameter_N_S Diameter_E_W
-4275 1145 77 96
-4855 1330 30 25
-4850 1612 45 90
-4990 1410 15 15
-5055 1230 60 50
-5065 1503 43 45
-5135 1305 40 50
-5505 1190 55 70
-5705 1430 90 40
-5645 1535 52 60
", header = TRUE, stringsAsFactors = FALSE) %>% as_tibble()
showing data,
tbl
#> # A tibble: 10 x 4
#> X Y Diameter_N_S Diameter_E_W
#> <int> <int> <int> <int>
#> 1 -4275 1145 77 96
#> 2 -4855 1330 30 25
#> 3 -4850 1612 45 90
#> 4 -4990 1410 15 15
#> 5 -5055 1230 60 50
#> 6 -5065 1503 43 45
#> 7 -5135 1305 40 50
#> 8 -5505 1190 55 70
#> 9 -5705 1430 90 40
#> 10 -5645 1535 52 60
loading more packages needed
library(ggforce) # devtools::install_github("thomasp85/ggforce")
executing
ggplot(tbl, aes(x0 = X, y0 = Y, a = Diameter_E_W, b = Diameter_N_S, angle = 0)) +
geom_ellipsis() + geom_point(aes(X, Y), size = .5) + coord_fixed() + theme_bw()

Resources