geom_bar(position="dodge") and errorbar not dodging properly - r

So I got a little issue.... Here's my code :
ggplot(GFAPdata_numb, aes(x=Level, y=Pos.Area, fill=Statut))+
geom_bar(stat="identity", color="black", position = "dodge")+
geom_errorbar(aes(ymin=lower, ymax=higher), width=.2, position=position_dodge(.9))
And for some weird and unknown reason, my plot look like this : weird dodge
And I don't know why ! The dodge seems to have work somehow but it like looks like the "ghost" of the data are still stacked and screwing up with my errorbars...
Do you guys have any ideas what's causing that ?
Edit : I was asked to put some data with dput so here it is (first time using this function so I'm not sure I did it right)
> dput(head(GFAPdata_numb))
structure(list(Agneau = c(1L, 1L, 1L, 1L, 2L, 2L), Statut = c("terme",
"terme", "terme", "terme", "terme", "terme"), Area = c(6.53,
6.53, 6.53, 6.53, 4.93, 4.93), Level = c("Weak", "Pos", "Strong",
"Neg", "Weak", "Pos"), Values = c(6744015L, 5076648L, 787615L,
13099676L, 5356151L, 3978924L), Positivity = c(0.262331844844596,
0.197473824638087, 0.0306370160768142, 0.509557314440504, 0.275961978086681,
0.205003880155091), Pos.Area = c(0.0401733299915154, 0.0302410144928157,
0.00469173293672499, 0.0780332793936453, 0.0559760604638299,
0.041582937151134), moyenne = c(0.0382848392036753, 0.0382848392036753,
0.0382848392036753, 0.0382848392036753, 0.050709939148073, 0.050709939148073
), ecart.type = c(0.0304231534615388, 0.0304231534615388, 0.0304231534615388,
0.0304231534615388, 0.0391149666608345, 0.0391149666608345),
SEM = c(0.0152115767307694, 0.0152115767307694, 0.0152115767307694,
0.0152115767307694, 0.0195574833304173, 0.0195574833304173
), lower = c(0.00847014881136729, 0.00847014881136729, 0.00847014881136729,
0.00847014881136729, 0.0123772718204552, 0.0123772718204552
), higher = c(0.0680995295959834, 0.0680995295959834, 0.0680995295959834,
0.0680995295959834, 0.0890426064756909, 0.0890426064756909
)), row.names = c(NA, -6L), groups = structure(list(Agneau = 1:2,
.rows = structure(list(1:4, 5:6), ptype = integer(0), class = c("vctrs_list_of",
"vctrs_vctr", "list"))), row.names = c(NA, -2L), class = c("tbl_df",
"tbl", "data.frame"), .drop = TRUE), class = c("grouped_df",
"tbl_df", "tbl", "data.frame"))

Ok I think I managed to correct the issue by myself after a couple of days not working on it to refresh my brain.
I was using "Pos.Area" as my y-value (which mean the value for each of my smaples) instead of the mean of the Pos.Area to create my plot. And I guess that's why my errorbars were so wild : I had the errorbars for each values of Pos.Area
Once I changed that, the plot was way better.

Related

Using ggalluvial with nodes holding different values

My data is a set of activities completed by persons. The sequence of activities a person takes varies. The data below show the activities for each step (Step1, Step2, etc). I'd like an alluvial plot that labels the activities at each step (each a different node 1, 2, 3...) What is the best approach? Here's what I have so far:
df<-structure(list(acts_activity_id = c("9928131", "445661", "686203", "687868", "688564"), Step1 = c("Unable to Reach", "Unable to Reach",
"Search Correspondence", "Unable to Reach", "Unable to Reach"), Step2 = c("Match Request", NA, "Connection Made", NA, "Match Request"
), Step3 = c("Support Group Request", NA, "Connection Contact Attempt", NA, "Support Group Request"),Step4 = c("Information Provided",
NA, "Not Available to Support", NA, "Information Provided"),
Step5 = c(NA_character_, NA_character_, NA_character_, NA_character_,
NA_character_)), class = c("grouped_df", "tbl_df", "tbl",
"data.frame"),
row.names = c(NA, -5L),
groups = structure(list(acts_activity_id = c("9928131", "445661", "686203", "687868", "688564"), .rows = structure(list(1L, 2L, 3L, 4L, 5L), ptype = integer(0), class = c("vctrs_list_of",
"vctrs_vctr", "list"))), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, -5L), .drop = TRUE))
df %>%
ggplot(
aes(
axis1=Step1, #each step has different values; individuals go thru different sequence of steps
axis2=Step2, axis3=Step3, axis4=Step4, axis5=Step5 ))+
geom_flow()+
geom_stratum()+
labs(title="Activity Sequence")
The first
If you have your data in this order (each column is a set of different activities), then use ggsankey:
df$acts_activity_id<-NULL
x<-df %>% ggsankey::make_long(Step1,Step2,Step3,Step4,Step5)
ggplot(x, aes(x = x, next_x = next_x,
node = node, next_node = next_node,
fill = factor(node), label = node)) +
geom_sankey(flow.alpha = 0.6, node.color = "gray30") +
geom_sankey_label(size = 3, color = "white", fill = "gray40") +
scale_fill_viridis_d() +
theme_sankey(base_size = 18) +
labs(x = NULL) +
theme(legend.position = "none",
plot.title = element_text(hjust = .5))

Using segment labels in ggplot with ggrepel with smooth segments

This is my dataframe:
df<-structure(list(year = c(1984, 1984), team = c("Australia", "Brazil"
), continent = c("Oceania", "Americas"), medal = structure(c(3L,
3L), .Label = c("Bronze", "Silver", "Gold"), class = "factor"),
n = c(84L, 12L)), row.names = c(NA, -2L), class = c("tbl_df",
"tbl", "data.frame"))
And this is my ggplot (my question is related to the annotations regard Brazil label):
ggplot(data = df)+
geom_point(aes(x = year, y = n)) +
geom_text_repel(aes(x = year, y = n, label = team),
size = 3, color = 'black',
seed = 10,
nudge_x = -.029,
nudge_y = 35,
segment.size = .65,
segment.curvature = -1,
segment.angle = 178.975,
segment.ncp = 1)+
coord_flip()
So, I have a segment divided by two parts. On both parts I have 'small braks'. How can I avoid them?
I already tried to use segment.ncp, change nudge_xor nudge_ynut its not working.
Any help?
Not really sure what is going on here. This is the best I could generate by experimenting with variations to the input values for segment... arguments.
There is some guidance at: https://ggrepel.slowkow.com/articles/examples.html which has an example with shorter leader lines, maybe that's an approach you could use.
df<-structure(list(year = c(1984, 1984), team = c("Australia", "Brazil"
), continent = c("Oceania", "Americas"), medal = structure(c(3L,
3L), .Label = c("Bronze", "Silver", "Gold"), class = "factor"),
n = c(84L, 12L)), row.names = c(NA, -2L), class = c("tbl_df",
"tbl", "data.frame"))
library(ggplot2)
library(ggrepel)
ggplot(data = df)+
geom_point(aes(x = year, y = n)) +
geom_text_repel(aes(x = year, y = n, label = team),
size = 3, color = 'black',
seed = 1,
nudge_x = -0.029,
nudge_y = 35,
segment.size = 0.5,
segment.curvature = -0.0000002,
segment.angle = 1,
segment.ncp = 1000)+
coord_flip()
Created on 2021-08-26 by the reprex package (v2.0.0)

make subway graph include 102 topics in ggplot2 r

This is a followup from subway-style graph for word frequency across three datasets in ggplot2
I used the code in the answer from this question, but am struggling with how best to manipulate the graph to make it fits 100 unique dict entries within the subway graph without completely messing up the dict word entries on the margins.
I have tested out different amounts of words to feed into the subway graph, and found that it cannot contain more than 25 words.
I have data:
structure(list(dict = c("apple", "apple", "apple",
"mandarin", "mandarin", "mandarin", "orange", "orange", "orange", "pear"),
name = c("freq_ongov", "freq_onindiv", "freq_onmedia", "freq_ongov",
"freq_onindiv", "freq_onmedia", "freq_ongov", "freq_onindiv",
"freq_onmedia", "freq_ongov"), value = c(0, 87, 63, 0, 44,
20, 3, 27, 25, 0), rank = c(26, 85, 70, 26, 61, 42.5, 86,
47, 48, 26)), row.names = c(NA, -10L), groups = structure(list(
name = c("freq_ongov", "freq_onindiv", "freq_onmedia"), .rows = structure(list(
c(1L, 4L, 7L, 10L), c(2L, 5L, 8L), c(3L, 6L, 9L)), ptype = integer(0), class = c("vctrs_list_of",
"vctrs_vctr", "list"))), row.names = c(NA, 3L), class = c("tbl_df",
"tbl", "data.frame"), .drop = TRUE), class = c("grouped_df",
"tbl_df", "tbl", "data.frame"))
But there are 100 rows within this data that I want to include in the following code:
leftlabels <- df$dict[df$name == "freq_ongov"]
leftlabels <- leftlabels[order(df$rank[df$name == "freq_ongov"])]
rightlabels <- df$dict[df$name == "freq_onmedia"]
rightlabels <- rightlabels[order(df$rank[df$name == "freq_onmedia"])]
ggplot(df, aes(name, rank, color = dict, group = dict)) +
geom_line(size = 4) +
geom_point(shape = 21, fill = "white", size = 4) +
scale_y_continuous(breaks = seq(max(df$rank)), labels = leftlabels,
sec.axis = sec_axis(~., breaks = seq(max(df$rank)),
labels = rightlabels)) +
scale_x_discrete(expand = c(0.01, 0)) +
guides(color = guide_none()) +
coord_cartesian(clip = "off") +
theme(axis.ticks.length.y = unit(0, "points"))
I tried changing the y.int and width of the y axis to fit in 100 words, but that only makes the y-axis longer, without changing the spacing between each word label on the y-axis, so all the words get squeezed together. Any suggestions?

R: Creating more space between bars in geom_col

I'm creating a bar chart in R for values of a certain case over time, using geom_col. The chart contains values per week for a period of about a year and a half.
My problem with my current plot is that the bars are pretty close together. Especially in a PDF format, this creates a problem, since zoomed out it looks more like a histogram. You really have to zoom in drastically to see that the plot consists of individual bars per week. See below.
So what I've tried to do is increase the size between the bars, using position = position_dodge(width=2)). However, so far I see no changes. Why doesn't it take the position dodge? Because the x scale is based on dates?
Below is the head() of my df and a basic version of the code for the plot I'm trying to make.
structure(list(Land = c("India", "India", "India", "India", "India",
"India"), Date = structure(c(18498, 18491, 18484, 18477, 18470,
18463), class = "Date"), SUMU = c(88L, 142L, 96L, 101L, 112L,
128L), ChangeAVG = c("Other", "Other", "Other", "Other", "Other",
"Other")), row.names = c(NA, -6L), groups = structure(list(Land = "India",
.rows = structure(list(1:6), ptype = integer(0), class = c("vctrs_list_of",
"vctrs_vctr", "list"))), row.names = 1L, class = c("tbl_df",
"tbl", "data.frame"), .drop = TRUE), class = c("grouped_df",
"tbl_df", "tbl", "data.frame"))
ggplot(India, aes(Date, SUMU, fill=ChangeAVG))+ theme_light() + geom_col(position = position_dodge(width=10))
Examples of plot view in PDF normally and with zoom at 200%
Thanks!
The problem is that you are using the width argument inside position_dodge. Move it outside that call:
ggplot(India, aes(Date, SUMU, fill=ChangeAVG)) +
theme_light() +
geom_col(width=1.5)

igraph arrow.mode seems to have no effect

I have a network with some directed and some undirected edges. I'm trying to use igraph to plot it using the arrow.mode parameter, but the graph is always showing arrows with default parameters. Here's an example
Here are some data:
spearRhoP_lagged4 <- structure(list(Var1 = c("ARISA_538.9", "ARISA_538.9", "ARISA_666.4",
"ARISA_686.9", "ARISA_538.9", "ARISA_594.1"), Var2 = c("ARISA_666.4",
"ARISA_686.9", "ARISA_686.9", "ARISA_666.4", "ARISA_561.8", "ARISA_561.8"
), rho = c(0.280885191364122, 0.415365287156247, 0.614493076574831,
0.312630564055403, 0.295296877306726, 0.381890811408216), p = c(0.00206314544835896,
2.9098006351119e-06, 1.35005674822095e-13, 0.000567475872663549,
0.00116911931220592, 1.98010880043619e-05), delay = c(0, 0, 0,
1, 0, 0), fdr = c(0.0135393920048557, 7.97032347878478e-05, 2.83511917126399e-11,
0.00503534929264839, 0.00898225813036257, 0.000366902513022),
arrow = c("-", "-", "-", ">", "-", "-")), class = c("grouped_df",
"tbl_df", "tbl", "data.frame"), row.names = c(NA, -6L), groups = structure(list(
Var1 = c("ARISA_538.9", "ARISA_538.9", "ARISA_538.9", "ARISA_594.1",
"ARISA_666.4", "ARISA_686.9"), Var2 = c("ARISA_561.8", "ARISA_666.4",
"ARISA_686.9", "ARISA_561.8", "ARISA_686.9", "ARISA_666.4"
), .rows = list(5L, 1L, 2L, 6L, 3L, 4L)), row.names = c(NA,
-6L), class = c("tbl_df", "tbl", "data.frame"), .drop = TRUE))
Then I build the graph
LaggedSpearGraph <- graph_from_data_frame(spearRhoP_lagged4)
Lastly I plot the graph, telling it that I want the arrow direction to be specified by the parameter arrow
plot(LaggedSpearGraph,
vertex.size=2,
arrow.mode = E(LaggedSpearGraph)$arrow)
I get an output that looks like this.
But what I want is a network where there is only one edge with an arrow on it.
Any suggestions?
You need to add edge as a prefix:
LaggedSpearGraph <- graph_from_data_frame(spearRhoP_lagged4, directed=T)
plot(LaggedSpearGraph,
vertex.size=10,
edge.arrow.mode = E(LaggedSpearGraph)$arrow)
See here:
https://github.com/igraph/igraph/issues/954

Resources