x-axis on joy plot shows incorrect values - r

I am new to making joy plots in R. Below is a plot I made with some simulated data. I'm confused though, because my data variable foo contains no negative values, but the resulting plot would indicate so:
library(ggjoy)
p <- ggplot(results, aes(foo, bar)) + geom_joy()
The data is:
results <- structure(list(foo = c(462.834004209936, 460.834004209936, 73.0340042099357,
106.134004209936, 165.634004209936, 200.134004209936, 490.434004209936,
157.334004209936, 460.834004209936, 131.434004209936, 269.934004209936,
457.534004209936, 459.634004209936, 475.534004209936, 180.034004209936,
142.134004209936, 294.734004209936, 419.534004209936, 279.834004209936,
280.734004209936, 448.034004209936, 206.334004209936, 283.134004209936,
243.034004209936, 530.334004209936, 396.934004209936, 49.8340042099357,
136.134004209936, 210.234004209936, 59.0340042099357, 269.834004209936,
123.034004209936, 385.434004209936, 78.7340042099357, 226.434004209936,
391.034004209936, 219.434004209936, 338.134004209936, 87.0340042099357,
434.234004209936, 123.034004209936, 75.7340042099357, 247.234004209936,
192.334004209936, 146.234004209936, 259.334004209936, 72.5340042099357,
110.934004209936, 287.134004209936, 122.634004209936, 197.834004209936,
379.334004209936), bar = structure(c(3L, 8L, 1L, 5L, 10L, 8L,
7L, 9L, 8L, 10L, 9L, 8L, 8L, 9L, 2L, 3L, 5L, 6L, 9L, 1L, 3L,
5L, 6L, 8L, 7L, 9L, 2L, 3L, 2L, 2L, 3L, 1L, 5L, 10L, 4L, 7L,
5L, 6L, 8L, 8L, 1L, 8L, 8L, 9L, 5L, 6L, 5L, 6L, 7L, 9L, 1L, 9L
), .Label = c("1", "2", "3", "4", "5", "6", "7", "8", "9", "10"
), class = "factor")), .Names = c("foo", "bar"), row.names = c(NA,
-52L), class = "data.frame")
I think it may have to do with stat:
Stats
The default stat used with geom_joy is stat_joy. However, it may not do exactly what you want it to do, and there are other stats that can be used that may be better for your respective application.
First, stat_joy estimates the data range and bandwidth for the density estimation from the entire data at once, rather than from each individual group of data. This choice makes joyplots look more uniform, but the density estimates can in some cases look quite different from what you would get from geom_density or stat_density. This problem can be remidied by using stat_density with geom_joy. This works just fine, we just need to make sure that we map the calculated density onto the height aesthetic.

Function geom_joy() estimates density function which is not bounded by min/max value of your data. Because you've supplied only a few data points, ranges of densities are too wide. You can see it here:
ggplot(results, aes(foo, bar)) +
geom_point() +
geom_joy(alpha=.3)

Related

How to add a edges between component of a graph in igraph R

I have a graph containing 4 components. Now, I want to add an edge among all components based on the size of the membership.
For example, the following graph contains 4 components.
First, I will connect all components with only one edge and take the edge randomly. I can do it using this code
graph1 <- graph_from_data_frame(g, directed = FALSE)
E(graph1)$weight <- g$new_ssp
cl <- components(graph1)
graph2 <- with(
stack(membership(cl)),
add.edges(
graph1,
c(combn(sapply(split(ind, values), sample, size = 1), 2)),
weight = runif(choose(cl$no, 2))
)
)
Secondly, now, I want to add an edge between component-1 and component-2. I want to add an edge between 2 components but rest of the component will be present in the new graph from the previous graph.
Like, after adding an edge between component-1 and component-2, the new graph will contain 3 component 1st (component-1 and component-2 as a 1 component because we added 1 edge), 2nd (component-3 from the main graph), and 3rd (component-4 from the main graph). I can do it using this code
dg <- decompose.graph(graph1)
graph3 <- (dg[[1]] %u% dg[[2]])
component_subgraph_1 <- components(graph3)
graph2 <- with(
stack(membership(component_subgraph_1)),
add.edges(
graph1,
c(combn(sapply(split(ind, values), sample, size = 1), 2)),
weight = 0.01))
Figure:
Same for all combinations. Such as, component-1 and component-3, and component-1 and component-4, and component-2 and component-3, and component-2 and component-4, and component-3 and component-4.
But, this is not feasible to write the code and change manually dg[[1]], dg[[2]], and so on. Moreover, my actual dataset contains a lot of components. So, in reality, this is impossible.
Any idea, how can I do this automatically?
Actually, I have a scoring function (like the shortest path). So, I want to check the score after adding all components, or after adding only 2 components, after adding only 3 components, and so on! Something like greedy algorithms.
Reproducible Data:
g <- structure(list(query = structure(c(1L, 1L, 1L, 2L, 2L, 3L, 4L,
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L), .Label = c("ID_00104",
"ID_00136", "ID_00169", "ID_00178", "ID_00180"), class = "factor"),
target = structure(c(16L, 19L, 20L, 1L, 9L, 9L, 6L, 11L,
13L, 15L, 4L, 8L, 10L, 14L, 2L, 3L, 5L, 7L, 12L, 17L, 18L
), .Label = c("ID_00169", "ID_00288", "ID_00324", "ID_00394",
"ID_00663", "ID_00790", "ID_00846", "ID_00860", "ID_00910", "ID_00959",
"ID_01013", "ID_01047", "ID_01130", "ID_01222", "ID_01260", "ID_06663",
"ID_06781", "ID_06786", "ID_06791", "ID_09099"), class = "factor"),
new_ssp = c(0.654172560113154, 0.919096895578551, 0.925821596244131,
0.860406091370558, 0.746376811594203, 0.767195767195767,
0.830379746835443, 0.661577608142494, 0.707520891364902,
0.908193484698914, 0.657118786857624, 0.687664041994751,
0.68586387434555, 0.874513618677043, 0.836646499567848, 0.618361836183618,
0.684163701067616, 0.914728682170543, 0.876297577854671,
0.732707087959009, 0.773116438356164)), row.names = c(NA,
-21L), class = "data.frame")
Thanks in advance.
You are actually close to what you want already. Perhaps the code below could help you
out <- with(
stack(membership(cl)),
lapply(
combn(split(ind, values), 2, simplify = FALSE),
function(x) {
add.edges(
graph1,
c(combn(sapply(x, sample, size = 1), 2)),
weight = 0.01
)
}
)
)
and then you can run
sapply(out, plot)
to visualize all the combinations.

New error when producing boxplot

So I had this script working yesterday on a different data set, an it actually worked once on this data set, but when I tried to combine it with another figure using plot_grid, I got this error:
Error:
T_SHOW_BACKTRACE environmental variable.
Error in grid.Call(L_textBounds, as.graphicsAnnot(x$label), x$x, x$y, :
polygon edge not found
Now when I try to construct the boxplot itself, I get the same error...
Here is my data:
dput(SUICMass)
structure(list(ChillTime = structure(c(1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L,
3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 5L,
5L, 5L, 5L, 5L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 7L, 7L,
7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L), .Label = c("2", "4", "6", "24",
"27", "29", "31"), class = "factor"), Mass = c(1.2687, 1.5417,
1.6898, 1.7655, 2.413, 2.0333, 2.0824, 1.2676, 1.4916, 2.1585,
2.2453, 1.3624, 1.2951, 2.4209, 2.0804, 1.9227, 1.9032, 2.1063,
1.7601, 1.9905, 1.9837, 1.6312, 1.8567, 1.4433, 1.9369, 2.1029,
2.0265, 1.3212, 1.2971, 1.5823, 1.4759, 1.2745, 0.714, 1.5693,
1.7906, 1.8607, 1.8851, 1.9192, 1.6307, 1.4269, 1.7011, 0.8249,
1.7198, 1.3939, 1.394, 2.1527, 1.288, 1.4724, 1.5264, 1.6562,
1.5796, 1.4982, 1.2794, 1.6021, 0.6345, 2.4041, 2.0246, 1.8398,
1.349, 2.0156, 1.1563, 2.0462)), .Names = c("ChillTime", "Mass"
), row.names = c(NA, -62L), class = "data.frame")
Here is my code:
library(ggplot2)
library(multcompView)
library(plyr)
library(gridExtra)
library(cowplot)
## Box plot for Susans WMA population
SUICMass <- read.csv('SUICMass_Test_June_28_2017.csv', header = TRUE)
SUICMass$ChillTime <- factor(SUICMass$ChillTime, levels=c("2", "4", "6", "24", "27", "29", "31"))
generate_label_df <- function(SUICMassTUKEY, variable){
# Extract labels and factor levels from Tukey post-hoc
Tukey.levels <- SUICMassTUKEY[[variable]][,4]
Tukey.labels <- data.frame(multcompLetters(Tukey.levels)['Letters'])
#I need to put the labels in the same order as in the boxplot :
Tukey.labels$treatment=rownames(Tukey.labels)
Tukey.labels=Tukey.labels[order(Tukey.labels$treatment) , ]
return(Tukey.labels)
}
SUICMassmodel=lm(SUICMass$Mass~SUICMass$ChillTime )
SUICMassANOVA=aov(SUICMassmodel)
# Tukey test to study each pair of treatment :
SUICMassTUKEY <- TukeyHSD(x=SUICMassANOVA, 'SUICMass$ChillTime', conf.level=0.95)
labels<-generate_label_df(SUICMassTUKEY , "SUICMass$ChillTime")#generate labels using function
names(labels)<-c('Letters','ChillTime')#rename columns for merging
SUICMassyvalue<-aggregate(.~ChillTime, data=SUICMass, max)# obtain letter position for y axis using means
SUICMassfinal<-merge(labels,SUICMassyvalue) #merge dataframes
SUICMassPlot <- ggplot(SUICMass, aes(x = ChillTime, y = Mass)) +
stat_boxplot(geom ='errorbar', width=.2) +
geom_blank() +
theme_bw() +
theme(panel.border = element_rect(fill=NA, colour = "black", size=0.75)) +
theme(axis.text.x = element_text(face="bold")) +
theme(axis.text.y = element_text(face="bold")) +
labs(x = 'Time (weeks)', y = 'Mass (g)') +
ggtitle(expression(atop(bold("Fresh Mass"), atop(italic("(Sarah's - UIC Colony)"))))) +
theme(plot.title = element_text(hjust = 0.5, vjust = -0.6, face='bold')) +
geom_boxplot(fill = 'dodgerblue1', stat = "boxplot") +
geom_text(data = SUICMassfinal, aes(x = ChillTime, y = Mass, label = Letters),vjust=-2,hjust=.5) +
scale_y_continuous(limit = c(0, 3.5))
I can't figure out what the issue is here, because sometimes I can get the script to work and other times not.

Ordering of factor variables [duplicate]

I am calling the ggplot function
ggplot(data,aes(x,y,fill=category)+geom_bar(stat="identity")
The result is a barplot with bars filled by various colours corresponding to category. However the ordering of the colours is not consistent from bar to bar. Say there is pink, green and blue. Some bars go pink,green,blue from bottom to top and some go green,pink,blue. I don't see any obvious pattern.
How are these orderings chosen? How can I change it? At the very least, how can I make ggplot choose a consistent ordering?
The class of (x,y and category) are (integer,numeric and factor) respectively. If I make category an ordered factor, it does not change this behavior.
Anyone know how to fix this?
Reproducible example:
dput(data)
structure(list(mon = c(9L, 10L, 11L, 10L, 8L, 7L, 7L, 11L, 9L,
10L, 12L, 11L, 7L, 12L, 8L, 12L, 9L, 7L, 9L, 10L, 10L, 8L, 12L,
7L, 11L, 10L, 8L, 7L, 11L, 12L, 12L, 9L, 9L, 7L, 7L, 12L, 12L,
9L, 9L, 8L), gclass = structure(c(9L, 1L, 8L, 6L, 4L, 4L, 3L,
6L, 2L, 4L, 1L, 1L, 5L, 7L, 1L, 6L, 8L, 6L, 4L, 7L, 8L, 7L, 9L,
8L, 3L, 5L, 9L, 2L, 7L, 3L, 5L, 5L, 7L, 7L, 9L, 2L, 4L, 1L, 3L,
8L), .Label = c("Down-Down", "Down-Stable", "Down-Up", "Stable-Down",
"Stable-Stable", "Stable-Up", "Up-Down", "Up-Stable", "Up-Up"
), class = c("ordered", "factor")), NG = c(222614.67, 9998.17,
351162.2, 37357.95, 4140.48, 1878.57, 553.86, 40012.25, 766.52,
15733.36, 90676.2, 45000.29, 0, 375699.84, 2424.21, 93094.21,
120547.69, 291.33, 1536.38, 167352.21, 160347.01, 26851.47, 725689.06,
4500.55, 10644.54, 75132.98, 42676.41, 267.65, 392277.64, 33854.26,
384754.67, 7195.93, 88974.2, 20665.79, 7185.69, 45059.64, 60576.96,
3564.53, 1262.39, 9394.15)), .Names = c("mon", "gclass", "NG"
), row.names = c(NA, -40L), class = "data.frame")
ggplot(data,aes(mon,NG,fill=gclass))+geom_bar(stat="identity")
Starting in ggplot2_2.0.0, the order aesthetic is no longer available. To get a graph with the stacks ordered by fill color, you can simply order the dataset by the grouping variable you want to order by.
I often use arrange from dplyr for this. Here I'm ordering the dataset by the fill factor within the ggplot call rather than creating an ordered dataset but either will work fine.
library(dplyr)
ggplot(arrange(data, gclass), aes(mon, NG, fill = gclass)) +
geom_bar(stat = "identity")
This is easily done in base R, of course, using the classic order with the extract brackets:
ggplot(data[order(data$gclass), ], aes(mon, NG, fill = gclass)) +
geom_bar(stat = "identity")
With the resulting plot in both cases now in the desired order:
ggplot2_2.2.0 update
In ggplot_2.2.0, fill order is based on the order of the factor levels. The default order will plot the first level at the top of the stack instead of the bottom.
If you want the first level at the bottom of the stack you can use reverse = TRUE in position_stack. Note you can also use geom_col as shortcut for geom_bar(stat = "identity").
ggplot(data, aes(mon, NG, fill = gclass)) +
geom_col(position = position_stack(reverse = TRUE))
You need to specify the order aesthetic as well.
ggplot(data,aes(mon,NG,fill=gclass,order=gclass))+
geom_bar(stat="identity")
This may or may not be a bug.
To order, you must use the levels parameter and inform the order. Like this:
data$gclass
(data$gclass2 <- factor(data$gclass,levels=sample(levels(data$gclass)))) # Look the difference in the factors order
ggplot(data,aes(mon,NG,fill=gclass2))+geom_bar(stat="identity")
You can change the colour using the scale_fill_ functions. For example:
ggplot(dd,aes(mon,NG,fill=gclass)) +
geom_bar(stat="identity") +
scale_fill_brewer(palette="blues")
To get consistent ordering in the bars, then you need to order the data frame:
dd = dd[with(dd, order(gclass, -NG)), ]
In order to change the ordering of legend, alter the gclass factor. So something like:
dd$gclass= factor(dd$gclass,levels=sort(levels(dd$gclass), TRUE))
Since this exchange shows up first for "factor fill order", I will add one more solution, what I believe to be a bit more straight forward, and doesn't require altering your underlying data.
ggplot(data,aes(x,y,fill=factor(category, levels = c("Down-Down", "Down-Stable", "Down-Up", "Stable-Down", "Stable-Stable", "Stable-Down", "Up-Down", "Up-Stable", "Up-Up"))) +
geom_col(position = position_stack(reverse = FALSE))
Or as I prefer, I first create a variable vector to simplify coding later and make it more easily editable:
v_factor_levels <- c("Down-Down", "Down-Stable", "Down-Up", "Stable-Down", "Stable-Stable", "Stable-Down", "Up-Down", "Up-Stable", "Up-Up")
ggplot(data,aes(x,y,fill=factor(category, levels = v_factor_levels)) +
geom_col(position = position_stack(reverse = FALSE))
You don't need the reverse position element within geom_col(), I keep these as a reminder in case I want to reverse, but you could further simplify by eliminating that.
Building on #aosmith 's answer, another way to order the bars, that I found slightly more intuitive is:
ggplot(data, aes(x=mon, y=reorder(NG,gclass), fill = gclass)) +
geom_bar(stat = "identity")
The beauty of the reorder function from the base stats package is that you can apply it in the reorder(based_on_dimension, y, function) wherein y is ordered based_on_dimension with a function like sum, mean, etc.

How to add multiple data series to a scatterplot and how to format numbers to appear in standard form on y axis

My data set:
structure(list(Site = c(2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 4L,
4L, 4L, 4L, 4L, 5L, 5L, 6L, 6L, 6L), Average.worm.weight..g. = c(0.1934,
0.249, 0.263, 0.262, 0.4186, 0.204, 0.311, 0.481, 0.326, 0.657,
0.347, 0.311, 0.239, 0.4156, 0.31, 0.3136, 0.4033, 0.302, 0.277
), Average.total.immune.cell.count = structure(c(8L, 16L, 11L,
12L, 10L, 1L, 4L, 15L, 4L, 3L, 17L, 13L, 18L, 7L, 5L, 6L, 9L,
14L, 2L), .Label = c("0", "168750", "18650000", "200,000", "21,600,000",
"226666.6", "22683333.33", "2533333.33", "283333.333", "291666.6",
"335833.3", "435800", "474816666.7", "500000", "6450000", "729166.667",
"7433333.3", "9916667"), class = "factor"), Average.eleocyte.number = structure(c(2L,
5L, 14L, 10L, 1L, 1L, 6L, 1L, 6L, 7L, 1L, 9L, 15L, 8L, 12L, 3L,
11L, 13L, 4L), .Label = c("0", "1266666.67", "153333.3", "168740",
"17", "200,000", "2266666.667", "22683333.33", "23116666.67",
"264000", "283333.333", "442", "500000", "7.3", "9916667"), class = "factor")), .Names = c("Site",
"Average.worm.weight..g.", "Average.total.immune.cell.count",
"Average.eleocyte.number"), class = "data.frame", row.names = c(NA,
-19L))
This is my R script so far:
Plotting multiple data series on a graph
y1<-dframe1$"Average.total.immune.cell.count"
y2<-dframe1$"Average.eleocyte.number"
x<-dframe1$"Average.worm.weight..g."
plot.default(y1~x,type="p" )
points(y2~x)
I am trying to add to y series to the same scatterplot and I am struggling to do so, I want to have different symbols for the points so as to tell apart the two different data series. Also I would like the axes to meet on the bottom left hand side and would appreciate being informed as to how I can do that? I would also like the y axis to be in standard form, but do not know how to get R to do that.
Best regards.
K.
So this is an object lesson is getting your data in the correct format to begin with. Your numbers have commas, which R does not like. Hence the numbers get converted to character and imported as factors (which your structure(...) clearly shows. You need to fix that, or better yet get rid of the commas prior to exporting.
Something like this will work
colnames(dframe) <- c("Site","x","y1","y2")
dframe$y1 <- as.numeric(as.character(gsub(",","",dframe$y1,fixed=TRUE)))
dframe$y2 <- as.numeric(as.character(gsub(",","",dframe$y2,fixed=TRUE)))
plot(y1~x,dframe, col="red", pch=20)
points(y2~x,dframe, col="blue", pch=20)
But there are additional problems. One of the numbers (in row 12) is a factor of 10 larger than all the others, so the plot above is not very informative. It's hard to know if this is a data input error, or a genuine outlier in your data.
EDIT: Response to OP's comment
dframe <- dframe[-12,] # remove row 12
dframe <- dframe[order(dframe$x),] # order by increasing x
plot(y1~x,dframe, col="red", pch=20, type="b")
points(y2~x,dframe, col="blue", pch=20, type="b")
legend("topleft",legend=c("y1","y2"),col=c("red","blue"),pch=20)

factor order when subsetting within ggplot

I have factors on x-axis and order those factor levels in a way that's intuitive to plot with ggplot. It works fine. However, when I use the subset command within ggplot, it re-orders my original sequence of factors.
Is it possible to do subsetting within ggplot and preserve the order of factor levels?
Here is the data and code:
library(ggplot2)
library(plyr)
dat <- structure(list(SubjectID = structure(c(12L, 4L, 6L, 7L, 12L,
7L, 5L, 8L, 14L, 1L, 15L, 1L, 7L, 1L, 7L, 5L, 4L, 2L, 9L, 6L,
7L, 13L, 12L, 2L, 15L, 3L, 5L, 13L, 13L, 10L, 7L, 8L, 10L, 10L,
1L, 10L, 12L, 7L, 6L, 10L), .Label = c("s001", "s002", "s003",
"s004", "s005", "s006", "s007", "s008", "s009", "s010", "s011",
"s012", "s013", "s014", "s015"), class = "factor"), Parameter = structure(c(7L,
3L, 5L, 3L, 6L, 4L, 6L, 7L, 7L, 4L, 7L, 12L, 8L, 11L, 1L, 4L,
3L, 4L, 6L, 4L, 6L, 6L, 12L, 5L, 12L, 1L, 7L, 13L, 11L, 1L, 4L,
1L, 6L, 13L, 10L, 10L, 10L, 13L, 5L, 8L), .Label = c("(Intercept)",
"c0.008", "c0.01", "c0.015", "c0.02", "c0.03", "PrevCorr1", "PrevFail1",
"c0.025", "c0.004", "c0.006", "c0.009", "c0.012", "c0.005"), class = "factor"),
Weight = c(0.0352725634087837, 1.45546697427904, 2.29457594510248,
0.479548914792514, 6.39680995359234, 1.48829600339586, 2.69253113220079,
-0.171219812386926, -0.453625394224277, 1.43732884325816,
0.742416863226952, 0.256935761466245, -0.29401087047524,
0.34653127811481, 0.33120592543102, 2.79213318878505, 2.47047299128637,
1.022450287681, 6.92891513416868, 0.648982326396105, 6.58336282626389,
6.40600461501379, 1.80062359655524, 3.86658202530889, 1.23833324887194,
-0.026560261876089, 0.121670468861011, 0.9290824087063, 0.349104382483186,
0.24722583823016, 1.82473621255801, -0.712668411699556, 6.51789901685784,
0.74682257127003, 0.0755807984938072, 0.131705709322157,
0.246465073382095, 0.876279316248929, 1.83442709571662, -0.579086982613267
)), .Names = c("SubjectID", "Parameter", "Weight"), row.names = c(2924L,
784L, 1537L, 1663L, 3138L, 1744L, 1266L, 1996L, 3548L, 86L, 3692L,
230L, 1613L, 213L, 1627L, 1024L, 832L, 384L, 2418L, 1568L, 1714L,
3362L, 3200L, 497L, 3632L, 683L, 1020L, 3281L, 3263L, 2779L,
1632L, 1995L, 2674L, 2753L, 312L, 2638L, 3198L, 1809L, 1569L,
2589L), class = "data.frame")
## Sort factors in the order that will make it intuitive to read the plot
## It goes, "(Intercept), "PrevCorr1", "PrevFail1", "c0.004", "c0.006", etc.
paramNames <- levels(dat$Parameter)
contrastNames <- sort(paramNames[grep("c0",paramNames)])
biasNames <- paramNames[!paramNames %in% contrastNames]
dat$Parameter <- factor(dat$Parameter, levels=c(biasNames, contrastNames))
## Add grouping parameter that will be used to plot different weights in different colors
dat$plotColor <-"Contrast"
dat$plotColor[dat$Parameter=="(Intercept)"] <- "Intercept"
dat$plotColor[grep("PrevCorr", dat$Parameter)] <- "PrevSuccess"
dat$plotColor[grep("PrevFail", dat$Parameter)] <- "PrevFail"
p <- ggplot(dat, aes(x=Parameter, y=Weight)) +
# The following command, which adds geom_line to data points of the graph, changes the order of levels
# If I uncomment the next line, the factor level order goes wrong.
#geom_line(subset=.(plotColor=="Contrast"), aes(group=1), stat="summary", fun.y="mean", color="grey50", size=1) +
geom_point(aes(group=Parameter, color=plotColor), size=5, stat="summary", fun.y="mean") +
geom_point(aes(group=Parameter), size=2.5, color="white", stat="summary", fun.y="mean") +
theme(axis.text.x = element_text(angle=45, vjust=1, hjust=1))
print(p)
Here is the plot when geom line is commented
And here is what happens when geom_line is uncommented
If you switch the order in which you plot the objects, the problem disappears:
p <- ggplot(dat, aes(x=Parameter, y=Weight)) +
# The following command, which adds geom_line to data points of the graph, changes the order of levels
# If I uncomment the next line, the factor level order goes wrong.
geom_point(aes(group=Parameter, color=plotColor), size=5, stat="summary", fun.y="mean") +
geom_line(subset = .(plotColor == "Contrast"), aes(group=1), stat="summary", fun.y="mean", color="grey50", size=1) +
geom_point(aes(group=Parameter), size=2.5, color="white", stat="summary", fun.y="mean") +
theme(axis.text.x = element_text(angle=45, vjust=1, hjust=1))
print(p)
I think the problem lies in plotting the subsetted data first, it ditches the levels for the original data, and when you add back in the points, it doesn't know where to put them. When you plot with the original data first, it maintains the levels. I'm not sure though, you might have to take my word on it.

Resources