Related
I want to reproduce the following graph:
And my data is the following, where the blue line is complete_preds_means and the orange line is contrafact:
structure(list(dias = structure(c(19052, 19053, 19054, 19055,
19056, 19057, 19058, 19059, 19060, 19061, 19062, 19063, 19064,
19065, 19066, 19067, 19068, 19069, 19070, 19071), class = "Date"),
complete_preds_means = c(341.07434, 381.59167, 455.47815,
485.05597, 527.60876, 562.63965, 602.48975, 624.663, 626.5637,
527.2239, 420.71643, 389.30804, 378.74396, 366.61548, 361.36566,
363.37253, 319.31824, 314.39688, 303.60342, 294.8934), contrafact = c(364.5,
358.89, 466.64, 470.11, 464.25, 487.27, 591.2, 715.33, 628.02,
505.98, 402.9, 316.81, 323.35, 358.61, 354.26, 369.5, 317.01,
336.5, 285.33, 270.91), complete_preds_lower = c(320.6368042,
361.7870895, 432.4487762, 461.2275833, 503.2255051, 535.7108551,
576.3850006, 597.9762146, 601.4407013, 504.0448837, 398.7777023,
368.0046799, 356.3603165, 345.5847885, 339.9679932, 342.7514801,
298.3247482, 293.4419693, 282.5286865, 275.4635284), complete_preds_upper = c(359.9897186,
402.5708664, 477.4746765, 508.7775711, 550.3326447, 587.6521027,
628.5320251, 649.9691833, 649.4831665, 547.9886108, 442.046402,
410.8121475, 399.0208908, 389.8615128, 387.4929993, 386.2935928,
340.140834, 336.3622116, 324.793483, 315.4606934)), row.names = c(NA,
-20L), class = c("tbl_df", "tbl", "data.frame"))
So far I have tried this:
plot_fig4 <- ggplot()+
geom_line(data=fig4, aes(x=dias, y=complete_preds_means), colour="blue")+
geom_line(data=fig4, aes(x=dias, y=contrafact), colour="red") +
geom_ribbon(aes(ymin=fig4$complete_preds_lower, ymax=fig4$complete_preds_upper))+
labs(y="Clase ($)",
x="") +
scale_y_continuous(breaks=seq(from=100, to=800, by=100))+
scale_x_date(expand = c(0, 0), date_breaks="1 month", date_labels = "%b\n%Y")
But I only get this error: Error: geom_ribbon requires the following missing aesthetics: x or y, xmin and xmax
You haven't told geom_ribbon what variable should be on the x axis. You could just add x=fig4$dias inside the aes of geom_ribbon, but this isn't the best way to use ggplot. Better to use ggplot's inheritance of data and aesthetic mappings to avoid repeating yourself and making mistakes along the way. If you change your first line to ggplot(fig4, aes(x = dias)) you don't need to do data=fig4 and x=dias in every geom call.
A couple of other issues are that you should map the color aesthetic to produce a legend, and make the alpha low on your ribbon so that it is semi-transparent. The ordering of layers is also important.
Finally, I have added some theme tweaks to make the plot more like the desired output.
ggplot(fig4, aes(x = dias)) +
geom_line(aes(y = contrafact, color = "Contrafact"), linewidth = 1) +
geom_ribbon(aes(ymin = complete_preds_lower, ymax = complete_preds_upper),
fill = "deepskyblue4", alpha = 0.2) +
geom_line(aes(y = complete_preds_means, color = "Predicted"), linewidth = 1) +
geom_vline(xintercept = as.Date("2022-03-13"),
linetype = 4, colour = "green4") +
labs(y = "Clase ($)", x = NULL) +
scale_color_manual(NULL, values = c("orange", "deepskyblue4")) +
scale_y_continuous(breaks = 1:8 * 100) +
scale_x_date(expand = c(0, 0), date_breaks = "1 month",
date_labels = "%b\n%Y") +
theme_classic(base_size = 16)
I have two datasets looking like the following:
#dataset 1
structure(list(dataset1 = 1:86, x = c(24.22055, 24.61821, 24.60858,
24.5963, 24.66904, 24.682, 24.74323, 24.84038, 25.02606, 25.00763,
24.99861, 25.00901, 24.99273, 24.98789, 24.99308, 24.97615, 24.9572,
24.95962, 24.93451, 25.08111, 24.97653, 24.92734, 24.96208, 25.03111,
25.00242, 24.95385, 24.99345, 25.03311, 24.93516, 24.95163, 24.94859,
25.07071, 25.15814, 25.22433, 25.3163, 25.22823, 25.34902, 25.4118,
25.40885, 25.35868, 25.34709, 25.24046, 25.31097, 25.32868, 25.41141,
24.92474, 24.90951, 24.9927, 25.0052, 24.94954, 25.15449, 25.10164,
25.03112, 24.97345, 25.03352, 25.11059, 25.05391, 25.05766, 25.06176,
25.17039, 25.17868, 25.1053, 25.0568, 25.08028, 25.137, 25.36559,
25.06363, 25.26306, 25.16708, 25.14826, 25.06046, 24.99418, 25.19738,
25.20072, 25.24073, 25.18705, 25.18142, 25.16747, 25.1235, 25.38767,
25.37099, 25.30558, 25.35074, 25.33528, 25.32482, 25.32328),
y = c(22.25462, 21.88752, 21.89172, 21.88356, 21.86319, 21.80782,
21.7451, 21.70914, 21.68861, 21.66829, 21.67942, 21.67475,
21.67994, 21.67462, 21.67405, 21.67494, 21.66842, 21.65091,
21.6657, 21.68427, 21.66878, 21.6711, 21.66772, 21.63123,
21.64916, 21.65174, 21.65686, 21.63292, 21.64039, 21.53591,
21.64633, 21.62177, 21.61304, 21.60609, 21.594, 21.60413,
21.59069, 21.58264, 21.58277, 21.57736, 21.57457, 21.57674,
21.56562, 21.49258, 21.48584, 21.74852, 21.73081, 21.75594,
21.66646, 21.70782, 21.67075, 21.66456, 21.64514, 21.65763,
21.66863, 21.64658, 21.63672, 21.62677, 21.65441, 21.61994,
21.61754, 21.65159, 21.62676, 21.61157, 21.60181, 21.65121,
21.61303, 21.61424, 21.61419, 21.6258, 21.59797, 21.61477,
21.5879, 21.58918, 21.61834, 21.56725, 21.61358, 21.61456,
21.57619, 21.592, 21.58095, 21.52847, 21.57284, 21.56755,
21.56847, 21.49455), z = c(53.52483, 53.49427, 53.49971,
53.52014, 53.46777, 53.51018, 53.51168, 53.45048, 53.28533,
53.32408, 53.32197, 53.31623, 53.32733, 53.33749, 53.33287,
53.34891, 53.37439, 53.38947, 53.39978, 53.23462, 53.35469,
53.40156, 53.3702, 53.33767, 53.34843, 53.39441, 53.34969,
53.33398, 53.42445, 53.51247, 53.40507, 53.30752, 53.22882,
53.16958, 53.0897, 53.16764, 53.06029, 53.00556, 53.00838,
53.06396, 53.07834, 53.1828, 53.12341, 53.17874, 53.10275,
53.32674, 53.35968, 53.25136, 53.32834, 53.34264, 53.17476,
53.2338, 53.32374, 53.36892, 53.29785, 53.24283, 53.30937,
53.31556, 53.28384, 53.20967, 53.20378, 53.24311, 53.31644,
53.30816, 53.26118, 52.9832, 53.32334, 53.1227, 53.21872,
53.22594, 53.34158, 53.39105, 53.21472, 53.2101, 53.14093,
53.2457, 53.205, 53.21797, 53.30031, 53.02033, 53.04806,
53.16595, 53.07643, 53.09717, 53.10672, 53.18217)), class = "data.frame", row.names = c(NA,
-86L))
#dataset2
structure(list(dataset2 = 1:16, x1 = c(24.702, 24.64061, 24.64624,
24.699, 24.68064, 24.65854, 24.75148, 24.58633, 24.73463, 24.59992,
24.65293, 24.60753, 24.62394, 25.3416, 24.71006, 24.67719), y1 = c(21.87799,
21.89606, 21.9034, 21.8859, 21.89083, 21.90291, 21.8491, 21.93269,
21.87262, 21.87465, 21.90029, 21.87801, 21.87661, 21.64635, 21.83719,
21.90565), z1 = c(53.42002, 53.46333, 53.45036, 53.4151, 53.42853,
53.43855, 53.39942, 53.48098, 53.39274, 53.52543, 53.44677, 53.51446,
53.49945, 53.01205, 53.45276, 53.41716)), class = "data.frame", row.names = c(NA,
-16L))
I have written a code to plot kernel density contours according to the ggtern package.
# density plot for dataset 1
plot1 <- ggtern(data = test,aes(x=x, y=y, z=z))
plot1+ stat_density_tern(geom="polygon",
aes(fill = ..level..,
alpha = ..level..)) +
theme_rgbw() +
labs(title = "Example Density/Contour Plot") +
scale_fill_gradient(low = "lightblue",high = "blue") +
guides(color = "none", fill = "none", alpha = "none")+
scale_T_continuous (limits = c(0.225,0.215))+
scale_L_continuous (limits= c(0.255,0.245))+
scale_R_continuous (limits = c(0.53,0.54))
# density plot for dataset 2
plot2 <- ggtern(data = test2,aes(x=x1, y=y1, z=z1))
plot2 + stat_density_tern(geom="polygon",
aes(fill = ..level..,
alpha = ..level..)) +
theme_rgbw() +
labs(title = "Example Density/Contour Plot") +
scale_fill_gradient(low = "lightgreen",high = "green") +
guides(color = "none", fill = "none", alpha = "none")+
scale_T_continuous (limits = c(0.225,0.215))+
scale_L_continuous (limits= c(0.255,0.245))+
scale_R_continuous (limits = c(0.53,0.54))
The next step I would like to do is to overlap plot1 with plot2. I was wondering if anyone knows how to achieve this. Thanks.
The easiest way to handle this is to add a column to both dataframes identifying the source of the data and then to combine then into 1 large data frame.
Then in the mapping definition define the "group" parameter.
#Add column to identify the data source
test1$id <- "Test1"
test2$id <- "Test2"
test2$z <- test2$z+0.2
test2$y <- test2$y+0.2
#combine both datasets into 1
names(test2)<-names(test1)
totalTest <- rbind(test1, test2)
#plot and group by the new ID column
plot1 <- ggtern(data = totalTest, aes(x=x, y=y, z=z, group=id, fill=id))
plot1+ stat_density_tern(geom="polygon",
aes(fill = ..level..,
alpha = ..level..)) +
theme_rgbw() +
labs(title = "Example Density/Contour Plot") +
scale_fill_gradient(low = "lightblue",high = "blue") +
guides(color = "none", fill = "none", alpha = "none") +
scale_T_continuous (limits = c(0.225,0.215))+
scale_L_continuous (limits= c(0.255,0.245))+
scale_R_continuous (limits = c(0.53,0.54))
I am making a plot in ggplot2 that contains a geom_pointrange and a geom_line. I see that when I change the order of the geoms, either the points are plotted on top of the line, or vice versa. The legend also changes which geom is plotted on top of the other based on the same ordering of the geoms. However, I would like for the line to plot first, then the pointrange on top, in the plot itself, with the opposite in the legend. Is this possible? I would greatly appreciate any input.
Here is the code I used to make the figure.
md.figd2 <- structure(list(date = c("2013-05-28", "2013-07-11", "2013-09-22",
"2013-05-28", "2013-07-11", "2013-09-22", "2013-05-28", "2013-07-11",
"2013-09-22"), trt = structure(c(3L, 3L, 3L, 1L, 1L, 1L, 2L,
2L, 2L), .Label = c("- Fescue", "- Random", "Control"), class = "factor"),
means = c(1, 0.921865257043089, 0.793438250521971, 1, 0.878305313846414,
0.85698797555687, 1, 0.840679145697309, 0.798547331410388
), mins = c(1, 0.87709562979756, 0.72278951032918, 1, 0.816185624483356,
0.763720265496049, 1, 0.780804129401513, 0.717089626439849
), maxes = c(1, 0.966634884288619, 0.864086990714762, 1,
0.940425003209472, 0.950255685617691, 1, 0.900554161993105,
0.880005036380927)), .Names = c("date", "trt", "means", "mins",
"maxes"), row.names = c(NA, 9L), class = "data.frame")
library(ggplot2)
dplot1.ysc <- scale_y_continuous(limits=c(0,1), breaks=seq(0,1,.2), name='Proportion mass lost')
dplot1.xsc <- scale_x_date(limits=as.Date(c('2013-05-23', '2013-10-03')), labels=c('May 28', 'July 11', 'Sep 22'), breaks=md.figdata$date, name='Date')
dplot1.csc <- scale_color_manual(values=c('grey20','grey50','grey80'))
dplot1.lsc <- scale_linetype_manual(values=c('solid','dotted','dashed'))
djitter <- rep(c(0,-1,1), each=3)
# This one produces the plot with the legend I want.
dplot1b <- ggplot(md.figd2, aes(x=date + djitter, y=means, group=trt)) + geom_pointrange(aes(ymin=mins, ymax=maxes, color=trt), size=2) + geom_line(aes(linetype=trt), size=1)
# This one produces the plot with the points on the main plot that I want.
dplot1b <- ggplot(md.figd2, aes(x=date + djitter, y=means, group=trt)) + geom_line(aes(linetype=trt), size=1) + geom_pointrange(aes(ymin=mins, ymax=maxes, color=trt), size=2)
dplot1b + dplot1.xsc + dplot1.ysc + dplot1.csc + dplot1.lsc
You can use gtable::gtable_filter to extract the legend from the plot you want, and then gridExtra::grid.arrange to recreate the plot you want
# the legend I want
plot1a <- ggplot(md.figd2, aes(x=date , y=means, group=trt)) +
geom_pointrange(aes(ymin=mins, ymax=maxes, color=trt), size=2,
position = position_dodge(width=1)) +
geom_line(aes(linetype=trt), size=1)
# This one produces the plot with the points on the main plot that I want.
dplot1b <- ggplot(md.figd2, aes(x=date, y=means, group=trt)) +
geom_line(aes(linetype=trt), size=1) +
geom_pointrange(aes(ymin=mins, ymax=maxes, color=trt), size=2)
w <- dplot1b + dplot1.xsc + dplot1.ysc + dplot1.csc + dplot1.lsc
# legend
l <- dplot1a + dplot1.xsc + dplot1.ysc + dplot1.csc + dplot1.lsc
library(gtable)
library(gridExtra)
# extract legend ("guide-box" element)
leg <- gtable_filter(ggplot_gtable(ggplot_build(l)), 'guide-box')
# plot the two components, adjusting the widths as you see fit.
grid.arrange(w + theme(legend.position='none'),leg,ncol=2, widths = c(3,1))
An alternative is to simply replace the legend in the plot you want with the legend you want that you have extracted (using gtable_filter)
# create ggplotGrob of plot you want
wGrob <- ggplotGrob(w)
# replace the legend
wGrob$grobs[wGrob$layout$name == "guide-box"][[1]] <- leg
grid.draw(wGrob)
Quick and dirty. To get the correct plotting order in both figure and legend, add the layers like this: (1) geom_pointrange, (2) geom_line, and then (3) a second geom_pointrange without legend (show.legend = FALSE).
ggplot(md.figd2, aes(x = date, y = means, group = trt)) +
geom_pointrange(aes(ymin = mins, ymax = maxes, color = trt),
position = position_dodge(width = 5), size = 2) +
geom_line(aes(linetype = trt), size = 1) +
geom_pointrange(aes(ymin = mins, ymax = maxes, color = trt),
position = position_dodge(width = 5), size = 2,
show.legend = FALSE) +
scale_y_continuous(limits = c(0,1), breaks = seq(0,1, 0.2), name = 'Proportion mass lost') +
scale_x_date(limits = as.Date(c('2013-05-23', '2013-10-03')), name = 'Date') +
scale_color_manual(values = c('grey20', 'grey50', 'grey80')) +
scale_linetype_manual(values = c('solid', 'dotted', 'dashed'))
I have data such as this:
yr X lower upper
1 2004 0.2852 0.3927 0.1888
2 2005 0.3710 0.2385 0.5093
3 2006 0.3297 0.2177 0.4557
4 2007 0.2230 0.1424 0.3138
5 2008 0.3028 0.1952 0.4237
6 2009 0.3906 0.2798 0.5226
7 2010 0.3382 0.2343 0.4467
Here is some reproducible data:
dt <- structure(list(yr = 2004:2010, X = c(0.2852, 0.371, 0.3297, 0.223, 0.3028, 0.3906, 0.3382), lower = c(0.3927, 0.2385, 0.2177, 0.1424, 0.1952, 0.2798, 0.2343), upper = c(0.1888, 0.5093, 0.4557, 0.3138, 0.4237, 0.5226, 0.4467)), .Names = c("yr", "X", "lower", "upper"), class = "data.frame", row.names = c(NA, -7L))
I would like to plot this, and the results will go in a presentation, so I would like to make it look as nice as possible - I'm sorry to use the subjective "nice" but I don't know how else to say it ! I have tried this:
library(ggplot2)
ggplot(dt, aes(x=yr, y=X, group=1)) +
geom_line() +
geom_errorbar(width=.1, aes(ymin=lower, ymax=upper)) +
geom_point(shape=21, size=3, fill="blue") +
ylim(0,0.6)
But I don't like the results - it just seems to plain and boring:
You could use a ribbon instead of the errorbars
dt <- structure(list(yr = 2004:2010,
X = c(0.2852, 0.371, 0.3297, 0.223, 0.3028, 0.3906, 0.3382),
lower = c(0.3927, 0.2385, 0.2177, 0.1424, 0.1952, 0.2798, 0.2343),
upper = c(0.1888, 0.5093, 0.4557, 0.3138, 0.4237, 0.5226, 0.4467)),
.Names = c("yr", "X", "lower", "upper"), class = "data.frame",
row.names = c(NA, -7L))
library(ggplot2)
ggplot(dt, aes(x=yr, y=X, group=1, ymin = lower, ymax = upper)) +
geom_ribbon(alpha = 0.2) +
geom_line() +
geom_point(shape=21, size=3, fill="blue") +
ylim(0,0.6)
I would like to plot the following dataset
structure(list(X = structure(c(3L, 12L, 11L, 7L, 13L, 2L, 1L,
10L, 5L, 4L, 8L, 14L, 9L, 6L), .Label = c("BUM", "DDR", "ETB",
"EXP", "HED", "HEDOS", "KON", "LEIT", "MAIN", "MAT", "PER", "PMA",
"TRA", "TRADITION"), class = "factor"), Geschaeft = c(0.0468431771894094,
0.0916666666666667, 0.0654761904761905, 0.0905432595573441, 0.0761904761904762,
0.0672097759674134, 0.0869565217391304, 0.0650887573964497, 0.0762250453720508,
0.0518234165067179, 0.0561330561330561, 0.060077519379845, 0.0865384615384615,
0.0628683693516699), Gaststaette = c(0.0855397148676171, 0.0604166666666667,
0.0555555555555556, 0.0764587525150905, 0.0895238095238095, 0.0712830957230143,
0.075098814229249, 0.0631163708086785, 0.0780399274047187, 0.0383877159309021,
0.0561330561330561, 0.0581395348837209, 0.0596153846153846, 0.0648330058939096
), Bank = c(0.065173116089613, 0.0854166666666667, 0.0972222222222222,
0.0824949698189135, 0.060952380952381, 0.0529531568228106, 0.0731225296442688,
0.0828402366863905, 0.0725952813067151, 0.0806142034548944, 0.0686070686070686,
0.0503875968992248, 0.0807692307692308, 0.0550098231827112),
Hausarzt = c(0.0712830957230143, 0.0833333333333333, 0.0912698412698413,
0.0704225352112676, 0.0628571428571429, 0.0672097759674134,
0.106719367588933, 0.0710059171597633, 0.108892921960073,
0.0940499040307102, 0.0852390852390852, 0.0794573643410853,
0.0826923076923077, 0.110019646365422), Einr..F..Aeltere = c(0.10183299389002,
0.104166666666667, 0.107142857142857, 0.100603621730382,
0.12, 0.116089613034623, 0.112648221343874, 0.112426035502959,
0.121597096188748, 0.0998080614203455, 0.118503118503119,
0.131782945736434, 0.121153846153846, 0.104125736738703),
Park = c(0.0855397148676171, 0.0666666666666667, 0.0912698412698413,
0.0804828973843058, 0.0704761904761905, 0.0672097759674134,
0.0731225296442688, 0.0670611439842209, 0.0834845735027223,
0.0806142034548944, 0.0686070686070686, 0.0658914728682171,
0.0884615384615385, 0.0609037328094303), Sportstaette = c(0.0855397148676171,
0.0791666666666667, 0.0952380952380952, 0.0824949698189135,
0.0933333333333333, 0.114052953156823, 0.0810276679841897,
0.0788954635108481, 0.0780399274047187, 0.0825335892514395,
0.0831600831600832, 0.0852713178294574, 0.0884615384615385,
0.1237721021611), OEPNV = c(0.0529531568228106, 0.05625,
0.0456349206349206, 0.0583501006036217, 0.0666666666666667,
0.0366598778004073, 0.0434782608695652, 0.0571992110453649,
0.0344827586206897, 0.0633397312859885, 0.0478170478170478,
0.062015503875969, 0.0519230769230769, 0.0235756385068762
), Mangel.an.Gruenflaechen = c(0.0692464358452139, 0.0645833333333333,
0.0694444444444444, 0.0422535211267606, 0.0666666666666667,
0.0692464358452139, 0.0711462450592885, 0.0749506903353057,
0.0598911070780399, 0.0959692898272553, 0.0623700623700624,
0.0717054263565891, 0.0653846153846154, 0.0746561886051081
), Kriminalitaet = c(0.0672097759674134, 0.0541666666666667,
0.0476190476190476, 0.0422535211267606, 0.0628571428571429,
0.0509164969450102, 0.0454545454545455, 0.0532544378698225,
0.058076225045372, 0.072936660268714, 0.0602910602910603,
0.063953488372093, 0.0461538461538462, 0.0648330058939096
), Auslaender = c(0.0244399185336049, 0.04375, 0.0416666666666667,
0.0663983903420523, 0.0228571428571429, 0.0509164969450102,
0.0237154150197628, 0.0236686390532544, 0.0217785843920145,
0.0441458733205374, 0.024948024948025, 0.0232558139534884,
0.0230769230769231, 0.0451866404715128), Umweltbelastung = c(0.0468431771894094,
0.0479166666666667, 0.0476190476190476, 0.0402414486921529,
0.0438095238095238, 0.0468431771894094, 0.0454545454545455,
0.0512820512820513, 0.0417422867513612, 0.0518234165067179,
0.0478170478170478, 0.0445736434108527, 0.0442307692307692,
0.0451866404715128), Einr..f..Kinder = c(0.0753564154786151,
0.075, 0.0555555555555556, 0.0724346076458753, 0.0533333333333333,
0.0794297352342159, 0.075098814229249, 0.0788954635108481,
0.0598911070780399, 0.0460652591170825, 0.0977130977130977,
0.0930232558139535, 0.0634615384615385, 0.0451866404715128
), Einr..f..Jugendliche = c(0.122199592668024, 0.0875, 0.0892857142857143,
0.0945674044265594, 0.11047619047619, 0.109979633401222,
0.0869565217391304, 0.120315581854043, 0.105263157894737,
0.0978886756238004, 0.122661122661123, 0.11046511627907,
0.0980769230769231, 0.119842829076621)), .Names = c("X",
"Geschaeft", "Gaststaette", "Bank", "Hausarzt", "Einr..F..Aeltere",
"Park", "Sportstaette", "OEPNV", "Mangel.an.Gruenflaechen", "Kriminalitaet",
"Auslaender", "Umweltbelastung", "Einr..f..Kinder", "Einr..f..Jugendliche"
), row.names = c(NA, -14L), class = "data.frame")
So that it look like this picture (or better with each line in a seperate plot) that I created with Excel.
But I can't figure out how...
Thanks a lot for your help.
Dominik
UPDATE: Here is just a map of what the groups (BUM,DDR,ETB etc.) mean.
This is an extension to #Andrie's solution. It combines the faceting idea with that of overplotting (stolen liberally from the learnr blog, which I find results in a cool visualization. Here is the code and the resulting output. Comments are welcome
mdf <- melt(df, id.vars="X")
mdf = transform(mdf, variable = reorder(variable, value, mean), Y = X)
ggplot(mdf, aes(x = variable, y = value)) +
geom_line(data = transform(mdf, X = NULL), aes(group = Y), colour = "grey80") +
geom_line(aes(group = X)) +
facet_wrap(~X) +
opts(axis.text.x = theme_text(angle=90, hjust=1))
EDIT: If you have groupings of milieus, then a better way to present might be the following
mycols = c(brewer.pal(4, 'Oranges'), brewer.pal(4, 'Greens'),
brewer.pal(3, 'Blues'), brewer.pal(3, 'PuRd'))
mdf2 = read.table(textConnection("
V1, V2
ETB, LEIT
PMA, LEIT
PER, LEIT
LEIT, LEIT
KON, TRADITION
TRA, TRADITION
DDR, TRADITION
TRADITION, TRADITION
BUM, MAIN
MAT, MAIN
MAIN, MAIN
EXP, HEDOS
HED, HEDOS
HEDOS, HEDOS"), sep = ",", header = T, stringsAsFactors = F)
mdf2 = data.frame(mdf2, mycols = mycols)
mdf3 = merge(mdf, mdf2, by.x = 'X', by.y = "V1")
p1 = ggplot(mdf3, aes(x = variable, y = value, group = X, colour = mycols)) +
geom_line(subset = .(nchar(as.character(X)) == 3)) +
geom_line(subset = .(nchar(as.character(X)) != 3), size = 1.5) +
facet_wrap(~ V2) +
scale_color_identity(name = 'Milieus', breaks = mdf2$mycols, labels = mdf2$V1) +
theme_bw() +
opts(axis.text.x = theme_text(angle=90, hjust=1))
The trick is to reshape your data into tall format before you pass it to ggplot. This is easy when using the melt function in package reshape2:
Assuming your data is a variable called df:
library(reshape2)
library(ggplot2)
mdf <- melt(df, id.vars="X")
str(mdf)
ggplot(mdf, aes(x=variable, y=value, colour=X, group=X)) + geom_line() +
opts(axis.text.x = theme_text(angle=90, hjust=1))
Edit As #Chase suggests, you can use facetting to make the plot more readable:
ggplot(mdf, aes(x=X, y=value)) + geom_point() +
opts(axis.text.x = theme_text(angle=90, hjust=1)) + facet_wrap(~variable)
First, melt the data to put it in a long format.
melted_data <- melt(the_data, id.vars = "X")
Now draw the plot with a numeric x axis, and fix up the labels.
p <- ggplot(melted_data, aes(as.numeric(variable), value, colour = X)) +
geom_line() +
scale_x_continuous(
breaks = seq_len(nlevels(melted_data$variable)),
labels = levels(melted_data$variable)
) +
opts(axis.text.x = theme_text(angle = 90))
p
Having answered this, I'm not sure what the plot tells you &ndahs; it's just a jumble of lines to me. You might be better greying out most of the lines, and highlighting one or two interesting ones.
Add a column that picks out, e.g., EXP.
melted_data$is_EXP <- with(melted_data, X == "EXP")
Ignore my previous anser; Andrie's is better. Use manual colour and size scales to highlight your new column.
p <- ggplot(melted_data, aes(variable, value, colour = is_EXP, size = is_EXP, group = X)) +
geom_line() +
scale_colour_manual(values = c("grey80", "black")) +
scale_size_manual(values = c(0.5, 1.5)) +
opts(axis.text.x = theme_text(angle = 90, hjust=1))
p