selecting only some facets to print in facet_wrap, ggplot2 - r
my question is very simple, but I have failed to solve it after many attempts. I just want to print some facets of a facetted plot (made with facet_wrap in ggplot2), and remove the ones I am no interested in.
I have facet_wrap with ggplot2, as follows:
#anomalies linear trends
an.trends <- ggplot()+
geom_smooth(method="lm", data=tndvilong.anomalies, aes(x=year, y=NDVIan, colour=TenureZone,
group=TenureZone))+
scale_color_manual(values=miscol) +
ggtitle("anomalies' trends")
#anomalies linear trends by VEG
an.trendsVEG <- an.trends + facet_wrap(~VEG,ncol=2)
print(an.trendsVEG)
And I get the plot as I expected (you can see it in te link below):
anomalies' trends by VEG
The question is: how do I get printed only the facest I am interested on?
I only want to print "CenKal_ShWoodl", "HlShl_ShDens", "NKal_ShWoodl", and "ThShl_ShDens"
Thanks
I suggest the easiest way to do that is to simply give ggplot() an appropriate subset. In this case:
facets <- c("CenKal_ShWoodl", "HlShl_ShDens", "NKal_ShWoodl", "ThShl_ShDens")
an.trends.sub <- ggplot(tndvilong.anomalies[tndvilong.anomalies$VEG %in% facets,])+
geom_smooth(method="lm" aes(x=year, y=NDVIan, colour=TenureZone,
group=TenureZone))+
scale_color_manual(values=miscol) +
ggtitle("anomalies' trends") +
facet_wrap(~VEG,ncol=2)
Obviously without your data I can't be sure this will give you what you want, but based on your description, it should work. I find that with ggplot, it is generally best to pass it the data you want plotted, rather than finding ways of changing the plot itself.
Related
Creating a multi-panel plot of a data set grouped by two grouping variables in R
I'm trying to solve the following exercise: Make a scatter plot of the relationship between the variables 'K1' and 'K2' with "faceting" based on the parameters 'diam' and 'na' (subdivide the canvas by these two variables). Finally, assign different colors to the points depending on the 'thickness' of the ring (don't forget to factor it before). The graph should be similar to this one ("grosor" stands by "thickness"): Now, the last code I tried with is the following one (the dataset is called "qerat"): ggplot(qerat, aes(K1,K2, fill=factor(grosor))) + geom_point() + facet_wrap(vars(diam,na)) ¿Could somebody give me a hand pointing out where the mistake is? ¡Many thanks in advance!
Maybe you are looking for a facet_grid() approach. Here the code using a data similar to yours: library(ggplot2) #Data data("diamonds") #Plot ggplot(diamonds,aes(x=carat,y=price,color=factor(cut)))+ geom_point()+ facet_grid(color~clarity) Output: In the case of your code, as no data is present, I would suggest next changes: #Code ggplot(qerat, aes(K1,K2, color=factor(grosor)))+ geom_point() + facet_grid(diam~na)
Creating nice overlayed histogram in R with ggplot
I'm hoping to get some help on making the following histogram looks as nice and understandable as possible. I am plotting the salaries of Immigrant versus US Born workers. I am wondering 1. How would you modify colors, axis intervals, etc. to make the graph more clear/appealing? 2. How could I add a key to indicate purple is for US born workers, and pink is for foreign born? 3. How can I add two different lines to indicate the median of each group? And a corresponding label for each? My current code is set up as this: ggplot(NHIS1,aes(x=adj_SALARY, y=..density..)) + geom_histogram(data=subset(NHIS1,IMMIGRANT=='0'), alpha=.5,binwidth=800, fill="purple",position="identity") + xlim(4430.4,50000) + geom_vline(xintercept=median(NHIS1$adj_SALARY), col="black", linetype="dashed") + geom_histogram(data=subset(NHIS1,IMMIGRANT=='1'), alpha=.5,binwidth=800,fill="red") + xlim(4430.4,50000) geom_vline(xintercept=median(NHIS1$adj_SALARY), col="black", linetype="dashed") And my final histogram at the moment appears as this:
If you have two variables, one for income , one for immigrant status, you do not need to plot two histograms but one will suffice if you specify the grouping. Also, I'd suggest you also use density lines, which help smooth over the histogram's bumps: Assuming this is roughly like your data: df <- data.frame(income = sample(1000:5000, 1000), born = sample(c("US", "Foreign"), 1000, replace = T)) Then a crude way to plot one histogram as well as density lines for the two groups would be this: ggplot(df, aes(x=income, color=born, fill=born)) + geom_histogram(aes(y=..density..), alpha=0.5, binwidth=100, position="identity") + geom_density(alpha=.2)
This question has been asked before: overlaying-histograms-with-ggplot2-in-r discusses several options with many examples. You should definitely take a look at it. Another option to compare the distributions could be violin plots using geom_violin(). I see violin plots as the better option when you need to compare distributions because they give you more flexibility and are still clearer. But that may be just me. Refer to the examples in the manual.
R - Bar Plot with transparency based on values?
I have a dataset myData which contains x and y values for various Samples. I can create a line plot for a dataset which contains a few Samples with the following pseudocode, and it is a good way to represent this data: myData <- data.frame(x = 290:450, X52241 = c(..., ..., ...), X75123 = c(..., ..., ...)) myData <- myData %>% gather(Sample, y, -x) ggplot(myData, aes(x, y)) + geom_line(aes(color=Sample)) Which generates: This turns into a Spaghetti Plot when I have a lot more Samples added, which makes the information hard to understand, so I want to represent the "hills" of each sample in another way. Preferably, I would like to represent the data as a series of stacked bars, one for each myData$Sample, with transparency inversely related to what is in myData$y. I've tried to represent that data in photoshop (badly) here: Is there a way to do this? Creating faceted plots using facet_wrap() or facet_grid() doesn't give me what I want (far too many Samples). I would also be open to stacked ridgeline plots using ggridges, but I am not understanding how I would be able to convert absolute values to a stat(density) value needed to plot those. Any suggestions?
Thanks to u/Joris for the helpful suggestion! Since, I did not find this question elsewhere, I'll go ahead and post the pretty simple solution to my question here for others to find. Basically, I needed to apply the alpha aesthetic via aes(alpha=y, ...). In theory, I could apply this over any geom. I tried geom_col(), which worked, but the best solution was to use geom_segment(), since all my "bars" were going to be the same length. Also note that I had to "slice" up the segments in order to avoid the problem of overplotting similar to those found here, here, and here. ggplot(myData, aes(x, Sample)) + geom_segment(aes(x=x, xend=x-1, y=Sample, yend=Sample, alpha=y), color='blue3', size=14) That gives us the nice gradient: Since the max y values are not the same for both lines, if I wanted to "match" the intensity I normalized the data (myDataNorm) and could make the same plot. In my particular case, I kind of preferred bars that did not have a gradient, but which showed a hard edge for the maximum values of y. Here was one solution: ggplot(myDataNorm, aes(x, Sample)) + geom_segment(aes(x=x, xend=x-1, y=Sample, y=end=Sample, alpha=ifelse(y>0.9,1,0)) + theme(legend.position='none') Better, but I did not like the faint-colored areas that were left. The final code is what gave me something that perfectly captured what I was looking for. I simply moved the ifelse() statement to apply to the x aesthetic, so the parts of the segment drawn were only those with high enough y values. Note my data "starts" at x=290 here. Probably more elegant ways to combine those x and xend terms, but whatever: ggplot(myDataNorm, aes(x, Sample)) + geom_segment(aes( x=ifelse(y>0.9,x,290), xend=ifelse(y>0.9,x-1,290), y=Sample, yend=Sample), color='blue3', size=14) + xlim(290,400) # needed to show entire scale
Combining output from smatr with ggplot2
I have a dataset of leaf trait measurements made at multiple sites at two contrasting seasons. I am interested to explore the association/line fit between a pair of traits and to differentiate the seasons at each site. Rather than a linear regression, I would prefer to use the Standardised Major Axis approach within the smatr package: e.g. sma.site1 <- sma(TraitA ~ TraitB * Visit, data=subset(myfile, Site=="Site1")) # testing the null hypothesis of common slopes for the two Visits (Seasons) at a given Site. I can produce a handy lattice plot in ggplot2 with a separate panel for each Site and the points differentiated by Visit: e.g. qplot(TraitB, TraitA, data=myfile, colour=Visit) + facet_wrap(~Site, ncol=2) However, if I add trend lines fitted with the additional argument in ggplot2: + geom_smooth(aes(group=Visit), method="lm", se=F) ……, those lines are not a good match for the sma coefficients. What I would like to do is fit the lines suggested by the sma test onto the ggplot lattice. Is there an easy, or efficient, way to do that? I know that I can subset the data, produce a plot for each site, add the relevant lines with + geom_abline() and then stitch the separate plots up together with grid.arrange(). But that feels very long-winded. I would be grateful for any pointers.
I don't know anything about the smatr package but you should be able to tweak this to get the right values. Since you provided no data I used the leaf data from the example in the pkg. The basic idea is to pull out the slope & intercept from the returned sma object and then facet the geom_abline. I may be misinterpreting the object, though. library(smatr) library(ggplot2) data(leaflife) do.call(rbind, lapply(unique(leaflife$site), function(x) { obj <- sma(longev~lma*rain, data=subset(leaflife, site=x)) data.frame(site=x, intercept=obj$coef[[1]][1, 1], slope=obj$coef[[1]][2, 1]) })) -> fits gg <- ggplot(leaflife) gg <- gg + geom_point(aes(x=lma, y=longev, color=soilp)) gg <- gg + geom_abline(data=fits, aes(slope=slope, intercept=intercept)) gg <- gg + facet_wrap(~site, ncol=2) gg
I just saw this question and am not sure if you are still interested in this. I run the code by hrbrmstr, and found actually the only thing you need to change is: obj <- sma(longev~lma*rain, data=subset(leaflife, site == x)) then you can get the plot with four lines for each group. and also
Create Lollipop-like plot with R
I have a .csv file that looks like that: Pos,ReadsME_016,ReadsME_017,ReadsME_018,ReadsME_019,ReadsME_020,ReadsME_021 95952794,62.36,62.06,55.56,51,60.35,44.27 95952795,100,100,100,100,100,100 95952833,0,0,-,0,-,- 95952846,0,0,-,0,0,- 95952876,0,-,0,0,0,0 95952877,38.89,28.98,25.67,36.99,37.91,16.86 95952878,100,100,100,100,100,100 95952884,0,-,0,-,-,0 95952897,18.7,20.52,20.94,16.43,22.68,12.55 95952898,100,100,75,80,-,100 95952902,10.88,8.93,10.22,10.63,13.51,6.06 95952903,100,100,100,75,-,100 95952915,10.75,8.7,7.91,8.35,15.12,8.88 What I want is to create a plot that is similar to this one: http://www.scfbm.org/content/9/1/11/figure/F2 However, all my attempts failed. Unfortunately, the tool is yet not available and I cannot read the source code. I've thought of ggplot and melt, but I do not come close to this graph. How can I achieve that all read samples (ReadsME_016,ReadsME_017,..) are listed on the x-axes and the positions are listed on the y-axes? I don’t know how to deal with both x- & y-axes being categorical while the plotted values should show percentages? dataset <- melt(dataset, id.vars="Pos") ggplot(dataset, aes(x=value, y=Pos, colour=variable)) + geom_point() Here is the complete .csv file: Pos,ReadsME_016,ReadsME_017,ReadsME_018,ReadsME_019,ReadsME_020,ReadsME_021,ReadsME_022,ReadsME_023,ReadsME_024,ReadsME_025,ReadsME_026,ReadsME_027,ReadsME_028,ReadsME_030,ReadsME_031,ReadsME_032 95952794,62.36,62.06,55.56,51.0,60.35,44.27,53.73,61.69,57.04,64.16,61.48,59.42,66.93,49.71,55.23,66.67 95952795,100.0,100.0,100.0,100.0,100.0,100.0,100.0,100.0,100.0,100.0,-,100.0,100.0,100.0,100.0,- 95952833,0.0,0.0,-,0.0,-,-,100.0,-,-,-,-,0.0,-,-,0.0,- 95952846,0.0,0.0,-,0.0,0.0,-,0.0,0.0,-,-,-,0.0,-,-,-,- 95952876,0.0,-,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,- 95952877,38.89,28.98,25.67,36.99,37.91,16.86,29.65,35.38,35.43,36.87,34.04,33.91,35.04,19.09,38.35,0.0 95952878,100.0,100.0,100.0,100.0,100.0,100.0,100.0,100.0,100.0,100.0,-,100.0,100.0,100.0,100.0,- 95952884,0.0,-,0.0,-,-,0.0,-,-,100.0,-,-,0.0,-,-,-,- 95952897,18.7,20.52,20.94,16.43,22.68,12.55,18.3,22.28,21.05,22.55,24.81,20.63,22.05,13.06,22.8,0.0 95952898,100.0,100.0,75.0,80.0,-,100.0,80.0,100.0,100.0,-,-,-,100.0,-,100.0,- 95952902,10.88,8.93,10.22,10.63,13.51,6.06,9.62,15.73,14.08,18.65,13.28,16.44,15.02,8.92,11.11,100.0 95952903,100.0,100.0,100.0,75.0,-,100.0,100.0,100.0,100.0,-,-,100.0,100.0,100.0,100.0,- 95952915,10.75,8.7,7.91,8.35,15.12,8.88,7.32,9.76,11.45,8.99,10.57,14.07,10.36,6.35,10.04,0.0 95952916,100.0,100.0,100.0,100.0,-,100.0,100.0,100.0,100.0,-,-,100.0,100.0,-,100.0,- 95952925,10.39,8.33,8.59,10.51,14.19,10.99,6.98,11.56,13.93,15.0,14.29,16.26,9.76,5.86,12.96,0.0 95952926,100.0,100.0,100.0,100.0,-,100.0,100.0,100.0,100.0,-,-,-,100.0,-,100.0,- 95952937,19.53,14.97,11.97,14.43,19.26,17.18,19.48,12.31,21.17,21.57,23.08,26.24,16.38,13.47,21.82,0.0 95952938,100.0,100.0,100.0,100.0,-,100.0,100.0,-,-,-,-,-,-,-,100.0,- 95952825,-,0.0,-,-,-,-,-,-,-,-,0.0,-,-,0.0,0.0,- 95952975,-,0.0,-,-,-,-,-,-,0.0,-,-,-,-,-,-,- 95952669,-,-,0.0,-,-,0.0,0.0,-,-,-,-,-,-,-,0.0,- 95952718,-,-,0.0,0.0,0.0,-,0.0,-,-,-,0.0,-,-,0.0,0.0,- 95952868,-,-,0.0,-,0.0,-,-,0.0,-,-,0.0,-,-,-,-,- 95952957,-,-,0.0,-,-,-,-,0.0,0.0,0.0,-,0.0,-,-,-,- 95952976,-,-,0.0,-,0.0,0.0,0.0,100.0,-,0.0,-,-,-,-,0.0,- 95952681,-,-,-,0.0,-,0.0,-,0.0,-,-,-,-,-,0.0,-,- 95952779,-,-,-,0.0,-,-,-,-,-,-,-,-,-,-,-,- 95952811,-,-,-,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,-,-,-,0.0,- 95952821,-,-,-,0.0,-,-,-,-,-,-,-,-,-,-,-,- 95952823,-,-,-,0.0,-,-,-,-,-,-,-,-,-,-,-,- 95952859,-,-,-,0.0,0.0,-,-,0.0,0.0,-,0.0,-,-,0.0,0.0,- 95952882,-,-,-,0.0,-,-,-,-,-,-,0.0,-,-,-,-,- 95953023,-,-,-,0.0,-,0.0,-,-,-,-,-,-,-,-,-,- 95953058,-,-,-,0.0,-,0.0,-,-,-,-,-,-,-,-,-,- 95952664,-,-,-,-,-,0.0,0.0,-,-,0.0,-,-,-,-,0.0,- 95952801,-,-,-,-,-,0.0,-,-,-,-,-,-,-,-,-,- 95952968,-,-,-,-,-,-,0.0,-,-,0.0,-,-,-,-,-,- 95952797,-,-,-,-,-,-,-,-,0.0,-,-,-,-,-,-,- 95952851,-,-,-,-,-,-,-,-,-,-,0.0,-,-,-,-,- 95952894,-,-,-,-,-,-,-,-,-,-,0.0,-,-,-,-,- 95952807,-,-,-,-,-,-,-,-,-,-,-,-,-,0.0,-,- 95952712,-,-,-,-,-,-,-,-,-,-,-,-,-,-,0.0,-
First, you want to make sure you are reading in your data properly. You have non-numeric values (specifically "-") mixed in with numeric values. I'm assuming those are missing values. Make sure you let R know that with na.strings="-". Then, to get something more consistent with the example plot, i changed your variables around library(reshape2) # for melt() library(ggplot2) # for ggplot() dataset <- read.table("file.txt", header=TRUE, sep=",", na.strings="-") ggplot(melt(dataset, id.vars="Pos"), aes(x=Pos, y=variable, colour=cut(value, breaks=5))) + geom_point()