I have used facet_wrap to build my plot as picture below:
After considering abit, I have used scale_y_log10 to transform my data (4 panels in the plot).
However, I realized that I only need to do the transformation for the first panel not the others since those values are very closed to 0.
My idea is to separate those values I would like to transform and combine them later on. I wondering that whether I can use scale_y_log10 for only one single panel created by facet_wrap so that I dont have to separate my dataset.
Thanks
Related
I have this problem where R will auto-adjust the size of the facets in ggplot. In the 2 attached images, clearly, the one scaled from 0-100 on the y-axis is less stretched out compared to the one scaled at 6.6-7.2. These are plotted using the same ggplot commands from maaply, so I don't know where the difference would come from. Is there any way to prevent R from performing the auto-adjusting to keep the formatting of each ggplot the same? My OCD and I thank you.
It looks like I have made a copy and paste error where I used some the the wrong variable to set the base_height in save_plot within mapply, so the scaling factor was varying across iterations.
I have a sample dataset
d=data.frame(n=rep(c(1,1,1,1,1,1,2,2,2,3),2),group=rep(c("A","B"),each=20),stringsAsFactors = F)
And I want to draw two separate histograms based on group variable.
I tried this method suggested by #jenesaisquoi in a separate post here
Generating Multiple Plots in ggplot by Factor
ggplot(data=d)+geom_histogram(aes(x=n,y=..count../sum(..count..)),binwidth = 1)+facet_wrap(~group)
It did the trick but if you look closely, the proportions are wrong. It didn't calculate the proportion for each group but rather a grand proportion. I want the proportion to be 0.6 for number 1 for each group, not 0.3.
Then I tried dplyr package, and it didn't even create two graphs. It ignored the group_by command. Except the proportion is right this time.
d%>%group_by(group)%>%ggplot(data=.)+geom_histogram(aes(x=n,y=..count../sum(..count..)),binwidth = 1)
Finally I tried factoring with color
ggplot(data=d)+geom_histogram(aes(x=n,y=..count../sum(..count..),color=group),binwidth = 1)
But the result is far from ideal. I was going to accept one output but with the bins side by side, not on top of each other.
In conclusion, I want to draw two separate histograms with correct proportions calculated within each group. If there is no easy way to do this, I can live with one graph but having the bins side by side, and with correct proportions for each group. In this example, number 1 should have 0.6 as its proportion.
By changing ..count../sum(..count..) to ..density.., it gives you the desired proportion
ggplot(data=d)+geom_histogram(aes(x=n,y=..density..),binwidth = 1)+facet_wrap(~group)
You actually have the separation of charts by variable correct! Especially with ggplot, you sometimes need to consider the scales of the graph separately from the shape. Facet_wrap applies a new layer to your data, regardless of scale. It will behave the same, no matter what your axes are. You could also try adding scale_y_log10() as a layer, and you'll notice that the overall shape and style of your graph is the same, you've just changed the axes.
What you actually need is a fix to your scales. Understandable - frequency plots can be confusing. ..count../sum(..count..)) treats each bin as an independent unit, regardless of its value. See a good explanation of this here: Show % instead of counts in charts of categorical variables
What you want is ..density.., which is basically the count divided by the total count. The difference is subtle in principle, but the important bit is that the value on the x-axis matters. For an extreme case of this, see here: Normalizing y-axis in histograms in R ggplot to proportion, where tiny x-axis values produced huge densities.
Your original code will still work, just substituting the aesthetics I described above.
ggplot(data=d)+geom_histogram(aes(x=n,y=..density..,)binwidth = 1)+facet_wrap(~group)
If you're still confused about density, so are lots of people. Hadley Wickham wrote a long piece about it, you can find that here: http://vita.had.co.nz/papers/density-estimation.pdf
I am a novice at R and experimenting with as an alternative for data visualisation.
I am having trouble creating a stacked bar chart.
I have tried the reshape2 package with the melt function and have successfully produced one, but I had to explicitly create a dataset containing JUST the x-axis and variables that I want stacked.
It seems extremely counter-intuitive to me that we can't visualise data from a left to right sense (x-axis constant, y variables summed and overlapping).
Is there an alternate method, where I could simply perform a ggplot with the logic of:
ggplot(data=dataset, aes(x=Time, y1=var1, y2=var2, y3=var3.....)) +
geom_bar(stat="identity",position="stack")
where y1, y2, y3 are the variables I want stacked, but do not have corresponding flags for me to use a "fill=flag" type?
I basically want to work off one large master dataset and export multiple analysis without having to excessively isolate each dataset and melt it
In general a stacked bar chart is used to distinguish between variations within a single category of data. For example if you had a bar chart showing the population of three species of migratory fowl that inhabit one specific marsh.
The bars might be mallard ducks, muted swans & Canada geese. Each would have a single whole bar.
The stacking would come in when you looked at these with a trait or quality they might share which you were comparing, such as the number who migrate and those who overwinter locally. The population of each type of fowl would be split into two stacks in the bar, those migrating who are Canada geese, those not...and so on.
It is not really meant to bring together disparate traits into a stack.
So, if you have data that separates out categories of the same population, reshaping the data to create a set of individual types within your data in columns, then differentiating by factors in another (also all in the same column) that is the right move.
If you need to keep it extracted for some reason, you can probably use y = (x$1 +x$2 x$b) to create your stacks, but depending on the data that might fail miserably. The best thing to do is reshape so that the quality you are counting is in a column and you compare those members across some other column with stacks.
If you need to use the data in another format later, create a temporary table, plot and then remove() it and gc() after graphing to get your memory back
I just came into a problem while making several maps in R, the problem I came to is that I want to plot several maps and some geom_points in those maps, each map will have some points with different values and so the legend with the scales (size and color) will change between maps. All I want is to have exactly the same legend, representing the same values (for both color and size). I've tried with breaks etc but my data is continuous, so I didn't find any way to fix it.
EDIT:Simple example
Will try to explain with simple example by myself. Imagine I have these two arrays to be plotted into different coordinates for 2 different days:
c<-(1,2,3,2,1)
c<-(1,9,2,1,2)
What I want is to set the legend of the plot to be always representing the range 1-9 as values of the geom_points, no matter the specific values of the given day, in a way that no matter the values, the legend will be always the same and if I try to set some slides, the scale will not change
Any ideas?
I am attempting to plot lots of graphs on the fly and I chanced upon the facet_wrap functionality. It produced the desired results until I realised that it was not assigning individual axes headings. There was just a single X and Y axis heading for a whole set of graphs. What I'm looking for is a way to assign individual axes headings for each graph.
Is this possible using the facet_wrap functionality at all?
Looking forward to any suggestions and advice.
EDIT:
(removed previous, incorrect, answer)
It is my understanding that if the axes of your plots are not the same (i.e. require different labels), the way to go would be with multiple separate plots (on the same page), and not with facet_wrap.