Is there a chance that I can make data labels automatically adjust on line graphs whether it is above or below depends on which data has higher or lower value?
I have a sample dataset
d=data.frame(n=rep(c(1,1,1,1,1,1,2,2,2,3),2),group=rep(c("A","B"),each=20),stringsAsFactors = F)
And I want to draw two separate histograms based on group variable.
I tried this method suggested by #jenesaisquoi in a separate post here
Generating Multiple Plots in ggplot by Factor
ggplot(data=d)+geom_histogram(aes(x=n,y=..count../sum(..count..)),binwidth = 1)+facet_wrap(~group)
It did the trick but if you look closely, the proportions are wrong. It didn't calculate the proportion for each group but rather a grand proportion. I want the proportion to be 0.6 for number 1 for each group, not 0.3.
Then I tried dplyr package, and it didn't even create two graphs. It ignored the group_by command. Except the proportion is right this time.
d%>%group_by(group)%>%ggplot(data=.)+geom_histogram(aes(x=n,y=..count../sum(..count..)),binwidth = 1)
Finally I tried factoring with color
ggplot(data=d)+geom_histogram(aes(x=n,y=..count../sum(..count..),color=group),binwidth = 1)
But the result is far from ideal. I was going to accept one output but with the bins side by side, not on top of each other.
In conclusion, I want to draw two separate histograms with correct proportions calculated within each group. If there is no easy way to do this, I can live with one graph but having the bins side by side, and with correct proportions for each group. In this example, number 1 should have 0.6 as its proportion.
By changing ..count../sum(..count..) to ..density.., it gives you the desired proportion
ggplot(data=d)+geom_histogram(aes(x=n,y=..density..),binwidth = 1)+facet_wrap(~group)
You actually have the separation of charts by variable correct! Especially with ggplot, you sometimes need to consider the scales of the graph separately from the shape. Facet_wrap applies a new layer to your data, regardless of scale. It will behave the same, no matter what your axes are. You could also try adding scale_y_log10() as a layer, and you'll notice that the overall shape and style of your graph is the same, you've just changed the axes.
What you actually need is a fix to your scales. Understandable - frequency plots can be confusing. ..count../sum(..count..)) treats each bin as an independent unit, regardless of its value. See a good explanation of this here: Show % instead of counts in charts of categorical variables
What you want is ..density.., which is basically the count divided by the total count. The difference is subtle in principle, but the important bit is that the value on the x-axis matters. For an extreme case of this, see here: Normalizing y-axis in histograms in R ggplot to proportion, where tiny x-axis values produced huge densities.
Your original code will still work, just substituting the aesthetics I described above.
ggplot(data=d)+geom_histogram(aes(x=n,y=..density..,)binwidth = 1)+facet_wrap(~group)
If you're still confused about density, so are lots of people. Hadley Wickham wrote a long piece about it, you can find that here: http://vita.had.co.nz/papers/density-estimation.pdf
I am plotting a scatter plot in R, however I have many data points and they overlap. I want to have a plot where there no overlaps and maintain a reasonable size of the data . This is the image of the scatter. On the top side is where the data are clustered.
This is the code plot(data2,col="red",pch=21,cex=0.7)
Could you put a reproducible code to check.
One thing you can do is may be increase the space between y axis interval.
I am trying to include two plot in a single window using mfrowin par().
something like this
x=1:10
y=seq(10,100,10)
z=seq(100,1000,100)
par(mfrow=c(2,1))
par(mar=c(0.2,4.1,4.1,2.1))
plot(x,y,xaxt="n",xlab=NA)
par(mar=c(4.1,4.1,0.2,2.1))
plot(x,z)
which gives
Now let's say I want to adjust the height of second plot as small as half the height of the first plot. How can I do it? I was trying with many par() parameters. But I couldn't find a way.
I have a plot, where the lines are only in the negative range and when I plot it, the axis is automatically changed, i.e. negative larger values are going up and not down, in a normal plot. Currently I have the following plot:
But I want to have the y axis the other way round, so that negative larger values are going down and not up, I hope it is understandable what I mean.
How can I achieve this in R?
My code with my specific data is just using the normal plot() function.
As Ben Bolker said, the following has to be said:
I set the ylim range wrong, I set it like
ylim=c(-0.05,-1)
but
ylim=c(-1,-0.05)
should do what I want!