Missing scale on ggplot 2 - r

I am creating a graph using ggplot2. Here is the first output of the graph before any tidying is done.
And here is the code:
graph <- ggplot(data = village.times,
aes(x=village.times$a6ncopo, y=(village.times$a5species=="funestus")))
+ geom_bar(stat="identity", position = "stack", fill="#FF4444")
What I don't know is why there isn't a scale on the y axis and how to remove the True-False labels. Is there a way I can force ggplot to include a scale on the y axis or do I have to change the way I use my data?

Maybe subsetting your data frame before using ggplot and just creating a histogram? Otherwise I don't what your expected result should be...
ggplot(subset(village.times, a5species=="funestus"),
aes(x=a6ncopo)) +
geom_bar()

Related

Determine order of several boxplots in one plot in R qqplot

I tried to create a relatively simple boxplot plot in R's ggplot2: One value on the x axis and several variables on the y axis. I'm using a code similar to this one:
ggplot() +
# Boxplot 1
geom_boxplot(df[which(df$Xvalue=="Boxplot1"),],
mapping = aes(X, "Y")) +
# Boxplot 2
geom_boxplot(df[which(df$Xvalue=="Boxplot2"),],
mapping = aes(X, "Y")) +
# Boxplot 3
geom_boxplot(df[which(df$Xvalue=="Boxplot3"),],
mapping = aes(X, "Y")) +
The boxplots in my real code are ordered alphabetically, however, I need them to be in a customized, categorial order.
I'm aware I could restructure my data frame so that I don't use a subset and a new geom_boxplot command for each boxplot, but I've structured the data that way for other reasons and that's not the solution I'm looking for right now.
Maybe there is an easy way using the scale_Y_manual or else? Any help is appreciated!

Geom_area plot doesn't fill the area between the lines

I want to make an area plot with ggplot(mpg, aes(x=year,y=hwy, fill=manufacturer)) + geom_area(), but I get this:
I'm realy new in R world, can anyone explain why it does not fill the area between the lines? Thanks!
First of all, there's nothing wrong with your code. It's working as intended and you are correct in the syntax required to do what you are looking to do.
Why don't you get the area geom to plot correctly, then? Simple answer is that you don't have enough points to draw a proper line between your x values for all of the aesthetics (manufacturers). Try the geom_point plot and you'll see what I mean:
ggplot(mpg, aes(x=year,y=hwy)) + geom_point(aes(color=manufacturer))
You need a different dataset. Here's a dummy one that is simply two lines with different slopes. It works as expected because each of the aesthetics has y values which span the x labels:
# dummy dataset
df <- data.frame(
x=rep(1:10,2),
y=c(seq(1,10,length.out=10), seq(1,5,length.out=10)),
z=c(rep('A',10), rep('B', 10))
)
# plot
ggplot(df, aes(x,y)) + geom_area(aes(fill=z))

Wrong density values in a histogram with `fill` option in `ggplot2`

I was creating histograms with ggplot2 in R whose bins are separated with colors and noticed one thing. When the bins of a histogram are separated by colors with fill option, the density value of the histogram turns funny.
Here is the data.
set.seed(42)
x <- rnorm(10000,0,1)
df <- data.frame(x=x, b=x>1)
This is a histogram without fill.
ggplot(df, aes(x = x)) +
geom_histogram(aes(y=..density..))
This is a histogram with fill.
ggplot(df, aes(x = x, fill=b)) +
geom_histogram(aes(y=..density..))
You can see the latter is pretty crazy. The left side of the bins is sticking out. The density values of the bins of each color are obviously wrong.
I thought over this issue for a while. The data can't be wrong for the first histogram was normal. It should be something in ggplot2 or geom_histogram function. I googled "geom_histogram density fill" and couldn't find much help.
I want the end product to look like:
Separated by colors as you see in the second histogram
Size and shape identical to the first histogram
The vertical axis being density
How would you deal with issue?
I think what you may want is this:
ggplot(df, aes(x = x, fill=b)) +
geom_histogram()
Rather than the density. As mentioned above the density is asking for extra calcuations.
One thing that is important (in my opinion) is that histograms are graphs of one variable. As soon as you start adding data from other variables you start to change them more into bar charts or something else like that.
You will want work on setting the axis manually if you want it to range from 0 to .4.
The solution is to hand-compute density like this (instead of using the built-in ggplot2 version):
library(ggplot2)
# Generate test data
set.seed(42)
x <- rnorm(10000,0,1)
df <- data.frame(x=x, b=x>1)
ggplot(df, aes(x = x, fill=b)) +
geom_histogram(mapping = aes(y = ..count.. / (sum(..count..) * ..width..)))
when you provide a column name for the fill parameter in ggplot it groups varaiables and plots them according to each group with a unique color.
if you want a single color for the plot just specify the color you want:
FIXED
ggplot(df, aes(x = x)) +
geom_histogram(aes(y=..density..),fill="Blue")

Drawing flipped Normal distribution in R without using coord_flip()

Good day
Without using coord_flip(), Is there a way to draw normal distribution flipped by exchanging position x and y in aes()?
I' ve tried as below.
df3 <- data.frame(x=seq(-6,6,b=0.1),y=sapply(seq(-6,6,b=0.1),function(x) dnorm(x)))
ggplot(df3,aes(y,x))+ geom_line() # x,y position exchanged
I'm not sure what's wrong with coord_flip, but you can avoid it with geom_path. geom_path connects the points in the order they appear in the data, rather than in order of the magnitude of the x-value. So you just need to make sure the data are ordered by y-axis value (which they already are here).
ggplot(df3, aes(y,x)) +
geom_path() +
theme_classic()

tiny pie charts to represent each point in an scatterplot using ggplot2

I want to create a scatter plot, in which each point is a tiny pie chart. For instance consider following data:
foo <- data.frame(X=runif(30), Y=runif(30),A=runif(30),B=runif(30),C=runif(30))
The following code will make a scatter plot, representing X and Y values of each point:
library(reshape2)
library(ggplot2)
foo.m <- melt(foo, id.vars=c("X","Y"))
ggplot(foo.m, aes(X,Y))+geom_point()
And the following code will make a pie chart for each point:
p <- ggplot(foo.m, aes(variable,value,fill=variable)) + geom_bar(stat="identity")
p + coord_polar() + facet_wrap(~X+Y,,ncol=6) + theme_bw()
But I am looking to merge them: creating a scatter plot in which each point is replaced by the pie chart. This way I will be able to show all 5 values (X, Y, A, B, C) of each record in the same chart.
Is there anyway to do it?
This is the sort of thing you can do with package ggsubplot. Unfortunately, according to issue #10 here, this package is not working with R 3.1.1. I ran it successfully if I used an older version of R (3.0.3).
Using your long dataset, you could put bar plots at each X, Y point like this:
library(ggplot2)
library(ggsubplot)
ggplot(foo.m) +
geom_subplot2d(aes(x = X, y = Y,
subplot = geom_bar(aes(variable, value, fill = variable), stat = "identity")),
width = rel(.5), ref = NULL)
This gives the basic idea, although there are many other options (like controlling where the subplots move to when there is overlap in plot space).
This answer has more information on the status of ggsubplot with newer R versions.
there is a package, scatterpie, that does exactly what you want to do!
library(ggplot2)
library(scatterpie)
ggplot() +
geom_scatterpie(aes(x=X, y=Y, r=0.1), data=foo.m, cols=c("A", "B", "C"))
In the aesthetics, r is the radius of the pie, you can adjust as necessary. It is dependent on the scale of the graph - since your graph goes from 0.0 to 1.0, a radius of 1 would take up the entire graph (if centered at 0.5, 0.5).
Do note that while you will get a legend for the pie slice colors, it will not (to my knowledge) label the slices themselves on the pies.

Resources