ggplot2 only plotting axes - not the points - r

I have a CSV file like:
date;foo
2016-07-01;0,54
2016-08-01;0,54
2016-09-01;0,50
2016-10-01;0,49
but then read into R and plotted
foo2 <- read.csv2("here")
ggplot(foo2, aes(x=date, y=foo))
The output is empty. I.e. axes are present but no points are plotted.
A regular plot(foo2$foo) simply plots the points - what could be wrong here?

You need to add a geom to your plot. If you want a line plot add...
ggplot(foo2, aes(x=date, y=foo)) + geom_line()
If you want a scatter plot...
ggplot(foo2, aes(x=date, y=foo)) + geom_point()
You can find more geoms here.

Related

Why does this ggplot only plot the grid without the values?

I am trying to plot a bar chart in ggplot but I am continuously getting only the grid. This is apparently a demonstration about the draw nothing here but I would like to understand how to get the values visible in the simplest way.
library(ggplot2)
testData<-data.frame(x=c("a","b","c","d","e","f"), y=c(10,6,9,28,10,17))
bar <- ggplot(data=testData, aes(x=c("a","b","c","d","e","f"), y=c(10,6,9,28,10,17), fill = "#FFCC00"))
One way I can get the plots is the geom_bar
bar <- ggplot(data=testData, aes(x=c("a","b","c","d","e","f"), y=c(10,6,9,28,10,17), fill = "#FFCC00")) + geom_bar(stat="identity")
Why are the values not plotted on the first bar chart and how to fix it the simplest way? What is the idea behind of this way of plotting with + and what is it called?
With the ggplot2 package, calling ggplot() is only meant to call the basic grid; it's like taking out a piece of graph paper before drawing a graph. In either case, having the grid ready has nothing to do with plotting the graph. That's why running the following command will result in the empty grid in your first example:
ggplot(data=testData, aes(x=x, y=y, fill = "#FFCC00"))
It's not the same as using a function like plot() or hist(), which prep the grid and plot the data at the same time:
plot(x=x,y=y,data=testData)
hist(x=x,data=testData)
The "+" in ggplot is just a way to say that there are more arguments related to the ggplot that we want included on top of the first blank grid. That's why each line separated by a "+" is typically called a layer.
So, if we want to make a simple scatterplot, we add points on top of a grid:
testData<-data.frame(x=c(1:6), y=c(10,6,9,28,10,17))
ggplot(data=testData,aes(x=x,y=y)) +
geom_point()
Output:
If we want to add lines to that scatterplot, we can just add one line of code:
ggplot(data=testData,aes(x=x,y=y)) +
geom_point() +
geom_line()
Output:
We can keep adding layers like this if we want. Just note that they will print in the order that you type them (i.e. the first few lines will be below the lines printed after them):
ggplot(data=testData,aes(x=x,y=y)) +
geom_bar(stat="identity",fill="#00BFC4") +
geom_point() +
geom_line()
Output:
Also, note that it's recommended not to call your data multiple times within a ggplot call; that can lead to errors.
Don't use:
ggplot(data=testData, aes(x=c("a","b","c","d","e","f"),
y=c(10,6,9,28,10,17), fill = "#FFCC00")) +
geom_bar(stat="identity")
#or
ggplot(data=testData, aes(x=testData$x, y=testData$x, fill = "#FFCC00")) +
geom_bar(stat="identity")
Instead use:
ggplot(data=testData, aes(x=x, y=y, fill="#FFCC00")) +
geom_bar(stat="identity")
If you want to plot data from a data frame(s) not called within the first ggplot() line, then simply add a data argument to the "layers" that use that different data frame, like this:
ggplot(data=testData,aes(x=x,y=y)) +
geom_bar(stat="identity",fill="#00BFC4") +
geom_point(data=differentDf, aes(x=x,y=y)) +
geom_line(data=differentDf, aes(x=x,y=y))

Using color-filled circles for ggplot legends

I have the following code
TRP_C<-100/(100+650)
FPR_C<-200/(200+650)
C<-data.frame(TPR=TRP_C,FPR=FPR_C)
TRP_D<-120/(120+30)
FPR_D<-350/(350+500)
D<-data.frame(TPR=TRP_D,FPR=FPR_D)
ggplot(NULL, aes(x=FPR, y=TPR)) +
geom_point(data=C,shape=1,aes(fill="A"),size=4,color="red")+
geom_point(data=D,shape=1,aes(fill="B"),size=4,color="green")
The problem is it gives me a ggplot which the points are not clear on it at all.
I think, if i can make the points filled then it would be more clear in the diagram.
So, how can i make the legend ,and points filled?
Use shape (insert value from 21-25) inside geom_point() and scale_fill_manual for colors.
So your code looks like this
ggplot(NULL, aes(x=FPR, y=TPR)) +
geom_point(data=C,shape=21,aes(fill="A"),size=4) +
geom_point(data=D,shape=21,aes(fill="B"),size=4) +
scale_fill_manual(values=c("red", "green"))
And output

How can I plot several series as lines and one of them as area using ggplot2?

I'm trying to accomplish something that I used to do in Excel, I have several timeseries for the same time interval and would like to plot them as lines (easy enough using ggplot geom_line), but one of them should be plotted as an area plot.
Basically something like this:
Plase note that the series S_1 is plotted as area.
I have already tried adding geom_area() with aes values equal to the value of the area series:
ggplot(df.lines, aes(x=Index, y=Value, colour=Series)) + geom_line() + geom_area(aes(x=df.area$Index, y=df.area$S_1))
How could I acomplish something like this using ggplot2?
Difficult to test with no dataset (can you provide one on the example, you can use dput()), but in geom_area, the selection should be made in the data argument.. like this for instance..
ggplot +
geom_area(data = df.area[df.area$Series == "S_1", ], aes(x=Index, y=Value))
geom_line(data = df.lines, aes(x=Index, y=Value, colour=Series))

Wrong vertex order in geom_line plot

Getting a strange ordering of vertices in a geom_line plot. Left hand plot is base R; right is ggplot.
Here's the shapefile I'm working with. This will reproduce the plot:
require(ggplot2); require(maptools)
rail = readShapeLines('railnetworkLine.shp')
rail_dat = fortify(rail[1,])
ggplot(rail_dat) + geom_line(aes(long, lat, group=group)) + coord_equal()
Any idea what is causing this? The data order of fortify seems correct, as plotting separately lines() confirms.
Use geom_path instead of geom_line. geom_line orders the data from lowest to highest x-value (long in this case) before plotting, but geom_path plots the data in the current order of the data frame rows.
ggplot(rail_dat) +
geom_path(aes(long, lat)) + coord_equal()

How can I change the colors in a ggplot2 density plot?

Summary: I want to choose the colors for a ggplot2() density distribution plot without losing the automatically generated legend.
Details: I have a dataframe created with the following code (I realize it is not elegant but I am only learning R):
cands<-scan("human.i.cands.degnums")
non<-scan("human.i.non.degnums")
df<-data.frame(grp=factor(c(rep("1. Candidates", each=length(cands)),
rep("2. NonCands",each=length(non)))), val=c(cands,non))
I then plot their density distribution like so:
library(ggplot2)
ggplot(df, aes(x=val,color=grp)) + geom_density()
This produces the following output:
I would like to choose the colors the lines appear in and cannot for the life of me figure out how. I have read various other posts on the site but to no avail. The most relevant are:
Changing color of density plots in ggplot2
Overlapped density plots in ggplot2
After searching around for a while I have tried:
## This one gives an error
ggplot(df, aes(x=val,colour=c("red","blue"))) + geom_density()
Error: Aesthetics must either be length one, or the same length as the dataProblems:c("red", "blue")
## This one produces a single, black line
ggplot(df, aes(x=val),colour=c("red","green")) + geom_density()
The best I've come up with is this:
ggplot() + geom_density(aes(x=cands),colour="blue") + geom_density(aes(x=non),colour="red")
As you can see in the image above, that last command correctly changes the colors of the lines but it removes the legend. I like ggplot2's legend system. It is nice and simple, I don't want to have to fiddle about with recreating something that ggplot is clearly capable of doing. On top of which, the syntax is very very ugly. My actual data frame consists of 7 different groups of data. I cannot believe that writing + geom_density(aes(x=FOO),colour="BAR") 7 times is the most elegant way of coding this.
So, if all else fails I will accept with an answer that tells me how to get the legend back on to the 2nd plot. However, if someone can tell me how to do it properly I will be very happy.
set.seed(45)
df <- data.frame(x=c(rnorm(100), rnorm(100, mean=2, sd=2)), grp=rep(1:2, each=100))
ggplot(data = df, aes(x=x, color=factor(grp))) + geom_density() +
scale_color_brewer(palette = "Set1")
ggplot(data = df, aes(x=x, color=factor(grp))) + geom_density() +
scale_color_brewer(palette = "Set3")
gives me same plots with different sets of colors.
Provide vector containing colours for the "values" argument to map discrete values to manually chosen visual ones:
ggplot(df, aes(x=val,color=grp)) +
geom_density() +
scale_color_manual(values=c("red", "blue"))
To choose any colour you wish, enter the hex code for it instead:
ggplot(df, aes(x=val,color=grp)) +
geom_density() +
scale_color_manual(values=c("#f5d142", "#2bd63f")) # yellow/green

Resources