add extra legend to plot - r

Hi can I add an additional legend to a ggplot.
Like
the following code
d <- melt(as.matrix(data.frame(y1=1/(1:10),y2=1/(10:1))))
ggplot(d, aes(x=Var1, y=value,fill=Var2)) + geom_bar(stat="identity",position='dodge')
This generates a nice legend containing the name of my dataframe.
But is it possible to put in an extralegend, that contains some extra information generated from the data.
In the standard R, I would add the additional legend like
d<-data.frame(y1=1/(1:10),y2=2*1/(10:1))
barplot(t(d),beside=T)
legend("top",paste("sums:",apply(d,2,sum)))
Thanks

This seems to work for me.
plot.new()
d <- melt(as.matrix(data.frame(y1=1/(1:10),y2=1/(10:1))))
ggplot(d, aes(x=Var1, y=value,fill=Var2)) +
geom_bar(stat="identity",position='dodge')
then the exciting stuff.
legend('top',paste("sums:",tapply(d$value,d$Var2,sum)))
I changed the apply statement to work on the molten data.
I am not aware of a ggplot solution, but I would love to see one.

Related

Plotting bar-plot/clustered column charts in R from csv file using ggplo2

I am planning to plot a bar-plot/clustered column chart for time vs revenue with trend-line connecting each bars on top. Starting from year 1981 to 1988.
I have used this code to read the csv : read.csv(file_location/Revenue.csv",header = T, sep=",", dec = ".")
for the plotting : pl <- ggplot(data,aes(x=ï..Year))
and then : pl + geom_bar(color='red',fill='blue').
Unfortunately, i end up with something like this. Whereas, i'd prefer something like this.
I used only ggplot2 library in this case, should i use tidyr, diplyr additionally ? Am i mistaking between continuous and discrete variables. Any advice regarding aesthetic modification to beautify it or solutions regarding this would be really appreciated as i am still in the basics of ggplot and data visualizations.
I have added the fine in case if you want to check it : Revenue.csv
Check the documentation here form some information, but the big change you should make is to use geom_col in place of geom_bar. Your current call specifies an x= aesthetic (what should be the x axis), but not the y= aesthetic (what should be the y axis). geom_bar indicates the number of cases/observations at each x value by default, whereas geom_col is used to display a bar of length y at each x value... but you need a y aesthetic.
With all that being said, try this:
pl <- ggplot(data,aes(x=ï..Year, y=your.y.column.name)) +
geom_col(color='red',fill='blue')
As for aesthetics, I might change the color scheme a bit and also the theme, but that's ind of personal preference. My suggestion would be to at least change your color scheme for geom_bar/col. The color= specifies the outline on the bars, and the fill= is the color of the bars. Your code would give you bright blue bars with a red outline... not awesome. I would also change the width of your bars a to be a bit skinnier by adjusting the width= argument from the default of 1 to something smaller. Here is an example with a dummy dataset. Most people (me included) would not want to download someone else's data via a link, sorry.
df <- data.frame(x=1:10, y=1:10)
ggplot(df, aes(x=x, y=y)) +
geom_col(fill='steelblue', color='black', width=0.5) +
theme_bw()

Modify legend and labels of stacked-area plot in R/ggplot2

EDIT: Solved by Haboryme in comments; the problem was my use of xlab and ylab instead of x and y as the names of keyword arguments to labs() (explaining the graph labels), and a redundant use of colour= in the second call to aes() (explaining the persistence of the original legend).
I'd like to make a stacked-area chart from some CSV data with R and ggplot2. For example:
In file "test.csv":
Year,Column with long name 1,Column with long name 2
2000,1,1
2001,1,1.5
2002,1.5,2
I run this code (imitating the answer to this GIS.SE question):
library(ggplot2)
library(reshape)
df <- read.csv('test.csv')
df <- melt(df, id="Year")
png(filename="test.png")
gg <- ggplot(df,aes(x=as.numeric(Year),y=value)) +
# Add a new legend
scale_fill_discrete(name="Series", labels=c("Foo bar", "Baz quux")) +
geom_area(aes(colour=variable,fill=variable)) +
# Change the axis labels and add a title
labs(title="Test",xlab="Year",ylab="Values")
print(gg)
dev.off()
The result, in file "test.png":
Problems: my attempt to change the axis labels was ignored, and my new legend (with code borrowed from the R Cookbook's suggestions) was added to, not substituted for, the (strangely recolored) default one. (Other solutions offered by the R Cookbook, such as calling guides(fill=FALSE), do more or less the same thing.) I'd rather not use the workaround of editing my dataframe (e.g. stripping the periods that read.csv() substitutes for spaces in column headers) so that the default labels turn out correct. What should I do?
ggplot(df,aes(x=as.numeric(Year),y=value)) +
scale_fill_discrete(name="Series", labels=c("Foo bar", "Baz quux")) +
geom_area(aes(fill=variable)) +
labs(title="Test",x="Year",y="Values")
The argument colour in the aes() of geom_area() only colours the contour and hence doesn't add anything to the plot here.

can i convert a base plot in r to a ggplot object?

I'm somewhat new to R and I love ggplot - that's all I use for plotting, so I don't know all the archaic syntax needed for base plots in R (and I'd rather not have learn it). I'm running pROC::roc and I would like to plot the output in ggplot (so I can fine tune how it looks). I can immediately get a plot as follows:
size <- 100
response <- sample(c(0,1), replace=TRUE, size=size)
predictor <- rnorm(100)
rocobject <- pROC::roc(response, predictor,smooth=T)
plot(rocobject)
To use ggplot instead, I can create a data frame from the output and then use ggplot (this is NOT my question). What I want to know is if I can somehow 'convert' the plot made in the code above into ggplot automatically so that I can then do what I want in ggplot? I've searched all over and I can't seem to find the answer to this 'basic' question. Thanks!!
Better late than never? I think the ggplotify package might do what you want. You basically plug in your plot generating code to the as.ggplot() function like so:
p6 <- as.ggplot(~plot(iris$Sepal.Length, iris$Sepal.Width, col=color, pch=15))
https://cran.r-project.org/web/packages/ggplotify/vignettes/ggplotify.html
No, I think unfortunately this is not possible.
Even though this does not answer your real question, building it with ggplot is actually not difficult.
Your original plot:
plot(rocobject)
In ggplot:
library(ggplot2)
df<-data.frame(y=unlist(rocobject[1]), x=unlist(rocobject[2]))
ggplot(df, aes(x, y)) + geom_line() + scale_x_reverse() + geom_abline(intercept=1, slope=1, linetype="dashed") + xlab("Specificity") + ylab("sensitivity")

Adding legend (ggplot) doesn't work

I feel like I am asking a totally silly question, but I can't force ggplot to show the legend for lines colours.
The thing is that I have two data frames with the same data, just the first data.frame represents new data (plus additional numbers) and the second represents the old data. I am trying to compare new and old data, thus to understand which is which I have to see the legend. I have tried to use scale_colour_manual, but it still doesn't appear.
I have read a number of various answers on similar questions and non of them worked or led to a better. You can see a simple example of my problem below:
rm(list = ls())
library(ggplot2)
xnew<-3:10
y<-5:12
xold<-4:11
years<-2000:2007
xfact<-rep("x", times=8)
yfact<-rep("y", times=8)
Newdata<-data.frame(indicator=c(xfact,yfact),Years=c(years,years), data=c(xnew,y))
Olddata<-data.frame(indicator=xfact,Years=c(years), data=xold)
graph<-ggplot(mapping=aes(Years, data, group=1)) +
geom_line(,Newdata[Newdata=="x",], size=1.5, colour="lightblue")+
geom_line(,Olddata[Olddata=="x",], size=1.5, colour="orange")+
ggtitle("OLD vs NEW")+
scale_colour_manual(name="Legend", values=c("New"="lightblue", "Old"="orange"))
the result is without the legend.
Thanks for all the help I have already found on this website and thank you in advance for helping to solve this problem.
Legends are created in ggplot by mapping aesthetics to a single variable. Your mistake is that you're trying to set colors manually in each layer.
Newdata$type <- "New"
Olddata$type <- "Old"
all_data <- rbind(Newdata,Olddata)
ggplot(data = all_data[all_data$indicator == 'x',],aes(x = Years,y = data,colour = type)) +
geom_line() +
ggtitle("OLD vs NEW") +
scale_colour_manual(name="Legend", values=c("New"="lightblue", "Old"="orange"))
There are countless examples illustrating this basic technique in ggplot here.

ggplot2: Overlay density plots R

I want to overlay a few density plots in R and know that there are a few ways to do that, but they don't work for me for a reason or another ('sm' library doesn't install and i'm noob enough not to understand most of the code). I also tried plot and par but i would like to use qplot since it has more configuration options.
I have data saved in this form
library(ggplot2)
x <- read.csv("clipboard", sep="\t", header=FALSE)
x
V1 V2 V3
1 34 23 24
2 32 12 32
and I would like to create 3 overlaid plots with the values from V1, V2 and V3 using or tones of grey to fill in or using dotlines or something similar with a legend. Can you guys help me?
Thank you!
generally for ggplot and multiple variables you need to convert to long format from wide. I think it can be done without but that is the way the package is meant to work
Here is the solution, I generated some data (3 normal distributions centered around different points). I also did some histograms and boxplots in case you want those. The alpha parameters controls the degree of transparency of the fill, if you use color instead of fill you get only outlines
x <- data.frame(v1=rnorm(100),v2=rnorm(100,1,1),v3=rnorm(100,0,2))
library(ggplot2);library(reshape2)
data<- melt(x)
ggplot(data,aes(x=value, fill=variable)) + geom_density(alpha=0.25)
ggplot(data,aes(x=value, fill=variable)) + geom_histogram(alpha=0.25)
ggplot(data,aes(x=variable, y=value, fill=variable)) + geom_boxplot()
For the sake of completeness, the most basic way to overlay plots based on a factor is:
ggplot(data, aes(x=value)) + geom_density(aes(group=factor))
But as #user1617979 mentioned, aes(color=factor) and aes(fill=factor) are probably more useful in practice.
Some people have asked if you can do this when the distributions are of different lengths. The answer is yes, just use a list instead of a data frame.
library(ggplot2)
library(reshape2)
x <- list(v1=rnorm(100),v2=rnorm(50,1,1),v3=rnorm(75,0,2))
data<- melt(x)
ggplot(data,aes(x=value, fill=L1)) + geom_density(alpha=0.25)
ggplot(data,aes(x=value, fill=L1)) + geom_histogram(alpha=0.25)
ggplot(data,aes(x=L1, y=value, fill=L1)) + geom_boxplot()

Resources