Weighted ggplot2 warning: Ignoring unknown aesthetics: weight - r

I try to plot a weighted density with ggplot2. The results seem to be fine, but I get the following warning: Warning: Ignoring unknown aesthetics: weight. Similar problems seem to appear in other ggplot2 applications and therefore I am wondering, if the warning could be ignored.
Reproducible example:
library(ggplot2)
set.seed(123)
# Some random data & weights
x <- rnorm(1000, 5)
w <- x^5
# Plot unweighted
ggplot() + stat_density(aes(x = x))
# Plot weighted - Warning: Ignoring unknown aesthetics: weight
ggplot() + stat_density(aes(x = x, weight = w / sum(w))) # Weighting seems to work fine
# Comparison of weighted density in base graphics - Same results as with ggplot2
plot(density(x, weights = w / sum(w)))
Can this warning message be ignored?

You can avoid the warning by using geom_density:
ggplot() +
geom_density(aes(x = x, weight = w / sum(w)), color = "green") +
geom_density(aes(x = x), color = "blue")
I would have expected the stat_ function to handle the same aesthetics as the geom and it appears to do so. The warning would then be a bug that should be reported to the maintainers.

Here is another solution:
ggplot(data=NULL, aes(x = x, weight=w/sum(w))) + stat_density()
And:
ggplot(data=NULL, aes(x = x, weight=w/sum(w))) +
stat_density(fill=NA, color = "green") +
stat_density(aes(x=x), fill=NA, color = "blue", inherit.aes=F)

Whether or not your get a warning appears to depend on where you give the weight argument (ggplot2 version 2.2.1):
Following these answers:
Create weighted histogram,
Histogram with weights
Setup data:
w = seq(1,1000)
v = sort(runif(1000))
foo = data.frame(v,w)
The following command produces a warning:
ggplot(foo) + geom_histogram(aes(v, weight=w),bins = 30)
These commands do not produce a warning:
ggplot(foo, aes(v, weight=w)) + geom_histogram(bins = 30)
ggplot(foo, aes(weight=w)) + geom_histogram(aes(v),bins = 30)
But all three commands produce the same plot.

Related

R ggnetwork: unable to change graph layout

I am trying ggnetwork and ggplot2 to plot some graph visualisation but I am unable to change the graph layout parameter that comes with the ggnetwork function. My reproducible code are as follows, and I am running this on R 4.0.3 on Ubuntu
install.packages("WDI") # this is the data source I need for this example
library(WDI)
new_wdi_cache <- WDIcache()
library(igraph)
library(tidyverse)
library(ggnetwork)
education<-WDI(indicator=c("SE.PRM.ENRR","SE.SEC.ENRR",
"SE.TER.ENRR","SE.SEC.PROG.ZS","SE.PRM.CMPT.ZS"),
start=2014,
end=2014,
extra= TRUE,
cache=new_wdi_cache)
education<-education[education$region!="Aggregates",]
education<-na.omit(education)
education.features <- education[,4:8]
education.features_scaled <-scale(education.features)
education.distance_matrix <- as.matrix(dist(education.features_scaled))
education.adjacency_matrix <- education.distance_matrix < 1.5
g1<-graph_from_adjacency_matrix(education.adjacency_matrix, mode="undirected")
new.g2<-ggnetwork(g1, layout = "kamadakawai") # LINE A
ggplot(new.g2, aes(x=x, y=y, xend=xend, yend=yend))+
geom_edges(colour="grey")+geom_nodes(size=5,aes(colour=species ))+
theme_blank()+labs(caption='WDI School enrollment and progression datasets')
On line A, I get an error that I really cannot understand:
Error: $ operator is invalid for atomic vectors
What does that mean? And if I remove the 'layout=' parameter from ggnetwork, the code runs. However I really need to change the layout.
The layout parameter doesn't take a string, but the output from a igraph::layout_ function.
So you can do:
new_g2 <- ggnetwork(g1, layout = igraph::layout.kamada.kawai(g1))
ggplot(new_g2, aes(x, y, xend = xend, yend = yend)) +
geom_edges(colour = "grey") +
geom_nodes(size = 8, aes(colour = name)) +
theme_blank() +
labs(caption = 'WDI School enrollment and progression datasets') +
theme(plot.caption = element_text(size = 16))

Error plotting in a second series in ggplot2

I want to plot a scatter plot with two different series:
Ej4 = read.xlsx("C:\\Users\\Ulia\\Downloads\\Parte 5.xlsx", sheet = 4)
fix(Ej4)
graf = ggplot(Ej4, aes(x1,x1)) + geom_point(alpha = 0.8)
graf
variable = data.frame(y = c(6,9,6,17,12), y2= c(6,9,6,17,12))
variable
grafica2 = graf + geom_point() +
geom_point(data = variable, colour ="blue")
grafica2
But R shows this error:
Error in FUN(X[[i]], ...) : object 'x' not found
There's no problem plotting graf, but I can't understand why R tells me there's is an error with grafica2
PD: Ej4 is a dataframe with numerical variables, with the exactly same size as 'variable'
There's another way:
ggplot() + geom_point(data = Ej4, aes(x1,y1)) + geom_point(data = variable, aes(y,y2))

ggplot2: Violin Plot with Stat="Identity"

I'm trying to use ggplot to create a violin plot where instead of the widths of the violins being controlled by a density function, they directly represent the count of relevant elements.
I think this can be accomplished through setting geom_violin(stat="identity"), but R then complains
> ggplot(allData, aes(x = tool, y = length)) + geom_violin(stat="identity")
Warning: Ignoring unknown parameters: trim, scale
Error in eval(substitute(list(...)), `_data`, parent.frame()) :
object 'violinwidth' not found
Trying to add aes(violinwidth=0.2*count), as this answer suggests, gives
> ggplot(allData, aes(x = tool, y = length)) + geom_violin(stat="identity", aes(violinwidth=0.2*count))
Warning: Ignoring unknown parameters: trim, scale
Warning: Ignoring unknown aesthetics: violinwidth
Error in FUN(X[[i]], ...) : object 'count' not found
And while I can set violinwidth to just a constant, this makes the violins just rectangles. How can I fix this?
When I run this with some sample data, it generates the plots ok with and without the changes to stat and violinwidth. Is your count a column in allData?
library(ggplot2)
dt <- data.frame(category = rep(letters[1:2], each = 10),
response = runif(20),
count = rpois(20, 5))
ggplot(dt, aes(x = category, y = response)) + geom_violin()
ggplot(dt, aes(x = category, y = response)) +
geom_violin(stat = "identity", aes(violinwidth = 0.1*count))

How to use sec_axis() for discrete data in ggplot2 R?

I have discreet data that looks like this:
height <- c(1,2,3,4,5,6,7,8)
weight <- c(100,200,300,400,500,600,700,800)
person <- c("Jack","Jim","Jill","Tess","Jack","Jim","Jill","Tess")
set <- c(1,1,1,1,2,2,2,2)
dat <- data.frame(set,person,height,weight)
I'm trying to plot a graph with same x-axis(person), and 2 different y-axis (weight and height). All the examples, I find is trying to plot the secondary axis (sec_axis), or discreet data using base plots.
Is there an easy way to use sec_axis for discreet data on ggplot2?
Edit: Someone in the comments suggested I try the suggested reply. However, I run into this error now
Here is my current code:
p1 <- ggplot(data = dat, aes(x = person, y = weight)) +
geom_point(color = "red") + facet_wrap(~set, scales="free")
p2 <- p1 + scale_y_continuous("height",sec_axis(~.*1.2, name="height"))
p2
I get the error: Error in x < range[1] :
comparison (3) is possible only for atomic and list types
Alternately, now I have modified the example to match this example posted.
p <- ggplot(dat, aes(x = person))
p <- p + geom_line(aes(y = height, colour = "Height"))
# adding the relative weight data, transformed to match roughly the range of the height
p <- p + geom_line(aes(y = weight/100, colour = "Weight"))
# now adding the secondary axis, following the example in the help file ?scale_y_continuous
# and, very important, reverting the above transformation
p <- p + scale_y_continuous(sec.axis = sec_axis(~.*100, name = "Relative weight [%]"))
# modifying colours and theme options
p <- p + scale_colour_manual(values = c("blue", "red"))
p <- p + labs(y = "Height [inches]",
x = "Person",
colour = "Parameter")
p <- p + theme(legend.position = c(0.8, 0.9))+ facet_wrap(~set, scales="free")
p
I get an error that says
"geom_path: Each group consists of only one observation. Do you need to
adjust the group aesthetic?"
I get the template, but no points get plotted
R function arguments are fed in by position if argument names are not specified explicitly. As mentioned by #Z.Lin in the comments, you need sec.axis= before your sec_axis function to indicate that you are feeding this function into the sec.axis argument of scale_y_continuous. If you don't do that, it will be fed into the second argument of scale_y_continuous, which by default, is breaks=. The error message is thus related to you not feeding in an acceptable data type for the breaks argument:
p1 <- ggplot(data = dat, aes(x = person, y = weight)) +
geom_point(color = "red") + facet_wrap(~set, scales="free")
p2 <- p1 + scale_y_continuous("weight", sec.axis = sec_axis(~.*1.2, name="height"))
p2
The first argument (name=) of scale_y_continuous is for the first y scale, where as the sec.axis= argument is for the second y scale. I changed your first y scale name to correct that.

R ggplot2 exponential regression with R² and p

I am trying to do a exponential regression in ggplot2. So first my skript:
g <- ggplot(data, aes(x=datax, y=datay), color="black") +
geom_point(shape=1) + stat_smooth(method = 'nls', formula = y~a*exp(b*x), aes(colour = 'Exponential'), se = FALSE)
g <- g + theme_classic()
g <- g + theme(panel.grid.major=element_blank())
g <- g + theme(panel.grid.minor=element_blank())
g <- g + theme(axis.line.x=element_line(color="black"),
axis.line.y=element_line(color="black"),
panel.border=element_blank(),
panel.background=element_blank())
g <- g + labs(x="\ndatax",y="datay\n")
g <- g + theme(axis.text.y=element_text(size=14))
g <- g + theme(axis.text.x=element_text(size=14))
g <- g + theme(axis.title.y=element_text(size=18,vjust=1))
g <- g + theme(axis.title.x=element_text(size=18,vjust=1))
g
This is the image that I got
As a R-beginner I did the script by mixing scripts of mine and the internet. I always get the following error:
"In (function (formula, data = parent.frame(), start, control = nls.control(), : No starting values specified for some parameters.
Initializing ‘a’, ‘b’ to '1.'.Consider specifying 'start' or using a selfStart model"
I did not found a better way to do the exponential graph, yet.
In addition, I would like to change the color of the graph into black and delete the legend and I would like to have the R² and p value in the graph. (maybe as well the confidence intervals?)
It's not easy to answer without a reproducible example and so many questions.
Are you sure the message you reported is an error and not a warning instead? On my own laptop, with dataset 'iris', I got a warning...
However, how you can read on ?nls page on R documentation, you should provide through the parameter "start" an initial value for starting the estimates to help finding the convergence. If you don't provide it, nls() itself should use some dummy default values (in your case, a and b are set to 1).
You could try something like this:
g <- ggplot(data, aes(x=datax, y=datay), color="black") +
geom_point(shape=1) + stat_smooth(method = 'nls',
method.args = list(start = c(a=1, b=1)),
formula = y~a*exp(b*x), colour = 'black', se = FALSE)
You told R that the colour of the plot is "Exponential", I think that so is going to work (I tried with R-base dataset 'iris' and worked).
You can notice that I passed the start parameter as an element of a list passed to 'method.args': this is a new feature in ggplot v2.0.0.
Hope this helps
Edit:
just for completeness, I attach the code I reproduced on my laptop with default dataset: (please take care that it has no sense an exponential fit with such a dataset, but the code runs without warning)
library(ggplot2)
data('iris')
g1 <- ggplot(data=iris, aes(x=Sepal.Length, y=Sepal.Width)) +
geom_point(color='green') +geom_smooth(method = 'nls',
method.args = list(start=c(a=1, b=1)), se = FALSE,
formula = y~a*exp(b*x), colour='black')
g1

Resources