error with ggplot2 pie chart - r

I'm using the following code to create a pie chart with ggplot2, which contains two pie charts next to one another: one for each value of "MotT". Each pie chart need to how the proportions for each "Model". Here is my code:
library(ggplot2)
library(sqldf)
df <- data.frame("MorT" = c(1,2,1,2), "Model" = c(1,1,2,2),
"Values" = c(length(outOfTime1withIns[,1]),
length(outOfMem1withIns[,1]),
length(outOfTime1noIns[,1]),
length(outOfMem1noIns[,1])))
df=sqldf("select Values,
CASE WHEN MorT==1 THEN 'Insuficient Time'
WHEN MorT==2 THEN 'Insuficient Memory'
END MorT,
CASE WHEN Model==1 THEN '1) FSM1 with Insertion Dominance'
WHEN Model==2 THEN '2) FSM1 without Insertion Dominance'
END Model from df")
p = ggplot(data=df,
aes(x=factor(1),
y=Summary,
fill = factor(Model)
)
)
I get the following error after I try to run "df=sqldf("select..."
Error in sqliteExecStatement(con, statement, bind.data) :
RS-DBI driver: (error in statement: near "Values": syntax error)
And of course p is empty. I get
Error: No layers in plot
If I try to call it.
Any help will be very much appreciated!Thanks

'Values' is a keyword in SQL so you can't use it as a variable name. Change it to 'value' or something else in the data frame, that should sort the SQL error.
It looks like you're following the example on http://www.r-chart.com/2010/07/pie-charts-in-ggplot2.html.
Firstly, you have y = Summary in your ggplot, that needs to be updated to 'value' for your code.
Next, there seemed to be an issue with the data you're using (I don't have outOfMem1noIns so i made test data), but you should make sure the values for each MorT sum up to 1.
Then the code as it is on the tutorial page should work (maybe with some warning messages...)

The SQL statement has a syntax error, as the error states. In addition, the ggplot2 error comes from the fact that you have not added a geometry, e.g. geom_point:
p = ggplot(data=df,
aes(x=factor(1),
y=Summary,
fill = factor(Model)
)
) + geom_point()

Related

Histogram runs into object not found error

I am a student who is quite new to R and am having difficulty linking my dataset to the actual workspace. In particular, I am trying creating a histogram to show what life expectancy looks like across zipcodes of a state, but nothing is showing up.
This is what my code looks like:
install.packages("ggplot2")
library(ggplot2)
ggplot(data = df_mo, aes(x = life_expectancy)) + geom_histogram(color = "tomato")
Here is what my error message in the console states:
>ggplot(data = df_mo, aes(x = life_expectancy)) + geom_histogram(color = "tomato")
Error in FUN(X[[i]], ...) : object 'life_expectancy' not found
>
Here is what my dataset looks like:
This may be quite an elementary problem I imagine but I don't have a clue and have been at this for an hour. I've tried to look this problem up but everything i've seen has some additional bells and whistles added to the code or they are receiving a different error message.
Thank you in advance.
The problem is that the dataframe doesn't have the columns names that you want. As it is shown in the picture the names are V1, V2, .... Something like:
colnames(df_mo) = df_mo[1,]
df_mo = df_mo[-1,]
should do the trick. Also should re-consider the way you are loading the data to R so it uses the first line as column names

Error in axis(side = side, at = at, labels = labels, ...) : invalid value specified for graphical parameter "pch"

I have applied DBSCAN algorithm on built-in dataset iris in R. But I am getting error when tried to visualise the output using the plot( ).
Following is my code.
library(fpc)
library(dbscan)
data("iris")
head(iris,2)
data1 <- iris[,1:4]
head(data1,2)
set.seed(220)
db <- dbscan(data1,eps = 0.45,minPts = 5)
table(db$cluster,iris$Species)
plot(db,data1,main = 'DBSCAN')
Error: Error in axis(side = side, at = at, labels = labels, ...) :
invalid value specified for graphical parameter "pch"
How to rectify this error?
I have a suggestion below, but first I see two issues:
You're loading two packages, fpc and dbscan, both of which have different functions named dbscan(). This could create tricky bugs later (e.g. if you change the order in which you load the packages, different functions will be run).
It's not clear what you're trying to plot, either what the x- or y-axes should be or the type of plot. The function plot() generally takes a vector of values for the x-axis and another for the y-axis (although not always, consult ?plot), but here you're passing it a data.frame and a dbscan object, and it doesn't know how to handle it.
Here's one way of approaching it, using ggplot() to make a scatterplot, and dplyr for some convenience functions:
# load our packages
# note: only loading dbscacn, not loading fpc since we're not using it
library(dbscan)
library(ggplot2)
library(dplyr)
# run dbscan::dbscan() on the first four columns of iris
db <- dbscan::dbscan(iris[,1:4],eps = 0.45,minPts = 5)
# create a new data frame by binding the derived clusters to the original data
# this keeps our input and output in the same dataframe for ease of reference
data2 <- bind_cols(iris, cluster = factor(db$cluster))
# make a table to confirm it gives the same results as the original code
table(data2$cluster, data2$Species)
# using ggplot, make a point plot with "jitter" so each point is visible
# x-axis is species, y-axis is cluster, also coloured according to cluster
ggplot(data2) +
geom_point(mapping = aes(x=Species, y = cluster, colour = cluster),
position = "jitter") +
labs(title = "DBSCAN")
Here's the image it generates:
If you're looking for something else, please be more specific about what the final plot should look like.

Using the QQ Plot functionality in ggplot

I'm brand new to R, and have a data frame with 8 columns that has daily changes in interest rates. I can plot QQ plots for data each of the 8 columns using the following code:
par(mfrow = c(2,4))
for(i in 1:length(column_names)){
qqnorm(deltaIR.df[,i],main = column_names[i], pch = 16, cex = .5)
qqline(deltaIR.df[,i],cex = .5)
}
I'd like now to use the stat_qq function in the ggplot2 package to do this more elegantly, but just can't get my arms around the syntax - I keep getting it wrong. Would someone kindly help me translate the above code to use ggplot and allow me to view my 8 QQ plots on one page with an appropriate header? Trying the obvious
ggplot(deltaIR.df) + stat_qq(sample = columns[i])
gets me only an error message
Warning: Ignoring unknown parameters: sample
Error: stat_qq requires the following missing aesthetics: sample
and adding in the aesthetics
ggplot(deltaIR.df, aes(column_names)) + stat_qq()
is no better. The error message just changes to
Error: Aesthetics must be either length 1 or the same as the data (5271)
In short, nothing I have done so far (even with Google's assistance) has got me closer to a solution. May I ask for guidance?

Specifying the `stratum` and `alluvium` parameters without attaching ggalluvial

I use ggalluvial with ggplot2, though, I'd like to be able to generate the same plot without attaching ggalluvial but only specify its use with ggalluvial::. If it is not attached, I get the following error: Error: Can't find stat called "stratum".
d <- data.frame(
status = rep(c("state1","state2","state3"), rep(90, times=3)),
cellIndex = rep(seq_len(90), times=3),
cellCategory = c(rep(letters[seq_len(3)], each=30),
rep(letters[c(2,3,1)], each=30),
rep(letters[c(3,1,2)], each=30))
)
ggplot2::ggplot(data=d, ggplot2::aes(x=status, stratum=cellCategory, alluvium=cellIndex,
fill=cellCategory, label=cellCategory)) +
ggalluvial::geom_flow(stat="alluvium", lode.guidance="rightleft", color="darkgray") +
ggalluvial::geom_stratum() +
ggplot2::geom_text(stat="stratum", size=3)
This was a tough one---digging into the code for ggplot2, the stat argument pastes the string you give and then looks for that object (in this case "StatStratum") in the environment you're in. Because you don't want to load the package, it won't be able to find it (and there's no way to change the argument itself).
Answer
So you need to save that object from the ggalluvial package like so:
StatStratum <- ggalluvial::StatStratum
Then leave the rest of your code as is.
The following worked for me.
ggplot2::geom_text(stat = ggalluvial::StatStratum)

Cannot save plots as pdf when ggplot function is called inside a function

I am going to plot a boxplot from a 4-column matrix pl1 using ggplot with dots on each box. The instruction for plotting is like this:
p1 <- ggplot(pl1, aes(x=factor(Edge_n), y=get(make.names(y_label)), ymax=max(get(make.names(y_label)))*1.05))+
geom_boxplot(aes(fill=method), outlier.shape= NA)+
theme(text = element_text(size=20), aspect.ratio=1)+
xlab("Number of edges")+
ylab(y_label)+
scale_fill_manual(values=color_box)+
geom_point(aes(x=factor(Edge_n), y=get(make.names(true_des)), ymax=max(get(make.names(true_des)))*1.05, color=method),
position = position_dodge(width=0.75))+
scale_color_manual(values=color_pnt)
Then, I use print(p1) to print it on an opened pdf. However, this does not work for me and I get the below error:
Error in make.names(true_des) : object 'true_des' not found
Does anyone can help?
Your example is not very clear because you give a call but you don't show the values of your variables so it's really hard to figure out what you're trying to do (for instance, is method the name of a column in the data frame pl1, or is it a variable (and if it's a variable, what is its type? string? name?)).
Nonetheless, here's an example that should help set you on the way to doing what you want:
Try something like this:
pl1 <- data.frame(Edge_n = sample(5, 20, TRUE), foo = rnorm(20), bar = rnorm(20))
y_label <- 'foo'
ax <- do.call(aes, list(
x=quote(factor(Edge_n)),
y=as.name(y_label),
ymax = substitute(max(y)*1.05, list(y=as.name(y_label)))))
p1 <- ggplot(pl1) + geom_boxplot(ax)
print(p1)
This should get you started to figuring out the rest of what you're trying to do.
Alternately (a different interpretation of your question) is that you may be running into a problem with the environment in which aes evaluates its arguments. See https://github.com/hadley/ggplot2/issues/743 for details. If this is the issue, then the answer might to override the default value of the environment argument to aes, for instance: aes(x=factor(Edge_n), y=get(make.names(y_label)), ymax=max(get(make.names(y_label)))*1.05, environment=environment())

Resources