Error while plotting with ggsave and other save functions - r

I have a problem concerning the function ggsave() and I would be really grateful for any help and or suggestions/solutions. I am creating four plots and put them all in one big plot, and since I want to loop the whole function using all columns in my dataframe I want to save the created plots in a specified folder (preferably with an identifying name).
plotting_fun3 <- function(Q){
plot1 <- plot_likert(
t(Q),
title = "Total population",
legend.labels = c("strongly disagree","disagree", "neither nor", "agree", "strongly agree"),
grid.range = c(1.6, 1.1),
expand.grid = FALSE,
axis.labels = c(" "),
values = "sum.outside",
show.prc.sign = TRUE,
catcount = 4,
cat.neutral = 3,
)
plot2 <- plot_likert(
t(Q[survey$animal=="Dogs"]),
title = "Female",
legend.labels = c("strongly disagree","disagree", "neither nor", "agree", "strongly agree"),
grid.range = c(1.6, 1.1),
expand.grid = FALSE,
axis.labels = c(" "),
values = "sum.outside",
show.prc.sign = TRUE,
catcount = 4,
cat.neutral = 3,
)
plot3 <- plot_likert(
t(Q[survey$animal=="Cats"]),
title = "Male",
legend.labels = c("strongly disagree","disagree", "neither nor", "agree", "strongly agree"),
grid.range = c(1.6, 1.1),
expand.grid = FALSE,
axis.labels = c(" "),
values = "sum.outside",
show.prc.sign = TRUE,
catcount = 4,
cat.neutral = 3,
)
plot4 <- plot_likert(
t(Q[survey$animal=="Turtle"]),
title = "Others",
legend.labels = c("strongly disagree","disagree", "neither nor", "agree", "strongly agree"),
grid.range = c(1.6, 1.1),
expand.grid = FALSE,
axis.labels = c(" "),
values = "sum.outside",
show.prc.sign = TRUE,
catcount = 4,
cat.neutral = 3,
)
theplot <- ggarrange(plot1, plot2, plot3, plot4,
labels = NULL,
common.legend = TRUE,
legend = "bottom",
ncol = 1, nrow = 4)
#ggsave(filename=paste(Q,".png",sep=""), plot=theplot, device = "png")
#ggsave(filename=paste("animal_plot", ID, ".jpeg"), plot=plots[[x]])
#ggsave(path = "/myDirectory",
# device = "png", filename = "animal_plot", plot = theplot)
#save_plot(filename = "hello", plot = theplot,
# "/myDirectory",
# device = "png")
#ggsave(sprintf("%s.pdf", Q), device = "pdf")
return(theplot)
}
The commented lines show all kinds of ways I have tried to save the plot in my directory. I encounter 2 different problems:
Either: Most of the ggsave suggestions I found on stack overflow. Several of them did not include the line device = "png". If I leave out this line of code I always get something like this:
Fehler: `device` must be NULL, a string or a function.
Run `rlang::last_error()` to see where the error occurred.
If I follow that command I get:
<error/rlang_error>
`device` must be NULL, a string or a function.
Backtrace:
1. global::plotting_fun3(survey[, 9])
2. ggplot2::ggsave(sprintf("%s.pdf", Q))
3. ggplot2:::plot_dev(device, filename, dpi = dpi)
Run `rlang::last_trace()` to see the full context.
> rlang::last_trace()
<error/rlang_error>
`device` must be NULL, a string or a function.
Backtrace:
█
1. └─global::plotting_fun3(survey[, 9])
2. └─ggplot2::ggsave(sprintf("%s.pdf", Q))
3. └─ggplot2:::plot_dev(device, filename, dpi = dpi)
So online I found people with the same or similar problem and the suggestion has always been to use device = "png" or similar.
Now if I do this I encounter a different problem:
The plots are saved in the right directory but the name is wrong. Usually the name is "3.png" or "3.pdf" or depending on what I create. If "3.png" already exists it gives the file another number.
I had this problem in an older project three months ago and couldn't solve it and now I have it again.
For what it's worth, I use macOS Mojave 10.14.6, my R version is Version 1.3.1093
Thank you in advance for any thoughts, suggestions or other comments.
[EDIT]
Here is some sample data:
> str(myDF[,c(2,9:10)])
data.frame': 123 obs. of 3 variables:
$ animal: chr "Cats" "Cats" "Turtles" "Cats" ...
$ q8 : int 3 5 5 3 4 4 2 5 3 5 ...
$ q9.1 : int 4 5 5 4 3 4 2 4 2 4 ...
The values stay between 1 and 5 for all observations. They actually represent answers such as "strongly agree", "agree", "neither agree nor disagree"...etc.
Alternatively, if you prefer this to the other one:
> myDF[,c(2,9:10)]
animal q8 q9.1
1 Cats 3 4
2 Cats 5 5
3 Turtles 5 5
4 Cats 3 4
5 Turtles 4 3
6 Turtles 4 4
7 Turtles 2 2
8 Cats 5 4
9 Cats 3 2
10 Turtles 5 4
11 Turtles 4 3
12 Turtles 3 3
13 Dogs 3 3
14 Cats 3 3
15 Dogs 1 1
16 Dogs 1 3

The issue with file name is due to you use Q which is a dataframe in the filename defintion so it will result in some very messy way depend on how your system handling filename.
# This command result in a few long character depend on number of columns in Q.
# 4 columns w+ill result 4 long character and ggsvave will return the error
# Error: `device` must be NULL, a string or a function.
ggsave(filename=paste(Q,".png",sep=""), plot=theplot, device = "png")
# Again not sure what ID is here but if it was a dataframe you got
# same error with previous one.
ggsave(filename=paste("animal_plot", ID, ".jpeg"), plot=plots[[x]])
# This one it doesn't specific a file name but a directory
# ggsave will return an error:
# Error: Unknown graphics device ''
# If you specify device = "png" - error will be:
# Error in grid.newpage() : could not open file '/home/sinh'
ggsave(path = "/myDirectory",
device = "png", filename = "animal_plot", plot = theplot)
# Why there is a param "/myDirectory" here? and you should specify the extention
# in the file name. So the correct param is:
# filename = "/myDirectory/hello.png"
save_plot(filename = "hello", plot = theplot,
"/myDirectory",
device = "png")
Here is one that should work properly but you need to input file name manually:
character_variable <- "my_character_variable_here_"
index_number <- 20
# If you specify sep = "" then just need to use paste0
file_name <- paste0(character_variable, index_number)
ggsave(filename=paste(file_name, ".jpeg"), plot=plots[[x]], device = "png")
And here is my rewrite function based on your function. You may try it out and tweak it a bit
# df is your survey data.frame
# q_column_name is the name of questionare column that you want to graph.
# the final output file will use q_column_name as file name.
plotting_fun3 <- function(df, q_column_name){
require(foreach)
require(dplyr)
require(tidyr)
graph_data <- df %>% select(one_of("animal", q_column_name))
plot1 <- plot_likert(
t(graph_data),
title = "Total population",
legend.labels = c("strongly disagree","disagree", "neither nor", "agree", "strongly agree"),
grid.range = c(1.6, 1.1),
expand.grid = FALSE,
axis.labels = c(" "),
values = "sum.outside",
show.prc.sign = TRUE,
catcount = 4,
cat.neutral = 3,
)
animal_plots <- foreach(current_animal = c("Dog", "Cats", "Turtle")) %do% {
plot_likert(
t(graph_data %>% filter(animal == current_animal)),
title = "Female",
legend.labels = c("strongly disagree","disagree", "neither nor", "agree", "strongly agree"),
grid.range = c(1.6, 1.1),
expand.grid = FALSE,
axis.labels = c(" "),
values = "sum.outside",
show.prc.sign = TRUE,
catcount = 4,
cat.neutral = 3
)
}
theplot <- ggarrange(plot1, animal_plots[[1]],
animal_plots[[2]], animal_plots[[3]],
labels = NULL,
common.legend = TRUE,
legend = "bottom",
ncol = 1, nrow = 4)
ggsave(filename=paste(q_column_name, ".png",sep=""), plot=theplot, device = "png")
return(theplot)
}
Here is how to use the function
# Assume that your survey dataframe variable is myDF
my_new_plot <- plotting_fun3(df = myDF, q_column_name = "q8")
[Updated] - Added the function to solve the graph issue.

For anyone encountering a similar problem in a less complicated setting: double check if you are passing a valid path/filename string to ggsave()!
In my case, I made a mistake with string processing using str_split() and path concatenation. Therefore, ggsave() was not given a valid single-string path, promting the Error: 'device' must be NULL, a string or a function. The error had nothing to do with my 'device' but I was simply not passing a proper string. Once I fixed the path, the issue was solved.

Related

How do I create barplots with categories instead of numbers?

I'm just getting started in R and I'm trying to wrap my head around barplot for a university assignment. Specifically, I am using the General Social Survey 2018 dataset (for codebook: https://www.thearda.com/Archive/Files/Codebooks/GSS2018_CB.asp) and I am trying to figure out if religion has any effect on the way people seek out help for mental health. I want to use reliten (self-assessment of religiousness - from strong to no religion) as the IV and tlkclrgy, (asks if a person with mental health issues should reach out to a religious leader - yes or no) as the DV. For a better visualization of the data, I want to create a side-by-side barplot with reliten on the x-axis and see how many people answered yes and no on tlkclrgy. My problem is that on the barplot I get numbers instead of categories (from strong to no religion). This is what I tried, but I keep getting NA on the x-axis:
GSS$reliten <- factor(as.character(GSS$reliten),
levels = c("No religion", "Somewhat
strong", "Not very strong",
"Strong"))
GSS <- GSS18[!GSS18$tlkclrgy %in% c(0, 8, 9),]
GSS$reliten <- as_factor(GSS$reliten)
GSS$tlkclrgy <- as_factor(GSS$tlkclrgy)
ggplot(data=GSS,mapping=aes(x=reliten,fill=tlkclrgy))+
geom_bar(position="dodge")
Does anybody have any tips?
Here is complete code to download the codebook and data, table the two columns of interest and plot the frequencies.
1. Read the data
Data will be downloaded to a temporary directory, to keep my disk palatable. Use of these first two instructions is optional
od <- getwd()
setwd("~/Temp")
These are the links to the two files that need to be read and the filenames.
cols_url <- "https://osf.io/ydxu4/download"
cols_file <- "General Social Survey, 2018.col"
data_url <- "https://osf.io/e76rv/download"
data_file <- "General Social Survey, 2018.dat"
download.file(cols_url, cols_file, mode = "wb")
download.file(data_url, data_file, mode = "wb")
Now read in the codebook and process it, extracting the column widths and column names.
cols <- readLines(cols_file)
cols <- strsplit(cols, ": ")
widths_char <- sapply(cols, '[', 2)
i_widths <- grepl("-", widths_char)
f <- function(x) -eval(parse(text = x)) + 1L
widths <- rep(1L, length(widths_char))
widths[i_widths] <- f(widths[i_widths])
col_names <- sapply(cols, '[', 1)
col_names <- trimws(sub("^.[^ ]* ", "", col_names))
col_names <- tolower(col_names)
Finally, read the fixed width text file.
df1 <- read.fwf(data_file, widths = widths, header = FALSE, na.strings = "-", col.names = col_names)
2. Table the data
Find out where are the two columns we want with grep.
i_cols <- c(
grep("reliten", col_names, ignore.case = TRUE),
grep("tlkclrgy", col_names, ignore.case = TRUE)
)
head(df1[i_cols])
Table those columns and coerce to data.frame. Then coerce the columns to factor.
Here there is a problem, there is no answer 3 for tlkclrgy in the published survey but there are answers 3 in the data file. So I have created an extra factor level.
GSS <- as.data.frame(table(df1[i_cols]))
labels_reliten <- c(
"Not applicable",
"Strong",
"Not very strong",
"Somewhat Strong",
"No religion",
"Don't know",
"No answer"
)
levels_reliten <- c(0, 1, 2, 3, 4, 8, 9)
labels_tlkclrgy <- c(
"Not applicable",
"Yes",
"No",
"Not in codebook",
"Don't know",
"No answer"
)
levels_tlkclrgy <- c(0, 1, 2, 3, 8, 9)
GSS$reliten <- factor(
GSS$reliten,
labels = labels_reliten,
levels = levels_reliten
)
GSS$tlkclrgy <- factor(
GSS$tlkclrgy,
labels = labels_tlkclrgy,
levels = levels_tlkclrgy
)
3. Plot the frequencies table
library(ggplot2)
ggplot(data = GSS, mapping = aes(x = reliten, y = Freq, fill = tlkclrgy)) +
geom_col(position = "dodge")

labelling specific data points in graph R without ggplot

I am trying to label the data points which are shaded in the plot.
Here is my sample data :
genes logFC PValue
1 Arhgap8 -5.492152 2.479473e-99
2 Asns -2.519970 2.731718e-93
3 Bmp4 -1.663583 4.767201e-72
4 Casp1 -1.650139 2.212689e-25
5 Ctgf -1.272772 1.000103e-61
6 Eya4 -2.328052 2.077364e-68
my plot code till now :
plot(sample$logFC,-log10(as.numeric(sample$PValue)),pch = 20,xlab = 'Log2 FoldChange',ylab = '-Log10 p-value',col = 'blue',xlim = c(-10,8),ylim = c(0,300),cex.lab=1.5, cex.axis=1.5)
points(sample$logFC,-log10(as.numeric(sample$PValue)),col = "dark green")
with(subset(sample,genes=='Arhgap8'),points(logFC,-log10(as.numeric(PValue)),pch = 20, col="orange"))
I have tried using the below command including text;but it doesnt show me the label.
with(subset(sample,genes=='Arhgap8'),points(logFC,-log10(as.numeric(PValue)),pch = 20, col="violet"),text(sample,labels = sample$genes,cex = 0.9,pos = 4))
The correct way to use with to run to commands would be
with(subset(sample, genes=='Arhgap8'), {
points(logFC, -log10(as.numeric(PValue)), pch = 20, col="violet")
text(logFC, -log10(as.numeric(PValue)), labels = genes, cex = 0.9, pos = 4)
})
When you pass more arguments with with(), they are silently ignored. For example
with(iris, mean(Sepal.Length), stop("not run"))

Display a fraction in a dynamically changing plot

I'm wondering how I can actually display a conditional calculation (see below) I'm using to obtain what I call GG in my R codes (see codes below)?
Details:
To be precise, I want to display in my plot that GG = G1 / G2 (displayed IF type = 1) and ELSE GG = (G1 / G2 ) * 2 (displayed IF type == 2).
For example, if G1 = 1, G2 = 2, what I want to be shown depending on "type" is one of the following:
Note: In the picture above, I used "X 2" out of hurry, but I need an actual x-like mathematical sign for the multiplication sign.
Below is my R code:
G1 = 1 ## Can be any other number, comes from a function
G2 = 2 ## Can be any other number, comes from a function
type = 1 ## Can be either 1 or 2
plot(1:10,ty="n",ann=F, bty="n")
text( 6, 6, expression(paste("GG = ", ifelse(type==1, frac(G1, G2), frac(G1, G2)*2 ), " = ",
G1/G2, sep = "")), col = "red4", cex = 1.6 )
Using bquote, use .() to add dynamic values into expression:
type = 1
plot(1:10, ty = "n", ann = F, bty = "n")
text(6, 6,
bquote(paste("GG = ", frac(.(G1), .(G2)),
.(ifelse(type == 1, " × 2", "")),
" = ",
.(ifelse(type == 1, G1/G2 * 2, G1/G2)), sep = "")),
col = "red4", cex = 1.6)
Note: ASCII code for Multiplication Sign(×) is Alt 0215
Awesome #zx8754, very nice.
I misunderstood your question and thought you wanted to print the formula and result so I produced this.
plot(1:10,ty="n",ann=F, bty="n")
text( 6, 6, if (type==1) substitute(paste("GG = ", frac(G1, G2)) == list(x), list(x=G1/G2)) else
substitute(paste("GG = ", frac(G1, G2), " x 2") == list(x), list(x=G1/G2 * 2)), col = "red4", cex = 1.6 )

Mapping nearest neighbours of a long-lat data set using ggmap, geom_point and a loop

My ultimate goal is to connect all nearest neighbours of a set of buildings (based on Euclidean distance) on a ggmap using geom_path from the ggplot2 package. I need help with a loop that will allow me to plot all neighbours as easily as possible
I have created a distance matrix (called 'kmnew') in kilometres between 3 types of building in Beijing: B (x2), D (x2) and L (x1):
B B D D L
B NA 6.599014 5.758531 6.285787 3.770175
B NA NA 7.141096 3.873296 5.092667
D NA NA NA 3.690725 2.563017
D NA NA NA NA 2.832083
L NA NA NA NA NA
I try to discern the nearest neighbours of each building by row by declaring a matrix and using a loop to ascertain the nearest neighbour building:
nn <- matrix(NA,nrow=5,ncol=1)
for (i in 1:nrow(kmnew)){
nn[i,] <- which.min(kmnew[i,])
}
This returns the following error (not sure why):
Error in nn[i, ] <- which.min(kmnew[i, ]) : replacement has length zero
but seems to return the correct answer to nn:
[,1]
[1,] 5
[2,] 4
[3,] 5
[4,] 5
[5,] NA
I append this to an original dataframe called newbjdata:
colbj <- cbind(newbjdata,nn)
that returns
Name Store sqft long lat nn
1 B 1 1200 116.4579 39.93921 5
2 B 2 750 116.3811 39.93312 4
3 D 1 550 116.4417 39.88882 5
4 D 2 600 116.4022 39.90222 5
5 L 1 1000 116.4333 39.91100 NA
I then retrieve my map via ggmap:
bjgmap <- get_map(location = c(lon = 116.407395,lat = 39.904211),
zoom = 13, scale = "auto",
maptype = "roadmap",
messaging = FALSE, urlonly = FALSE,
filename = "ggmaptemp", crop = TRUE,
color = "bw",
source = "google", api_key)
My ultimate goal is to map the nearest neighbours together in a plot using geom_path from the ggplot package.
For example, the nn of the 1st building of type B (row 1) is the 1 building of type L (row 5). Obviously I can draw this line by subsetting the said 2 rows of the dataframe thus:
ggmap(bjgmap) +
geom_point(data = colbj, aes(x = long,y = lat, fill = factor(Name)),
size =10, pch = 21, col = "white") +
geom_path(data = subset(colbj[c(1,5),]), aes(x = long,y = lat),col = "black")
However, I need a solution that works like a loop, and I can't figure out how one might achieve this, as I need to reference the nn column and refer that back to the long lat data n times. I can well believe that I am not using the most efficient method, so am open to alternatives. Any help much appreciated.
Here is my attempt. I used gcIntermediate() from the geosphere package to set up lines. First, I needed to rearrange your data. When you use gcIntermediate(), you need departure and arrival long/lat. That is you need four columns. In order to arrange your data in this way, I used the dplyr package. mutate_each(colbj, funs(.[nn]), vars = long:lat) works for you to pick up desired arrival long/lat. . is for 'long' and 'lat'. [nn] is the vector index for the variables. Then, I employed gcIntermediate(). This creates SpatialLines. You need to make the object a SpatialLinesDataFrame. Then, you need to convert the output to "normal" data.frame. This step is essential so that ggplot can read your data. fortify() is doing the job.
library(ggmap)
library(geosphere)
library(dplyr)
library(ggplot2)
### Arrange the data: set up departure and arrival long/lat
mutate_each(colbj, funs(.[nn]), vars = long:lat) %>%
rename(arr_long = vars1, arr_lat = vars2) %>%
filter(complete.cases(nn)) -> mydf
### Get line information
rts <- gcIntermediate(mydf[,c("long", "lat")],
mydf[,c("arr_long", "arr_lat")],
50,
breakAtDateLine = FALSE,
addStartEnd = TRUE,
sp = TRUE)
### Convert the routes to a data frame for ggplot use
rts <- as(rts, "SpatialLinesDataFrame")
rts.df <- fortify(rts)
### Get a map (borrowing the OP's code)
bjgmap <- get_map(location = c(lon = 116.407395,lat = 39.904211),
zoom = 13, scale = "auto",
maptype = "roadmap",
messaging = FALSE, urlonly = FALSE,
filename = "ggmaptemp", crop = TRUE,
color = "bw",
source = "google", api_key)
# Draw the map
ggmap(bjgmap) +
geom_point(data = colbj,aes(x = long, y = lat, fill = factor(Name)),
size = 10,pch = 21, col = "white") +
geom_path(data = rts.df, aes(x = long, y = lat, group = group),
col = "black")
EDIT
If you want to do all data manipulation in one sequence, the following is one way to go. foo is identical to rts.df above.
mutate_each(colbj, funs(.[nn]), vars = long:lat) %>%
rename(arr_long = vars1, arr_lat = vars2) %>%
filter(complete.cases(nn)) %>%
do(fortify(as(gcIntermediate(.[,c("long", "lat")],
.[,c("arr_long", "arr_lat")],
50,
breakAtDateLine = FALSE,
addStartEnd = TRUE,
sp = TRUE), "SpatialLinesDataFrame"))) -> foo
identical(rts.df, foo)
#[1] TRUE
DATA
colbj <- structure(list(Name = structure(c(1L, 1L, 2L, 2L, 3L), .Label = c("B",
"D", "L"), class = "factor"), Store = c(1L, 2L, 1L, 2L, 1L),
sqft = c(1200L, 750L, 550L, 600L, 1000L), long = c(116.4579,
116.3811, 116.4417, 116.4022, 116.4333), lat = c(39.93921,
39.93312, 39.88882, 39.90222, 39.911), nn = c(5L, 4L, 5L,
5L, NA)), .Names = c("Name", "Store", "sqft", "long", "lat",
"nn"), class = "data.frame", row.names = c("1", "2", "3", "4",
"5"))

Plot concentric pie charts in r

I'm trying to do a concentric pie chart. The internal pie represent three classes of subjects and each class has to be splitted in 3 sub-classes (of course the slices for the sub-classes have to be in line with the corresponding internal slice).
this is what I tried:
layout(matrix(c(1,1,1,1,2,1,1,1,1), nrow=3)); pie(x=c(14,22,15,3,15,33,0,6,45),labels="",col=c("#f21c39","#dba814","#7309de")); pie(x=c(51,51,51),labels=c("O","VG","V"),col=c("#c64719","#0600f5","#089c1f"))
This worked, but the internal pie is too small. I tried to play with the radius option, but then the external slices are not correspondent to the internal ones. how can I adjust them?
Use par(new=TRUE) to overplot the pies rather than layout() in this case
pie(x=c(14,22,15,3,15,33,0,6,45),labels="",
col=c("#f21c39","#dba814","#7309de"))
par(new=TRUE)
pie(x=c(51,51,51),labels=c("O","VG","V"),radius=.5,
col=c("#c64719","#0600f5","#089c1f"))
Three years later. this can be achieved using sunburstR package. http://timelyportfolio.github.io/sunburstR/example_baseball.html
Example:
DF <- data.frame(LOGRECNO = c(60, 61, 62, 63, 64, 65),
STATE = c(1, 1, 1, 1, 1, 1),
COUNTY = c(1, 1, 1, 1, 1, 1),
TRACT = c(21100, 21100, 21100, 21100, 21100, 21100),
BLOCK = c(1053, 1054, 1055, 1056, 1057, 1058))
DF$BLOCKID <-
paste(DF$LOGRECNO, DF$STATE, DF$COUNTY,
DF$TRACT, DF$BLOCK, sep = "-")
DF %>%
select(BLOCKID) %>%
group_by(BLOCKID) %>%
summarise(Tots=n())->dftest
sunburst(dftest)
I'm sure you are able to adapt this to suit your needs!
you could also use the ggsunburst package
# install ggsunburst
if (!require("ggplot2")) install.packages("ggplot2")
if (!require("rPython")) install.packages("rPython")
install.packages("http://genome.crg.es/~didac/ggsunburst/ggsunburst_0.0.9.tar.gz", repos=NULL, type="source")
library(ggsunburst)
df <- read.table(header=T, text = "
parent node size
O 1 14
O 2 22
O 3 15
V 1 3
V 2 15
V 3 33
VG 1 1
VG 2 6
VG 3 45")
write.table(df, file = 'df.txt', sep = ',', row.names = F)
sb <- sunburst_data('df.txt', type = "node_parent", sep = ",")
p <- sunburst(sb, node_labels = T, leaf_labels = F, rects.fill.aes = "name")
cols <- c("O" = "#c64719", "V" = "#0600f5", "VG" = "#089c1f", "1" = "#f21c39", "2" = "#dba814", "3" = "#7309de")
p + scale_fill_manual(values = cols)

Resources