Irrelevant legend information in ggplot2 - r

When running this code (go ahead, try it):
library(ggplot2)
(myDat <- data.frame(cbind(VarX=10:1, VarY=runif(10)),
Descrip=sample(LETTERS[1:3], 10, replace=TRUE)))
ggplot(myDat,aes(VarX,VarY,shape=Descrip,size=3)) + geom_point()
... the "size=3" statement does correctly set the point size. However it causes the legend to give birth to a little legend beneath it, entitled "3" and containing nothing but a big dot and the number 3.
This does the same
ggplot(myDat,aes(VarX,VarY,shape=Descrip)) + geom_point(aes(size=3))
Yes, it is funny. It would have driven me insane a couple hours ago if it weren't so funny. But now let's make it stop.

That's because it's interpreting it as an aesthetic mapping rather than a constant. This works I think:
ggplot(myDat,aes(VarX,VarY,shape=Descrip)) + geom_point(size=3)

Related

Remembering steps in R [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed last year.
Improve this question
I am a beginner in R and it might appear irrelevant. But can anyone tell me how to remember syntax? like arguments of ggplot or tidyverse or any other package.
There are a few ways to do that. You can start writing the function and press TAB, it will appear in a pop up. You can also check the cheatsheet, here are some
examples: https://www.rstudio.com/resources/cheatsheets/
Or you can check the help topic by writing the function with a ? in it's start, for example: ?ggplot
OP, your question does not relate to coding per se - no problem to solve via issues with code - so it's not really supposed to be on SO. With that being said, it is a viable question and very daunting to approach using ggplot2 to create plots when you really don't have the background for doing so. Consequently, I think you still deserve a good answer, so here are some principles to help out a new user.
Know where to get information
The biggest help to offer is to practice. You will become more familiar with usage, but even "the pros" forget the argument syntax and what stuff does. In this case, the following is helpful:
Use RStudio. The base R terminal is fully capable; however, RStudio brings a ton of conveniences that make programming in R so much easier. Tooltips are an important part of how I create and use functions in R. If you start typing out a function, you'll be presented with a short list of arguments:
What's more, you can start typing an argument and you'll get a description from the help directly within RStudio:
Check the help for functions. This one should be obvious, but I am constantly checking the help for functions on CRAN. This is easily done in RStudio by typing ? before the function. So, if I need to know the arguments and syntax for geom_point(), I'll type ?geom_point into the console and you'll get the documentation directly within RStudio.
Online Resources. A quick search online can give you a lot of information (maybe even this answer). There are a lot of other resources: here too. Including here, here, here, and here.
Become familiar with the Principles of plotting in ggplot2
Knowing where to get information is helpful, but sometimes you feel so lost that you don't even know what information you actually are looking to get. This is the crux of many of the questions here on SO related to ggplot2, which is: "how can I change my axes?", "How do I change colors in the plot?", or "How can I get a legend to show x, y, or z?". Sometimes you can google, but often it's not even clear what you are looking to find.
This is where a fundamental understanding of how to create a plot in ggplot2 becomes useful. I'll go through how I always approach plotting in ggplot2 and hopefully this will help you out a bit.
Step 1 - prepare data
Making your data prepared to plot is exceptionally useful, and sometimes difficult to do. It's a bit beyond what I intend to communicate here, but a mandatory piece of reading would be regarding Tidy Data Principles.
Step 2 - Think about Mapping
Mapping is often overlooked in the process, but in short, this is how the columns of your dataset relate to the plot. It's easy to say "this column will be my x axis" and "this column will be my y axis", but you should also be clear on if the values of other columns will relate to color, fill, size, shape, etc etc... Thinking this way, it will soon be quite obvious why you would want to get Step 1 correct above, because only Tidy data will be able to be used directly in mapping without issue.
Step 3 - The Fundamental ggplot() call
The first step in plotting will be your first call to ggplot(). Here you need to assign data - example via df %>% ggplot(...) or ggplot(data=df, ...). This is also typically where you would setup at least your x and y axes via mapping. You can just stop here (x and y axes), or you can specify the other aesthetics in the mapping here too. Ultimately, this alone plotted "sets up" the plot. If we just plot the result of that, you get the following:
p <- ggplot(mtcars, aes(disp, mpg))
p
Step 4 - Add your geoms
A "geom" (short for "geometry") describes the shapes and "things" on your plot that will be positioned on the x and y axes. You can add any number, but in this example, we'll add points. If all you want to do is plot the observations at the x and y axes, you just need to add geom_point() and that should be enough:
p + geom_point()
Step 5 - Adding Legends
Note we don't have a legend yet. This is because there are no aesthetics mapped other than x and y. ggplot2 creates legends automatically when you specify in the mapping (via aes()) a characteristic way of differentiating the way we draw a geom. In other words, we can describe color= within aes() and that will initiate the creation of a legend. You can do the same with other aesthetics too.
p + geom_point(aes(color=cyl))
This creates a legend type depending on the type of data mapped. So, a colorbar legend is created here because the column mtcars$cyl is numeric. If we use a non-numeric column, you get a discrete legend:
p + geom_point(aes(color=rownames(mtcars)))
There's advanced stuff too... but not covered here.
Step 6 - Adjusting the Scales
All we do when you specify mapping (i.e. aes(color = ...),) is how the data is mapped to that aesthetic. This does not specify the actual color to be used. If you don't specify, the default coloring or sizing is used, but sometimes you want to change that. You can do that via scale_*_() functions... of which there are many depending on your application. For information on color scales, you can see this answer here... but suffice it to say this is quite a complicated part of the plotting stuff that depends greatly on what you want to do. Many of the scale_() functions are structured similarly, so you can probably get an idea of what you can do with that answer and see. Here's an example of how we can adjust the color with one of these functions:
p + geom_point(aes(color=cyl)) +
scale_color_gradient(low="red", high="green")
Step 7 - Adjusting Labels
Here I usually add the plot labels and axis labels. You can conviently use ylab(), or xlab() or ggtitle() to assign axis labels and the title, or just define them all together with labs(y = ..., x = ..., title = ...). You can also use this time to format and arrange things associated with legends and scales (tick marks and whatnot) via guides(...) (for legends) or the scale_x_*() and scale_y_*() functions (for tick marks on axes).
Step 8 - Theme Elements
Finally, you can change the overall look with various ggplot themes. An account of default themes is given here, but you can extend that with the ggtheme package to get more. You might want to just change a few specific elements of size, color, linetype, etc on the plot. You can address these specific elements via theme(). A helpful list of theme elements is given here.
So, putting it all together you have:
# initial call
ggplot(mtcars, aes(disp, mpg)) +
# geoms
geom_point(aes(color=cyl), size=3) +
# define the color scale
scale_color_viridis_c() +
# define labels and ticks and stuff
# axis
scale_x_continuous(breaks = seq(0, 600, by=50)) +
# legend ticks
guides(color=guide_colorbar(ticks.colour = "black", ticks.linewidth = 3)) +
# Labels
labs(x="Disp", y="Miles per gallon (mpg)", color = "# of \ncylinders", title="Ugly Plot 1.0") +
# theme and theme elements
theme_bw() +
theme(
panel.background = element_rect(fill="gray90"),
panel.grid.major = element_line(color="gray20", linetype=2, size=0.2),
panel.grid.minor = element_line(color="gray70", linetype=2, size=0.1),
axis.text = element_text(size=12, face = "bold"),
axis.text.x = element_text(angle=30, hjust=1)
)
It's a lot of steps, but I break it down like that basically every time. When plot code gets large, I break up the chunks much in that manner above to help clear my mind on how to create the plot.

Map cut on the "edges" of x-axis with Aitoff projection

I am trying to plot with ggplot2 (v3.3.2) data points on a map using a specific projection (called aitoff), which is useful especially for sky plots.
When doing so, the plot is "cropped" on the x-axis, i.e. the edges of the axis are located just outside the plot. I tried a few things (adjust the margin for example), but without success. Could you please help to make these part of the plot visible?
Here is the code to reproduce the issue, i.e. the point located at (0,0) is not visible.
skydata <- data.frame(RA=c(0,180,360), Dec=c(0,10,20))
ggplot(skydata) +
geom_point(aes(RA,Dec)) +
coord_map(projection="aitoff",orientation=c(90,180,0)) +
scale_y_continuous(breaks=(-2:2)*30,limits=c(-90,90)) +
scale_x_continuous(breaks=(0:8)*45,limits=c(0,360), labels=c("","","","","","","","","")) +
labs(x="R.A.(°)", y="Decl. (°)",title="Map of the sky")
I hope I was clear enough...
Thanks a lot!
I think that the clipping happens in the first place is a known issue. In the Github issue hadley says: "I think this is a long standing problem that I'm unlikely to solve in the near future"
I think there are two ways, you can more or less solve the problem for yourself. One solution was already mentioned by #Arnaud
(but both solutions have downsides)
Add clip = "off" in the coord_map part
Add expand = c(1.1,0) in the scale_x_continuous part
I added you some example plots, where you can see the results and problems.
1. Initial version (if I run your code):
Problem: The point at (0,0) can't be seen properly.
2. Version with expand:
skydata <- data.frame(RA=c(0,180,360), Dec=c(0,10,20))
ggplot(skydata) +
geom_point(aes(RA,Dec)) +
coord_map(projection="aitoff", orientation=c(90,180,0)) +
scale_y_continuous(breaks=(-2:2)*30,limits=c(-90,90)) +
scale_x_continuous(breaks=(0:8)*45,limits=c(0,360), ,expand = c(1.1, 0), labels=c("","","","","","","","","")) +
labs(x="R.A.(°)", y="Decl. (°)",title="Map of the sky")
Looks quite nice now. x-axis got expanded to both sides, point in (0,0) now clearly visible.
But attention seems to work only with natural numbers (like expand = c(5,0)). For the 1.1 I chose in my example the plot is somehow different and the y-axis seems distorted.
3. Version with clip = "off":
skydata <- data.frame(RA=c(0,180,360), Dec=c(0,10,20))
ggplot(skydata) +
geom_point(aes(RA,Dec)) +
coord_map(projection="aitoff", clip = "off", orientation=c(90,180,0)) +
scale_y_continuous(breaks=(-2:2)*30,limits=c(-90,90)) +
scale_x_continuous(breaks=(0:8)*45,limits=c(0,360), labels=c("","","","","","","","","")) +
labs(x="R.A.(°)", y="Decl. (°)",title="Map of the sky")
This version does not expand the x-axis, but it makes sure, the point at (0,0) is not clipped off. Definitely no distortion of the y-axis. But does not look as good as the solution with expand.

Line graph is blurry, not clear & axes positioning difficulties in R

I am trying to plot (with ggplot2) a simple time-velocity graph in R, but my data looks messy. I am using Markdown Notebook.
I am using this base code for this:
ggplot(data, aes(x = time, y = velocity)) + geom_line() +
scale_x_continuous(name="Time (s)",
limits=c(min(data$time),max(data$time)),breaks=seq(0,3000,500)) +
scale_y_continuous(name="", limits=c(-1,max(data$velocity))) +
theme(axis.text = element_text(color="black"),axis.title = element_text(
color="black"))
This is how it looks like with the default settings:
After that, I tried to extend the figure horizontally, but then the labels became really small, and the data is still kind of blurry:
For this, I added the following (at the beginning of my Chunk):
{r fig1, fig.width=95, fig.asp = 0.15}
If I make the font sizes bigger the labels look okay, but the velocity graph stays the same (naturally). Does someone know a way to fix that? I thought that maybe this is because of my monitor, it has 4K resolution, but I'm not sure. I also wonder if anyone knows how to move the y-axis so it would start at the zero point of the x-axes (now there's a space, and it looks weird).
I am also open to suggestions on how to improve the visualization. :D Thank you in advance!
with the changed DPI (comment) it looks better in my opinion, but the figure is really small (I included the output window for reference):

How to make the bg of a single legend transparent in a merged plot?

I don't have to mention I am new, and my problem has too many solutions. I tried about 12 different versions and couldn't solve it:
The example given is close to my desired plot I want to generate.
I overtook a given script from 2013, so I do not entirely understand what to do to change it in the way I would like:
Plot 1's legend in the bottom right corner without a title, transparent background and instead of 1 and 2 the labels "urban" and "non-urban".I am aware that "legend.position="none"" delets all legends, but was not able to find a solution that looked like at least close to this. Still it is not on the plot, not transparent and has a title.
Unfortunately somehow the dots changed into squares in this process and I have no clue why. I didn't change geom_point.
Another flaw I want to change is to remove the top line over the central plot. But how?
And last but not least I am not sure if the function geom.mooth(method=lm) does reflect the regression line + confidence interval, because the description says it adds a conditional mean which is, afaIk, not the same. Is my concern unnecessary?
Edit: shorter Version plot1 out of 3 merged plots:
library(ggplot2)
library(gridExtra)
set.seed(42)
DF <- data.frame(x=rnorm(100,mean=c(1,5)),y=rlnorm(100,meanlog=c(8,6)),group=1:2)
p1 <- ggplot(DF,aes(x=x,y=y,colour=factor(group))) + geom_point(shape=16) +
scale_x_continuous(expand=c(0.02,0)) +
scale_y_continuous(expand=c(0.02,0)) +
geom_rug() +
geom_smooth(method=lm,alpha=0.3) +
theme_bw()+
theme(legend.position=c(0.9,0.09),
legend.title=element_blank(),
plot.margin=unit(c(0,0,0,0),
"points"))
Thanks for any advice, I am researching on this topic since 2 weeks, even though I thought I am studying psychology, I learned a lot ... but not enough in the short time to success. :/

Unwanted bold-face while putting multiple ggplot charts in the same file

I don't know if you have seen some unwanted bold-face font like picture below:
As you see the third line is bold-faced, while the others are not. This happens to me when I try to use ggplot() with lapply() or specially mclapply(), to make the same chart template based on different data, and put all the results as different charts in a single PDF file.
One solution is to avoid using lapply(x, f) when f() is a function that returns a ggplot() plot, but I have to do so for combining charts (i.e. as input for grid.arrange()) in some situation.
Sorry not able to provide you reproducible example, I tried really hard but was not successful because the size of code and data is too big with several nested functions and when I reduced complexity to make a reproducible example, the problem did not happen.
I asked the question because I guessed maybe someone has faced the same experience and know how to solve it.
My intuition is that it's not actually being printed in bold, but rather double-printed for some reason, which then looks bold. This would explain why it doesn't come up with a simpler example. Especially given your mention of nested functions and probably other complicated structures where it's easy to get an off-by-one or similar error, I would try doing something where you can see exactly what's being plotted -- perhaps by examining the length() of the return value from apply().
Changing the order of elements of the vector, so that the order of the elements in the key is different, may also help. If you consistently get the bold-face on the last element, that also tells you a little bit more about where something is going wrong.
As #Dinre also mentioned, it could also be related to your plotting device. You can try out changing your plotting device. I have my doubts about this though, seeing as it's not a consistent problem. You could also try changing the position of the key, which depending on your plotting device and settings, may move you in or out of a compression block, thus changing which artifacts crop up.
Reproducible example and a solution may be as follows:
library(ggplot2)
d <- data.frame(x=1:10, y=1:10)
ggplot(data = d, aes(x=x, y=y)) +
geom_point() +
geom_text(aes(3,7,label = 'some text 10 times')) +
geom_text(data = data.frame(x=1,y=1),
aes(7,3, label = 'some text one time'))
When we try to add a label by geom_text() manually inserting x and y do not shorten the data. Then same label happen to be printed as many times as the number of rows our data has. Data length may be forced to 1 by replacing data within geom_text().

Resources