R + ggplot: coordinate transforms and geom_fill - r

The following code works well:
p <- ggplot(df,aes(x=x,y=y))
p <- p + geom_tile(aes(fill=z))
p
It plots a nice heatmap. Here df contains x and y, created using expand.grid(), and z which contains the value at each (x,y) co-ordinate.
This code, on the other hand
p <- ggplot(df,aes(x=x,y=y))
p <- p + coord_map(project="lagrange")
p <- p + geom_tile(aes(fill=z))
p
doesn't plot anything much at all (and doesn't plot anything with all the coordinate transforms I've tried). My understanding is that the coord_map works on the x and y data, and the fill should be drawn on top of the transformed co-ordinates. However, this must be wrong as nothing's being plotted once the co-ordinates have been mapped to a new frame.
So my question is: how should I go about this so that it works properly? Could it be something to do with my data.frame df?

I think the problem is in how the tile geom calculates the area it should cover interacting with the new coordinate system. I tried recreating your problem with this code.
vol.m <- melt(volcano)
#points & cartesian
p <- ggplot(vol.m, aes(X1, X2, color = value)) + geom_point()
#points and map
p + coord_map()
#hex & cartesian
p <- ggplot(vol.m, aes(X1, X2, fill = value)) + geom_hex(stat = "identity")
#hex & map
p + coord_map
I sat there for a while trying the same thing with geom_tile(), but it wouldn't plot, so I'll take your word for it that it doesn't work. Notice how the hexagons get all screwed up on the map coordinates. I guess when trying to plot tiles which have been calculated to fill the area, it can't tolerate plotting them on a different coordinate system.
Edit:
That is, changing the coordinate system merely changes the "graph paper" the plot is drawn on. So, if you computed a linear regression statistic, then changed the coordinates to be log-transformed, the plotted regression line would be bent. I'm guessing that ggplot2 can't "bend" a tile to fit into a non-cartesian coordinate system.

Related

How to draw a radar chart with a lot of rows (or dimensions) using R?

Let's say I have such data:
a <- tibble(id=c(1,1.1,1.2,1.7,2,2.1,2.6,4,4.6,4.68),
x=c(0.3,0.5,0.2,0.7,0.1,0.5,0.43,0.6,0.3,0.65),
y=c(0.2,0.1,0.22,0.1,0.5,0.2,0.3,0.2,0.14,0.3))
This is just a sample, my real data is much more than this. and x+y+... = 1. I want to draw two lines: one line is for x, one line is for x+y:
ggplot(a) +
geom_line(aes(x=id,y=x),color='red') +
geom_line(aes(x=id,y=x+y),color='blue')
But what I really want something like a radar chart like:
You can see there is a circle with the radius to be 1. x and x+y, (maybe more in my data) are red and blue circles respectively. So, x+y must be larger than x but always in the circle because x+y+...=1. My data has a lot of ids, so it is not the traditional radar with few dimensions.
You can create radar charts with coord_polar() - e.g.
library(tidyverse)
ggplot(a) +
geom_smooth(aes(x=id,y=x),color='red', se = FALSE) +
geom_smooth(aes(x=id,y=x+y),color='blue', se = FALSE) +
geom_line(aes(x = id, y = 1)) +
coord_polar()
Note, that I used geom_smooth to get a closer to your intended result.

R geom_line not plotting as expected

I am using the following code to plot a stacked area graph and I get the expected plot.
P <- ggplot(DATA2, aes(x=bucket,y=volume, group=model, fill=model,label=volume)) + #ggplot initial parameters
geom_ribbon(position='fill', aes(ymin=0, ymax=1))
but then when I add lines which are reading the same data source I get misaligned results towards the right side of the graph
P + geom_line(position='fill', aes(group=model, ymax=1))
does anyone know why this may be? Both plots are reading the same data source so I can't figure out what the problem is.
Actually, if all you wanted to do was draw an outline around the areas, then you could do the same using the colour aesthetic.
ggplot(DATA2, aes(x=bucket,y=volume, group=model, fill=model,label=volume)) +
geom_ribbon(position='fill', aes(ymin=0, ymax=1), colour = "black")
I have an answer, I hope it works for you, it looks good but very different from your original graph:
library(ggplot2)
DATA2 <- read.csv("C:/Users/corcoranbarriosd/Downloads/porsche model volumes.csv", header = TRUE, stringsAsFactors = FALSE)
In my experience you want to have X as a numeric variable and you have it as a string, if that is not the case I can Change that, but this will transform your bucket into a numeric vector:
bucket.list <- strsplit(unlist(DATA2$bucket), "[^0-9]+")
x=numeric()
for (i in 1:length(bucket.list)) {
x[i] <- bucket.list[[i]][2]
}
DATA2$bucket <- as.numeric(x)
P <- ggplot(DATA2, aes(x=bucket,y=volume, group=model, fill=model,label=volume)) +
geom_ribbon(aes(ymin=0, ymax=volume))+ geom_line(aes(group=model, ymax=volume))
It gives me the area and the line tracking each other, hope that's what you needed
If you switch to using geom_path in place of geom_line, it all seems to work as expected. I don't think the ordering of geom_line is behaving the same as geom_ribbon (and suspect that geom_line -- like geom_area -- assumes a zero base y value)
ggplot(DATA2, aes(x=bucket, y=volume, ymin=0, ymax=1,
group=model, fill=model, label=volume)) +
geom_ribbon(position='fill') +
geom_path(position='fill')
Should give you

tiny pie charts to represent each point in an scatterplot using ggplot2

I want to create a scatter plot, in which each point is a tiny pie chart. For instance consider following data:
foo <- data.frame(X=runif(30), Y=runif(30),A=runif(30),B=runif(30),C=runif(30))
The following code will make a scatter plot, representing X and Y values of each point:
library(reshape2)
library(ggplot2)
foo.m <- melt(foo, id.vars=c("X","Y"))
ggplot(foo.m, aes(X,Y))+geom_point()
And the following code will make a pie chart for each point:
p <- ggplot(foo.m, aes(variable,value,fill=variable)) + geom_bar(stat="identity")
p + coord_polar() + facet_wrap(~X+Y,,ncol=6) + theme_bw()
But I am looking to merge them: creating a scatter plot in which each point is replaced by the pie chart. This way I will be able to show all 5 values (X, Y, A, B, C) of each record in the same chart.
Is there anyway to do it?
This is the sort of thing you can do with package ggsubplot. Unfortunately, according to issue #10 here, this package is not working with R 3.1.1. I ran it successfully if I used an older version of R (3.0.3).
Using your long dataset, you could put bar plots at each X, Y point like this:
library(ggplot2)
library(ggsubplot)
ggplot(foo.m) +
geom_subplot2d(aes(x = X, y = Y,
subplot = geom_bar(aes(variable, value, fill = variable), stat = "identity")),
width = rel(.5), ref = NULL)
This gives the basic idea, although there are many other options (like controlling where the subplots move to when there is overlap in plot space).
This answer has more information on the status of ggsubplot with newer R versions.
there is a package, scatterpie, that does exactly what you want to do!
library(ggplot2)
library(scatterpie)
ggplot() +
geom_scatterpie(aes(x=X, y=Y, r=0.1), data=foo.m, cols=c("A", "B", "C"))
In the aesthetics, r is the radius of the pie, you can adjust as necessary. It is dependent on the scale of the graph - since your graph goes from 0.0 to 1.0, a radius of 1 would take up the entire graph (if centered at 0.5, 0.5).
Do note that while you will get a legend for the pie slice colors, it will not (to my knowledge) label the slices themselves on the pies.

Scatterplot with ugly margins when using log scale

I have a somewhat "weird" two-dimensional distribution (not normal with some uniform values, but it kinda looks like this.. this is just a minimal reproducible example), and want to log-transform the values and plot them.
library("ggplot2")
library("scales")
df <- data.frame(x = c(rep(0,200),rnorm(800, 4.8)), y = c(rnorm(800, 3.2),rep(0,200)))
Without the log transformation, the scatterplot (incl. rug plot which I need) works (quite) well, apart from a marginally narrower rug plot on the x axis:
p <- ggplot(df, aes(x, y)) + geom_point() + geom_rug(alpha = I(0.5)) + theme_minimal()
p
When plotting the same with a log10-transform though, the points at the margin (at x = 0 and y = 0, respectively) are plotted outside the rug plot or just on the axis (with other data, and only one half side of a point is visible).
p + scale_x_log10() + scale_y_log10()
How can I "rescale" the axes so that all the points are contained fully within the grid and the rug plots are unaffected, as in the first example?
Maybe you want
p + scale_x_log10(oob=squish_infinite) + scale_y_log10(oob=squish_infinite)
I don't really know what you expect to happen for those values that can be negative or infinite, but one general advice when transformations don't do what you want is to perform them outside of ggplot2. Something like this might be useful,
library(plyr)
df2 <- colwise(log10)(df) # log transform columns
df2 <- colwise(squish_infinite)(df2) # do something with infinites
p %+% df2 # plot the transformed data

Problems making a graphic in ggplot

I an working with ggplot. I want to desine a graphic with ggplot. This graphics is with two continuous variables but I would like to get a graphic like this:
Where x and y are the continuous variables. My problem is I can't get it to show circles in the line of the plot. I would like the plot to have circles for each pair of observations from the continuous variables. For example in the attached graphic, it has a circle for pairs (1,1), (2,2) and (3,3). It is possible to get it? (The colour of the line doesn't matter.)
# dummy data
dat <- data.frame(x = 1:5, y = 1:5)
ggplot(dat, aes(x,y,color=x)) +
geom_line(size=3) +
geom_point(size=10) +
scale_colour_continuous(low="blue",high="red")
Playing with low/high will change the colours.
In general, to remove the legend, use + theme(legend.position="none")

Resources