I am plotting some payment distribution information and I aggregated the data after scaling it to log-normal (base-e). The histograms turn out great but I want to modify the x-axis to display the non-log equivalents.
My current axis displays [0:2.5:10] values
Alternatively, I would like to see values for exp(2.5), exp(5), etc.
Any suggestions on how to accomplish this? Anything I can add to my plotting statement to scale the x-axis values? Maybe there's a better approach - thoughts?
Current code:
ggplot(plotData, aes_string(pay, fill = pt)) + geom_histogram(bins = 50) + facet_wrap(~M_P)
Answered...Final plot:
Not sure if this is exactly what you are after but you can change the text of the x axis labels to whatever you want using scale_x_continuous.
Here's without:
ggplot(data = cars) + geom_histogram(aes(x = speed), binwidth = 1)
Here's with:
ggplot(data = cars) + geom_histogram(aes(x = speed), binwidth = 1) +
scale_x_continuous(breaks=c(5,10,15,20,25), labels=c(exp(5), exp(10), exp(15), exp(20), exp(25)))
Related
I am trying to create a histogram in ggplot2 where the x-axis transitions from linear scaling to log2 scaling after a pre-defined point. In other words, I want the x-axis to be of a linear scale up to some threshold, and then after that threshold, use the log2 scale.
So, before the threshold, the x-axis should look like what you would get from simply doing:
ggplot(data,aes(x=value)) + geom_histogram()
and after the threshold, the x-axis should look like what you would get from doing:
ggplot(data,aes(x=value)) + geom_histogram() + scale_x_continuous(trans='log2')
The problem is that while I can make those histograms individually (one where everything is on a linear scale, and one where everything is on a log2 scale), I don't know how to get it to transition and have both in one histogram.
I agree with the commenters that this would be problematic as a single figure. However, it could be informative, if you have one figure showing all data, and then an inset/subplot to show a subset. Here I used cowplot::plot_grid to combine two figures, but there are other packages out there for arranging (like gridExtra). Do be extremely cautious about how you label the figures.
library(ggplot2)
x <- rexp(1000, .05) + rep(c(0, 5), each = 500)
cowplot::plot_grid(
ggplot(data.frame(x = x[x<5]), aes(x)) +
geom_histogram() +
labs(title = "Subset, x<5, linear-scale"),
ggplot(data.frame(x), aes(x)) +
geom_vline(xintercept = 5, color = "red", size = 2) +
geom_histogram() +
scale_x_log10() +
labs(title = "All data, log-scale")
)
This question is motivated by a previous post illustrating various ways to change how axes scales are plotted in a ggplot figure, from the default exponential notation to the full integer value (when ones axes values are very large). While I am able to convert the axes scales from exponential notation to full values, I am unclear how one would achieve the same goal for the values appearing in the legend.
While I understand that one can manually change the length of the legend scale with "scale_color..." or "scale_fill..." followed by the "limits" argument, this does not appear to be a solution to getting my legend values to show "6000000000" rather than "6e+09" (or "0" rather than "0e+00" for that matter).
The following example should suffice. My hope is someone can point out how to implement the 'scales' package to apply for legend scales rather than axes scales.
Thanks very much.
library(ggplot2)
library(scales)
Data <- data.frame(
pi = c(2,71,828,1828,45904,523536,2874713,52662497,757247093,6999595749),
e = c(3,14,159,2653,58979,311599,7963468,54418516,1590576171, 99),
face = 1:10)
p <- ggplot(data = Data, aes(x=face, y=e, colour = pi))
myplot <- p + geom_point() +
scale_y_continuous(labels = comma) +
scale_color_gradientn(colours = rainbow(2), limits=c(0,7000000000))
myplot
Use the Comma formatter in scale_color_gradientn by setting labels = comma e.g.:
p <- ggplot(data = Data, aes(x=face, y=e, colour = pi))
myplot <- p + geom_point() +
scale_y_continuous(labels = comma) +
scale_color_gradientn(colours = rainbow(2), limits=c(0,7000000000), labels = comma)
myplot
I'm currently working on a very simple data.frame, containing three columns:
x contains x-coordinates of a set of points,
y contains y-coordinates of the set of points, and
weight contains a value associated to each point;
Now, working in ggplot2 I seem to be able to plot contour levels for these data, but i can't manage to find a way to fill the plot according to the variable weight. Here's the code that I used:
ggplot(df, aes(x,y, fill=weight)) +
geom_density_2d() +
coord_fixed(ratio = 1)
You can see that there's no filling whatsoever, sadly.
I've been trying for three days now, and I'm starting to get depressed.
Specifying fill=weight and/or color = weight in the general ggplot call, resulted in nothing. I've tried to use different geoms (tile, raster, polygon...), still nothing. Tried to specify the aes directly into the geom layer, also didn't work.
Tried to convert the object as a ppp but ggplot can't handle them, and also using base-R plotting didn't work. I have honestly no idea of what's wrong!
I'm attaching the first 10 points' data, which is spaced on an irregular grid:
x = c(-0.13397460,-0.31698730,-0.13397460,0.13397460,-0.28867513,-0.13397460,-0.31698730,-0.13397460,-0.28867513,-0.26794919)
y = c(-0.5000000,-0.6830127,-0.5000000,-0.2320508,-0.6547005,-0.5000000,-0.6830127,-0.5000000,-0.6547005,0.0000000)
weight = c(4.799250e-01,5.500250e-01,4.799250e-01,-2.130287e+12,5.798250e-01,4.799250e-01,5.500250e-01,4.799250e-01,5.798250e-01,6.618956e-01)
any advise? The desired output would be something along these lines:
click
Thank you in advance.
From your description geom_density doesn't sound right.
You could try geom_raster:
ggplot(df, aes(x,y, fill = weight)) +
geom_raster() +
coord_fixed(ratio = 1) +
scale_fill_gradientn(colours = rev(rainbow(7)) # colourmap
Here is a second-best using fill=..level... There is a good explanation on ..level.. here.
# load libraries
library(ggplot2)
library(RColorBrewer)
library(ggthemes)
# build your data.frame
df <- data.frame(x=x, y=y, weight=weight)
# build color Palette
myPalette <- colorRampPalette(rev(brewer.pal(11, "Spectral")), space="Lab")
# Plot
ggplot(df, aes(x,y, fill=..level..) ) +
stat_density_2d( bins=11, geom = "polygon") +
scale_fill_gradientn(colours = myPalette(11)) +
theme_minimal() +
coord_fixed(ratio = 1)
Say I'm measuring 10 personality traits and I know the population baseline. I would like to create a chart for individual test-takers to show them their individual percentile ranking on each trait. Thus, the numbers go from 1 (percentile) to 99 (percentile). Given that a 50 is perfectly average, I'd like the graph to show bars going to the left or right from 50 as the origin line. In bar graphs in ggplot, it seems that the origin line defaults to 0. Is there a way to change the origin line to be at 50?
Here's some fake data and default graphing:
df <- data.frame(
names = LETTERS[1:10],
factor = round(rnorm(10, mean = 50, sd = 20), 1)
)
library(ggplot2)
ggplot(data = df, aes(x=names, y=factor)) +
geom_bar(stat="identity") +
coord_flip()
Picking up on #nongkrong's comment, here's some code that will do what I think you want while relabeling the ticks to match the original range and relabeling the axis to avoid showing the math:
library(ggplot2)
ggplot(data = df, aes(x=names, y=factor - 50)) +
geom_bar(stat="identity") +
scale_y_continuous(breaks=seq(-50,50,10), labels=seq(0,100,10)) + ylab("Percentile") +
coord_flip()
This post was really helpful for me - thanks #ulfelder and #nongkrong. However, I wanted to re-use the code on different data without having to manually adjust the tick labels to fit the new data. To do this in a way that retained ggplot's tick placement, I defined a tiny function and called this function in the label argument:
fix.labels <- function(x){
x + 50
}
ggplot(data = df, aes(x=names, y=factor - 50)) +
geom_bar(stat="identity") +
scale_y_continuous(labels = fix.labels) + ylab("Percentile") +
coord_flip()
I am plotting a forest plot in ggplot2 and am having issues with the ordering of the labels in the legend matching the order of the labels in the data set. Here is my code below.
data code
d<-data.frame(x=c("Co-K(W) N=720", "IH-K(W) N=67", "IF-K(W) N=198", "CO-K(B)N=78", "IH-K(B) N=13", "CO=A(W) N=874","D-Sco Ad(W) N=346","DR-Ad (W) N=892","CE_A(W) N=274","CO-Ad(B) N=66","D-So Ad(B) N=215","DR-Ad(B) N=123","CE-Ad(B) N=79"),
y = rnorm(13, 0, 0.1))
d <- transform(d, ylo = y-1/13, yhi=y+1/13)
d$x <- factor(d$x, levels=rev(d$x)) # reverse ordering
forest plot code
credplot.gg <- function(d){
# d is a data frame with 4 columns
# d$x gives variable names
# d$y gives center point
# d$ylo gives lower limits
# d$yhi gives upper limits
require(ggplot2)
p <- ggplot(d, aes(x=x, y=y, ymin=ylo, ymax=yhi,group=x,colour=x,)) +
geom_pointrange(size=1) +
theme_bw() +
scale_color_discrete(name="Sample") +
coord_flip() +
theme(legend.key=element_rect(fill='cornsilk2')) +
guides(colour = guide_legend(override.aes = list(size=0.5))) +
geom_hline(aes(x=0), colour = 'red', lty=2) +
xlab('Cohort') + ylab('CI') + ggtitle('Forest Plot')
return(p)
}
credplot.gg(d)
This is what I get. As you can see the labels on the y axis matches the labels in the order that it is in the data. However, it is not the same order in the legend. I'm not sure how to correct this. This is my first time creating a plot in ggplot2. Any feedback is well appreciated.Thanks in advanced
Nice plot, especially for a first ggplot! I've not tested, but I think all you need is to add reverse=TRUE inside your colour's guide_legend(found this in the Cookbook for R).
If I were to make one more comment, I'd say that ordering your vertical factor by numeric value often makes comparisons easier when alphabetical order isn't particularly meaningful. (Though maybe your alpha order is meaningful.)