I want to make simple heat map using ggplot, but the heatmap I get is weird. the heatmap plot spreads unequally on the background and I don't know how to fix this.
I have already tried this following code which gave me the heat map plot, but with weird result.
ggplot(data = vmd, aes(x = x, y = y)) +
geom_tile(aes(fill = val)) +
scale_fill_gradientn(colours = mycol)
Someone already said that you can use +coord_equal() to fix the ratio of the X and Y axis.
If you talk about the NA, what I usually do in this case is to clean the data before plotting it.
vmd_clean = vmd[complete.cases(vmd),]
You can also dig into the subset(...) function.
There might be a solution built in ggplot2, but I don't know about it. If someone knows about it, I'd love to learn about.
Related
I found this on the Tidyverse Github:
https://github.com/tidyverse/ggplot2/issues/3716
but I can't find the resolution of yutannihilation's question.
For exploratory data analysis, I would like for the outline stroke to reach the x-axis as it does with base R, including facets with scales="free".
Is there a way to do this programmatically? The user may have multiple facets of data, on the same or different scales. Can I ensure the x-axis is wide enough to take the density to zero?
I have tried outline.type = "full" and "both" but neither seem to work.
The MRE shows the issue. The use case is within a Shiny app and can be facet_wrap-ed as well.
Thanks!
#R base
plot(density(diamonds$carat, adjust = 5))
#ggplot
library(ggplot2)
ggplot(diamonds, aes(carat)) +
geom_density(adjust = 5)
A straightforward solution would be to calculate the density yourself and plot that:
library(ggplot2)
ggplot(as.data.frame(density(diamonds$carat, adjust = 5)[1:2]), aes(x, y)) +
geom_line()
I'm kinda stuck in a matter.
Following the advice of another fellow user in this platform, I was able to plot a raster image and have the pixel assume the values I had assigned I a column, named "colors" in my df.
That was great, but know I need the legends assum the categories I have created in another column. So far, my code and my out put look like this
ggplot(data = Xinjiang_df) + geom_raster(aes(x = x, y = y, fill = colors))+ scale_fill_identity(guide = "legend")
Output
I appreciate any suggestion. Thanks.
I have a dataset myData which contains x and y values for various Samples. I can create a line plot for a dataset which contains a few Samples with the following pseudocode, and it is a good way to represent this data:
myData <- data.frame(x = 290:450, X52241 = c(..., ..., ...), X75123 = c(..., ..., ...))
myData <- myData %>% gather(Sample, y, -x)
ggplot(myData, aes(x, y)) + geom_line(aes(color=Sample))
Which generates:
This turns into a Spaghetti Plot when I have a lot more Samples added, which makes the information hard to understand, so I want to represent the "hills" of each sample in another way. Preferably, I would like to represent the data as a series of stacked bars, one for each myData$Sample, with transparency inversely related to what is in myData$y. I've tried to represent that data in photoshop (badly) here:
Is there a way to do this? Creating faceted plots using facet_wrap() or facet_grid() doesn't give me what I want (far too many Samples). I would also be open to stacked ridgeline plots using ggridges, but I am not understanding how I would be able to convert absolute values to a stat(density) value needed to plot those.
Any suggestions?
Thanks to u/Joris for the helpful suggestion! Since, I did not find this question elsewhere, I'll go ahead and post the pretty simple solution to my question here for others to find.
Basically, I needed to apply the alpha aesthetic via aes(alpha=y, ...). In theory, I could apply this over any geom. I tried geom_col(), which worked, but the best solution was to use geom_segment(), since all my "bars" were going to be the same length. Also note that I had to "slice" up the segments in order to avoid the problem of overplotting similar to those found here, here, and here.
ggplot(myData, aes(x, Sample)) +
geom_segment(aes(x=x, xend=x-1, y=Sample, yend=Sample, alpha=y), color='blue3', size=14)
That gives us the nice gradient:
Since the max y values are not the same for both lines, if I wanted to "match" the intensity I normalized the data (myDataNorm) and could make the same plot. In my particular case, I kind of preferred bars that did not have a gradient, but which showed a hard edge for the maximum values of y. Here was one solution:
ggplot(myDataNorm, aes(x, Sample)) +
geom_segment(aes(x=x, xend=x-1, y=Sample, y=end=Sample, alpha=ifelse(y>0.9,1,0)) +
theme(legend.position='none')
Better, but I did not like the faint-colored areas that were left. The final code is what gave me something that perfectly captured what I was looking for. I simply moved the ifelse() statement to apply to the x aesthetic, so the parts of the segment drawn were only those with high enough y values. Note my data "starts" at x=290 here. Probably more elegant ways to combine those x and xend terms, but whatever:
ggplot(myDataNorm, aes(x, Sample)) +
geom_segment(aes(
x=ifelse(y>0.9,x,290), xend=ifelse(y>0.9,x-1,290),
y=Sample, yend=Sample), color='blue3', size=14) +
xlim(290,400) # needed to show entire scale
I am trying to plot two Cumulative Frequency curves in ggplot, and shade the cross over at a certain cut off. I haven't been using ggplot for long, so I was hoping someone might be able to help me with this one.
The plot without filled regions, looks like this...
Which I have created using the following code...
library(ggplot2) # required
north <- rnorm(3060, mean=277,sd=3.01) # to create synthetic data
south <- rnorm(3060, mean=278, sd=3.26) # in place of my real data.
#placing in dataframe
df_temp <- data.frame(temp=c(north,south),
region=c(rep("north",length=3060),rep("south",length=3060)))
#manipulating into cdf, as I've seen in other examples
temp.regions <- ddply(df_temp, .(region), summarize,
temp = unique(temp),
ecdf = ecdf(temp)(unique(temp)))
# feeding into ggplot.
ggplot(temp.regions,aes(x=temp, y=ecdf, color = region)) +
geom_line(aes(x=temp,color=region))+
scale_colour_manual(values = c("blue","red"))
What I would then like, would be to shade both curves for temperatures below 0.2 on the y axis. Ideally I'd like to see the blue one shaded in blue, and the red one shaded in red. Then, where they cross over in purple.
However, the closest I have managed is as follows... ]
Which I have achieved using the following additions to my code.
# creating a dataframe with just the temperatures for below 0.2
# to try and aid control when plotting
temp.below <- temp.regions[which(temp.regions$ecdf<0.2),]
# plotting routine again.
ggplot(temp.regions, aes(x=temp, y=ecdf, color = region)) +
geom_line(aes(x=temp,color=region))+
scale_colour_manual(values = c("blue","red"))+
# with additional line for shading.
geom_ribbon(data=temp.below,
aes(x=temp,ymin=0,ymax=0.2), alpha=0.5)
I've seen a few examples of people shading for a normal distribution density plot, which is where I have adapted my code from. But for some reason my boxes don't seem to want anything to do with the temperature curve.
Please help! I'm sure it's quite simple, I'm just really lost and have tried a few, producing less convincing results than these.
Thank you so much for taking a look.
PROBLEM SOLVED THANKS TO HELP BELOW...
running suggested code from below
geom_ribbon(aes(ymin=0,ymax=ecdf, fill=region), alpha=0.5)
gives...
which is so very almost the solution I'm after, but with one final addition... like so
#geom_ribbon(aes(ymin=0,ymax=ecdf, fill=region), alpha=0.5)
geom_ribbon(data=temp.below, aes(ymin=0,ymax=ecdf, fill=region), alpha=0.5)
I get what I'm after...
The reason I set the data again is so that it only fills the lowest 20% of the two regions.
Thank you so much for the help :-)
Looks like you're thinking about it in the right way.
With geom_ribbon i dont think you need to set data to anything else. Just set aes(ymin = 0, ymax = ecdf, fill = region). I think that should do it.
From what I can find on stackoverflow, (such as this answer to using two scale colour gradients on one ggplot) this may not (yet) be possible with ggplot2.
I want to create a bubbleplot with two size aesthetics, one always larger than the other. The idea is to show the proportion as well as the absolute values. Now I could colour the points by the proportion but I prefer multi-bubbles. In Excel this is relatively simple. (http://i.stack.imgur.com/v5LsF.png) Is there a way to replicate this in ggplot2 (or base)?
Here's an option. Mapping size in two geom_point layers should work. It's a bit of a pain getting the sizes right for bubblecharts in ggplot though.
p <- ggplot(mtcars, aes(mpg, wt)) + geom_point(aes(size = disp), shape = 1) +
geom_point(aes(size = hp/(2*disp))) + scale_size_continuous(range = c(15,30))
To get it looking most like your exapmle, add theme_bw():
P <- p + theme_bw()
The scale_size_continuous() is where you have to just fiddle around till you're happy - at least in my experience. If someone has a better idea there I'd love to hear it.