ggplot conditionally change shape or shape fill using variable

ggplot conditionally change shape or shape fill using variable - r

I am trying to create a line chart that shows open symbols for data that is not detected and closed (filled) symbols to represent detected data. Here is the some code to work with:
date <- c("1991-04-25","1991-04-26","1991-04-27","1991-04-28","1991-04-29","1991-04-25","1991-04-26","1991-04-27","1991-04-28","1991-04-29","1991-04-25","1991-04-26","1991-04-27","1991-04-28","1991-04-29")
Parameter <- c("TEA","TEA","TEA","TEA","TEA","COFFEE","COFFEE","COFFEE","COFFEE","COFFEE","WATER","WATER","WATER","WATER","WATER")
data <- c(5,4,7,3,6,4,6,8,6,3,7,8,7,6,7)
DetectYN <- c("Y","N","Y","Y","Y","N","Y","Y","Y","N","N","N","Y","Y","N")
df <- data.frame(date, Parameter,data, DetectYN)
df$date <- as.Date(df$date, "%Y-%m-%d" )
df$DetectYN <-as.character(df$DetectYN)
ggplot(df, aes(x=date, y=data)) +
geom_point(size=4, aes(shape = Parameter , colour= Parameter)) +
geom_line(aes(x=date, y=data,color = Parameter)) +
scale_shape_manual(values=ifelse(DetectYN == "Y",c(15,16,17),c(0,1,2)) , guide = "none")
This creates the following chart - nearly correct, except that my ifelse is not having the desired effect. I would like the DetectYN = "N" to be hollow (no fill) and I would like the DetectYN = "Y" to be filled. The existing symbols need to remain. Could anyone help me with this please?

This is a deceptively difficult problem!
This solution directly answers your question, and is hopefully of some use. However, I fear that it may become messy with large, complicated datasets.
I added a column combining the two variables that you wish to control shape, and then defined shape by this new column, ordering the shape numbers to achieve the desired result.
df$shape<-paste(Parameter, DetectYN)
ggplot(df, aes(x=date, y=data, colour= Parameter)) +
geom_point(size=4, aes(shape=shape))+
geom_line() +
scale_shape_manual(values=c(0,15,1,16,2,17) , guide = "none")

Related

Bug in ggplot2?

I'm currently working on plotting simple plots using ggplot2.
The graph looks good, but there is one tiny detail I can't fix.
When you look at the legend, it says "Low n" twice. One of them should be "High n".
Here is my code:
half_plot <- ggplot() +
ggtitle(plot_title) +
geom_line(data = plot_dataframe_SD1, mapping = aes(x = XValues, y = YValues_SD1, color = "blue")) +
geom_line(data = plot_dataframe_SD2, mapping = aes(x = XValues, y = YValues_SD2, color = "green")) +
xlim(1, 2) +
ylim(1, 7) +
xlab("Standard Deviation") +
ylab(AV_column_name) +
scale_fill_identity(name = 'the fill', guide = 'legend',labels = c('m1')) +
scale_colour_manual(name = 'Legend',
values =c('blue'='blue','green'='green'),
labels = c(paste("Low ", Mod_column_name), paste("High ", Mod_column_name))
Here is the graph I get in my output:
So do you know how to fix this?
And there is one more thing that makes me curious: I can't remember that I changes anything in this code, but I know that the legend worked just fine a few days ago. I safed pictures I made wih this code and it looks alright..
Also if you have any further suggestions how to upgrade the graph, these suggestions are very welcome too.

When asking questions, it will help us if you provide a reproducible example including the data. With some sample data, there are a couple ways to fix it.
Sample data
library(dplyr)
plot_dataframe_SD1 = data.frame(XValues=seq(1,2,by=.2)) %>%
mutate(YValues_SD1=XValues*2)
plot_dataframe_SD2 = data.frame(XValues=seq(1,2,by=.2)) %>%
mutate(YValues_SD2=XValues*5)
The simplest way to modify your code is to supply the desired color label in the aesthetic.
Mod_column_name = 'n'
half_plot <- ggplot() +
# put the desired label name in the aesthetic
# link describing the bang bang operator (!!) https://www.r-bloggers.com/2019/07/bang-bang-how-to-program-with-dplyr/ geom_line(data=plot_dataframe_SD1,mapping=aes(x=XValues,y=YValues_SD1,color=!!paste('Low',Mod_column_name))) +
geom_line(data=plot_dataframe_SD2,mapping=aes(x=XValues,y=YValues_SD2,color=!!paste('High',Mod_column_name))) +
scale_color_manual(values=c('blue','green'),
labels=c(paste('Low',Mod_column_name),paste('High',Mod_column_name)))
A more general approach is to join the dataframes and pivot the joined df to have a column with the SD values and another to specify how to separate the colors. This makes it easier to plot without having to make multiple calls to geom_line.
# Join the dfs, pivot the SD columns longer, and make a new column with your desired labels
joined_df = plot_dataframe_SD1 %>% full_join(plot_dataframe_SD2,by='XValues') %>%
tidyr::pivot_longer(cols=contains('YValues'),names_to='df_num',values_to='SD') %>%
mutate(label_name=if_else(df_num == 'YValues_SD1',paste('Low',Mod_column_name),paste('High',Mod_column_name)))
# Simplified plot
ggplot(data=joined_df,aes(x=XValues,y=SD,color=label_name)) +
geom_line() +
scale_color_manual(values=c('blue','green'),
labels=c(paste('Low',Mod_column_name),paste('High',Mod_column_name)))

Arranging data for two facet R line plot

I am trying to make a two facet line plot as this example. My problem is to arrange data to show desired variable on x-axis. Here is small data set I wanna use.
Study,Cat,Dim1,Dim2,Dim3,Dim4
Study1,PK,-3.00,0.99,-0.86,0.46
Study1,US,-4.67,0.76,1.01,0.45
Study2,FL,-2.856,4.15,1.554,0.765
Study2,FL,-8.668,5.907,3.795,4.754
I tried to use the following code to draw line graph from this data frame.
plot1 <- ggplot(data = dims, aes(x = Cat, y = Dim1, group = Study)) +
geom_line() +
geom_point() +
facet_wrap(~Study)
As is clear, I can only use one value column to draw lines. I want to put Dim1, Dim2, Dim3, Dim4 on x axis which I cannot do in this arrangement of data. [tried c(Dim1, Dim2, Dim3, Dim4) with no luck]
Probably the solution is to transpose the table but then I cannot reproduce categorization for facet (Study in above table) and colour (Cat in above table. Any ideas how to solve this issue?

You can try this:
library(tidyr)
library(dplyr)
gather(dims, variable, value, -Study, -Cat) %>%
ggplot(aes(x=variable, y=value, group=Cat, col=Cat)) +
geom_point() + geom_line() + facet_wrap(~Study)

The solution was quite easy. Just had to think a bit and the re-arranged data looks like this.
Study,Cat,Dim,Value
Study1,PK,Dim1,-3
Study1,PK,Dim2,0.99
Study1,PK,Dim3,-0.86
Study1,PK,Dim4,0.46
Study1,US,Dim1,-4.67
Study1,US,Dim2,0.76
Study1,US,Dim3,1.01
Study1,US,Dim4,0.45
Study2,FL,Dim1,-2.856
Study2,FL,Dim2,4.15
Study2,FL,Dim3,1.554
Study2,FL,Dim4,0.765
Study2,FL,Dim1,-8.668
Study2,FL,Dim2,5.907
Study2,FL,Dim3,3.795
Study2,FL,Dim4,4.754
After that R produced desire result with this code.
plot1 <- ggplot(data=dims, aes(x=Dim, y=Value, colour=Cat, group=Cat)) + geom_line()+ geom_point() + facet_wrap(~Study)

How to format the scatterplots of data series in R

I have been struggling in creating a decent looking scatterplot in R. I wouldn't think it was so difficult.
After some research, it seemed to me that ggplot would have been a choice allowing plenty of formatting. However, I'm struggling in understanding how it works.
I'd like to create a scatterplot of two data series, displaying the points with two different colours, and perhaps different shapes, and a legend with series names.
Here is my attempt, based on this:
year1 <- mpg[which(mpg$year==1999),]
year2 <- mpg[which(mpg$year==2008),]
ggplot() +
geom_point(data = year1, aes(x=cty,y=hwy,color="yellow")) +
geom_point(data = year2, aes(x=cty,y=hwy,color="green")) +
xlab('cty') +
ylab('hwy')
Now, this looks almost OK, but with non-matching colors (unless I suddenly became color-blind). Why is that?
Also, how can I add series names and change symbol shapes?

Don't build 2 different dataframes:
df <- mpg[which(mpg$year%in%c(1999,2008)),]
df$year<-as.factor(df$year)
ggplot() +
geom_point(data = df, aes(x=cty,y=hwy,color=year,shape=year)) +
xlab('cty') +
ylab('hwy')+
scale_color_manual(values=c("green","yellow"))+
scale_shape_manual(values=c(2,8))+
guides(colour = guide_legend("Year"),
shape = guide_legend("Year"))

This will work with the way you currently have it set-up:
ggplot() +
geom_point(data = year1, aes(x=cty,y=hwy), col = "yellow", shape=1) +
geom_point(data = year2, aes(x=cty,y=hwy), col="green", shape=2) +
xlab('cty') +
ylab('hwy')

You want:
library(ggplot2)
ggplot(mpg, aes(cty, hwy, color=as.factor(year)))+geom_point()

3-variables plotting heatmap ggplot2

I'm currently working on a very simple data.frame, containing three columns:
x contains x-coordinates of a set of points,
y contains y-coordinates of the set of points, and
weight contains a value associated to each point;
Now, working in ggplot2 I seem to be able to plot contour levels for these data, but i can't manage to find a way to fill the plot according to the variable weight. Here's the code that I used:
ggplot(df, aes(x,y, fill=weight)) +
geom_density_2d() +
coord_fixed(ratio = 1)
You can see that there's no filling whatsoever, sadly.
I've been trying for three days now, and I'm starting to get depressed.
Specifying fill=weight and/or color = weight in the general ggplot call, resulted in nothing. I've tried to use different geoms (tile, raster, polygon...), still nothing. Tried to specify the aes directly into the geom layer, also didn't work.
Tried to convert the object as a ppp but ggplot can't handle them, and also using base-R plotting didn't work. I have honestly no idea of what's wrong!
I'm attaching the first 10 points' data, which is spaced on an irregular grid:
x = c(-0.13397460,-0.31698730,-0.13397460,0.13397460,-0.28867513,-0.13397460,-0.31698730,-0.13397460,-0.28867513,-0.26794919)
y = c(-0.5000000,-0.6830127,-0.5000000,-0.2320508,-0.6547005,-0.5000000,-0.6830127,-0.5000000,-0.6547005,0.0000000)
weight = c(4.799250e-01,5.500250e-01,4.799250e-01,-2.130287e+12,5.798250e-01,4.799250e-01,5.500250e-01,4.799250e-01,5.798250e-01,6.618956e-01)
any advise? The desired output would be something along these lines:
click
Thank you in advance.

From your description geom_density doesn't sound right.
You could try geom_raster:
ggplot(df, aes(x,y, fill = weight)) +
geom_raster() +
coord_fixed(ratio = 1) +
scale_fill_gradientn(colours = rev(rainbow(7)) # colourmap

Here is a second-best using fill=..level... There is a good explanation on ..level.. here.
# load libraries
library(ggplot2)
library(RColorBrewer)
library(ggthemes)
# build your data.frame
df <- data.frame(x=x, y=y, weight=weight)
# build color Palette
myPalette <- colorRampPalette(rev(brewer.pal(11, "Spectral")), space="Lab")
# Plot
ggplot(df, aes(x,y, fill=..level..) ) +
stat_density_2d( bins=11, geom = "polygon") +
scale_fill_gradientn(colours = myPalette(11)) +
theme_minimal() +
coord_fixed(ratio = 1)

Omitting a Missing x-axis Value in ggplot2 (Convert range to categorical variable)

I am using ggplot to generate a chart that summarises a race made up from several laps. There are 24 participants in the race,numbered 1-12, 14-25; I am plotting out a summary measure for each participant using ggplot, but ggplot assumes I want the number range 1-25, rather than categories 1-12, 14-25.
What's the fix for this? Here's the code I am using (the data is sourced from a Google spreadsheet).
sskey='0AmbQbL4Lrd61dHlibmxYa2JyT05Na2pGVUxLWVJYRWc'
library("ggplot2")
require(RCurl)
gsqAPI = function(key,query,gid){ return( read.csv( paste( sep="", 'http://spreadsheets.google.com/tq?', 'tqx=out:csv', '&tq=', curlEscape(query), '&key=', key, '&gid=', curlEscape(gid) ) ) ) }
sin2011racestatsX=gsqAPI(sskey,'select A,B,G',gid='13')
sin2011proximity=gsqAPI(sskey,'select A,B,C',gid='12')
h=sin2011proximity
k=sin2011racestatsX
l=subset(h,lap==1)
ggplot() +
geom_step(aes(x=h$car, y=h$pos, group=h$car)) +
scale_x_discrete(limits =c('VET','WEB','HAM','BUT','ALO','MAS','SCH','ROS','SEN','PET','BAR','MAL','','SUT','RES','KOB','PER','BUE','ALG','KOV','TRU','RIC','LIU','GLO','AMB'))+
xlab(NULL) + opts(title="F1 2011 Korea \nRace Summary Chart", axis.text.x=theme_text(angle=-90, hjust=0)) +
geom_point(aes(x=l$car, y=l$pos, pch=3, size=2)) +
geom_point(aes(x=k$driverNum, y=k$classification,size=2), label='Final') +
geom_point(aes(x=k$driverNum, y=k$grid, col='red')) +
ylab("Position")+
scale_y_discrete(breaks=1:24,limits=1:24)+ opts(legend.position = "none")

Expanding on my cryptic comment, try this:
#Convert these to factors with the appropriate labels
# Note that I removed the ''
h$car <- factor(h$car,labels = c('VET','WEB','HAM','BUT','ALO','MAS','SCH','ROS','SEN','PET','BAR','MAL',
'SUT','RES','KOB','PER','BUE','ALG','KOV','TRU','RIC','LIU','GLO','AMB'))
k$driverNum <- factor(k$driverNum,labels = c('VET','WEB','HAM','BUT','ALO','MAS','SCH','ROS','SEN','PET','BAR','MAL',
'SUT','RES','KOB','PER','BUE','ALG','KOV','TRU','RIC','LIU','GLO','AMB'))
l=subset(h,lap==1)
ggplot() +
geom_step(aes(x=h$car, y=h$pos, group=h$car)) +
geom_point(aes(x=l$car, y=l$pos, pch=3, size=2)) +
geom_point(aes(x=k$driverNum, y=k$classification,size=2), label='Final') +
geom_point(aes(x=k$driverNum, y=k$grid, col='red')) +
ylab("Position") +
scale_y_discrete(breaks=1:24,limits=1:24) + opts(legend.position = "none") +
opts(title="F1 2011 Korea \nRace Summary Chart", axis.text.x=theme_text(angle=-90, hjust=0)) + xlab(NULL)
Calling scale_x_discrete is no longer necessary. And stylistically, I prefer putting opts and xlab stuff at the end.
Edit
A few notes in response to your comment. Many of your difficulties can be eased by a more streamlined use of ggplot. Your data is in an awkward format:
#Summarise so we can use geom_linerange rather than geom_step
d1 <- ddply(h,.(car),summarise,ymin = min(pos),ymax = max(pos))
#R has a special value for missing data; use it!
k$classification[k$classification == 'null'] <- NA
k$classification <- as.integer(k$classification)
#The other two data sets should be merged and converted to long format
d2 <- merge(l,k,by.x = "car",by.y = "driverNum")
colnames(d2)[3:5] <- c('End of Lap 1','Final Position','Grid Position')
d2 <- melt(d2,id.vars = 1:2)
#Now the plotting call is much shorter
ggplot() +
geom_linerange(data = d1,aes(x= car, ymin = ymin,ymax = ymax)) +
geom_point(data = d2,aes(x= car, y= value,shape = variable),size = 2) +
opts(title="F1 2011 Korea \nRace Summary Chart", axis.text.x=theme_text(angle=-90, hjust=0)) +
labs(x = NULL, y = "Position", shape = "")
A few notes. You were setting aesthetics to fixed values (size = 2) which should be done outside of aes(). aes() is for mapping variables (i.e. columns) to aesthetics (color, shape, size, etc.). This allows ggplot to intelligently create the legend for you.
Merging the second two data sets and then melting it creates a grouping variable for ggplot to use in the legend. I used the shape aesthetic since a few values overlap; using color may make that hard to spot. In general, ggplot will resist mixing aesthetics into a single legend. If you want to use shape, color and size you'll get three legends.
I prefer setting labels using labs, since you can do them all in one spot. Note that setting the aesthetic label to "" removes the legend title.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

ggplot conditionally change shape or shape fill using variable - r

Related

Bug in ggplot2?

Arranging data for two facet R line plot

How to format the scatterplots of data series in R

3-variables plotting heatmap ggplot2

Omitting a Missing x-axis Value in ggplot2 (Convert range to categorical variable)

Categories

Resources