Problems with ordering for geom_segment chart

Problems with ordering for geom_segment chart - r

I would appreciate any advice with my plot - I am a ggplot novice!
I am trying to create a cleveland dot plot faceted by cluster, which has 3 levels. I have 3 issues that I am struggling with:
Within each cluster, I want the dots to be ordered by my continuous x-var. The code below isn't ordering correctly.
Is it possible to change the dot type based on whether the y-var ends in a 0 (does not have a characteristic) or 1 (does have the characteristic)?
I have a variable in my data set (Population) which shows the population % of a characteristic. I would like to see if a cluster characteristic is over/under-represented compared with the population. I would like to add a dot on the same line of each y-var.
Here is my code :
ggplot(cl1, aes(x=Cluster_prop, y=reorder(Var, Cluster_prop)))+
geom_segment(aes(yend=Var), xend=0, colour="grey50")+
geom_point(size=3, aes(colour=Cluster))+
facet_grid(Cluster~., scales="free_y", space="free_y") +
ggtitle("Top 10 Cluster Characteristics: % Children Within Cluster With
Feature")
Here is my data:
> dput(cl1)
structure(list(Var = structure(c(2L, 3L, 5L, 7L, 14L, 16L, 18L,
19L, 20L, 22L, 15L, 9L, 7L, 6L, 21L, 13L, 17L, 12L, 4L, 11L,
15L, 17L, 21L, 1L, 13L, 4L, 10L, 12L, 6L, 8L), .Label = c("asthdoc_1",
"AttacksOnExer_1_0", "AttacksTTT_1_0", "AttacksTTT_1_1", "Breath0rmal_1_0",
"Breath0rmal_1_1", "CAsthmaMed_1_0", "CAsthmaMed_1_1", "CCurrentAsthma_1_0",
"CCurrentAsthma_1_1", "CongColds_1_1", "CoughNight_1_1",
"CoughWithColds_1_1",
"EverWheeze_1_0", "EverWheeze_1_1", "Wheeze6M_1_0", "Wheeze6M_1_1",
"WheezeMostDays_1_0", "WheezeOcc_1_0", "WheezeWithColds_1_0",
"WheezeWithColds_1_1", "WheezeWithShort_1_0"), class = "factor"),
Cluster_prop = c(100, 100, 100, 100, 100, 100, 100, 100,
100, 100, 100, 99.4219653, 98.8439306, 95.3757225, 94.7976879,
83.2369942, 79.1907514, 53.7572254, 50.867052, 50.867052,
100, 100, 100, 93.103448, 89.655172, 86.206897, 86.206897,
82.758621, 79.310345, 79.310345), Population = c(96.131528,
78.143133, 63.636364, 95.16441, 60.928433, 67.891683, 97.485493,
89.555126, 62.669246, 90.32882, 39.071567, 94.584139, 95.16441,
36.363636, 37.330754, 68.665377, 32.108317, 43.520309, 21.856867,
42.166344, 39.071567, 32.108317, 37.330754, 9.864603, 68.665377,
21.856867, 5.415861, 43.520309, 36.363636, 4.83559), Cluster =
structure(c(1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), .Label = c("1",
"2", "3"), class = "factor")), .Names = c("Var", "Cluster_prop",
"Population", "Cluster"), row.names = c(NA, -30L), vars = "Cluster", drop =
TRUE, indices = list(
0:9, 10:19, 20:29), group_sizes = c(10L, 10L, 10L), biggest_group_size =
10L, labels = structure(list(
Cluster = 1:3), row.names = c(NA, -3L), class = "data.frame", vars =
"Cluster", drop = TRUE, .Names = "Cluster"), class = c("grouped_df",
"tbl_df", "tbl", "data.frame"))
Many thanks for any advice!

For your second (EDIT and third) issue(s):
library(tidyverse)
library(stringr)
str_sub(str, start = -1, end = -1)
cl2 <- cl1 %>% mutate(Shape = str_sub(Var, start = -1, end = -1))
ggplot(cl2, aes(x=Cluster_prop, y=reorder(Var, Cluster_prop)))+
geom_segment(aes(yend=Var), xend=0, colour="grey50")+
geom_point(size=3, aes(colour=Cluster, shape = Shape))+
geom_point(aes(x = Population), size = 2, color = "black")+
facet_grid(Cluster~., scales="free_y", space="free_y") +
ggtitle("Top 10 Cluster Characteristics: % Children Within Cluster With
Feature")

Related

Plot Y values against the time grouped by an ID

I want make a time series plot grouped by ID. My dataset has 42 different IDs with 7 different timeframes. The timeframe varies per ID and ranges from 9/2016 to 8/2018. I.e., ID1 can start 10/2016 and end 7/2017 (with 7 rows containing a different date) and ID40 can start 11/2016 and ends 6/2018 (also with 7 rows containing a different date). I try to plot this with the following code
p <- ggplot(data = df6, aes(x = START, y = AI, col = ID, group = ID))
p + geom_point(size = 1.2,
alpha = .8) + stat_smooth(aes(group = 1)) + stat_summary(aes(group = 1), geom =
"point", fun.y = mean,
shape = 17, size = 3) + theme_minimal() + theme(axis.text.x = element_text(angle =
90, vjust = 0.5, hjust=1))
This gives me the following graph:
As one can see the X-axis is not chronological. I should start at 09/2016 and end at 08/2018 and then correspond with the Y value based on the ID. I got the following dataset:
structure(list(ID = c("ID1", "ID1", "ID1", "ID1", "ID1", "ID1",
"ID1", "ID10", "ID10", "ID10", "ID10", "ID10", "ID10", "ID10",
"ID11", "ID11", "ID11", "ID11", "ID11", "ID12"), Time = c("1",
"2", "3", "4", "5", "6", "7", "1", "2", "3", "4", "5", "6", "7",
"1", "2", "3", "4", "5", "1"), AI = c(0.393672183448241, 0.4876954603533,
0.411717908455957, 0.309769862660288, 0.149826889496538, 0.2448558592586,
0.123606753324621, 0.296109333767922, 0.309960002123076, 0.445886231347992,
0.370013553008003, 0.393414429902431, 0.318940511323733, 0.131112361225666,
0.31961673567578, 0.227268892979164, 0.433471105477564, 0.207184572401005,
0.144257239122978, 0.520204263001733), AI_VAR = c(0.154977788020905,
0.237846862049217, 0.169511636143347, 0.0959573678125739, 0.0224480968162077,
0.0599543918132674, 0.0152786294674538, 0.0876807375444826, 0.0960752029161373,
0.198814531305715, 0.136910029409606, 0.154774913655455, 0.101723049763444,
0.0171904512661696, 0.102154857724042, 0.0516511497159746, 0.187897199283942,
0.0429254470409874, 0.020810151039384, 0.270612475245176), activity = c(0,
0.303472222222222, 0.232638888888889, 0.228472222222222, 0.348611111111111,
0.215972222222222, 0.123611111111111, 0.357638888888889, 0.235416666666667,
0.233333333333333, 0.2875, 0.353472222222222, 0.356944444444444,
0.149305555555556, 0.448611111111111, 0.213888888888889, 0.248611111111111,
0.288888888888889, 0.25625, 0.238888888888889), ZIM_SD = c(0,
0.148002025121106, 0.095781596758851, 0.0707738088994687, 0.0522313184217097,
0.0528820640482116, 0.0152791681192935, 0.105900213118389, 0.0729697504998075,
0.104040120647865, 0.106378896489801, 0.139061072791901, 0.113844043625277,
0.0195758039329988, 0.143383618921218, 0.0486102909983211, 0.107765733167339,
0.059853320915846, 0.036965917525263, 0.124271018383747), ZIM_VAR = c(0,
0.0721799157746582, 0.039434998686126, 0.0219235930627339, 0.00782565597342798,
0.0129484832318932, 0.00188860836472692, 0.0313580415523671,
0.0226177040198407, 0.0463900573046668, 0.0393616334552618, 0.0547086326740462,
0.0363094774850072, 0.00256662987654616, 0.0458278042289798,
0.0110476070225835, 0.0467133314886466, 0.0124006847007297, 0.00533260120384214,
0.0646463135307921), CHECK = c(10L, 13L, 11L, 7L, 7L, 5L, 4L,
36L, 36L, 34L, 34L, 32L, 29L, 21L, 28L, 27L, 26L, 25L, 21L, 36L
), BULBAR = c(2L, 4L, 4L, 4L, 4L, 2L, 2L, 9L, 9L, 9L, 9L, 9L,
7L, 6L, 12L, 12L, 11L, 11L, 11L, 11L), FINE = c(0L, 0L, 0L, 0L,
0L, 0L, 0L, 9L, 9L, 8L, 8L, 7L, 6L, 4L, 2L, 1L, 1L, 1L, 0L, 7L
), GROSS = c(2L, 2L, 2L, 2L, 2L, 2L, 1L, 9L, 9L, 9L, 9L, 8L,
8L, 6L, 3L, 3L, 3L, 3L, 2L, 6L), RESPI = c(6L, 7L, 5L, 1L, 1L,
1L, 1L, 9L, 9L, 8L, 8L, 8L, 8L, 5L, 11L, 11L, 11L, 10L, 8L, 12L
), GROSS_RENEWD = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 6L, 6L, 6L, 6L,
5L, 5L, 4L, 3L, 3L, 3L, 3L, 2L, 3L), ACTIVE = c(2L, 2L, 2L, 2L,
2L, 2L, 1L, 18L, 18L, 17L, 17L, 15L, 14L, 10L, 5L, 4L, 4L, 4L,
2L, 13L), NON.ACTIVE = c(8L, 11L, 9L, 5L, 5L, 3L, 3L, 18L, 18L,
17L, 17L, 17L, 15L, 11L, 23L, 23L, 22L, 21L, 19L, 23L), START = c("09/2016",
"11/2016", "01/2017", "04/2017", "06/2017", "10/2017", "02/2018",
"10/2016", "12/2016", "02/2017", "04/2017", "07/2017", "11/2017",
"04/2018", "10/2016", "12/2016", "02/2017", "04/2017", "07/2017",
"10/2016"), STOP = c("10/2016", "11/2016", "01/2017", "04/2017",
"06/2017", "10/2017", "03/2018", "10/2016", "12/2016", "02/2017",
"04/2017", "07/2017", "11/2017", "04/2018", "10/2016", "12/2016",
"02/2017", "04/2017", "07/2017", "10/2016")), row.names = c(NA,
20L), class = "data.frame")
In general I want the column START to start with the begin date and end with the last date when it is plotted

You should convert your "START" column to a date format. You could use the package zoo with the function as.yearmon for that. To start the axis with your start date and end it with the end date, you could create a vector of date breaks using the min (start) date and max (end) date. Here is a reproducible example:
library(ggplot2)
library(zoo)
library(dplyr)
df6 <- df6 %>%
mutate(START = as.Date(as.yearmon(START, format = '%m/%Y')))
breaks.vec <- c(min(df6$START),
seq(from=min(df6$START), to=max(df6$START), by = 'month'))
ggplot(data = df6, aes(x = START, y = AI, col = ID, group = ID)) +
geom_point(size = 1.2, alpha = .8) +
stat_smooth(aes(group = 1)) +
stat_summary(aes(group = 1), geom = "point", fun.y = mean, shape = 17, size = 3) +
scale_x_date(breaks = breaks.vec, date_labels = "%m/%Y") +
theme_minimal() +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1))
#> Warning: `fun.y` is deprecated. Use `fun` instead.
#> `geom_smooth()` using method = 'loess' and formula 'y ~ x'
Created on 2022-10-17 with reprex v2.0.2

create scatter plot matrix with openair and hexbin

I've worked with the openair and hexbin packages to create two scatter plots with the help of the scatter plot function commands:
scatterPlot(mydata, x ="Observed" , y = "Model1",xlab=10, ylab=10,method = "hexbin",mod.line=T,auto.text=F, col = "jet", xbin = 30)
scatterPlot(mydata, x ="Observed" , y = "Model2",xlab=10, ylab=10,method = "hexbin",mod.line=T,auto.text=F, col = "jet", xbin = 30)
I've got the scatter plots, but if I want to put them into one plot and with one color counts to get something similar to this:How should i proceed?
please refer to this link to view the image : https://ibb.co/rF148kp

You could reorganize your data frame so that it has three columns - "Observed", "Modeled", and "Model Type". Example -
structure(list(observed = c(2L, 2L, 4L, 4L, 6L, 6L, 8L, 8L, 10L,
10L, 12L, 12L, 14L, 14L, 16L, 16L, 18L, 18L, 20L, 20L), modelled = c(1L,
5L, 7L, 2L, 5L, 9L, 13L, 15L, 16L, 14L, 18L, 17L, 10L, 21L, 26L,
24L, 22L, 28L, 27L, 30L), model_type = structure(c(1L, 2L, 1L,
2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L,
2L), .Label = c("Model 1", "Model 2"), class = "factor")), class = "data.frame",
row.names = c(NA,
-20L))
This way, you can then use the following code -
scatterPlot(mydata, x = "observed", y = "modelled", type = c("model_type"),
method = "hexbin",mod.line=T,auto.text=F, col = "jet", xbin = 5,
linear = TRUE, layout = c(2, 1))
To create a plot containing the two scatter plots. Note, the above code sets xbin to 5 purely for the reason that I have used a small data set for testing purposes. Also, excuse the spelling error in the y-axis and code ("modelled" should be "modeled")!

Why isn't my barplot rearranging properly when faceting with ggplot?

So I have made this barplot with this code, bars organised in descending order, great!
na.omit(insect_tally_native_ranges)%>%
group_by(native_ranges)%>%
dplyr::summarise(freq=sum(n))%>%
ggplot(aes(x=reorder(native_ranges,freq),y=freq))+
geom_col(color="#CD4F39",fill="#CD4F39",alpha=0.8)+
coord_flip()+
labs(x="Native ranges",
y="Number of invasive insect arrivals",
title="Species by native ranges")+
theme_minimal()
And now I wanted to do the same but faceting by a variable called Period, here's the code:
ggplot(native_freq_period,
aes(y=reorder(native_ranges,freq),x=freq))+
geom_barh(stat= "identity",
color="#CD4F39",
fill="#CD4F39",
alpha=0.8)+
labs(x="Native ranges",
y="Number of invasive insect arrivals",
title="Species by native ranges")+
theme_minimal()+
facet_wrap(~Period)
But the plot came out like this:
Which is pretty annoying because it is the same code as above and the levels for the variable native_ranges should be organised again. But instead it gives me this lumpy order that isn't even the alphabetic order. So the reorder part is reordering but not by freq! Don't understand.
Here is the data:
structure(list(native_ranges = structure(c(6L, 10L, 11L, 7L,
3L, 5L, 1L, 1L, 8L, 6L, 3L, 5L, 2L, 4L, 5L, 7L, 7L, 7L, 8L, 9L,
11L), .Label = c("Afrotropic", "Afrotropic/Neotropic", "Australasia",
"Australasia/Neotropic", "Indomalaya", "Nearctic", "Neotropic",
"Neotropic/Nearctic", "Neotropic/Nearctic/Australasia", "Palearctic",
"Palearctic/Indomalaya"), class = "factor"), Period = structure(c(4L,
4L, 4L, 4L, 4L, 4L, 3L, 4L, 4L, 3L, 3L, 3L, 4L, 4L, 2L, 1L, 2L,
3L, 2L, 4L, 3L), .Label = c("1896-1925", "1926-1955", "1956-1985",
"1986-2018"), class = "factor"), freq = c(21L, 13L, 12L, 11L,
10L, 10L, 4L, 4L, 4L, 3L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L)), row.names = c(NA, -21L), class = c("grouped_df", "tbl_df",
"tbl", "data.frame"), vars = "native_ranges", drop = TRUE, indices = list(
6:7, 12L, c(4L, 10L), 13L, c(5L, 11L, 14L), c(0L, 9L), c(3L,
15L, 16L, 17L), c(8L, 18L), 19L, 1L, c(2L, 20L)), group_sizes = c(2L,
1L, 2L, 1L, 3L, 2L, 4L, 2L, 1L, 1L, 2L), biggest_group_size = 4L, labels = structure(list(
native_ranges = structure(1:11, .Label = c("Afrotropic",
"Afrotropic/Neotropic", "Australasia", "Australasia/Neotropic",
"Indomalaya", "Nearctic", "Neotropic", "Neotropic/Nearctic",
"Neotropic/Nearctic/Australasia", "Palearctic", "Palearctic/Indomalaya"
), class = "factor")), row.names = c(NA, -11L), class = "data.frame", vars = "native_ranges", drop = TRUE))

You have to arrange the order of the variable first before plotting. Since you didn't provide any reproducible data I am using the following data
drugs <- data.frame(drug = c("a", "b", "c"), effect = c(4.2, 9.7, 6.1))
ggplot(drugs, aes(drug, effect)) +
geom_col()
Now to change the order of the variable use factor
drugs$drug <- factor(drugs$drug,levels = c("b","a","c")) #This is the order I want
ggplot(drugs, aes(drug, effect)) +
geom_col()
Here I provided the levels in factor manually. You can either provide them manually or sort the order of the variable first separately and provide. See below,
drugs$drug <- factor(drugs$drug,levels = drugs[order(drugs$effect),]$drug)
ggplot(drugs, aes(drug, effect)) +
geom_col()
This should work with facet_wrap as well.

OK, finally figured it out with help from the other answer. You need to create another column that summarizes the total frequency so you can then reorder by that column. There may be a more efficient way to do it, but I create a new summary data.frame and then join it back to the original and then reorder based on the new column.
summary_data <- data %>%
ungroup() %>%
group_by(native_ranges) %>%
summarize(total = sum(freq))
data <- data %>%
left_join(summary_data)
ggplot(data, aes(y = reorder(native_ranges, total),x = freq)) +
geom_barh(stat= "identity",
color="#CD4F39",
fill="#CD4F39",
alpha=0.8) +
labs(x="Native ranges",
y="Number of invasive insect arrivals",
title="Species by native ranges") +
theme_minimal()+
facet_wrap(~Period)

Why does ggtern distort data

I can't work out why my data points in the ternary diagram appear distorted, particularly visible in Fe02 scale where none of the values approaching 50% seem to be plotting correctly. Does ggtern require some data transformation or am I missing something?
The dataset:
KiDaSm<-structure(list(Site = structure(c(3L, 3L, 3L, 3L, 3L, 3L, 3L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c("Dakawa", "Fukuchani",
"Kilwa", "Mkokotoni", "Tumbe Chwaka", "Unguja Ukuu"), class = "factor"),
Sample = structure(c(7L, 8L, 9L, 10L, 11L, 14L, 15L, 16L,
17L, 19L, 20L, 21L, 23L, 24L, 25L, 26L), .Label = c("EB005",
"EB008", "EB009", "EB017", "EB018", "EB023", "EB028", "EB030",
"EB033", "EB034", "EB035", "EB036", "EB037", "EB038", "EB040",
"EBDAK002", "EBDAK006", "EBDAK007", "EBDAK009", "EBDAK012",
"EBDAK014", "EBDAK015", "EBDAK017", "EBDAK020", "EBDAK021",
"EBDAK022", "FKCH002", "FKCH003", "FKCH005", "FKCH006", "FKCH008",
"FKCH009", "FKCH010", "FKCH012", "FKCH014", "FKCH015", "FKCH016",
"FKCH017", "FKCH018", "FKCH019", "FKCH023", "MKK002", "MKK003",
"MKK007", "MKK009", "MKK011", "MKK013", "MKK014", "MKK017",
"MKK018", "MKK020", "MKK06", "TBCH001", "TBCH002", "TBCH003",
"TBCH005", "TBCH007", "TBCH008", "TBCH009", "TBCH010", "TBCH011",
"TBCH014", "TBCH017", "TBCH018", "TBCH021", "TBCH022", "UU001",
"UU003", "UU004", "UU005", "UU007", "UU008", "UU010", "UU011",
"UU012", "UU014", "UU018", "UU020", "UU022", "UU023", "UU026",
"UU031", "UU033"), class = "factor"), ID = structure(c(2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L
), .Label = c("ND", "Smelting", "Smithing"), class = "factor"),
Iron = c(52.2866002788889, 57.437955161, 55.880450631, 50.213473286,
53.068958017, 55.776340727, 56.764639409, 61.37738424, 75.741474131,
75.459980082, 69.785922113, 76.298245515, 75.860464737, 77.221978734,
76.602317775, 67.582636787), Aluminium = c(8.07348620588889,
6.9369729006, 6.4314347298, 7.7061493869, 7.3254949831, 7.2108549156,
7.2113019865, 8.2022565362, 4.570137602, 4.3668232665, 5.8538177888,
4.5660791632, 4.2671637947, 4.727287541, 4.7084385736, 6.0287010895
), Silicon = c(24.6786504477778, 22.516695383, 24.261662172,
26.81463386, 25.558654883, 23.062108874, 23.144722305, 26.480492462,
17.138349267, 16.917779397, 19.620246624, 16.265818105, 17.628059944,
15.696017597, 15.786928218, 22.04500569)), .Names = c("Site",
"Sample", "ID", "Iron", "Aluminium", "Silicon"), row.names = c(NA,
-16L), class = "data.frame")
My code:
library(ggtern)
ggtern(KiDaSm, aes(Iron,Silicon, Aluminium, color=Site, shape=Site )) + geom_point() +
labs(x = expression(FeO[2]), y=expression(SiO[2]), z=expression(Al[2]*O[3])) +
scale_color_manual(values = c("#FFC300", "#FF5733")) +
theme_bw()
Ternary diagram:

Suppressing data from a graph in R

I have a dataset, d, that contains personally identifiable data, I have the dataset putting an X for all values that are suppressed:
column1 column2 column3
* FSM X
* Male 2.5
* Female X
A FSM 6
A Male 10.3
A Female 11.7
B FSM 14.8
B Male 21.5
B Female 25.3
I want to plot this with an X above the bars in a bar plot, where data has been suppressed, such as:
My code is:
p <- ggplot(d, aes(x=column1, y=column3, fill=column2)) +
geom_bar(position=position_dodge(), stat="identity", colour="black") +
geom_text(aes(label=column2),position= position_dodge(width=0.9), vjust=-.5)
scale_y_continuous("Percentage",breaks=seq(0, max(d$column3), 2)))
But of course, it can't plot 'X' on the graph and says:
Error: Discrete value supplied to continuous scale
How can I get the bar plotting to ignore the 'X' and still add the label if it's present?
Data dump:
structure(list(column1 = structure(c(1L, 1L, 1L, 2L, 2L, 2L,
3L, 3L, 3L, 4L, 4L, 4L, 5L, 5L, 5L, 6L, 6L, 6L, 7L, 7L, 7L), .Label = c("*",
"A", "B", "C", "D", "E", "U"), class = "factor"), column2 = structure(c(1L,
2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L,
3L, 1L, 2L, 3L), .Label = c("FSM", "Male", "Female"), class = "factor"),
column3 = structure(c(21L, 1L, 2L, 18L, 3L, 4L, 7L, 12L,
14L, 16L, 15L, 13L, 10L, 9L, 8L, 11L, 6L, 5L, 20L, 19L, 17L
), .Label = c("1.93889541715629", "1.97444831591173", "10.1057579318449",
"11.7305458768873", "12.7758420441347", "14.4535840188014",
"14.8471615720524", "18.5830429732869", "19.9764982373678",
"20.0873362445415", "20.9606986899563", "21.5628672150411",
"24.1579558652729", "25.3193960511034", "25.7931844888367",
"29.2576419213974", "5.45876887340302", "6.11353711790393",
"6.16921269095182", "6.98689956331878", "X"), class = "factor")), .Names = c("column1",
"column2", "column3"), row.names = c(NA, -21L), class = "data.frame")
I 'm happy to print out 0 instances where there are 0 instances, but in the case of data suppression, I want to make it clear that data has been suppressed by printing out a 'X', but the bar will also show 0 instances

First convert the height to numeric which gives NA for censored values. Then create a label column based on that. Then you need a column of zeroes for the y coordinate of the labels.
> d$column3=as.numeric(as.character(d$column3))
Warning message:
NAs introduced by coercion
> d$column4 = ifelse(is.na(d$column3),"X","")
> d$y=0
Then:
> p <- ggplot(d, aes(x=column1, y=column3, fill=column2))
> p + geom_bar(position=position_dodge(), stat="identity",
colour="black") +
geom_text(aes(label=column4,x=column1,y=y),
position=position_dodge(width=1), vjust=-0.5)
Giving:
Its a variant on labelling a geom_bar with the value of the bar. Almost a dupe.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

Problems with ordering for geom_segment chart - r

Related

Plot Y values against the time grouped by an ID

create scatter plot matrix with openair and hexbin

Why isn't my barplot rearranging properly when faceting with ggplot?

Why does ggtern distort data

Suppressing data from a graph in R

Categories

Resources