coord_cartesian doesn't allow one to set per-facet coordinates, and using other range-limiting tends to produce a straight-line on the specific extreme. Since we have widelay-varying y-ranges, we can't set the limits on all facets identically; limiting the data before plot is not as friendly with geom_line/geom_path (https://stackoverflow.com/a/27319786/3358272), as it takes a lot more effort to interpolate data to get to the edge and then insert NAs in order to break up the line. (Ultimately, the only way to get the desired result is to do exactly this, which can be a bit onerous with other data.)
One workaround is suggested in https://gist.github.com/burchill/d780d3e8663ad15bcbda7869394a348a, where it starts with
test_data %>%
ggplot(aes(x=Nsubjects, y = Odds, color=EffectSize)) +
facet_wrap(DataType ~ ExpType, labeller = label_both, scales="free") +
geom_line(size=2) +
geom_ribbon(aes(ymax=Upper, ymin=Lower, fill=EffectSize, color=NULL), alpha=0.2)
and in previous versions of ggplot2, that gist defined coord_panel_ranges and was able to control coordinates per-facet. The two right facets should narrow down to a 1-6(ish) y-axis so that the exploding confidence interval goes off-screen and allows the facet to focus primarily on the "normal range" of data. (Note: the test_data and this vis is not mine, it's taken from the gist. While my needs are somewhat similar, I thought it better to stay within the confines of the gist's data and code.)
Unfortunately, this now fails for me with ggplot2-3.3.0. Initial errors related to the recent loss of ggplot2::scale_range, which I tried to mitigate with this adaptation of burchill's code (that uses other ggplot2::: internal functions):
UniquePanelCoords <- ggplot2::ggproto(
"UniquePanelCoords", ggplot2::CoordCartesian,
num_of_panels = 1,
panel_counter = 1,
panel_ranges = NULL,
setup_layout = function(self, layout, params) {
self$num_of_panels <- length(unique(layout$PANEL))
self$panel_counter <- 1
layout
},
setup_panel_params = function(self, scale_x, scale_y, params = list()) {
if (!is.null(self$panel_ranges) & length(self$panel_ranges) != self$num_of_panels)
stop("Number of panel ranges does not equal the number supplied")
train_cartesian <- function(scale, limits, name, given_range = NULL) {
if (is.null(given_range)) {
expansion <- ggplot2:::default_expansion(scale, expand = self$expand)
range <- ggplot2:::expand_limits_scale(scale, expansion,
coord_limits = self$limits[[name]])
} else {
range <- given_range
}
out <- scale$break_info(range)
out$arrange <- scale$axis_order()
names(out) <- paste(name, names(out), sep = ".")
out
}
cur_panel_ranges <- self$panel_ranges[[self$panel_counter]]
if (self$panel_counter < self$num_of_panels)
self$panel_counter <- self$panel_counter + 1
else
self$panel_counter <- 1
c(train_cartesian(scale_x, self$limits$x, "x", cur_panel_ranges$x),
train_cartesian(scale_y, self$limits$y, "y", cur_panel_ranges$y))
}
)
coord_panel_ranges <- function(panel_ranges, expand = TRUE, default = FALSE, clip = "on") {
ggplot2::ggproto(NULL, UniquePanelCoords, panel_ranges = panel_ranges,
expand = expand, default = default, clip = clip)
}
but this is still failing with
test_data %>%
ggplot(aes(x=Nsubjects, y = Odds, color=EffectSize)) +
facet_wrap(DataType ~ ExpType, labeller = label_both, scales="free") +
geom_line(size=2) +
geom_ribbon(aes(ymax=Upper, ymin=Lower, fill=EffectSize, color=NULL), alpha=0.2) +
coord_panel_ranges(panel_ranges = list(
list(x=c(8,64), y=c(1,4)), # Panel 1
list(x=c(8,64), y=c(1,6)), # Panel 2
list(NULL), # Panel 3, an empty list falls back on the default values
list(x=c(8,64), y=c(1,7)) # Panel 4
))
# Error in panel_params$x$break_positions_minor() :
# attempt to apply non-function
I'm not very familiar with extending ggplot2, and I suspect there is something I'm missing from the ggproto. Here's what the return value from the proto looks like:
str(c(train_cartesian(scale_x, self$limits$x, "x", cur_panel_ranges$x),
train_cartesian(scale_y, self$limits$y, "y", cur_panel_ranges$y)))
# List of 14
# $ x.range : num [1:2] 8 64
# $ x.labels : chr [1:3] "20" "40" "60"
# $ x.major : num [1:3] 0.214 0.571 0.929
# $ x.minor : num [1:6] 0.0357 0.2143 0.3929 0.5714 0.75 ...
# $ x.major_source: num [1:3] 20 40 60
# $ x.minor_source: num [1:6] 10 20 30 40 50 60
# $ x.arrange : chr [1:2] "secondary" "primary"
# $ y.range : num [1:2] 1 4
# $ y.labels : chr [1:4] "1" "2" "3" "4"
# $ y.major : num [1:4] 0 0.333 0.667 1
# $ y.minor : num [1:7] 0 0.167 0.333 0.5 0.667 ...
# $ y.major_source: num [1:4] 1 2 3 4
# $ y.minor_source: num [1:7] 1 1.5 2 2.5 3 3.5 4
# $ y.arrange : chr [1:2] "primary" "secondary"
Do I need to have an x element that's a list with at least a break_positions_minor function, or is there something else that needs to be inherited in order to ensure panel_params$x$break_positions_minor exists or a reasonable default is used?
Data:
test_data <- structure(list(DataType = structure(c(1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("A", "B"), class = "factor"),
ExpType = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("X", "Y"), class = "factor"),
EffectSize = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L,
1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 2L, 2L, 2L,
2L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L), .Label = c("15", "35"
), class = "factor"), Nsubjects = c(8, 16, 32, 64, 8, 16,
32, 64, 8, 16, 32, 64, 8, 16, 32, 64, 8, 16, 32, 64, 8, 16,
32, 64, 8, 16, 32, 64, 8, 16, 32, 64), Odds = c(1.06248116259846,
1.09482076720863, 1.23086993413208, 1.76749340505612, 1.06641831731573,
1.12616954196688, 1.48351814320987, 3.50755080416964, 1.11601399761081,
1.18352602009495, 1.45705466646283, 2.53384744810515, 1.13847061762186,
1.24983742407086, 1.97075900741022, 6.01497152563726, 1.02798821372378,
1.06297006279249, 1.19432835697453, 1.7320754674107, 1.02813271730924,
1.09355953747203, 1.44830680332583, 3.4732692664923, 1.06295915758305,
1.12008443626365, 1.3887632112682, 2.46321037334, 1.06722652223114,
1.1874936754725, 1.89870184372054, 5.943747409114), Upper = c(1.72895843644471,
2.09878774769559, 2.59771794965346, 5.08513435549015, 1.72999898901071,
1.8702196882561, 3.85385388850167, 5.92564404180303, 1.99113042576373,
2.61074135841984, 3.45852331828636, 4.83900142207583, 1.57897154221764,
1.8957409107653, 10, 75, 2.3763918424135, 2.50181951057562,
3.45037180395673, 3.99515276392065, 2.04584535265976, 2.39317394040066,
2.832526733659, 5.38414183471915, 1.40569501856836, 2.6778044191832,
2.98023068052396, 4.75934650422069, 1.54116883311054, 2.50647989271592,
3.48517589981551, 100), Lower = c(0.396003888752214, 0.0908537867216577,
-0.135978081389309, -1.55014754537791, 0.40283764562075,
0.382119395677663, -0.88681760208193, 1.08945756653624, 0.240897569457892,
-0.243689318229938, -0.544413985360706, 0.228693474134466,
0.69796969302609, 0.603933937376415, 0.183548809738402, 3.57236968943798,
-0.320415414965949, -0.375879384990643, -1.06171509000767,
-0.531001829099242, 0.010420081958713, -0.206054865456611,
0.0640868729926525, 1.56239669826544, 0.720223296597732,
-0.437635546655903, -0.202704257987574, 0.167074242459314,
0.593284211351745, -0.131492541770921, 0.312227787625573,
3.76692741957876)), .Names = c("DataType", "ExpType", "EffectSize",
"Nsubjects", "Odds", "Upper", "Lower"), class = c("tbl_df", "tbl",
"data.frame"), row.names = c(NA, -32L))
I modified the function train_cartesian to match the output format of view_scales_from_scale (defined here), which seems to work:
train_cartesian <- function(scale, limits, name, given_range = NULL) {
if (is.null(given_range)) {
expansion <- ggplot2:::default_expansion(scale, expand = self$expand)
range <- ggplot2:::expand_limits_scale(scale, expansion,
coord_limits = self$limits[[name]])
} else {
range <- given_range
}
out <- list(
ggplot2:::view_scale_primary(scale, limits, range),
sec = ggplot2:::view_scale_secondary(scale, limits, range),
arrange = scale$axis_order(),
range = range
)
names(out) <- c(name, paste0(name, ".", names(out)[-1]))
out
}
p <- test_data %>%
ggplot(aes(x=Nsubjects, y = Odds, color=EffectSize)) +
facet_wrap(DataType ~ ExpType, labeller = label_both, scales="free") +
geom_line(size=2) +
geom_ribbon(aes(ymax=Upper, ymin=Lower, fill=EffectSize, color=NULL), alpha=0.2)
p +
coord_panel_ranges(panel_ranges = list(
list(x=c(8,64), y=c(1,4)), # Panel 1
list(x=c(8,64), y=c(1,6)), # Panel 2
list(NULL), # Panel 3, an empty list falls back on the default values
list(x=c(8,64), y=c(1,7)) # Panel 4
))
Original answer
I've cheated my way out of a similar problem before.
# alternate version of plot with data truncated to desired range for each facet
p.alt <- p %+% {test_data %>%
mutate(facet = as.integer(interaction(DataType, ExpType, lex.order = TRUE))) %>%
left_join(data.frame(facet = 1:4,
ymin = c(1, 1, -Inf, 1), # change values here to enforce
ymax = c(4, 6, Inf, 7)), # different axis limits
by = "facet") %>%
mutate_at(vars(Odds, Upper, Lower), list(~ ifelse(. < ymin, ymin, .))) %>%
mutate_at(vars(Odds, Upper, Lower), list(~ ifelse(. > ymax, ymax, .))) }
# copy alternate version's panel parameters to original plot & plot the result
p1 <- ggplot_build(p)
p1.alt <- ggplot_build(p.alt)
p1$layout$panel_params <- p1.alt$layout$panel_params
p2 <- ggplot_gtable(p1)
grid::grid.draw(p2)
Many thanks go to Z.Lin for starting the fix to my question, and that answer certainly helped me get past the errors and learn a more appropriate way of working with ggproto objects.
This answer is posted as more of a flexible method of fixing the underlying problem of per-panel limits within a faceted plot. The major issue I had with my first batch of code was that it relies on the ordering of the facets, which in some of my other (private) use-cases is not always known (well, not controlled) a priori. Because of this, I wanted an unambiguous determination of per-panel limits.
I've changed the function name (and the args) to represent two points: (1) this appears to be mimic/replace coord_cartesian, and (2) I don't know that it will translate to other coord_* functions without adjustment. Comments/patches welcome at my gist.
Up front, a perfect duplication of Z.Lin's results can be had with:
p <- test_data %>%
ggplot(aes(x = Nsubjects, y = Odds, color=EffectSize)) +
facet_wrap(DataType ~ ExpType, labeller = label_both, scales = "free") +
geom_line(size = 2) +
geom_ribbon(aes(ymax = Upper, ymin = Lower, fill = EffectSize, color = NULL), alpha = 0.2)
p + coord_cartesian_panels(
panel_limits = tibble::tribble(
~DataType, ~ExpType, ~ymin, ~ymax
, "A" , "X" , 1, 4
, "A" , "Y" , 1, 6
, "B" , "Y" , 1, 7
)
)
and gone is the ambiguity (that the original code introduced) of which panel is which argument in the list. Since it uses a data.frame to match (usually merge) with the layout of the plot, the order of rows does not matter.
Notes:
the panel_limits fields referenced are: xmin, xmax, ymin, and ymax, on top of whichever faceting variables are desired;
an NA in a particular field (or a missing field) means to use the previously-defined limit;
when all faceting-variables match (between panel_limits and the layout defined by facet_*), the limits are set on individual panels; this one-to-one mapping is the going-in assumption about this function;
when some (but not all) variables match, the limits are set on a subset of panels (e.g., on one axis of the panels, depending on the faceting method);
when no variables match and panel_limits is a single row, then set the limits for all panels indiscriminately; and
faceting rows in panel_limits that match nothing in layout are silently ignored.
Errors:
any faceting variables in panel_limits that do not exist in the layout (i.e., not specified within facet_*); or
more than one row in panel_limits matches a particular panel.
As an extension, this also handles a subset of the faceting variables, so if we want to limit all facets by ExpType only, then
# set the limits on panels based on one faceting variable only
p + coord_cartesian_panels(
panel_limits = tibble::tribble(
~ExpType, ~ymin, ~ymax
, "X" , NA, 4
, "Y" , 1, 5
)
) + labs(title = "panel_limits, one variable")
# set the limits on all panels
p + coord_cartesian_panels(
panel_limits = tibble::tribble(
~ymin, ~ymax
, NA, 5
)
) + labs(title = "panel_limits, no variables")
(The last example seems silly, but if the facets/plots are being built programmatically and it is not guaranteed a priori that there are individual facets, then this will result in a reasonable default behavior, assuming that everything is otherwise unambiguous.)
A further extension might allow for an NA in a facet variable to match all, such as
# does not work
p + coord_cartesian_panels(
panel_limits = tibble::tribble(
~DataType, ~ExpType, ~ymin, ~ymax
, "A" , NA , 1, 4
, NA , "Y" , 1, 6
)
)
This would require that merge understand that NA means "all/any", not a literal NA. I'm not going to extend merge at the moment to handle that, so I'm not going to complicate this function to attempt to do that. If there is a reasonable merge replacement that does this kind of calculus, let me know :-)
Many Thanks to ...
burchill for the original effort and gist; and
Z.Lin, for helping to bring the function up to ggplot2-3.3.0.
UniquePanelCoords <- ggplot2::ggproto(
"UniquePanelCoords", ggplot2::CoordCartesian,
num_of_panels = 1,
panel_counter = 1,
layout = NULL,
setup_layout = function(self, layout, params) {
self$num_of_panels <- length(unique(layout$PANEL))
self$panel_counter <- 1
self$layout <- layout # store for later
layout
},
setup_panel_params = function(self, scale_x, scale_y, params = list()) {
train_cartesian <- function(scale, limits, name, given_range = c(NA, NA)) {
if (anyNA(given_range)) {
expansion <- ggplot2:::default_expansion(scale, expand = self$expand)
range <- ggplot2:::expand_limits_scale(scale, expansion, coord_limits = limits)
isna <- is.na(given_range)
given_range[isna] <- range[isna]
}
out <- list(
ggplot2:::view_scale_primary(scale, limits, given_range),
sec = ggplot2:::view_scale_secondary(scale, limits, given_range),
arrange = scale$axis_order(),
range = given_range
)
names(out) <- c(name, paste0(name, ".", names(out)[-1]))
out
}
this_layout <- self$layout[ self$panel_counter,, drop = FALSE ]
self$panel_counter <-
if (self$panel_counter < self$num_of_panels) {
self$panel_counter + 1
} else 1
# determine merge column names by removing all "standard" names
layout_names <- setdiff(names(this_layout),
c("PANEL", "ROW", "COL", "SCALE_X", "SCALE_Y"))
limits_names <- setdiff(names(self$panel_limits),
c("xmin", "xmax", "ymin", "ymax"))
limit_extras <- setdiff(limits_names, layout_names)
if (length(limit_extras) > 0) {
stop("facet names in 'panel_limits' not found in 'layout': ",
paste(sQuote(limit_extras), collapse = ","))
} else if (length(limits_names) == 0 && NROW(self$panel_limits) == 1) {
# no panels in 'panel_limits'
this_panel_limits <- cbind(this_layout, self$panel_limits)
} else {
this_panel_limits <- merge(this_layout, self$panel_limits, all.x = TRUE, by = limits_names)
}
if (isTRUE(NROW(this_panel_limits) > 1)) {
stop("multiple matches for current panel in 'panel_limits'")
}
# add missing min/max columns, default to "no override" (NA)
this_panel_limits[, setdiff(c("xmin", "xmax", "ymin", "ymax"),
names(this_panel_limits)) ] <- NA
c(train_cartesian(scale_x, self$limits$x, "x",
unlist(this_panel_limits[, c("xmin", "xmax"), drop = TRUE])),
train_cartesian(scale_y, self$limits$y, "y",
unlist(this_panel_limits[, c("ymin", "ymax"), drop = TRUE])))
}
)
coord_cartesian_panels <- function(panel_limits, expand = TRUE, default = FALSE, clip = "on") {
ggplot2::ggproto(NULL, UniquePanelCoords,
panel_limits = panel_limits,
expand = expand, default = default, clip = clip)
}
At some point I had a similar problem to this. The result was a slightly more verbose but also more flexible option that can customize many aspects of position scales on a per-facet basis. Due to some technicality it uses the equivalent of scales::oob_keep() as oob arguments on the scales, thereby acting as if the coordinates determined the limits.
library(ggh4x)
library(tidyverse)
p <- test_data %>%
ggplot(aes(x=Nsubjects, y = Odds, color=EffectSize)) +
facet_wrap(DataType ~ ExpType, labeller = label_both, scales="free") +
geom_line(size=2) +
geom_ribbon(aes(ymax=Upper, ymin=Lower, fill=EffectSize, color=NULL), alpha=0.2) +
facetted_pos_scales(
x = list(
scale_x_continuous(limits = c(8, 64)),
scale_x_continuous(limits = c(64, 8), trans = "reverse"),
NULL,
scale_x_continuous(limits = c(8, 64), labels = scales::dollar_format())
),
y = list(
scale_y_continuous(limits = c(1, 4), guide = "none"),
scale_y_continuous(limits = c(1, 6), breaks = 1:3),
NULL,
scale_y_continuous(limits = c(1, 7), position = "right")
)
)
Related
I'm trying to make a hourly dispatch curve with generation and energy consumpsion data, which have the characteristic that when we do a power balance (generation minus consumpsion) we get values nearly to zero.
Into the generation data there are also net interchange values, that be negative when de power system are exporting energy and positive when the system are importing energy to complete the consumption.
Thus, to the plot created with geom_area and geom_line be ok, the black line (consumption) needs be adjusted with the generation area, so that there's no gap between the area and the black line. But, in my attempts I couldn't do it. How you can see, same the energy balence resulting in zero, there are a gap beetwen 19 and 20 hours. I don't know what is wrong. Someone have idea how to do that?
Thanks in advance.
Data to the plot:
generation <-
data.frame('dayHour' = c('18/11/2018 18:00','18/11/2018 19:00','18/11/2018 20:00','18/11/2018 21:00','18/11/2018 18:00','18/11/2018 19:00','18/11/2018 20:00','18/11/2018 21:00','18/11/2018 18:00','18/11/2018 19:00','18/11/2018 20:00','18/11/2018 21:00','18/11/2018 18:00','18/11/2018 19:00','18/11/2018 20:00','18/11/2018 21:00'),
'power' = c(-1364.290, -433.110, 1132.39, 749.48, 463.75, 467.8, 469.35, 436.51, 2025.5, 2133.07, 2306.85, 2304.91, 211.52, 213.16, 214.33, 214.59),
'label' = c('net interchange', 'net interchange', 'net interchange', 'net interchange', 'gas', 'gas', 'gas', 'gas', 'hydro', 'hydro', 'hydro', 'hydro', 'biomass', 'biomass', 'biomass', 'biomass'))
generation$label <- factor(generation$label, levels = c('net interchange', 'gas', 'hydro', 'biomass'))
net.load <-
data.frame('dayHour' = c('18/11/2018 18:00', '18/11/2018 19:00', '18/11/2018 20:00', '18/11/2018 21:00'), 'power' = c(1336.48, 2380.91, 4122.91, 3705.49), 'label' = c('net load', 'net load', 'net load', 'net load'))
generation$dayHour <-
as.POSIXct(strptime(generation$dayHour,format='%d/%m/%Y %H:%M'))
net.load$dayHour <-
as.POSIXct(strptime(net.load$dayHour,format='%d/%m/%Y %H:%M'))
Power balance
pb <-
filter(generation, label == "biomass")$power +
filter(generation, label == "hydro")$power +
filter(generation, label == "gas")$power +
filter(generation, label == "net interchange")$power -
net.load$power
summary(pb)
Dispatch curve
ggplot() +
geom_area(data = generation,
aes(y = power,
x = dayHour,
fill = label)) +
geom_line(data = net.load,
aes(y = power,
x = dayHour,
colour = label),
size = 1.2,
colour = "black") +
labs(fill = "generation",
colour = 'net load')
It looks like position_stack is getting confused when the interpolation crosses the x-axis.
To fix it, you can interpolate manually before plotting (e.g. with approx):
library(tidyverse)
generation <- data.frame(
dayHour = structure(c(1542585600, 1542589200, 1542592800, 1542596400, 1542585600, 1542589200, 1542592800, 1542596400, 1542585600, 1542589200, 1542592800, 1542596400, 1542585600, 1542589200, 1542592800, 1542596400), class = c("POSIXct", "POSIXt"), tzone = ""),
power = c(-1364.29, -433.11, 1132.39, 749.48, 463.75, 467.8, 469.35, 436.51, 2025.5, 2133.07, 2306.85, 2304.91, 211.52, 213.16, 214.33, 214.59),
label = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L), .Label = c("net interchange", "gas", "hydro", "biomass"), class = "factor")
)
generation_interpolated <- generation %>%
group_by(label) %>%
summarise(data = list(as_tibble(approx(dayHour, power, n = 501)))) %>%
unnest() %>%
mutate(x = as.POSIXct(x, origin = '1970-01-01', tz = 'UTC'))
net_power_interpolated <- generation_interpolated %>%
group_by(x) %>%
summarise(y = sum(y))
ggplot(generation_interpolated, aes(x, y)) +
geom_area(aes(fill = label)) +
geom_line(data = net_power_interpolated)
To see how approx works, a simpler, ungrouped example:
df <- data.frame(x = c(0, 5, 10), y = c(0, 20, 10))
interpolated <- approx(df$x, df$y, n = 11)
str(interpolated)
#> List of 2
#> $ x: int [1:11] 0 1 2 3 4 5 6 7 8 9 ...
#> $ y: num [1:11] 0 4 8 12 16 20 18 16 14 12 ...
ggplot(as.data.frame(interpolated), aes(x, y)) +
geom_line() +
geom_point() +
geom_point(data = df, color = 'dodgerblue', size = 4)
I have two map plots of 'Total Population' and 'Population Density' created using a shape file. Now, I'm trying to build a shiny app so that I can change from Total Population to Population Density and the plot should change accordingly. When I ran the code, i got following error code:
Warning: Error in : ggplot2 doesn't know how to deal with data of class matrix
Here's the code that i've been trying to use:
library(shiny)
library(ggplot2) #Loading necessary libraries
ui <- fluidPage(
selectInput("mr",
label="Type of Plot",
choices=c("Total Population", "Density"),
selected="Total Population"),
plotOutput("curv") #Giving an input name and listing out types to choose in the Shiny app
)
server <- function(input, output){
output$curv <- renderPlot({
ggplot() +
geom_polygon(data = final.plot==input$mr,
aes(x = long, y = lat, group = group, fill = Population),
color = "black", size = 0.20) +
coord_map()+
scale_fill_distiller(name="Population", palette = "YlGn")+
labs(title="Population in Australia")
}) # Output with the data file and input string to change when input changes.
}
shinyApp(ui = ui, server = server)
Any help is greatly appreciated.
UPDATE:
My dataset looks like this:
id long lat order hole piece
1 Ashmore and Cartier Islands 123.1169 -12.25333 1 FALSE 1
2 Ashmore and Cartier Islands 123.1206 -12.25611 2 FALSE 1
3 Ashmore and Cartier Islands 123.1222 -12.25861 3 FALSE 1
4 Ashmore and Cartier Islands 123.1239 -12.25528 4 FALSE 1
5 Ashmore and Cartier Islands 123.1258 -12.25333 5 FALSE 1
6 Ashmore and Cartier Islands 123.1275 -12.25619 6 FALSE 1
group Population Density
1 Ashmore and Cartier Islands.1 NA NA
2 Ashmore and Cartier Islands.1 NA NA
3 Ashmore and Cartier Islands.1 NA NA
4 Ashmore and Cartier Islands.1 NA NA
5 Ashmore and Cartier Islands.1 NA NA
6 Ashmore and Cartier Islands.1 NA NA
This is stored in the DataFrame called "final.plot". There's values of Population and Density for other states. I was able to create a static visualisation of Population and it looks like this:
There's a similar one for Density and I'm trying to create Shiny app where i can switch between these two so that the plot changes accordingly. Right now I've tried the following code:
library(shiny)
library(ggplot2) #Loading necessary libraries
ui <- fluidPage(
selectInput("pop",
label="Type of Plot",
choices=c("Population", "Density"),
selected="Total Population"),
plotOutput("curv") #Giving an input name and listing out types to choose in the Shiny app
)
server <- function(input, output){
output$curv <- renderPlot({
ggplot() +
geom_polygon(data = final.plot,
aes(x = long, y = lat, group = group, fill = input$pop),
color = "black", size = 0.25) +
coord_map()+
scale_fill_distiller(name="Density", palette = "Spectral")+
labs(title="Population in Australia")
})
}
shinyApp(ui = ui, server = server)
But I'm getting an error saying "Discrete value supplied to continuous scale".
UPDATE 2:
Here's the link for the dataset i'm using:
Dataset
I've had a quick look at your code and have a couple of suggestions.
1) When providing your data set you can use the function dput() - this writes a text representation of your data.frame which people answering your question can simply paste into R. For example:
dput(final.plot)
This will produce text output that I can assign to a dataframe by prefixing final.plot <- to the output. I have recreated your dataframe and used dput() to output it as text below. Now other users can quickly cut & paste your data:
Note this dataset is faulty - see below
final.plot <- structure(list(id = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = "Ashmore and Cartier Islands", class = "factor"),
long = c(123.1169, 123.1206, 123.1222, 123.1239, 123.1258, 123.1275),
lat = c(-12.25333, -12.25611, -12.25861, -12.25528, -12.25333, -12.25619),
order = 1:6, hole = c(FALSE, FALSE, FALSE, FALSE, FALSE, FALSE),
piece = c(1L, 1L, 1L, 1L, 1L, 1L),
group = structure(c(1L, 1L, 1L, 1L, 1L, 1L),
.Label = "Ashmore and Cartier Islands.1", class = "factor"),
Population = c(NA, NA, NA, NA, NA, NA),
Density = c(NA, NA, NA, NA, NA, NA)),
.Names = c("id", "long", "lat", "order", "hole", "piece", "group", "Population", "Density"),
class = "data.frame",
row.names = c(NA, -6L))
The error "Discrete value supplied to continuous scale" is caused by two issues.
i) You are passing NA in both your Population and Density columns. The dataframe below adds some (unrealistic) numbers to these columns and the error is removed when I run the plotting code in isolation.
Corrected Toy Dataset
final.plot <- structure(list(id = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = "Ashmore and Cartier Islands", class = "factor"),
long = c(123.1169, 123.1206, 123.1222, 123.1239, 123.1258, 123.1275),
lat = c(-12.25333, -12.25611, -12.25861, -12.25528, -12.25333, -12.25619),
order = 1:6, hole = c(FALSE, FALSE, FALSE, FALSE, FALSE, FALSE),
piece = c(1L, 1L, 1L, 1L, 1L, 1L),
group = structure(c(1L, 1L, 1L, 1L, 1L, 1L),
.Label = "Ashmore and Cartier Islands.1", class = "factor"),
Population = c(1, 2, 3, 4, 5, 6),
Density = c(7, 3, 9, 1, 3, 6)),
.Names = c("id", "long", "lat", "order", "hole", "piece", "group", "Population", "Density"),
class = "data.frame",
row.names = c(NA, -6L))
ii) When run interactively the error is caused because you are not passing appropriate data to fill in fill = input$pop. You should be passing the values from final.plot$Population or final.plot$Density depending on what was selected. You are instead passing the output of the dropdown box - "Population" or "Density". This can be corrected using a switch statement within renderPlot:
# User input assigns appropriate data to selectedData which can be passed to other functions:
selectedData <- switch(input$pop,
"Population" = final.plot$Population,
"Density" = final.plot$Density)
2) It would be helpful if you could provide the code which produced the static map you show in your Update above. When debugging Shiny code I find it easiest to get the function working non-interactively first and then to incorporate it into Shiny. I tried to extract your plotting code below but it is not producing the expected results:
library(ggplot2) #Loading necessary libraries
library(mapproj)
library(maps)
ggplot() +
geom_polygon(data = final.plot,
[![aes(x = long, y = lat, group = group, fill = Population),
color = "black", size = 0.25) +
coord_map()+
scale_fill_distiller(name="Density", palette = "Spectral")+
labs(title="Population in Australia")`
3) I am not familiar with plotting data onto maps in R but I believe your app will need to load in library(mapproj) and library(maps) to get the results you need. Here is all the above put together:
library(shiny)
library(ggplot2) #Loading necessary libraries
#I added the two lines below:
library(mapproj)
library(map)
ui <- fluidPage(
selectInput("pop",
label="Type of Plot",
choices=list("Population", "Density"),
selected="Population"), #NOTE: Total Population changed to Population so that it selects correct default value
plotOutput("curv") #Giving an input name and listing out types to choose in the Shiny app
)
server <- function(input, output){
#You will probably want to simply import your dataframe final.plot using read.table etc:
final.plot <- structure(list(id = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = "Ashmore and Cartier Islands", class = "factor"),
long = c(123.1169, 123.1206, 123.1222, 123.1239, 123.1258, 123.1275),
lat = c(-12.25333, -12.25611, -12.25861, -12.25528, -12.25333, -12.25619),
order = 1:6, hole = c(FALSE, FALSE, FALSE, FALSE, FALSE, FALSE),
piece = c(1L, 1L, 1L, 1L, 1L, 1L),
group = structure(c(1L, 1L, 1L, 1L, 1L, 1L),
.Label = "Ashmore and Cartier Islands.1", class = "factor"),
Population = c(1, 2, 3, 4, 5, 6),
Density = c(7, 3, 9, 1, 3, 6)),
.Names = c("id", "long", "lat", "order", "hole", "piece", "group", "Population", "Density"),
class = "data.frame",
row.names = c(NA, -6L))
output$curv <- renderPlot({
#Assign value of selectedData based upon user input:
selectedData <- switch(input$pop,
"Population" = final.plot$Population,
"Density" = final.plot$Density)
ggplot() +
geom_polygon(data = final.plot,
aes(x = long, y = lat, group = group, fill = selectedData),
color = "black", size = 0.25) +
coord_map()+
scale_fill_distiller(name="Density", palette = "Spectral")+
labs(title="Population in Australia")
})
}
shinyApp(ui = ui, server = server)
Now all you need to do is substitute your code which produced the static map shown in your update for the faulty code in renderPlot in your shiny app.
I have an usual problem when using geom_errorbar in ggplot2.
The error bars are not within range but that is of no concern here.
My problem is that geom_errorbar is plotting the confidence intervals for the same data differently depending on what other data is plotted with it.
The code below filters the data only passing data where Audio1 is equal to "300SW" OR "3500MFL" in the uncommented SE and AggBar.
SE<-c(0.0861829641865964, 0.0296894376485468, 0.0323219002250762,
0.0937013798013447)
AggBar <- structure(list(Report = structure(c(1L, 2L, 1L, 2L, 1L, 2L, 1L,
2L), .Label = c("One Flash", "Two Flashes"), class = "factor"),
Visual = structure(c(1L, 1L, 2L, 2L, 1L, 1L, 2L, 2L), .Label = c("one",
"two"), class = "factor"), Audio = c("300SW", "300SW", "300SW",
"300SW", "3500MFL3500CL", "3500MFL3500CL", "3500MFL3500CL",
"3500MFL3500CL"), Prob = c(0.938828282828283, 0.0611717171717172,
0.754141414141414, 0.245858585858586, 0.534484848484848,
0.465515151515151, 0.0830909090909091, 0.916909090909091)), .Names = c("Report",
"Visual", "Audio", "Prob"), row.names = c(NA, -8L), class = "data.frame")
#SE<-c(0.0310069159026252, 0.113219880555153, 0.0861829641865964, 0.0296894376485468)
#AggBar <- structure(list(Report = structure(c(1L, 2L, 1L, 2L, 1L, 2L, 1L,
#2L), .Label = c("One Flash", "Two Flashes"), class = "factor"),
#Visual = structure(c(1L, 1L, 2L, 2L, 1L, 1L, 2L, 2L), .Label = c("one",
#"two"), class = "factor"), Audio = c("300MFL300CL", "300MFL300CL",
#"300MFL300CL", "300MFL300CL", "300SW", "300SW", "300SW",
#"300SW"), Prob = c(0.562242424242424, 0.437757575757576,
#0.0921010101010101, 0.90789898989899, 0.938828282828283,
#0.0611717171717172, 0.754141414141414, 0.245858585858586)), .Names = c("Report",
#"Visual", "Audio", "Prob"), row.names = c(NA, -8L), class = "data.frame")
prob.bar = ggplot(AggBar, aes(x = Report, y = Prob, fill = Report)) + theme_bw() #+ facet_grid(Audio~Visual)
prob.bar + #This changes all panels' colour
geom_bar(position=position_dodge(.9), stat="identity", colour="black", width=0.8)+
theme(legend.position = "none") + labs(x="Report", y="Probability of Report", title = expression("Visual Condition")) + scale_fill_grey() +
scale_fill_grey(start=.4) +
scale_y_continuous(limits = c(0, 1), breaks = (seq(0,1,by = .25)))+
facet_grid(Audio ~ Visual)+
geom_errorbar(aes(ymin=Prob-SE, ymax=Prob+SE),
width=.1, # Width of the error bars
position=position_dodge(.09))
This results in the following output:
The Audio1 variables are seen on the rightmost vertical labels.
However if I filter where it only passes where Audio1 is equal to "300SW" OR "300MFL" (the commented SE and AggBar) the error bars for "300SW change":
The Audio1 variables are seen on the rightmost vertical labels with "300SW" on the bottom this time.
This change is the incorrect one because when I plot just the Audio1 "300SW" the error bars match the original plot.
I have tried plotting the Audio1 "300SW" with other variables not presented here and it is only when presenting with "300MFL" that this change occurs.
If you look at the SE variable contents you will see that there is no change in the values therein for "300SW" in both versions of the code. Yet the outputs differ.
I cannot fathom what is happening here. Any ideas or suggestions are welcome.
Thanks very much for your time.
#Antonios K below has highlighted that when "300SW" is on top of the grid the error bars are correctly drawn. I'm guessing that the error bars are being incorrectly matched to the bars although I don't know why this is the case.
The problem is that SE is not stored inside the data frame: it's just floating around in the global environment. When the data is facetted (which involves rearranging the order), it no longer lines up with the correct records. Fix the problem by storing SE in the data frame:
AggBar$SE <- c(0.0310069159026252, 0.113219880555153, 0.0861829641865964, 0.0296894376485468)
ggplot(AggBar, aes(Report, Prob, Report)) +
geom_bar(stat = "identity", fill = "grey50") +
geom_errorbar(aes(ymin = Prob - SE, ymax = Prob + SE), width = 0.4) +
facet_grid(Audio ~ Visual)
The bit of code that plots the error bars is :
geom_errorbar(aes(ymin=Prob-SE, ymax=Prob+SE),
width=.1, # Width of the error bars
position=position_dodge(.09))
So, I guess it's something there.
As you said the SE variable is the same in both cases, but what you plot there is Prob-SE and Prob+SE. And if you do AggBar$Prob-SE and AggBar$Prob+SE you'll get different values for 300SW for each case.
Might have to do with the order of your Audio1 values. The other cases that worked did they have 300SW on the top part of the plots as well maybe?
Try
sort(unique(DataRearrange$Audio1) )
[1] "300MFL" "300SW" "3500MFL"
Combining first two will give you 300SW on the bottom part of the plots.
Combining last two will give you 300SW on the top part.
So, to check this assumption, in your second case when you combine 300MFL and 300SW try to replace 300SW with 1_300SW (so that 300SW will be plotted on top) and see what happens. Just do :
DataRearrange$Audio1[DataRearrange$Audio1=="300SW"] = "1_300SW"
# Below is the alternative coupling..
ErrorBarsDF <- DataRearrange[(DataRearrange$Audio1=="1_300SW" | DataRearrange$Audio1=="300MFL"), c("correct","Visual1", "Audio1", "Audio2","correct_response", "response", "subject_nr")]
DataRearrange <- DataRearrange[(DataRearrange$Audio1=="1_300SW" | DataRearrange$Audio1=="300MFL"), c("correct","Visual1", "Audio1", "Audio2","correct_response", "response", "subject_nr")]
I've checked quite extensively through the forum and on the web but I couldn't find anyone that already presented my case, so here you are the question:
my goal: how can I extend the example presented here in case I have more than one conditioning factor?
I've tried several ways to modify the which.panel variable of strip.default function, but I couldn't come out of my problem.
This is the code I'm using at the moment (with comments):
if (!require("plyr","lattice")) install.packages("plyr","lattice")
require("plyr")
require("lattice")
# dataframe structure (8 obs. of 6 variables)
data2 <- structure(list(
COD = structure(c(1L, 1L, 1L, 1L, 2L, 2L,2L, 2L),
.Label = c("A", "B"), class = "factor"),
SPEC = structure(c(1L, 1L, 2L, 2L, 1L, 1L, 2L, 2L),
.Label = c("15/25-(15/06)", "15/26-(22/06)"), class = "factor"),
DATE = structure(c(16589, 16590, 16589, 16590, 16589, 16590, 16589, 16590), class = "Date"),
PM.BDG = c(1111.25, 1111.25, 1141.29, 1141.29, 671.26, 671.26, 707.99, 707.99),
PM = c(1033.14, 1038.4, 1181.48, 1181.48, 616.39, 616.39, 641.55, 641.55),
DELTA.PM = c(-78.12, -72.85, 40.19, 40.19, -54.87, -54.87, -66.44, -66.44)),
.Names = c("COD", "SPEC", "DATE", "PM.BDG", "PM", "DELTA.PM"),
row.names = c(NA, 8L), class = "data.frame")
# create a dataframe with a vector of colors
# based on the value of DELTA.PM for the last
# date available for each combination of COD and SPEC.
# Each color will be used for a specific panel, and it will
# forestgreen if DELTA.PM is higher than zero, red otherwise.
listaPM <- ddply(data2, .(COD,SPEC), summarize, ifelse(DELTA.PM[DATE=="2015-06-04"]<0, "red", "forestgreen"))
names(listaPM) <- c("COD","SPEC","COLOR")
# set a personalized strip, with bg color based on listaPM$COLOR
# and text based on listaPM$COD and listaPM$SPEC
myStripStylePM <- function(which.panel, factor.levels, ...) {
panel.rect(0, 0, 1, 1,
col = listaPM[which.panel,3],
border = 1)
panel.text(x = 0.5, y = 0.5,
font=2,
lab = paste(listaPM[which.panel,1],listaPM[which.panel,2], sep=" - "),
col = "white")}
# prepare a xyplot function to plot that will be used later with dlply.
# Here I want to plot the values of PM.BDG and PM over time (DATE),
# conditioning them on the SPEC (week) and COD (code) factors.
graficoPM <- function(df) {
xyplot (PM.BDG + PM ~ DATE | SPEC + COD,
data=df,
type=c("l","g"),
col=c("black", "red"),
abline=c(h=0,v=0),
strip = myStripStylePM
)}
# create a trellis object that has a list of plots,
# based on different COD (codes)
grafico.PM <- dlply(data2, .(data2$COD), graficoPM)
# graphic output, 1st row should be COD "A",
# 2nd row should be COD "B", each panel is a different SPEC (week)
par(mfrow=c(2,1))
print(grafico.PM[[1]], position=c(0,0.5,1,1), more=TRUE)
print(grafico.PM[[2]], position=c(0,0,1,0.5))
As you can see, the first row of plots is correct: text of the first strip is "A" (1st COD), the weeks (SPEC) are shown and the color represents if PM is above or below PM.BDG on the last date of the plot
On the contrary, the 2nd row of plots just repeats the same scheme of the first row (as it can be seen by the fact that COD is Always "A" and 2nd strip's bg color in the 2nd row is green, when the line of PM in red is clearly well below the PM.BDG line in black).
Although I'd like to keep my code, I'm pretty sure my goal could be achieved with a different strategy. If you can find a better way to use my dataframe, I'll be happy to study the code and see if it works with my data.
The problem is match up the current panel data to the listaPM data. Because you are doing different sub-setting in each of the calls, it's difficult to use which.panel() to match up the data sets.
There is an undocumented feature which allows you to get the conditioning variable names to make the matching more robust. Here's how you would use it in your case.
myStripStylePM <- function(which.panel, factor.levels, ...) {
cp <- dimnames(trellis.last.object())
ci <- arrayInd(packet.number(), .dim=sapply(cp, length))
cv <- mapply(function(a,b) a[b], cp, as.vector(ci))
idx<-which(apply(mapply(function(n, v) listaPM[, n] == v, names(cv), cv),1,all))
stopifnot(length(idx)==1)
panel.rect(0, 0, 1, 1,
col = listaPM[idx,3],
border = 1)
panel.text(x = 0.5, y = 0.5,
font=2,
lab = paste(listaPM[idx,1],listaPM[idx,2], sep=" - "),
col = "white")
}
When run with the rest of your code, it produces this plot
A pairwise scatterplot showing relationship between genes (columns in data frame) across multiple samples (rows in data frame) is created. The samples belong to two distinct groups: group "A" and "B". Since one dot in plot represent one sample, I need to color the data points (dots) according to groups with two different colors, say group A with "green" and group B with "red". Is it possible to do that?
Any kind of help will be appreciated.
plot(DF[1:6], pch = 21) #command used for plotting, DF is data frame
Sample Data Frame Example:
CBX3 PSPH ATP2C1 SNX10 MMD ATP13A3
B 10.589844 6.842970 8.084550 8.475023 9.202490 10.403811
A 10.174385 5.517944 7.736994 9.094834 9.253766 10.133408
B 10.202084 5.669137 7.392141 7.522270 7.830969 9.123178
B 10.893231 6.630709 7.601690 7.894177 8.979142 9.791841
B 10.071038 5.091222 7.032585 8.305581 7.903737 8.994821
A 10.005002 4.708631 7.927246 7.292527 8.257853 10.054630
B 10.028055 5.080944 6.421961 7.616856 8.287496 9.642294
A 10.144115 6.626483 7.686203 7.970934 7.919615 9.475175
A 10.675386 6.874047 7.900560 7.605519 8.585158 8.858613
A 9.855063 5.164399 6.847923 8.072608 8.221344 9.077744
A 10.994228 6.545318 8.606128 8.426329 8.787876 9.857079
A 10.501266 6.677360 7.787168 8.444976 8.928174 9.542558
GGally has a good function for this as well.
library(GGally)
ggpairs(dd, color = 'CLASS',columns = 2:ncol(dd) )
It might not be that easy to do with base graphics. You could easily do this with lattice. With this sample data.frame
dd<-structure(list(CLASS = structure(c(2L, 1L, 2L, 2L, 2L, 1L, 2L,
1L, 1L, 1L, 1L, 1L), .Label = c("A", "B"), class = "factor"),
CBX3 = c(10.589844, 10.174385, 10.202084, 10.893231, 10.071038,
10.005002, 10.028055, 10.144115, 10.675386, 9.855063, 10.994228,
10.501266), PSPH = c(6.84297, 5.517944, 5.669137, 6.630709,
5.091222, 4.708631, 5.080944, 6.626483, 6.874047, 5.164399,
6.545318, 6.67736), ATP2C1 = c(8.08455, 7.736994, 7.392141,
7.60169, 7.032585, 7.927246, 6.421961, 7.686203, 7.90056,
6.847923, 8.606128, 7.787168), SNX10 = c(8.475023, 9.094834,
7.52227, 7.894177, 8.305581, 7.292527, 7.616856, 7.970934,
7.605519, 8.072608, 8.426329, 8.444976), MMD = c(9.20249,
9.253766, 7.830969, 8.979142, 7.903737, 8.257853, 8.287496,
7.919615, 8.585158, 8.221344, 8.787876, 8.928174), ATP13A3 = c(10.403811,
10.133408, 9.123178, 9.791841, 8.994821, 10.05463, 9.642294,
9.475175, 8.858613, 9.077744, 9.857079, 9.542558)), .Names = c("CLASS",
"CBX3", "PSPH", "ATP2C1", "SNX10", "MMD", "ATP13A3"), class = "data.frame", row.names = c(NA, -12L))
you can do
library(lattice)
splom(~dd[,-1], groups=dd$CLASS)
to get
You can add color to the points by specifying the argument col
to plot
DF <- read.delim(textConnection(
"category CBX3 PSPH ATP2C1 SNX10 MMD ATP13A3
B 10.589844 6.842970 8.084550 8.475023 9.202490 10.403811
A 10.174385 5.517944 7.736994 9.094834 9.253766 10.133408
B 10.202084 5.669137 7.392141 7.522270 7.830969 9.123178
B 10.893231 6.630709 7.601690 7.894177 8.979142 9.791841
B 10.071038 5.091222 7.032585 8.305581 7.903737 8.994821
A 10.005002 4.708631 7.927246 7.292527 8.257853 10.054630
B 10.028055 5.080944 6.421961 7.616856 8.287496 9.642294
A 10.144115 6.626483 7.686203 7.970934 7.919615 9.475175
A 10.675386 6.874047 7.900560 7.605519 8.585158 8.858613
A 9.855063 5.164399 6.847923 8.072608 8.221344 9.077744
A 10.994228 6.545318 8.606128 8.426329 8.787876 9.857079
A 10.501266 6.677360 7.787168 8.444976 8.928174 9.542558"))
plot(DF[2:7],col = ifelse(DF$category == 'A','red','green'))
A list of valid color values can be obtained by calling colors(). Vectors with a gradient of colors can be created via rainbow(), and just for fun, I use this little function for choosing pretty colors when making a figure.
(Edited per suggestions from #MrFlick)
#! #param n The number of colors to be selected
colorchoose <- function (n = 1, alpha, term = F)
{
cols <- colors()
mod <- ceiling(sqrt(length(cols)))
plot(xlab = "", ylab = "", main = "click for color name",
c(0, mod), c(0, mod), type = "n", axes = F)
s<-seq_along(cols)
dev.hold()
points(s%%mod, s%/%mod, col = cols, pch = 15, cex = 2.4)
dev.flush()
p <- locator(n)
return(cols[round(p$y) * mod + round(p$x)])
}