I have two map plots of 'Total Population' and 'Population Density' created using a shape file. Now, I'm trying to build a shiny app so that I can change from Total Population to Population Density and the plot should change accordingly. When I ran the code, i got following error code:
Warning: Error in : ggplot2 doesn't know how to deal with data of class matrix
Here's the code that i've been trying to use:
library(shiny)
library(ggplot2) #Loading necessary libraries
ui <- fluidPage(
selectInput("mr",
label="Type of Plot",
choices=c("Total Population", "Density"),
selected="Total Population"),
plotOutput("curv") #Giving an input name and listing out types to choose in the Shiny app
)
server <- function(input, output){
output$curv <- renderPlot({
ggplot() +
geom_polygon(data = final.plot==input$mr,
aes(x = long, y = lat, group = group, fill = Population),
color = "black", size = 0.20) +
coord_map()+
scale_fill_distiller(name="Population", palette = "YlGn")+
labs(title="Population in Australia")
}) # Output with the data file and input string to change when input changes.
}
shinyApp(ui = ui, server = server)
Any help is greatly appreciated.
UPDATE:
My dataset looks like this:
id long lat order hole piece
1 Ashmore and Cartier Islands 123.1169 -12.25333 1 FALSE 1
2 Ashmore and Cartier Islands 123.1206 -12.25611 2 FALSE 1
3 Ashmore and Cartier Islands 123.1222 -12.25861 3 FALSE 1
4 Ashmore and Cartier Islands 123.1239 -12.25528 4 FALSE 1
5 Ashmore and Cartier Islands 123.1258 -12.25333 5 FALSE 1
6 Ashmore and Cartier Islands 123.1275 -12.25619 6 FALSE 1
group Population Density
1 Ashmore and Cartier Islands.1 NA NA
2 Ashmore and Cartier Islands.1 NA NA
3 Ashmore and Cartier Islands.1 NA NA
4 Ashmore and Cartier Islands.1 NA NA
5 Ashmore and Cartier Islands.1 NA NA
6 Ashmore and Cartier Islands.1 NA NA
This is stored in the DataFrame called "final.plot". There's values of Population and Density for other states. I was able to create a static visualisation of Population and it looks like this:
There's a similar one for Density and I'm trying to create Shiny app where i can switch between these two so that the plot changes accordingly. Right now I've tried the following code:
library(shiny)
library(ggplot2) #Loading necessary libraries
ui <- fluidPage(
selectInput("pop",
label="Type of Plot",
choices=c("Population", "Density"),
selected="Total Population"),
plotOutput("curv") #Giving an input name and listing out types to choose in the Shiny app
)
server <- function(input, output){
output$curv <- renderPlot({
ggplot() +
geom_polygon(data = final.plot,
aes(x = long, y = lat, group = group, fill = input$pop),
color = "black", size = 0.25) +
coord_map()+
scale_fill_distiller(name="Density", palette = "Spectral")+
labs(title="Population in Australia")
})
}
shinyApp(ui = ui, server = server)
But I'm getting an error saying "Discrete value supplied to continuous scale".
UPDATE 2:
Here's the link for the dataset i'm using:
Dataset
I've had a quick look at your code and have a couple of suggestions.
1) When providing your data set you can use the function dput() - this writes a text representation of your data.frame which people answering your question can simply paste into R. For example:
dput(final.plot)
This will produce text output that I can assign to a dataframe by prefixing final.plot <- to the output. I have recreated your dataframe and used dput() to output it as text below. Now other users can quickly cut & paste your data:
Note this dataset is faulty - see below
final.plot <- structure(list(id = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = "Ashmore and Cartier Islands", class = "factor"),
long = c(123.1169, 123.1206, 123.1222, 123.1239, 123.1258, 123.1275),
lat = c(-12.25333, -12.25611, -12.25861, -12.25528, -12.25333, -12.25619),
order = 1:6, hole = c(FALSE, FALSE, FALSE, FALSE, FALSE, FALSE),
piece = c(1L, 1L, 1L, 1L, 1L, 1L),
group = structure(c(1L, 1L, 1L, 1L, 1L, 1L),
.Label = "Ashmore and Cartier Islands.1", class = "factor"),
Population = c(NA, NA, NA, NA, NA, NA),
Density = c(NA, NA, NA, NA, NA, NA)),
.Names = c("id", "long", "lat", "order", "hole", "piece", "group", "Population", "Density"),
class = "data.frame",
row.names = c(NA, -6L))
The error "Discrete value supplied to continuous scale" is caused by two issues.
i) You are passing NA in both your Population and Density columns. The dataframe below adds some (unrealistic) numbers to these columns and the error is removed when I run the plotting code in isolation.
Corrected Toy Dataset
final.plot <- structure(list(id = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = "Ashmore and Cartier Islands", class = "factor"),
long = c(123.1169, 123.1206, 123.1222, 123.1239, 123.1258, 123.1275),
lat = c(-12.25333, -12.25611, -12.25861, -12.25528, -12.25333, -12.25619),
order = 1:6, hole = c(FALSE, FALSE, FALSE, FALSE, FALSE, FALSE),
piece = c(1L, 1L, 1L, 1L, 1L, 1L),
group = structure(c(1L, 1L, 1L, 1L, 1L, 1L),
.Label = "Ashmore and Cartier Islands.1", class = "factor"),
Population = c(1, 2, 3, 4, 5, 6),
Density = c(7, 3, 9, 1, 3, 6)),
.Names = c("id", "long", "lat", "order", "hole", "piece", "group", "Population", "Density"),
class = "data.frame",
row.names = c(NA, -6L))
ii) When run interactively the error is caused because you are not passing appropriate data to fill in fill = input$pop. You should be passing the values from final.plot$Population or final.plot$Density depending on what was selected. You are instead passing the output of the dropdown box - "Population" or "Density". This can be corrected using a switch statement within renderPlot:
# User input assigns appropriate data to selectedData which can be passed to other functions:
selectedData <- switch(input$pop,
"Population" = final.plot$Population,
"Density" = final.plot$Density)
2) It would be helpful if you could provide the code which produced the static map you show in your Update above. When debugging Shiny code I find it easiest to get the function working non-interactively first and then to incorporate it into Shiny. I tried to extract your plotting code below but it is not producing the expected results:
library(ggplot2) #Loading necessary libraries
library(mapproj)
library(maps)
ggplot() +
geom_polygon(data = final.plot,
[![aes(x = long, y = lat, group = group, fill = Population),
color = "black", size = 0.25) +
coord_map()+
scale_fill_distiller(name="Density", palette = "Spectral")+
labs(title="Population in Australia")`
3) I am not familiar with plotting data onto maps in R but I believe your app will need to load in library(mapproj) and library(maps) to get the results you need. Here is all the above put together:
library(shiny)
library(ggplot2) #Loading necessary libraries
#I added the two lines below:
library(mapproj)
library(map)
ui <- fluidPage(
selectInput("pop",
label="Type of Plot",
choices=list("Population", "Density"),
selected="Population"), #NOTE: Total Population changed to Population so that it selects correct default value
plotOutput("curv") #Giving an input name and listing out types to choose in the Shiny app
)
server <- function(input, output){
#You will probably want to simply import your dataframe final.plot using read.table etc:
final.plot <- structure(list(id = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = "Ashmore and Cartier Islands", class = "factor"),
long = c(123.1169, 123.1206, 123.1222, 123.1239, 123.1258, 123.1275),
lat = c(-12.25333, -12.25611, -12.25861, -12.25528, -12.25333, -12.25619),
order = 1:6, hole = c(FALSE, FALSE, FALSE, FALSE, FALSE, FALSE),
piece = c(1L, 1L, 1L, 1L, 1L, 1L),
group = structure(c(1L, 1L, 1L, 1L, 1L, 1L),
.Label = "Ashmore and Cartier Islands.1", class = "factor"),
Population = c(1, 2, 3, 4, 5, 6),
Density = c(7, 3, 9, 1, 3, 6)),
.Names = c("id", "long", "lat", "order", "hole", "piece", "group", "Population", "Density"),
class = "data.frame",
row.names = c(NA, -6L))
output$curv <- renderPlot({
#Assign value of selectedData based upon user input:
selectedData <- switch(input$pop,
"Population" = final.plot$Population,
"Density" = final.plot$Density)
ggplot() +
geom_polygon(data = final.plot,
aes(x = long, y = lat, group = group, fill = selectedData),
color = "black", size = 0.25) +
coord_map()+
scale_fill_distiller(name="Density", palette = "Spectral")+
labs(title="Population in Australia")
})
}
shinyApp(ui = ui, server = server)
Now all you need to do is substitute your code which produced the static map shown in your update for the faulty code in renderPlot in your shiny app.
Related
coord_cartesian doesn't allow one to set per-facet coordinates, and using other range-limiting tends to produce a straight-line on the specific extreme. Since we have widelay-varying y-ranges, we can't set the limits on all facets identically; limiting the data before plot is not as friendly with geom_line/geom_path (https://stackoverflow.com/a/27319786/3358272), as it takes a lot more effort to interpolate data to get to the edge and then insert NAs in order to break up the line. (Ultimately, the only way to get the desired result is to do exactly this, which can be a bit onerous with other data.)
One workaround is suggested in https://gist.github.com/burchill/d780d3e8663ad15bcbda7869394a348a, where it starts with
test_data %>%
ggplot(aes(x=Nsubjects, y = Odds, color=EffectSize)) +
facet_wrap(DataType ~ ExpType, labeller = label_both, scales="free") +
geom_line(size=2) +
geom_ribbon(aes(ymax=Upper, ymin=Lower, fill=EffectSize, color=NULL), alpha=0.2)
and in previous versions of ggplot2, that gist defined coord_panel_ranges and was able to control coordinates per-facet. The two right facets should narrow down to a 1-6(ish) y-axis so that the exploding confidence interval goes off-screen and allows the facet to focus primarily on the "normal range" of data. (Note: the test_data and this vis is not mine, it's taken from the gist. While my needs are somewhat similar, I thought it better to stay within the confines of the gist's data and code.)
Unfortunately, this now fails for me with ggplot2-3.3.0. Initial errors related to the recent loss of ggplot2::scale_range, which I tried to mitigate with this adaptation of burchill's code (that uses other ggplot2::: internal functions):
UniquePanelCoords <- ggplot2::ggproto(
"UniquePanelCoords", ggplot2::CoordCartesian,
num_of_panels = 1,
panel_counter = 1,
panel_ranges = NULL,
setup_layout = function(self, layout, params) {
self$num_of_panels <- length(unique(layout$PANEL))
self$panel_counter <- 1
layout
},
setup_panel_params = function(self, scale_x, scale_y, params = list()) {
if (!is.null(self$panel_ranges) & length(self$panel_ranges) != self$num_of_panels)
stop("Number of panel ranges does not equal the number supplied")
train_cartesian <- function(scale, limits, name, given_range = NULL) {
if (is.null(given_range)) {
expansion <- ggplot2:::default_expansion(scale, expand = self$expand)
range <- ggplot2:::expand_limits_scale(scale, expansion,
coord_limits = self$limits[[name]])
} else {
range <- given_range
}
out <- scale$break_info(range)
out$arrange <- scale$axis_order()
names(out) <- paste(name, names(out), sep = ".")
out
}
cur_panel_ranges <- self$panel_ranges[[self$panel_counter]]
if (self$panel_counter < self$num_of_panels)
self$panel_counter <- self$panel_counter + 1
else
self$panel_counter <- 1
c(train_cartesian(scale_x, self$limits$x, "x", cur_panel_ranges$x),
train_cartesian(scale_y, self$limits$y, "y", cur_panel_ranges$y))
}
)
coord_panel_ranges <- function(panel_ranges, expand = TRUE, default = FALSE, clip = "on") {
ggplot2::ggproto(NULL, UniquePanelCoords, panel_ranges = panel_ranges,
expand = expand, default = default, clip = clip)
}
but this is still failing with
test_data %>%
ggplot(aes(x=Nsubjects, y = Odds, color=EffectSize)) +
facet_wrap(DataType ~ ExpType, labeller = label_both, scales="free") +
geom_line(size=2) +
geom_ribbon(aes(ymax=Upper, ymin=Lower, fill=EffectSize, color=NULL), alpha=0.2) +
coord_panel_ranges(panel_ranges = list(
list(x=c(8,64), y=c(1,4)), # Panel 1
list(x=c(8,64), y=c(1,6)), # Panel 2
list(NULL), # Panel 3, an empty list falls back on the default values
list(x=c(8,64), y=c(1,7)) # Panel 4
))
# Error in panel_params$x$break_positions_minor() :
# attempt to apply non-function
I'm not very familiar with extending ggplot2, and I suspect there is something I'm missing from the ggproto. Here's what the return value from the proto looks like:
str(c(train_cartesian(scale_x, self$limits$x, "x", cur_panel_ranges$x),
train_cartesian(scale_y, self$limits$y, "y", cur_panel_ranges$y)))
# List of 14
# $ x.range : num [1:2] 8 64
# $ x.labels : chr [1:3] "20" "40" "60"
# $ x.major : num [1:3] 0.214 0.571 0.929
# $ x.minor : num [1:6] 0.0357 0.2143 0.3929 0.5714 0.75 ...
# $ x.major_source: num [1:3] 20 40 60
# $ x.minor_source: num [1:6] 10 20 30 40 50 60
# $ x.arrange : chr [1:2] "secondary" "primary"
# $ y.range : num [1:2] 1 4
# $ y.labels : chr [1:4] "1" "2" "3" "4"
# $ y.major : num [1:4] 0 0.333 0.667 1
# $ y.minor : num [1:7] 0 0.167 0.333 0.5 0.667 ...
# $ y.major_source: num [1:4] 1 2 3 4
# $ y.minor_source: num [1:7] 1 1.5 2 2.5 3 3.5 4
# $ y.arrange : chr [1:2] "primary" "secondary"
Do I need to have an x element that's a list with at least a break_positions_minor function, or is there something else that needs to be inherited in order to ensure panel_params$x$break_positions_minor exists or a reasonable default is used?
Data:
test_data <- structure(list(DataType = structure(c(1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("A", "B"), class = "factor"),
ExpType = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("X", "Y"), class = "factor"),
EffectSize = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L,
1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 2L, 2L, 2L,
2L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L), .Label = c("15", "35"
), class = "factor"), Nsubjects = c(8, 16, 32, 64, 8, 16,
32, 64, 8, 16, 32, 64, 8, 16, 32, 64, 8, 16, 32, 64, 8, 16,
32, 64, 8, 16, 32, 64, 8, 16, 32, 64), Odds = c(1.06248116259846,
1.09482076720863, 1.23086993413208, 1.76749340505612, 1.06641831731573,
1.12616954196688, 1.48351814320987, 3.50755080416964, 1.11601399761081,
1.18352602009495, 1.45705466646283, 2.53384744810515, 1.13847061762186,
1.24983742407086, 1.97075900741022, 6.01497152563726, 1.02798821372378,
1.06297006279249, 1.19432835697453, 1.7320754674107, 1.02813271730924,
1.09355953747203, 1.44830680332583, 3.4732692664923, 1.06295915758305,
1.12008443626365, 1.3887632112682, 2.46321037334, 1.06722652223114,
1.1874936754725, 1.89870184372054, 5.943747409114), Upper = c(1.72895843644471,
2.09878774769559, 2.59771794965346, 5.08513435549015, 1.72999898901071,
1.8702196882561, 3.85385388850167, 5.92564404180303, 1.99113042576373,
2.61074135841984, 3.45852331828636, 4.83900142207583, 1.57897154221764,
1.8957409107653, 10, 75, 2.3763918424135, 2.50181951057562,
3.45037180395673, 3.99515276392065, 2.04584535265976, 2.39317394040066,
2.832526733659, 5.38414183471915, 1.40569501856836, 2.6778044191832,
2.98023068052396, 4.75934650422069, 1.54116883311054, 2.50647989271592,
3.48517589981551, 100), Lower = c(0.396003888752214, 0.0908537867216577,
-0.135978081389309, -1.55014754537791, 0.40283764562075,
0.382119395677663, -0.88681760208193, 1.08945756653624, 0.240897569457892,
-0.243689318229938, -0.544413985360706, 0.228693474134466,
0.69796969302609, 0.603933937376415, 0.183548809738402, 3.57236968943798,
-0.320415414965949, -0.375879384990643, -1.06171509000767,
-0.531001829099242, 0.010420081958713, -0.206054865456611,
0.0640868729926525, 1.56239669826544, 0.720223296597732,
-0.437635546655903, -0.202704257987574, 0.167074242459314,
0.593284211351745, -0.131492541770921, 0.312227787625573,
3.76692741957876)), .Names = c("DataType", "ExpType", "EffectSize",
"Nsubjects", "Odds", "Upper", "Lower"), class = c("tbl_df", "tbl",
"data.frame"), row.names = c(NA, -32L))
I modified the function train_cartesian to match the output format of view_scales_from_scale (defined here), which seems to work:
train_cartesian <- function(scale, limits, name, given_range = NULL) {
if (is.null(given_range)) {
expansion <- ggplot2:::default_expansion(scale, expand = self$expand)
range <- ggplot2:::expand_limits_scale(scale, expansion,
coord_limits = self$limits[[name]])
} else {
range <- given_range
}
out <- list(
ggplot2:::view_scale_primary(scale, limits, range),
sec = ggplot2:::view_scale_secondary(scale, limits, range),
arrange = scale$axis_order(),
range = range
)
names(out) <- c(name, paste0(name, ".", names(out)[-1]))
out
}
p <- test_data %>%
ggplot(aes(x=Nsubjects, y = Odds, color=EffectSize)) +
facet_wrap(DataType ~ ExpType, labeller = label_both, scales="free") +
geom_line(size=2) +
geom_ribbon(aes(ymax=Upper, ymin=Lower, fill=EffectSize, color=NULL), alpha=0.2)
p +
coord_panel_ranges(panel_ranges = list(
list(x=c(8,64), y=c(1,4)), # Panel 1
list(x=c(8,64), y=c(1,6)), # Panel 2
list(NULL), # Panel 3, an empty list falls back on the default values
list(x=c(8,64), y=c(1,7)) # Panel 4
))
Original answer
I've cheated my way out of a similar problem before.
# alternate version of plot with data truncated to desired range for each facet
p.alt <- p %+% {test_data %>%
mutate(facet = as.integer(interaction(DataType, ExpType, lex.order = TRUE))) %>%
left_join(data.frame(facet = 1:4,
ymin = c(1, 1, -Inf, 1), # change values here to enforce
ymax = c(4, 6, Inf, 7)), # different axis limits
by = "facet") %>%
mutate_at(vars(Odds, Upper, Lower), list(~ ifelse(. < ymin, ymin, .))) %>%
mutate_at(vars(Odds, Upper, Lower), list(~ ifelse(. > ymax, ymax, .))) }
# copy alternate version's panel parameters to original plot & plot the result
p1 <- ggplot_build(p)
p1.alt <- ggplot_build(p.alt)
p1$layout$panel_params <- p1.alt$layout$panel_params
p2 <- ggplot_gtable(p1)
grid::grid.draw(p2)
Many thanks go to Z.Lin for starting the fix to my question, and that answer certainly helped me get past the errors and learn a more appropriate way of working with ggproto objects.
This answer is posted as more of a flexible method of fixing the underlying problem of per-panel limits within a faceted plot. The major issue I had with my first batch of code was that it relies on the ordering of the facets, which in some of my other (private) use-cases is not always known (well, not controlled) a priori. Because of this, I wanted an unambiguous determination of per-panel limits.
I've changed the function name (and the args) to represent two points: (1) this appears to be mimic/replace coord_cartesian, and (2) I don't know that it will translate to other coord_* functions without adjustment. Comments/patches welcome at my gist.
Up front, a perfect duplication of Z.Lin's results can be had with:
p <- test_data %>%
ggplot(aes(x = Nsubjects, y = Odds, color=EffectSize)) +
facet_wrap(DataType ~ ExpType, labeller = label_both, scales = "free") +
geom_line(size = 2) +
geom_ribbon(aes(ymax = Upper, ymin = Lower, fill = EffectSize, color = NULL), alpha = 0.2)
p + coord_cartesian_panels(
panel_limits = tibble::tribble(
~DataType, ~ExpType, ~ymin, ~ymax
, "A" , "X" , 1, 4
, "A" , "Y" , 1, 6
, "B" , "Y" , 1, 7
)
)
and gone is the ambiguity (that the original code introduced) of which panel is which argument in the list. Since it uses a data.frame to match (usually merge) with the layout of the plot, the order of rows does not matter.
Notes:
the panel_limits fields referenced are: xmin, xmax, ymin, and ymax, on top of whichever faceting variables are desired;
an NA in a particular field (or a missing field) means to use the previously-defined limit;
when all faceting-variables match (between panel_limits and the layout defined by facet_*), the limits are set on individual panels; this one-to-one mapping is the going-in assumption about this function;
when some (but not all) variables match, the limits are set on a subset of panels (e.g., on one axis of the panels, depending on the faceting method);
when no variables match and panel_limits is a single row, then set the limits for all panels indiscriminately; and
faceting rows in panel_limits that match nothing in layout are silently ignored.
Errors:
any faceting variables in panel_limits that do not exist in the layout (i.e., not specified within facet_*); or
more than one row in panel_limits matches a particular panel.
As an extension, this also handles a subset of the faceting variables, so if we want to limit all facets by ExpType only, then
# set the limits on panels based on one faceting variable only
p + coord_cartesian_panels(
panel_limits = tibble::tribble(
~ExpType, ~ymin, ~ymax
, "X" , NA, 4
, "Y" , 1, 5
)
) + labs(title = "panel_limits, one variable")
# set the limits on all panels
p + coord_cartesian_panels(
panel_limits = tibble::tribble(
~ymin, ~ymax
, NA, 5
)
) + labs(title = "panel_limits, no variables")
(The last example seems silly, but if the facets/plots are being built programmatically and it is not guaranteed a priori that there are individual facets, then this will result in a reasonable default behavior, assuming that everything is otherwise unambiguous.)
A further extension might allow for an NA in a facet variable to match all, such as
# does not work
p + coord_cartesian_panels(
panel_limits = tibble::tribble(
~DataType, ~ExpType, ~ymin, ~ymax
, "A" , NA , 1, 4
, NA , "Y" , 1, 6
)
)
This would require that merge understand that NA means "all/any", not a literal NA. I'm not going to extend merge at the moment to handle that, so I'm not going to complicate this function to attempt to do that. If there is a reasonable merge replacement that does this kind of calculus, let me know :-)
Many Thanks to ...
burchill for the original effort and gist; and
Z.Lin, for helping to bring the function up to ggplot2-3.3.0.
UniquePanelCoords <- ggplot2::ggproto(
"UniquePanelCoords", ggplot2::CoordCartesian,
num_of_panels = 1,
panel_counter = 1,
layout = NULL,
setup_layout = function(self, layout, params) {
self$num_of_panels <- length(unique(layout$PANEL))
self$panel_counter <- 1
self$layout <- layout # store for later
layout
},
setup_panel_params = function(self, scale_x, scale_y, params = list()) {
train_cartesian <- function(scale, limits, name, given_range = c(NA, NA)) {
if (anyNA(given_range)) {
expansion <- ggplot2:::default_expansion(scale, expand = self$expand)
range <- ggplot2:::expand_limits_scale(scale, expansion, coord_limits = limits)
isna <- is.na(given_range)
given_range[isna] <- range[isna]
}
out <- list(
ggplot2:::view_scale_primary(scale, limits, given_range),
sec = ggplot2:::view_scale_secondary(scale, limits, given_range),
arrange = scale$axis_order(),
range = given_range
)
names(out) <- c(name, paste0(name, ".", names(out)[-1]))
out
}
this_layout <- self$layout[ self$panel_counter,, drop = FALSE ]
self$panel_counter <-
if (self$panel_counter < self$num_of_panels) {
self$panel_counter + 1
} else 1
# determine merge column names by removing all "standard" names
layout_names <- setdiff(names(this_layout),
c("PANEL", "ROW", "COL", "SCALE_X", "SCALE_Y"))
limits_names <- setdiff(names(self$panel_limits),
c("xmin", "xmax", "ymin", "ymax"))
limit_extras <- setdiff(limits_names, layout_names)
if (length(limit_extras) > 0) {
stop("facet names in 'panel_limits' not found in 'layout': ",
paste(sQuote(limit_extras), collapse = ","))
} else if (length(limits_names) == 0 && NROW(self$panel_limits) == 1) {
# no panels in 'panel_limits'
this_panel_limits <- cbind(this_layout, self$panel_limits)
} else {
this_panel_limits <- merge(this_layout, self$panel_limits, all.x = TRUE, by = limits_names)
}
if (isTRUE(NROW(this_panel_limits) > 1)) {
stop("multiple matches for current panel in 'panel_limits'")
}
# add missing min/max columns, default to "no override" (NA)
this_panel_limits[, setdiff(c("xmin", "xmax", "ymin", "ymax"),
names(this_panel_limits)) ] <- NA
c(train_cartesian(scale_x, self$limits$x, "x",
unlist(this_panel_limits[, c("xmin", "xmax"), drop = TRUE])),
train_cartesian(scale_y, self$limits$y, "y",
unlist(this_panel_limits[, c("ymin", "ymax"), drop = TRUE])))
}
)
coord_cartesian_panels <- function(panel_limits, expand = TRUE, default = FALSE, clip = "on") {
ggplot2::ggproto(NULL, UniquePanelCoords,
panel_limits = panel_limits,
expand = expand, default = default, clip = clip)
}
At some point I had a similar problem to this. The result was a slightly more verbose but also more flexible option that can customize many aspects of position scales on a per-facet basis. Due to some technicality it uses the equivalent of scales::oob_keep() as oob arguments on the scales, thereby acting as if the coordinates determined the limits.
library(ggh4x)
library(tidyverse)
p <- test_data %>%
ggplot(aes(x=Nsubjects, y = Odds, color=EffectSize)) +
facet_wrap(DataType ~ ExpType, labeller = label_both, scales="free") +
geom_line(size=2) +
geom_ribbon(aes(ymax=Upper, ymin=Lower, fill=EffectSize, color=NULL), alpha=0.2) +
facetted_pos_scales(
x = list(
scale_x_continuous(limits = c(8, 64)),
scale_x_continuous(limits = c(64, 8), trans = "reverse"),
NULL,
scale_x_continuous(limits = c(8, 64), labels = scales::dollar_format())
),
y = list(
scale_y_continuous(limits = c(1, 4), guide = "none"),
scale_y_continuous(limits = c(1, 6), breaks = 1:3),
NULL,
scale_y_continuous(limits = c(1, 7), position = "right")
)
)
I'm trying to reorder the x axis by the values in the y axis. The x axis is a name, the y axis is an integer. Both are reactive, user defined inputs. I have created a datatable that renders in the correct order, but ggplot does not take that order. Instead it does an alphabetical order.
My current code is:
Packages
library(shiny)
library(readxl) # to load the data into R
library(tidyverse)
library(stringr)
library(DT)
library(tools)
library(magrittr)
Data
lpop <-read.csv("londonpopchange.csv", header=TRUE)
UI
# Define UI for application that plots features of movies
ui <- fluidPage(
# Sidebar layout with a input and output definitions
sidebarLayout(
# Inputs
sidebarPanel(
# Select variable for y-axis
selectInput(inputId = "y",
label = "Y-axis:",
choices = c("Mid Year 2016" = "MYE2016",
"Births" = "Births",
"Deaths" = "Deaths",
"Births minus Deaths" = "BirthsminusDeaths",
"Internal Migration Inflow" = "InternalMigrationInflow",
"Internal Migration Outflow" = "InternalMigrationOutflow",
"Internal Migration Net" = "InternalMigrationNet",
"International Migration Inflow" = "InternationalMigrationInflow",
"International Migration Outflow" = "InternationalMigrationOutflow",
"International Migration Net" = "InternationalMigrationNet"),
selected = "MYE2016"),
# Select variable for x-axis
selectInput(inputId = "x",
label = "X-axis:",
choices = c("Borough" = "Name"),
selected = "Name")
),
# Output
mainPanel(
h1(textOutput("MainTitle")),
br(),
plotOutput(outputId = "geom_bar"),
DT::dataTableOutput("mytable")
)
)
)
Server
# Define server function required to create the scatterplot
server <- function(input, output) {
#this creates the title
output$MainTitle <- renderText({
paste(input$y, "for London Boroughs")
})
#creates a data table that reacts to the user variable input and arranges
#by the y variable
df <- reactive({
lpop %>%
select(input$x, input$y, "WF") %>%
arrange_(.dots = input$y) #%>%
# setNames(1:2, c("x", "y"))
})
#outputs the user defined data frame
output$mytable = ({DT::renderDataTable({df()})})
# Create the bar plot object the plotOutput function is expecting
output$geom_bar <- renderPlot({
ggplot(data = df(), aes_string(x = input$x, y = input$y, fill = "WF")) +
geom_bar(stat = "identity") +
scale_fill_manual(values=c("#000000", "#00D253")) +
theme(axis.text.x = element_text(angle = 90)) +
xlab(input$x)
})
}
# Create a Shiny app object
shinyApp(ui = ui, server = server)
It renders as so: https://jwest.shinyapps.io/ShinyPopulation/
If I use the reorder function in ggplot, it amalgamates all "Names" into one bar, see below.
# Create the bar plot object the plotOutput function is expecting
output$geom_bar <- renderPlot({
ggplot(data = df(), aes_string(x = reorder(input$x, input$y), y = input$y, fill = "WF")) +
geom_bar(stat = "identity") +
scale_fill_manual(values=c("#000000", "#00D253")) +
theme(axis.text.x = element_text(angle = 90)) +
xlab(input$x)
})
}
How can I render it by the Y axis? Is it something to do with scale_x_discrete(limits = ...). If it is I am confused as to how i'm meant to reference the first column of the reactive df
The csv can be downloaded here: https://drive.google.com/file/d/1QLT8CX9XFSx3WU_tADyWgyddHYd3-VSp/view?usp=sharing
DPUT
structure(list(Code = structure(c(7L, 1L, 12L, 13L, 14L), .Label = c("E09000001",
"E09000002", "E09000003", "E09000004", "E09000005", "E09000006",
"E09000007", "E09000008", "E09000009", "E09000010", "E09000011",
"E09000012", "E09000013", "E09000014", "E09000015", "E09000016",
"E09000017", "E09000018", "E09000019", "E09000020", "E09000021",
"E09000022", "E09000023", "E09000024", "E09000025", "E09000026",
"E09000027", "E09000028", "E09000029", "E09000030", "E09000031",
"E09000032", "E09000033"), class = "factor"), Name = structure(c(6L,
7L, 12L, 13L, 14L), .Label = c("Barking and Dagenham", "Barnet",
"Bexley", "Brent", "Bromley", "Camden", "City of London", "Croydon",
"Ealing", "Enfield", "Greenwich", "Hackney", "Hammersmith and Fulham",
"Haringey", "Harrow", "Havering", "Hillingdon", "Hounslow", "Islington",
"Kensington and Chelsea", "Kingston upon Thames", "Lambeth",
"Lewisham", "Merton", "Newham", "Redbridge", "Richmond upon Thames",
"Southwark", "Sutton", "Tower Hamlets", "Waltham Forest", "Wandsworth",
"Westminster"), class = "factor"), Geography = structure(c(1L,
1L, 1L, 1L, 1L), .Label = "London Borough", class = "factor"),
MYE2016 = c(249162L, 7246L, 273239L, 181783L, 272078L), Births = c(2671L,
68L, 4405L, 2446L, 3913L), Deaths = c(1180L, 38L, 1168L,
895L, 1140L), BirthsminusDeaths = c(1491L, 30L, 3237L, 1551L,
2773L), InternalMigrationInflow = c(22189L, 856L, 21271L,
19109L, 22469L), InternalMigrationOutflow = c(25132L, 792L,
23324L, 20488L, 29113L), InternalMigrationNet = c(-2943L,
64L, -2053L, -1379L, -6644L), InternationalMigrationInflow = c(11815L,
756L, 5054L, 5333L, 7480L), InternationalMigrationOutflow = c(6140L,
441L, 3534L, 4336L, 4460L), InternationalMigrationNet = c(5675L,
315L, 1520L, 997L, 3020L), Other = c(-24L, -1L, -14L, 46L,
-3L), Estimated.Population..mid.2017 = c(253361L, 7654L,
275929L, 182998L, 271224L), WF = structure(c(1L, 1L, 1L,
1L, 1L), .Label = c("London Borough", "Waltham Forest"), class = "factor")), .Names = c("Code",
"Name", "Geography", "MYE2016", "Births", "Deaths", "BirthsminusDeaths",
"InternalMigrationInflow", "InternalMigrationOutflow", "InternalMigrationNet",
"InternationalMigrationInflow", "InternationalMigrationOutflow",
"InternationalMigrationNet", "Other", "Estimated.Population..mid.2017",
"WF"), row.names = c(NA, 5L), class = "data.frame")
I am creating a Flexdashboard in R. I want the dashboard to contains both a table and a series of visualizations, that would be filtered through inputs.
As I need to deliver a dashboard locally (without a server running in the background), I am unable to use Shiny, hence I rely on crosstalk.
I know that the crosstalk package provides limited functionality in the front-end. For instance, the documentation says that you can't aggregate the SharedData object.
Nonetheless, I am not clear if I can use the same inputs to filter two different dataframes.
For example, lets say I have:
Dataframe One: Contains original data
df1 <- structure(list(owner = structure(c(1L, 2L, 2L, 2L, 2L), .Label = c("John",
"Mark"), class = "factor"), hp = c(250, 120, 250, 100, 110),
car = structure(c(2L, 2L, 2L, 1L, 1L), .Label = c("benz",
"bmw"), class = "factor"), id = structure(1:5, .Label = c("car1",
"car2", "car3", "car4", "car5"), class = "factor")), .Names = c("owner",
"hp", "car", "id"), row.names = c(NA, -5L), class = "data.frame")
Dataframe Two: Contains aggregated data
df2 <- structure(list(car = structure(c(1L, 2L, 1L, 2L), .Label = c("benz",
+ "bmw"), class = "factor"), owner = structure(c(1L, 1L, 2L, 2L
+ ), .Label = c("John", "Mark"), class = "factor"), freq = c(0L,
+ 1L, 2L, 2L)), .Names = c("car", "owner", "freq"), row.names = c(NA,
+ -4L), class = "data.frame")
These two dataframes contain columns with identical values - car and owner. As well as, additional columns too.
I could create two different objects:
library(crosstalk)
shared_df1 <- SharedData$new(df1)
shared_df2 <- SharedData$new(df2)
and than:
filter_select("owner", "Car owner:", shared_df1, ~ owner)
filter_select("owner", "Car owner:", shared_df2, ~ owner)
However, that would mean that the user will need to fill inputs that are essentially identical, twice. Also, if the table is large, this would double the size of the memory needed to use the dashboard.
Is it possible to work around this problem in crosstalk?
Ah I recently ran into this too, there is another argument to SharedData$new(..., group = )! The group argument seems to do the trick. I found out by accident when I had two dataframes and used the group =.
If you make a sharedData object, it will include
a dataframe
a key to select rows by - preferably unique, but not necessarily.
a group name
What I think happens is that crosstalk filters the sharedData by the key - for all sharedData objects in the same group! So as long as two dataframes use the same key, you should be able to filter them together in one group.
This should work for your example.
---
title: "blabla"
output:
flexdashboard::flex_dashboard:
orientation: rows
social: menu
source_code: embed
theme: cerulean
---
```{r}
library(plotly)
library(crosstalk)
library(tidyverse)
```
```{r Make dataset}
df1 <- structure(list(owner = structure(c(1L, 2L, 2L, 2L, 2L), .Label = c("John", "Mark"), class = "factor"), hp = c(250, 120, 250, 100, 110), car = structure(c(2L, 2L, 2L, 1L, 1L), .Label = c("benz", "bmw"), class = "factor"), id = structure(1:5, .Label = c("car1", "car2", "car3", "car4", "car5"), class = "factor")), .Names = c("owner", "hp", "car", "id"), row.names = c(NA, -5L), class = "data.frame")
df2 <- structure(list(car = structure(c(1L, 2L, 1L, 2L), .Label = c("benz",
"bmw"), class = "factor"), owner = structure(c(1L, 1L, 2L, 2L
), .Label = c("John", "Mark"), class = "factor"), freq = c(0L,
1L, 2L, 2L)), .Names = c("car", "owner", "freq"), row.names = c(NA,
-4L), class = "data.frame")
```
#
##
### Filters
```{r}
library(crosstalk)
# Notice the 'group = ' argument - this does the trick!
shared_df1 <- SharedData$new(df1, ~owner, group = "Choose owner")
shared_df2 <- SharedData$new(df2, ~owner, group = "Choose owner")
filter_select("owner", "Car owner:", shared_df1, ~owner)
# You don't need this second filter now
# filter_select("owner", "Car owner:", shared_df2, ~ owner)
```
### Plot1 with plotly
```{r}
plot_ly(shared_df1, x = ~id, y = ~hp, color = ~owner) %>% add_markers() %>% highlight("plotly_click")
```
### Plots with plotly
```{r}
plot_ly(shared_df2, x = ~owner, y = ~freq, color = ~car) %>% group_by(owner) %>% add_bars()
```
##
### Dataframe 1
```{r}
DT::datatable(shared_df1)
```
### Dataframe 2
```{r}
DT::datatable(shared_df2)
```
I spent some time on this by trying to extract data from plot_ly() using plotly_data() without luck until I figured out the answer. That's why there's some very simple plots with plotly.
Recently, I've also wanted to use one filter to filter 2 visualizations.
Brief description of my situation
I've wanted to use one filter to filter a boxplot and a table.
Source data has been a data frame. I've wanted to use some of variables for the boxplot and also calculate some statistics (like mean, standard deviation, mode, number of records).
Functions I've needed to use to display results: plotly::plot_ly(), DT::datatable(), crosstalk::bscols().
I've found out that there are 3 key information to solve this situation
Key 1) It's necessary to correctly create shared data.
In my case, I've had to use crosstalk::SharedData$new() twice.
Correct shared data, to be used as source for visualizations, can be used if firstly keys 2 and 3 are fulfilled.
Key 2) When creating shared data, use the same group argument as "Lodewic Van Twillert" explained on 16 Mar 2018.
Key 3) Ensure that all SharedData instances refer conceptually to the same data points, and share the same keys.
Start with ensuring that a data frame has row names even if row names are character vector with numbers (like "1", "2", ...).
Used literature for this key 3: https://rstudio.github.io/crosstalk/using.html. (I suggest to mainly read subtitle "Grouping".)
Summary of steps I've used to fulfill key information from above
Key 3) This one could be tricky in order to fulfill relevant conditions of key 3 above.
The approach I've chosen creates one table containing all data and this table (data frame) will be used to create both shared data.
I've applied data manipulations to original data frame (risk_scores_df) so now this data has a new column.
I've created a new data frame with statistics.
I've joined both data frames using
risk_scores_df <- dplyr::left_join... so now the original data frame contains all prepared data.
I've run print(rownames(risk_scores_df)) to ensure that my updated data frame has row names.
Now, I've had one data frame containing all data (needed for both visualizations) that fulfill conditions of information of key 3 above.
Key 2) I've simply added group = "sd1" in both crosstalk::SharedData$new()
Key 1) This one could be also tricky if a wrong approach is chosen.
Here, the key to create correct shared data instances is to use that one table with all data and choose only rows and columns needed for a relevant shared data.
Example - in my case, I've run codes in Option 1 to create two shared data instances, but also Option 2 is possible.
Option 1 (choosing of only needed rows and columns is in crosstalk::SharedData$new())
rs_df_sd1 <- crosstalk::SharedData$new(
risk_scores_df[, c(1, 2, 5)],
group = "sd1"
)
rs_df_sd1a <- crosstalk::SharedData$new(
risk_scores_df[risk_scores_df$NumRecords > 0 &
is.na(risk_scores_df$NumRecords) == F,
c(1, 6:11)],
group = "sd1"
)
Option 2 (choosing of only needed rows and columns is in additional variables)
sd1 <- risk_scores_df[, c(1, 2, 5)]
sd1a <- risk_scores_df[risk_scores_df$NumRecords > 0 &
is.na(risk_scores_df$NumRecords) == F,
c(1, 6:11)]
rs_df_sd1 <- crosstalk::SharedData$new(sd1, group = "sd1")
rs_df_sd1a <- crosstalk::SharedData$new(sd1a, group = "sd1")
Completing the solution
At this point I've created shared data instances rs_df_sd1 and rs_df_sd1a that can be used as main sources for visualizations that will be filtered using crosstalk::bscols().
Brief example:
box_n_jitter_chart1 <- plotly::plot_ly(rs_df_sd1) %>% add_trace(...
DT_table1 <- DT::datatable(rs_df_sd1a)
crosstalk::bscols(
widths = c(6, 12, NA),
crosstalk::filter_select(
id = "idAvgRisk",
label = "Account",
sharedData = rs_df_sd1,
group = ~Account,
multiple = F
),
box_n_jitter_chart1,
DT_table1
)
Note: DT::datatable() can also use rs_df_sd1a$data() and cells = list(values = base::rbind(... (see that cells = ... is used; see more about using cells e.g. at https://plotly.com/r/reference/table/) but because method data() is used (see more e.g. at https://rdrr.io/cran/crosstalk/man/SharedData.html#method-data) then it will not work with crosstalk::bscols.
I have an usual problem when using geom_errorbar in ggplot2.
The error bars are not within range but that is of no concern here.
My problem is that geom_errorbar is plotting the confidence intervals for the same data differently depending on what other data is plotted with it.
The code below filters the data only passing data where Audio1 is equal to "300SW" OR "3500MFL" in the uncommented SE and AggBar.
SE<-c(0.0861829641865964, 0.0296894376485468, 0.0323219002250762,
0.0937013798013447)
AggBar <- structure(list(Report = structure(c(1L, 2L, 1L, 2L, 1L, 2L, 1L,
2L), .Label = c("One Flash", "Two Flashes"), class = "factor"),
Visual = structure(c(1L, 1L, 2L, 2L, 1L, 1L, 2L, 2L), .Label = c("one",
"two"), class = "factor"), Audio = c("300SW", "300SW", "300SW",
"300SW", "3500MFL3500CL", "3500MFL3500CL", "3500MFL3500CL",
"3500MFL3500CL"), Prob = c(0.938828282828283, 0.0611717171717172,
0.754141414141414, 0.245858585858586, 0.534484848484848,
0.465515151515151, 0.0830909090909091, 0.916909090909091)), .Names = c("Report",
"Visual", "Audio", "Prob"), row.names = c(NA, -8L), class = "data.frame")
#SE<-c(0.0310069159026252, 0.113219880555153, 0.0861829641865964, 0.0296894376485468)
#AggBar <- structure(list(Report = structure(c(1L, 2L, 1L, 2L, 1L, 2L, 1L,
#2L), .Label = c("One Flash", "Two Flashes"), class = "factor"),
#Visual = structure(c(1L, 1L, 2L, 2L, 1L, 1L, 2L, 2L), .Label = c("one",
#"two"), class = "factor"), Audio = c("300MFL300CL", "300MFL300CL",
#"300MFL300CL", "300MFL300CL", "300SW", "300SW", "300SW",
#"300SW"), Prob = c(0.562242424242424, 0.437757575757576,
#0.0921010101010101, 0.90789898989899, 0.938828282828283,
#0.0611717171717172, 0.754141414141414, 0.245858585858586)), .Names = c("Report",
#"Visual", "Audio", "Prob"), row.names = c(NA, -8L), class = "data.frame")
prob.bar = ggplot(AggBar, aes(x = Report, y = Prob, fill = Report)) + theme_bw() #+ facet_grid(Audio~Visual)
prob.bar + #This changes all panels' colour
geom_bar(position=position_dodge(.9), stat="identity", colour="black", width=0.8)+
theme(legend.position = "none") + labs(x="Report", y="Probability of Report", title = expression("Visual Condition")) + scale_fill_grey() +
scale_fill_grey(start=.4) +
scale_y_continuous(limits = c(0, 1), breaks = (seq(0,1,by = .25)))+
facet_grid(Audio ~ Visual)+
geom_errorbar(aes(ymin=Prob-SE, ymax=Prob+SE),
width=.1, # Width of the error bars
position=position_dodge(.09))
This results in the following output:
The Audio1 variables are seen on the rightmost vertical labels.
However if I filter where it only passes where Audio1 is equal to "300SW" OR "300MFL" (the commented SE and AggBar) the error bars for "300SW change":
The Audio1 variables are seen on the rightmost vertical labels with "300SW" on the bottom this time.
This change is the incorrect one because when I plot just the Audio1 "300SW" the error bars match the original plot.
I have tried plotting the Audio1 "300SW" with other variables not presented here and it is only when presenting with "300MFL" that this change occurs.
If you look at the SE variable contents you will see that there is no change in the values therein for "300SW" in both versions of the code. Yet the outputs differ.
I cannot fathom what is happening here. Any ideas or suggestions are welcome.
Thanks very much for your time.
#Antonios K below has highlighted that when "300SW" is on top of the grid the error bars are correctly drawn. I'm guessing that the error bars are being incorrectly matched to the bars although I don't know why this is the case.
The problem is that SE is not stored inside the data frame: it's just floating around in the global environment. When the data is facetted (which involves rearranging the order), it no longer lines up with the correct records. Fix the problem by storing SE in the data frame:
AggBar$SE <- c(0.0310069159026252, 0.113219880555153, 0.0861829641865964, 0.0296894376485468)
ggplot(AggBar, aes(Report, Prob, Report)) +
geom_bar(stat = "identity", fill = "grey50") +
geom_errorbar(aes(ymin = Prob - SE, ymax = Prob + SE), width = 0.4) +
facet_grid(Audio ~ Visual)
The bit of code that plots the error bars is :
geom_errorbar(aes(ymin=Prob-SE, ymax=Prob+SE),
width=.1, # Width of the error bars
position=position_dodge(.09))
So, I guess it's something there.
As you said the SE variable is the same in both cases, but what you plot there is Prob-SE and Prob+SE. And if you do AggBar$Prob-SE and AggBar$Prob+SE you'll get different values for 300SW for each case.
Might have to do with the order of your Audio1 values. The other cases that worked did they have 300SW on the top part of the plots as well maybe?
Try
sort(unique(DataRearrange$Audio1) )
[1] "300MFL" "300SW" "3500MFL"
Combining first two will give you 300SW on the bottom part of the plots.
Combining last two will give you 300SW on the top part.
So, to check this assumption, in your second case when you combine 300MFL and 300SW try to replace 300SW with 1_300SW (so that 300SW will be plotted on top) and see what happens. Just do :
DataRearrange$Audio1[DataRearrange$Audio1=="300SW"] = "1_300SW"
# Below is the alternative coupling..
ErrorBarsDF <- DataRearrange[(DataRearrange$Audio1=="1_300SW" | DataRearrange$Audio1=="300MFL"), c("correct","Visual1", "Audio1", "Audio2","correct_response", "response", "subject_nr")]
DataRearrange <- DataRearrange[(DataRearrange$Audio1=="1_300SW" | DataRearrange$Audio1=="300MFL"), c("correct","Visual1", "Audio1", "Audio2","correct_response", "response", "subject_nr")]
A pairwise scatterplot showing relationship between genes (columns in data frame) across multiple samples (rows in data frame) is created. The samples belong to two distinct groups: group "A" and "B". Since one dot in plot represent one sample, I need to color the data points (dots) according to groups with two different colors, say group A with "green" and group B with "red". Is it possible to do that?
Any kind of help will be appreciated.
plot(DF[1:6], pch = 21) #command used for plotting, DF is data frame
Sample Data Frame Example:
CBX3 PSPH ATP2C1 SNX10 MMD ATP13A3
B 10.589844 6.842970 8.084550 8.475023 9.202490 10.403811
A 10.174385 5.517944 7.736994 9.094834 9.253766 10.133408
B 10.202084 5.669137 7.392141 7.522270 7.830969 9.123178
B 10.893231 6.630709 7.601690 7.894177 8.979142 9.791841
B 10.071038 5.091222 7.032585 8.305581 7.903737 8.994821
A 10.005002 4.708631 7.927246 7.292527 8.257853 10.054630
B 10.028055 5.080944 6.421961 7.616856 8.287496 9.642294
A 10.144115 6.626483 7.686203 7.970934 7.919615 9.475175
A 10.675386 6.874047 7.900560 7.605519 8.585158 8.858613
A 9.855063 5.164399 6.847923 8.072608 8.221344 9.077744
A 10.994228 6.545318 8.606128 8.426329 8.787876 9.857079
A 10.501266 6.677360 7.787168 8.444976 8.928174 9.542558
GGally has a good function for this as well.
library(GGally)
ggpairs(dd, color = 'CLASS',columns = 2:ncol(dd) )
It might not be that easy to do with base graphics. You could easily do this with lattice. With this sample data.frame
dd<-structure(list(CLASS = structure(c(2L, 1L, 2L, 2L, 2L, 1L, 2L,
1L, 1L, 1L, 1L, 1L), .Label = c("A", "B"), class = "factor"),
CBX3 = c(10.589844, 10.174385, 10.202084, 10.893231, 10.071038,
10.005002, 10.028055, 10.144115, 10.675386, 9.855063, 10.994228,
10.501266), PSPH = c(6.84297, 5.517944, 5.669137, 6.630709,
5.091222, 4.708631, 5.080944, 6.626483, 6.874047, 5.164399,
6.545318, 6.67736), ATP2C1 = c(8.08455, 7.736994, 7.392141,
7.60169, 7.032585, 7.927246, 6.421961, 7.686203, 7.90056,
6.847923, 8.606128, 7.787168), SNX10 = c(8.475023, 9.094834,
7.52227, 7.894177, 8.305581, 7.292527, 7.616856, 7.970934,
7.605519, 8.072608, 8.426329, 8.444976), MMD = c(9.20249,
9.253766, 7.830969, 8.979142, 7.903737, 8.257853, 8.287496,
7.919615, 8.585158, 8.221344, 8.787876, 8.928174), ATP13A3 = c(10.403811,
10.133408, 9.123178, 9.791841, 8.994821, 10.05463, 9.642294,
9.475175, 8.858613, 9.077744, 9.857079, 9.542558)), .Names = c("CLASS",
"CBX3", "PSPH", "ATP2C1", "SNX10", "MMD", "ATP13A3"), class = "data.frame", row.names = c(NA, -12L))
you can do
library(lattice)
splom(~dd[,-1], groups=dd$CLASS)
to get
You can add color to the points by specifying the argument col
to plot
DF <- read.delim(textConnection(
"category CBX3 PSPH ATP2C1 SNX10 MMD ATP13A3
B 10.589844 6.842970 8.084550 8.475023 9.202490 10.403811
A 10.174385 5.517944 7.736994 9.094834 9.253766 10.133408
B 10.202084 5.669137 7.392141 7.522270 7.830969 9.123178
B 10.893231 6.630709 7.601690 7.894177 8.979142 9.791841
B 10.071038 5.091222 7.032585 8.305581 7.903737 8.994821
A 10.005002 4.708631 7.927246 7.292527 8.257853 10.054630
B 10.028055 5.080944 6.421961 7.616856 8.287496 9.642294
A 10.144115 6.626483 7.686203 7.970934 7.919615 9.475175
A 10.675386 6.874047 7.900560 7.605519 8.585158 8.858613
A 9.855063 5.164399 6.847923 8.072608 8.221344 9.077744
A 10.994228 6.545318 8.606128 8.426329 8.787876 9.857079
A 10.501266 6.677360 7.787168 8.444976 8.928174 9.542558"))
plot(DF[2:7],col = ifelse(DF$category == 'A','red','green'))
A list of valid color values can be obtained by calling colors(). Vectors with a gradient of colors can be created via rainbow(), and just for fun, I use this little function for choosing pretty colors when making a figure.
(Edited per suggestions from #MrFlick)
#! #param n The number of colors to be selected
colorchoose <- function (n = 1, alpha, term = F)
{
cols <- colors()
mod <- ceiling(sqrt(length(cols)))
plot(xlab = "", ylab = "", main = "click for color name",
c(0, mod), c(0, mod), type = "n", axes = F)
s<-seq_along(cols)
dev.hold()
points(s%%mod, s%/%mod, col = cols, pch = 15, cex = 2.4)
dev.flush()
p <- locator(n)
return(cols[round(p$y) * mod + round(p$x)])
}