How to specify tm_fill() if I want it to be a variable from a new object? - r

I am trying to create an R function that would run a GWR on variables that the user specifies from a Spatial Polygons Data Frame. The end result of running the function are two mappings - one of the independent variable's values and one of the coefficient values from the GWR model. I'm having trouble with the second map.
I have managed to create the GWR model and a 'results' object for the coefficients that I would be visualizing.
gwr.model <- gwr(SpatialPolygonsDataFrame#data[, y] ~ SpatialPolygonsDataFrame#data[, x],
data = SpatialPolygonsDataFrame,
adapt = GWRbandwidth,
hatmatrix = TRUE,
se.fit = TRUE)
results <- as.data.frame(gwr.model$SDF)
gwr.map <- SpatialPolygonsDataFrame
gwr.map#data <- cbind(SpatialPolygonsDataFrame#data, as.matrix(results))
To create the visualization of the GWR coefficients, I have to specify my tm_fill() to be a column from the 'results' object, but I do not know how to do it so that the function may be used will any Spatial Polygons Data Frame. So far, I have tried using the paste0() function, as so:
map2 <- tm_shape(gwr.map) + tm_fill(paste0("SpatialPolygonsDataFrame.", x), n = 5, style = "quantile", title = "Coefficient") +
tm_layout(frame = FALSE, legend.text.size = 0.5, legend.title.size = 0.6)
But I got an error saying that the fill argument is neither colors nor a valid variable name.
I'll be grateful for any tips that could help me resolve the issue.

Switching to the package sf - leaving sp behind - probably will solve your problem here.
In the absence of a reproducible example, let me try to suggest the following here:
convert your results with gwr.map.sf <- sf::st_as_sf(gwr.map). Then you add the results of your GWR simply as a new column: gwr.map$results <- results (my understanding is that the dimensions should fit).
Finally you should be able to plot like this:
map2 <- tm_shape(gwr.map.sf) + tm_fill("results", n = 5, style = "quantile", title = "Coefficient") +
tm_layout(frame = FALSE, legend.text.size = 0.5, legend.title.size = 0.6)

Related

Change name of groups in bal.plot

I am trying to visualize results from MatchIt procedure with bal.plot() from cobalt package.
It works just fine, except I would like to change the lables for the group which by default are "Unadjusted sample" and "Adjusted sample".
bal.plot(AHEAD_nomiss, var.name = "KCH_TKS", which = "both",
type = "histogram", mirror = F,
weights = AHEAD_nomiss$att.weights, treat = AHEAD_nomiss$group)
Author of cobalt package here. Thank you for using my package!
Edit. Original post at the bottom.
I just added some functionality to bal.plot for this in the development version of cobalt, which can be installed with devtools::install_github("ngreifer/cobalt"). Use the sample.names argument to supply a vector of names to give bal.plot and they'll appear in the facet labels. The vector should be as long as the number of samples (in your case, 2). Your new code should look like this:
bal.plot(AHEAD_nomiss, var.name = "KCH_TKS", which = "both",
type = "histogram", mirror = F,
weights = AHEAD_nomiss$att.weights, treat = AHEAD_nomiss$group,
sample.names = c("UNWEIGHTED", "WEIGHTED"))
Of course you can change the names. If you don't want to install the development version of cobalt (it't not guaranteed to be stable), you can use my solutions below.
I didn't intend bal.plot to be used for publication so I didn't make it super flexible, unlike love.plot. One thing you can do is manually program the histograms using ggplot2. Of course, this requires you learning how to use ggplot2, which can be a challenge, and looking at the source code of bal.plot probably won't help because of all the checks and transformations that occur. Here's some code that might work for you:
unweighted <- data.frame(KCH_TKS = AHEAD_nomiss$KCH_TKS,
treat = factor(AHEAD_nomiss$group),
weights = 1,
adj = "UNWEIGHTED",
stringsAsFactors = FALSE)
weighted <- data.frame(KCH_TKS = AHEAD_nomiss$KCH_TKS,
treat = factor(AHEAD_nomiss$group),
weights = AHEAD_nomiss$att.weights,
adj = "WEIGHTED",
stringsAsFactors = FALSE)
data <- rbind(unweighted, weighted)
ggplot(data, aes(x = KCH_TKS, fill = treat)) +
geom_histogram(aes(weight = weights), bins = 10, alpha = .4, color = "black") +
facet_grid(~adj)
One way you can hack bal.plot is to provide a set of weights that are all equal to 1 as well as your desired weights and leave which at its default. If you give the weights names, those names will appear on the facet labels. So, for your example, try
bal.plot(AHEAD_nomiss, var.name = "KCH_TKS",
type = "histogram", mirror = F,
weights = list(UNWEIGHTED = rep(1, nrow(AHEAD_nomiss),
WEIGHTED = AHEAD_nomiss$att.weights),
treat = AHEAD_nomiss$group)
You should see that "UNWEIGHTED" and "WEIGHTED" are the new facet label names. You can of course change them to be whatever you want.

How to fix ‘Error in FUN(X[[i]], ...) : only defined on a data frame with all numeric variables”

I intend to draw a qq plot on the data, but it reminds me that qqnorm function only works on numerical data.
As the factor include A,B,C,D and their two, three and four way interaction, I have no idea how to convert it into numerical form.
The data is as follows:
Effects,Value
A,76.95
B,-67.52
C,-7.84
D,-18.73
AB,-51.32
AC,11.69
AD,9.78
BC,20.78
BD,14.74
CD,1.27
ABC,-2.82
ABD,-6.5
ACD,10.2
BCD,-7.98
ABCD,-6.25
My code is as follows:
library(readr)
data621 <- read_csv("Desktop/data621.csv")
data621_qq<-qqnorm(data621,xlab = "effects",datax = T)
qqline(data621,probs=c(0.3,0.7),datax = T)
text(data621_qq$x,data621_qq$y,names(data621),pos=4)
Your code would work if using the proper columns instead of the entire data frame. For example,
data621_qq <- qqnorm(data621$Value, xlab = "Effects", datax = TRUE)
qqline(data621$Value, probs = c(0.3, 0.7), datax = TRUE)
text(data621_qq$x, data621_qq$y, data621$Effects, pos=4)
By the way, names(data621) would give you the column names, instead of the effect names (which are stored as values in a column).

How to plot an nmds with coloured/symbol points based on SIMPROF

Hi so i am trying to plot my nmds of a assemblage data which is in a bray-curtis dissimilarity matrix in R. I have been able to apply ordielipse(),ordihull() and even change the colours based on group factors created by cutree() of a hclst()
e.g using the dune data from the vegan package
data(dune)
Dune.dis <- vegdist(Dune, method = "bray)
Dune.mds <- metaMDS(Dune, distance = "bray", k=2)
#hierarchical cluster
clua <- hclust(Dune.dis, "average")
plot(clua, hang = -1)
# set groupings
rect.hclust(clua, 4)
grp <- cutree(clua, 4)
#plot mds
plot(Dune.mds, display = "sites", type = "text", cex = 1.5)
#show groupings
ordielipse(Dune.mds, group = grp, border =1, col ="red", lwd = 3)
or even colour the points just by the cutree
colvec <- c("red2", "cyan", "deeppink3", "green3")
colvec[grp]
plot(Dune.mds, display = "sites", type = "text", cex = 1.5) #or use type = "points"
points(P4.mds, col = colvec[c2], bg =colvec[c2], pch=21)
However what i really want to do is use the SIMPROF function using the package "clustsig" to then colour the points based on significant groupings - this is more of a technical coding language thing - i am sure there is a way to create a string of factors but i am sure there is a more efficient way to do it
heres my code so far for that:
simp <- simprof(Dune.dis, num.expected = 1000, num.simulated = 999, method.cluster = "average", method.distance = "braycurtis", alpha = 0.05, sample.orientation = "row")
#plot dendrogram
simprof.plot(simp, plot = TRUE)
Now i am just not sure how do the next step to plot the nmds using the groupings defined by the SIMPROF - how do i make the SIMPROF results a factor string without literally typing it my self it myself?
Thanks in advance.
You wrote you know how to get colours from an hclust object with cutree. Then read the documentation of clustsig::simprof. This says that simprof returns an hclust object within its result object. It also returns numgroups which is the suggested number of clusters. Now you have all information you need to use the cutree of hclust you already know. If your simprof result is called simp, use cutree(simp$hclust, simp$numgroups) to extract the integer vector corresponding to the clustsig::simprof result, and use this to colours.
I have never used simprof or clustsig, but I gathered all this information from its documentation.

Plotting quantile regression by variables in a single page

I am running quantile regressions for several independent variables separately (same dependent). I want to plot only the slope estimates over several quantiles of each variable in a single plot.
Here's a toy data:
set.seed(1988)
y <- rnorm(50, 5, 3)
x1 <- rnorm(50, 3, 1)
x2 <- rnorm(50, 1, 0.5)
# Running Quantile Regression
require(quantreg)
fit1 <- summary(rq(y~x1, tau=1:9/10), se="boot")
fit2 <- summary(rq(y~x2, tau=1:9/10), se="boot")
I want to plot only the slope estimates over quantiles. Hence, I am giving parm=2 in plot.
plot(fit1, parm=2)
plot(fit2, parm=2)
Now, I want to combine both these plots in a single page.
What I have tried so far;
I tried setting par(mfrow=c(2,2)) and plotting them. But it's producing a blank page.
I have tried using gridExtra and gridGraphics without success. Tried to convert base graphs into Grob objects as stated here
Tried using function layout function as in this document
I am trying to look into the source code of plot.rqs. But I am unable to understand how it's plotting confidence bands (I'm able to plot only the coefficients over quantiles) or to change mfrow parameter there.
Can anybody point out where am I going wrong? Should I look into the source code of plot.rqs and change any parameters there?
While quantreg::plot.summary.rqs has an mfrow parameter, it uses it to override par('mfrow') so as to facet over parm values, which is not what you want to do.
One alternative is to parse the objects and plot manually. You can pull the tau values and coefficient matrix out of fit1 and fit2, which are just lists of values for each tau, so in tidyverse grammar,
library(tidyverse)
c(fit1, fit2) %>% # concatenate lists, flattening to one level
# iterate over list and rbind to data.frame
map_dfr(~cbind(tau = .x[['tau']], # from each list element, cbind the tau...
coef(.x) %>% # ...and the coefficient matrix,
data.frame(check.names = TRUE) %>% # cleaned a little
rownames_to_column('term'))) %>%
filter(term != '(Intercept)') %>% # drop intercept rows
# initialize plot and map variables to aesthetics (positions)
ggplot(aes(x = tau, y = Value,
ymin = Value - Std..Error,
ymax = Value + Std..Error)) +
geom_ribbon(alpha = 0.5) +
geom_line(color = 'blue') +
facet_wrap(~term, nrow = 2) # make a plot for each value of `term`
Pull more out of the objects if you like, add the horizontal lines of the original, and otherwise go wild.
Another option is to use magick to capture the original images (or save them with any device and reread them) and manually combine them:
library(magick)
plots <- image_graph(height = 300) # graphics device to capture plots in image stack
plot(fit1, parm = 2)
plot(fit2, parm = 2)
dev.off()
im1 <- image_append(plots, stack = TRUE) # attach images in stack top to bottom
image_write(im1, 'rq.png')
The function plot used by quantreg package has it's own mfrow parameter. If you do not specify it, it enforces some option which it chooses on it's own (and thus overrides your par(mfrow = c(2,2)).
Using the mfrow parameter within plot.rqs:
# make one plot, change the layout
plot(fit1, parm = 2, mfrow = c(2,1))
# add a new plot
par(new = TRUE)
# create a second plot
plot(fit2, parm = 2, mfrow = c(2,1))

Error plotting Kohonen maps in R?

I was reading through this blog post on R-bloggers and I'm confused by the last section of the code and can't figure it out.
http://www.r-bloggers.com/self-organising-maps-for-customer-segmentation-using-r/
I've attempted to recreate this with my own data. I have 5 variables that follow an exponential distribution with 2755 points.
I am fine with and can plot the map that it generates:
plot(som_model, type="codes")
The section of the code I don't understand is the:
var <- 1
var_unscaled <- aggregate(as.numeric(training[,var]),by=list(som_model$unit.classif),FUN = mean, simplify=TRUE)[,2]
plot(som_model, type = "property", property=var_unscaled, main = names(training)[var], palette.name=coolBlueHotRed)
As I understand it, this section of the code is suppose to be plotting one of the variables over the map to see what it looks like but this is where I run into problems. When I run this section of the code I get the warning:
Warning message:
In bgcolors[!is.na(showcolors)] <- bgcol[showcolors[!is.na(showcolors)]] :
number of items to replace is not a multiple of replacement length
and it produces the plot:
Which just some how doesn't look right...
Now what I think it has come down to is the way the aggregate function has re-ordered the data. The length of var_unscaled is 789 and the length of som_model$data, training[,var] and unit.classif are all of length 2755. I tried plotting the aggregated data, the result was no warning but an unintelligible graph (as expected).
Now I think it has done this because unit.classif has a lot of repeated numbers inside it and that's why it has reduced in size.
The question is, do I worry about the warning? Is it producing an accurate graph? What exactly is the "Property"'s section looking for in the plot command? Is there a different way I could "Aggregate" the data?
I think that you have to create the palette color. If you put the argument
coolBlueHotRed <- function(n, alpha = 1) {rainbow(n, end=4/6, alpha=alpha)[n:1]}
and then try to get a plot, for example
plot(som_model, type = "count", palette.name = coolBlueHotRed)
the end is succesful.
This link can help you: http://rgm3.lab.nig.ac.jp/RGM/R_rdfile?f=kohonen/man/plot.kohonen.Rd&d=R_CC
I think that not all of the cells on your map have points inside.
You have 30 by 30 map and about 2700 points. In average it's about 3 points per cell. With high probability some cells have more than 3 points and some cells are empty.
The code in the post on R-bloggers works well when all of the cells have points inside.
To make it work on your data try change this part:
var <- 1
var_unscaled <- aggregate(as.numeric(training[, var]), by = list(som_model$unit.classif), FUN = mean, simplify = TRUE)[, 2]
plot(som_model, type = "property", property = var_unscaled, main = names(training)[var], palette.name = coolBlueHotRed)
with this one:
var <- 1
var_unscaled <- aggregate(as.numeric(data.temp[, data.classes][, var]),
by = list(som_model$unit.classif),
FUN = mean,
simplify = T)
v_u <- rep(0, max(var_unscaled$Group.1))
v_u[var_unscaled$Group.1] <- var_unscaled$x
plot(som_model,
type = "property",
property = v_u,
main = colnames(data.temp[, data.classes])[var],
palette.name = coolBlueHotRed)
Hope it helps.
Just add these functions to your script:
coolBlueHotRed <- function(n, alpha = 1) {rainbow(n, end=4/6, alpha=alpha)[n:1]}
pretty_palette <- c("#1f77b4","#ff7f0e","#2ca02c", "#d62728","#9467bd","#8c564b","#e377c2")

Resources