Nodes sliding off path diagram in R - r

This is a sample dput since the dataset is huge:
> dput(head(dat, n=20))
structure(list(q01 = c(2, 1, 2, 3, 2, 2, 2, 2, 3, 2, 2, 2, 3,
2, 2, 3, 1, 2, 2, 2), q02 = c(1, 1, 3, 1, 1, 1, 3, 2, 3, 4, 1,
1, 1, 2, 2, 1, 2, 2, 3, 1), q03 = c(4, 4, 2, 1, 3, 3, 3, 3, 1,
4, 5, 3, 3, 1, 3, 2, 5, 3, 4, 1), q04 = c(2, 3, 2, 4, 2, 2, 2,
2, 4, 3, 2, 3, 4, 2, 4, 2, 2, 3, 2, 2), q05 = c(2, 2, 4, 3, 2,
4, 2, 2, 5, 2, 2, 4, 3, 2, 2, 2, 1, 3, 3, 3), q06 = c(2, 2, 1,
3, 3, 4, 2, 2, 3, 1, 1, 3, 2, 2, 2, 2, 1, 4, 1, 4), q07 = c(3,
2, 2, 4, 3, 4, 2, 2, 5, 2, 2, 3, 3, 3, 3, 2, 1, 3, 1, 4), q08 = c(1,
2, 2, 2, 2, 2, 2, 2, 5, 2, 2, 1, 3, 2, 2, 2, 1, 2, 1, 1), q09 = c(1,
5, 2, 2, 4, 4, 3, 4, 3, 3, 5, 3, 2, 2, 2, 2, 4, 5, 5, 5), q10 = c(2,
2, 2, 4, 2, 3, 2, 2, 3, 2, 2, 2, 3, 3, 3, 3, 1, 2, 2, 1), q11 = c(1,
2, 3, 2, 2, 2, 2, 2, 5, 2, 1, 2, 3, 2, 2, 2, 1, 3, 1, 2), q12 = c(2,
3, 3, 2, 3, 4, 2, 3, 5, 3, 3, 3, 4, 4, 3, 3, 2, 3, 3, 5), q13 = c(2,
1, 2, 2, 3, 3, 2, 2, 5, 2, 1, 2, 4, 2, 2, 2, 1, 3, 1, 2), q14 = c(2,
3, 4, 3, 2, 3, 2, 2, 5, 1, 2, 2, 4, 4, 3, 3, 1, 3, 2, 5), q15 = c(2,
4, 2, 3, 2, 5, 2, 3, 5, 2, 1, 3, 4, 4, 3, 2, 1, 4, 2, 5), q16 = c(3,
3, 3, 3, 2, 2, 2, 2, 5, 3, 2, 3, 4, 4, 4, 3, 2, 3, 3, 5), q17 = c(1,
2, 2, 2, 2, 3, 2, 2, 5, 2, 2, 2, 3, 2, 2, 2, 2, 2, 1, 2), q18 = c(2,
2, 3, 4, 3, 5, 2, 2, 5, 2, 2, 2, 3, 4, 3, 3, 1, 2, 1, 5), q19 = c(3,
3, 1, 2, 3, 1, 3, 4, 2, 3, 5, 3, 2, 1, 3, 2, 4, 2, 4, 1), q20 = c(2,
4, 4, 4, 4, 5, 2, 3, 5, 3, 3, 4, 4, 5, 4, 3, 2, 3, 2, 5), q21 = c(2,
4, 3, 4, 2, 3, 2, 2, 5, 2, 2, 3, 4, 5, 4, 2, 1, 3, 2, 5), q22 = c(2,
4, 2, 4, 4, 1, 4, 4, 3, 4, 5, 4, 3, 3, 4, 3, 4, 3, 4, 5), q23 = c(5,
2, 2, 3, 4, 4, 4, 4, 3, 4, 5, 4, 4, 1, 4, 4, 4, 4, 4, 5)), variable.labels = c(q01 = "Statistics makes me cry",
q02 = "My friends will think I'm stupid for not being able to cope with SPSS",
q03 = "Standard deviations excite me", q04 = "I dream that Pearson is attacking me with correlation coefficients",
q05 = "I don't understand statistics", q06 = "I have little experience of computers",
q07 = "All computers hate me", q08 = "I have never been good at mathematics",
q09 = "My friends are better at statistics than me", q10 = "Computers are useful only for playing games ",
q11 = "I did badly at mathematics at school", q12 = "People try to tell you that SPSS makes statistics easier to understand but it doesn't",
q13 = "I worry that I will cause irreparable damage because of my incompetenece with computers",
q14 = "Computers have minds of their own and deliberately go wrong whenever I use them",
q15 = "Computers are out to get me", q16 = "I weep openly at the mention of central tendency",
q17 = "I slip into a coma whenever I see an equation", q18 = "SPSS always crashes when I try to use it",
q19 = "Everybody looks at me when I use SPSS", q20 = "I can't sleep for thoughts of eigen vectors",
q21 = "I wake up under my duvet thinking that I am trapped under a normal distribtion",
q22 = "My friends are better at SPSS than I am", q23 = "If I'm good at statistics my friends will think I'm a nerd"
), codepage = 65001L, row.names = c(NA, 20L), class = "data.frame")
I mostly copied another semPath model but edited it to fit the dataset I was using. First the nodes:
nodeNames <- c(
"Statistics makes me cry.",
"My friends think I'm stupid for not being able to cope with SPSS.",
"Standard deviations excite me.",
"I dream that Pearson is attacking me with correlation coefficients.",
"I don't understand statistics.",
"I have little experience with computers.",
"All computers hate me.",
"I've never been good at mathematics.",
"SPSS Anxiety"
)
Then the actual semPath:
semPaths(onefac8items_a,
what = "std", # this argument controls what the color of edges represent. In this case, standardized parameters
whatLabels = "est",
style = "lisrel",
residScale = 8,
theme = "colorblind",
manifests = paste0("q",1:8),
nCharNodes = 0,
reorder = FALSE,
nodeNames = nodeNames,
legend.cex = 0.5,
rotation = 2,
layout = "tree2",
cardinal = "lat cov",
curvePivot = TRUE,
sizeMan = 4,
sizeLat = 10,
mar = c(2,5,2,5.5),
filetype = "pdf", width = 8, height = 6, filename = "SPSS Anxiety"
)
So I really only have one question here. When I try to run my path diagram, the nodes look like they are sliding off to the right of the page. How do I fix this? Below is a picture of what I'm referring to:

Since you didn't share your model, I reproduced a dummy model. It seems semPaths doesn't allow us to adjust nodeNames, maybe you could save this graph as an object and try to reproduce with the "plot()" function in order to rescaling since semPaths has a lot of attributes.
semPaths(fit,
what = "std",
style = "lisrel",
residScale = 8,
theme = "colorblind",
nCharNodes = 4,
reorder = FALSE,
nodeNames = nodeNames,
legend.cex = 0.35,
rotation = 2,
layout = "tree2",
cardinal = "lat cov",
curvePivot = TRUE)
Or we could change the GLRatio in the plotOptions:
a<-semPaths(onefac8items_a,
what = "std",
whatLabels = "est",
style = "lisrel",
residScale = 8,
theme = "colorblind",
nCharNodes = 0,
reorder = FALSE,
nodeNames = nodeNames,
legend.cex = 0.5,
rotation = 2,
layout = "tree2",
cardinal = "lat cov",
curvePivot = TRUE,
sizeMan = 4,
sizeLat = 10,
mar = c(2,5,2,5.5)
)
a$plotOptions$GLratio<-1 # you may need to play with this number
plot(a)

I ended up just shortening my questions down in the nodes and it fixed the problem. I guess there's a limit to how much text you can put into your legend:
nodeNames <- c(
"Statistics makes me cry.",
"Friends think I'm stupid because I cant do SPSS.",
"Standard deviations excite me.",
"I dream that Pearson is attacking me with correlations.",
"I don't understand statistics.",
"I have little experience with computers.",
"All computers hate me.",
"I've never been good at mathematics.",
"SPSS Anxiety"
)

Your page isn't big enough.
There are two graphics systems in R, base and grid. The one semPaths uses is the base package which sort of mimics how you draw on a paper: first you set up the size, then you draw things; you can't go back. The other, grid, is used in lattice and ggplot2 which saves the plotting until you call for it. grid plots typically do not run off the page as base graphics can, the plots are usually scaled to fit with the plotting region.
Here is basically your problem using an example from lavaan::cfa
library('lavaan')
library('semPlot')
nodeNames <- c(
"Statistics makes me cry.",
"My friends think I'm stupid for not being able to cope with SPSS.",
"Standard deviations excite me.",
"I dream that Pearson is attacking me with correlation coefficients.",
"I don't understand statistics.",
"I have little experience with computers.",
"All computers hate me.",
"I've never been good at mathematics.",
"SPSS Anxiety"
)
?semPlot::semPaths
example(cfa)
semPaths(
fit,
what = "std", # this argument controls what the color of edges represent. In this case, standardized parameters
whatLabels = "est",
style = "lisrel",
residScale = 8,
theme = "colorblind",
# manifests = paste0("q",1:8),
nCharNodes = 0,
reorder = FALSE,
nodeNames = nodeNames,
legend.cex = 0.5,
rotation = 2,
layout = "tree2",
cardinal = "lat cov",
curvePivot = TRUE,
sizeMan = 4,
sizeLat = 10,
mar = c(2,5,2,5.5),
filetype = "pdf", width = 8, height = 6, filename = "SPSS-Anxiety"
)
I'm not sure what semPaths is doing here with the size because it is definitely not coming out 8x6
$ identify -verbose SPSS-Anxiety.pdf | grep "Print size"
8: Print size: 11.1944x6
I'm guessing it compensates for the extra features to fit everything on, but it is not doing a very good job.
The typical way to save base plots to file is
pdf() ## or png() or jpg() etc
plotting code
dev.off() ## or graphics.off() to close everything not just the current device
And to do this you need to remove the filetype part from your code
pdf('SPSS-Anxiety-2.pdf', width = 8, height = 6)
par(oma = c(0, 2, 0, 25), xpd = NA)
semPaths(
fit,
what = "std", # this argument controls what the color of edges represent. In this case, standardized parameters
whatLabels = "est",
style = "lisrel",
residScale = 8,
theme = "colorblind",
# manifests = paste0("q",1:8),
nCharNodes = 0,
reorder = FALSE,
nodeNames = nodeNames,
legend.cex = 0.5,
rotation = 2,
layout = "tree2",
cardinal = "lat cov",
curvePivot = TRUE,
sizeMan = 4,
sizeLat = 10,
mar = c(2,5,2,5.5)
)
dev.off()
Now I am getting something 8x6
$ identify -verbose SPSS-Anxiety-2.pdf | grep "Print size"
8: Print size: 8x6
I increased the size of the outer margins, oma see ?par, which gives me 2 extra lines of space on the left and 25 on the right. Also, note xpd = NA which turns off clipping, ie, anything printed outside of the plotting area will be shown--this also comes up a lot in base plots.
But this is a lot of wasted space for some text. I would either scale it down or split the text into multiple lines. You can use strwrap to split each label at white space into <= some maximum width:
par(oma = c(0, 0, 3, 0))
semPaths(
fit,
what = "std", # this argument controls what the color of edges represent. In this case, standardized parameters
whatLabels = "est",
style = "lisrel",
residScale = 8,
theme = "colorblind",
# manifests = paste0("q",1:8),
nCharNodes = 0,
reorder = FALSE,
nodeNames = sapply(nodeNames, function(x)
paste(strwrap(x, 30), collapse = '\n ')),
legend.cex = 0.5,
rotation = 2,
layout = "tree2",
cardinal = "lat cov",
curvePivot = TRUE,
sizeMan = 4,
sizeLat = 10,
mar = c(2,5,2,5.5)
)
title('Anxiety and Depression SEM Path Diagram', outer = TRUE)

Related

How can I change size of y-axis text labels on a likert() object in R?

I'm working with the likert() library to generate nice looking diverging stacked bar charts in R. Most of the formatting has come together, but I can't seem to find a way to shrink the text for the y-axis labels (e.g. "You and your family in the UK", "People in your local area..." etc.) which are too large for the plot. Any ideas here? I'm starting to wonder if I need to revert to ggplot, which will require more code, but have more customisability...
# Ingest data to make reproducible example:
climate_experience_data <- structure(list(Q25_self_and_family = c(4, 2, 3, 5, 3, 3, 4, 2,
4, 2, 4, 4, 3, 3, 2, 5, 3, 4, 1, 3, 3, 2, 4, 2, 2, 2, 4, 3, 3,
3, 2, 5, 5, 4, 2, 2, 2, 3, 1, 3, 2, 1, 2, 4, 2), Q25_local_area = c(3,
3, 3, 5, 3, 2, 4, 2, 4, 2, 4, 3, 2, 3, 2, 5, 4, 5, 1, 4, 3, 3,
4, 2, 3, 2, 3, 3, 2, 3, 2, 5, 5, 2, 2, 2, 2, 3, 1, 1, 2, 1, 2,
4, 3), Q25_uk = c(4, 3, 3, 5, 2, 3, 5, 2, 4, 2, 4, 3, 3, 3, 3,
5, 4, 5, 2, 3, 3, 2, 4, 2, 4, 3, 4, 3, 2, 4, 4, 5, 5, 4, 3, 3,
2, 4, 2, 5, 2, 2, 2, 3, 3), Q25_outside_uk = c(4, 4, 3, 5, 4,
4, 5, 2, 4, 3, 3, 3, 3, 4, 3, 5, 4, 5, 4, 3, 3, 2, 4, 2, 5, 3,
3, 2, 2, 3, 4, 4, 5, 4, 4, 3, 2, 4, 4, 5, 2, 3, 2, 2, 2)), row.names = c(NA,
-45L), class = c("tbl_df", "tbl", "data.frame"))
# load libraries:
require(tidyverse)
require(likert)
# Q25 - generate diverging stacked bar chart using likert()
q25_data <- select(climate_experience_data, Q25_self_and_family:Q25_outside_uk)
names(q25_data) <- c("You and your family in the UK", "People in your local area or city", "The UK as a whole", "Your family and/or friends living outside the UK")
# Set up levels text for question responses
q25_levels <- paste(c("not at all", "somewhat", "moderately", "very", "extremely"),
"serious")
q25_likert_table <- q25_data %>%
mutate(across(everything(),
factor, ordered = TRUE, levels = 1:5, labels=q25_levels)) %>%
as.data.frame %>%
# make plot:
plot(q25_likert_table, wrap=20, text.size=3, ordered=FALSE, low.color='#B18839', high.color='#590048') +
ggtitle(title) +
labs(title = "How serious a threat do you think \nclimate change poses to the following?", y="") +
guides(fill = guide_legend(title = NULL)) +
theme_ipsum_rc() +
theme()
Here's a sample of output:
As your plot is still a ggplot object you could adjust the size of the y axis labels via theme(axis.text.y = ...):
library(tidyverse)
library(likert)
library(hrbrthemes)
q25_likert_table <- q25_data %>%
mutate(across(everything(),
factor,
ordered = TRUE, levels = 1:5, labels = q25_levels
)) %>%
as.data.frame() %>%
likert()
plot(q25_likert_table, wrap = 20, text.size = 3, ordered = FALSE, low.color = "#B18839", high.color = "#590048") +
ggtitle(title) +
labs(title = "How serious a threat do you think \nclimate change poses to the following?", y = "") +
guides(fill = guide_legend(title = NULL)) +
theme_ipsum_rc() +
theme(axis.text.y = element_text(size = 4))

how to make a summary by category (dimension) with likert library in R

I am trying to do a question analysis but using dimensions, for example:
Dimension 1
question 1(p1):
question 2(p2):
question 3(p3):
Dimension 2
question 4(p4):
question 5(p5):
question 6(p6):
question 7(p7):
require(tibble)
tb = tibble("p1" = factor(c(5, 1, 4, 3, 2, 4, 4, 4, 2, 5),
labels = c("totally disagree","disagreement","neutral","agree","totally agree")),
"p2" = factor(c(1, 3, 2, 1, 1, 5, 3, 5, 5, 4),
labels = c("totally disagree","disagreement","neutral","agree","totally agree")),
"p3" = factor(c(3, 5, 3, 4, 3, 2, 1, 1, 1, 2),
labels = c("totally disagree","disagreement","neutral","agree","totally agree")),
"p4" = factor(c(1, 3, 5, 4, 1, 4, 2, 4, 5, 2),
labels = c("totally disagree","disagreement","neutral","agree","totally agree")),
"p5" = factor(c(5, 4, 2, 4, 2, 3, 2, 1, 3, 5),
labels = c("totally disagree","disagreement","neutral","agree","totally agree")),
"p6" = factor(c(2, 5, 1, 3, 4, 1, 3, 2, 2, 1),
labels = c("totally disagree","disagreement","neutral","agree","totally agree")),
"p7" = factor(c(2, 4, 5, 2, 5, 5, 3, 3, 1, 2),
labels = c("totally disagree","disagreement","neutral","agree","totally agree")))
now I want to make a summary but by dimensions, is there a function in the likert library for this?
for example
> likert(summary = tb$results)
Dimensions totally disagree disagree neutral agree totally agree
1 Dimension 1 17.69912 30.08850 44.24779 6.194690 1.769912
2 Dimension 2 23.89381 36.28319 30.08850 7.964602 1.769912

How to plot a rating scale in R

What is the best way to represent the following trait rating scale? I'd like to label the traits (8 traits) and degrees or each emotion (1 being low feelings, 5 being strong feelings), across the democratic and republican parties? Do I need to aggregate the items? I'm new to R and not sure how to tackle this.
Survey question and scale:
"Below is a list of feelings or moods that could be caused by an object. Please use the list below to describe how the U.S. FEDERAL parties (and its elected officials) make you feel. If the word definitely describes how a party makes you feel, then choose the number 5. If you decide that the word does not at all describe how the party makes you feel, then choose the number 1. Use the intermediate numbers between 1 and 5 to indicate responses between these two extremes."
Survey sample:
dput(df[Book3(1:nrow(df), 30),])
structure(list(TRAITDEM1 = c(3, 4, 3, 3, 3, 3, 3, 1, 2, 2, 2,
3, 3, 2, 2, 1, 1, 3, 1, 5, 1, 1, 3, 1, 4, 4, 3, 1, 2, 4), TRAITDEM2 = c(3,
1, 1, 2, 2, 2, 3, 5, 4, 2, 2, 2, 3, 3, 3, 4, 1, 2, 3, 1, 4, 5,
2, 3, 1, 1, 1, 4, 1, 2), TRAITDEM3 = c(3, 4, 4, 2, 3, 3, 3, 1,
1, 2, 2, 3, 3, 2, 2, 1, 1, 3, 1, 5, 1, 1, 3, 1, 4, 5, 4, 1, 3,
5), TRAITDEM4 = c(3, 2, 1, 2, 2, 2, 4, 5, 4, 5, 2, 3, 2, 3, 3,
4, 3, 4, 3, 1, 5, 4, 1, 4, 3, 4, 2, 4, 2, 1), TRAITDEM5 = c(3,
4, 3, 4, 4, 3, 2, 1, 1, 2, 2, 3, 4, 2, 2, 1, 1, 3, 1, 5, 1, 1,
2, 1, 4, 4, 4, 1, 3, 4), TRAITDEM6 = c(3, 1, 1, 1, 1, 1, 1, 2,
1, 1, 1, 2, 2, 2, 2, 4, 3, 1, 1, 1, 4, 5, 1, 3, 1, 1, 1, 1, 1,
1), TRAITDEM7 = c(3, 1, 3, 3, 2, 2, 1, 1, 1, 2, 3, 4, 3, 2, 2,
1, 1, 2, 2, 5, 1, 1, 1, 3, 3, 4, 2, 1, 5, 5), TRAITDEM8 = c(3,
1, 1, 1, 2, 1, 3, 5, 2, 4, 1, 1, 2, 2, 3, 1, 3, 1, 2, 1, 5, 5,
2, 2, 1, 2, 1, 2, 1, 1), TRAITREP1 = c(1, 1, 1, 1, 1, 1, 1, 1,
1, 4, 1, 1, 1, 1, 3, 1, 1, 1, 1, 1, 1, 3, 1, 1, 1, 1, 1, 1, 1,
1), TRAITREP2 = c(1, 5, 5, 5, 5, 5, 5, 2, 5, 2, 5, 5, 5, 5, 4,
5, 1, 5, 5, 5, 5, 1, 5, 4, 5, 5, 5, 3, 5, 5), TRAITREP3 = c(1,
1, 1, 1, 2, 1, 1, 2, 1, 4, 1, 1, 1, 1, 2, 1, 1, 1, 1, 1, 1, 3,
1, 1, 1, 1, 1, 1, 1, 2), TRAITREP4 = c(1, 5, 5, 1, 5, 5, 5, 3,
5, 2, 5, 4, 5, 5, 5, 5, 3, 5, 5, 5, 5, 1, 5, 3, 5, 5, 5, 4, 5,
1), TRAITREP5 = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 4, 1, 1, 1, 1, 2,
1, 1, 1, 1, 1, 1, 4, 1, 1, 1, 1, 1, 1, 1, 1), TRAITREP6 = c(1,
5, 5, 5, 3, 3, 3, 1, 1, 1, 3, 3, 5, 3, 4, 5, 3, 4, 5, 4, 5, 1,
5, 3, 4, 4, 5, 1, 1, 3), TRAITREP7 = c(1, 1, 1, 1, 2, 2, 1, 1,
1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 4, 1, 1, 1, 1, 1, 1, 1,
2), TRAITREP8 = c(1, 5, 5, 5, 4, 5, 5, 2, 5, 2, 5, 4, 5, 5, 4,
1, 3, 5, 5, 5, 5, 3, 4, 4, 5, 5, 5, 3, 5, 5), PARTYID_Strength = c(5,
1, 2, 1, 2, 1, 8, 7, 6, 3, 1, 6, 6, 1, 7, 8, 7, 1, 1, 1, 2, 4,
1, 6, 1, 1, 1, 7, 6, 8)), row.names = c(NA, -30L), class = c("tbl_df",
"tbl", "data.frame"))
"PartyID_Strength" represents 8 measures of political parties:
1 - Strong Democrat
2 - Not very strong Democrat
3 - Strong Republican
4 - Not very strong Republican
5 - Independent
6 - Independent - Democrat
7 - Independent - Republican
8 - Other
I tried it this way (graph below) but it's still not plotting the remaining four traits:
Cleaning the data
In order to solve your problem, we have to transform your data, in order to convert it into tidy format.
Observation
There are few particular problems with your original dataset:
Data are in a wide format, i.e. most of the columns from your data frame, can be represented by 3 variables;
Names of the variables are not self-explanatory. Names are in upper case which, by itself, does not hold any useful information, they are not readable and not good for typing/writing.
There is additional information we can extract from the variable names: Party and Feelings toward the Party. First one is an abbreviation ('dem' or 'rep') second one is the numerically encoded feeling towards the political party. However the order of numbers encoding the feeling does not reflect natural order of emotions from the disgust up to joy;
Variable PARTYID_Strength is numerically encoded Political Party [self-]Identification it also does not reflect natural order from strongest democrats through independent towards strongest republicans;
Plan
Convert data from wide into long format using all variables starting with TRAIT, and leaving PARTYID_Strength variable unchanged;
Extract useful information from the TRAIT... variables (Political Party, Feelings Toward the Party);
Convert all numerically encoded variables into the factors with reasonably ordered levels;
Give all variables meaningful names;
Summarize the data;
Transformations
We need to create several lookup tables, which will simplify the workflow.
Affiliation lookup table:
aff_lookup <- c(
'Strong Democrat',
'Not very strong Democrat',
'Strong Republican',
'Not very strong Republican',
'Independent',
'Independent-Democrat',
'Independent-Republican',
'Other'
)
We can further order aff_lookup by this vector:
aff_order = c(1, 2, 6, 5, 7, 4, 3, 8)
Emotions/Feelings lookup table:
emo_lookup <- c(
'Delighted',
'Angry',
'Happy',
'Annoyed',
'Joy',
'Hateful',
'Relaxed',
'Disgusted'
)
And we can order emo_lookup by this vector:
emo_order <- emo_order <- c(8, 6, 2, 4, 7, 3, 1, 5)
Political party lookup table:
party_lookup <- c(
dem = 'National Democratic Party',
rep = 'National Republican Party'
)
Finally, with all helper variables, we can transform our data into desirable form.
library(tidyverse)
dat %<>%
rename_all(tolower) %>%
pivot_longer(
cols = starts_with('trait'),
names_to = c('party', 'emotion'),
names_pattern = 'trait(dem|rep)(\\d)',
values_to = 'score'
) %>%
mutate(
party = factor(party_lookup[party]),
affiliation = factor(
aff_lookup[partyid_strength],
levels = aff_lookup[aff_order]
),
emotion = factor(
emo_lookup[as.numeric(emotion)],
levels = emo_lookup[emo_order]
)
) %>%
group_by(party, emotion, affiliation) %>%
summarise(score = median(score)) %>%
ungroup()
head(dat)
## A tibble: 6 x 4
# party emotion affiliation score
# <fct> <fct> <fct> <dbl>
#1 National Democratic Party Disgusted Strong Democrat 1
#2 National Democratic Party Disgusted Not very strong Democrat 2
#3 National Democratic Party Disgusted Independent-Democrat 2
#4 National Democratic Party Disgusted Independent 3
#5 National Democratic Party Disgusted Independent-Republican 3
#6 National Democratic Party Disgusted Not very strong Republican 5
Plot the data
Plan
Now we can plot the data, as two separate plots for Democrats and Republicans with Affiliation (Political Party Identification) on X-axis and Emotions (Feelings) on Y-axis.
Each Emotion/Affilation point is going to be represented as a bar with the height of the bar representing the Score.
We can also add color encoding to our plot. From my point of view, encoding Emotions/Feelings with a color gradient from red (Disgust) to green (Joy) could help as to gather the internal structure of our data.
Plot
dat %>%
ggplot(
aes(
x = affiliation,
y = as.numeric(emotion) + (score / max(score) * .95) / 2,
height = (score / max(score) * .95),
width = .95,
fill = emotion,
label = score
)
) +
geom_tile(show.legend = FALSE) +
geom_text(size = 3.5, color = 'gray25', alpha = .75) +
facet_wrap(~ party, scales = 'free') +
scale_fill_brewer(palette = 'RdYlGn') +
scale_y_continuous(breaks = sort(emo_order), labels = emo_lookup[emo_order]) +
labs(x = 'Affiliations', y = 'Emotions') +
ggthemes::theme_tufte() +
theme(
axis.text.x = element_text(angle = 45, hjust = 1),
axis.ticks.x = element_blank(),
axis.text.y = element_text(hjust = 0, vjust = -0.025),
axis.ticks.y = element_blank()
)
Which gives as following figure:
Explanation
There is a trick with this plot: it looks like a series of barplots, bot it is not real barplots (by the fact, not functionally).
What I do:
The core of this solution is the use of geom_tile() for each data point. It is just a rectangle (square by default) with geometrical center of mass determined by the given coordinates (Affilation, Emotion).
Both Affilation and Emotion are factors, not numerics. And it is OK for Affiliation, because we want only to position our tile according to the Affiliation it represents.
It is more complicated with Emotion, because we want to position each tile according to the Emotion it represents, but also we want to encode Score by the height of the tile.
To define the height of the tile we use height parameter within the aes(). We want our tile height to be less or equall to one (with 0.05 offset) so the tiles between let say Angry and Annoyed do not overlap. That's why we use (score / max(score) * .95 for the height parameter.
We also need to give different y-coordinates for each tile, so the center of the tile is placed not on the imaginary line representing each emotion, but half-height up. So when tile is drawn, it's center (on y-axis) is placed half-height up from the "base line" and the tile extends half-height up and down, creating a fake barplot. That's what the following line of code does as.numeric(emotion) + (score / max(score) * .95) / 2.
We also give a tile a fixed width of .95 by width = .95, file the tile with Red-Yellow-Green gradient and lable each tile with the relevant Score.
The rest are just decorations. However, note how we relable the Y-axis. Because, as it defined in aes() it is continuous scale, but we want to make it fake discrete axis we use this row:
scale_y_continuous(breaks = sort(emo_order), labels = emo_lookup[emo_order])
Here we just use our emo_order to say that we want breaks for integers from 1 to 8, and after that we label this breaks with feelings from ordered emo_lookup table.

How do I get multiple subplot in rcharts?

I have two lines as code below need layout in two row, how to do it in rcharts?
h1 <- Highcharts$new()
h1$chart(type = "line")
h1$series(data = c(1, 3, 2, 4, 5, 4, 6, 2, 3, 5, NA), dashStyle = "longdash")
h1$series(data = c(NA, 4, 1, 3, 4, 2, 9, 1, 2, 3, 4), dashStyle = "shortdot")
h1$legend(symbolWidth = 80)
h1
enter image description here

use R to remove header (6 lines) from .asc file (ESRI ascii grid) and export

I have over 800 .asc files (ESRI ascii grids) that each have a header consisting of 6 lines, then the raster data separated by spaces. Here is a small file as an example. I read it in using read.asciigrid (sp package).
new("SpatialGridDataFrame"
, data = structure(list(mydata.asc = c(4, 4, 4, 4, 3, 4, 4, 4, 1, 1, 1, 1, 1, 4, 4, 4, 4, 3, 4, 4, 4, 1, 1, 1, 1, 1, 4, 4, 4, 4, 3, 4,
4, 4, 1, 1, 1, 1, 1, 4, 4, 4, 4, 3, 4, 4, 4, 6, 1, 1, 1, 1, 4, 4, 4,
4, 3, 4, 4, 4, 6, 1, 1, 1, 1, 4, 4, 4, 4, 3, 4, 4, 4, 6, 1, 1, 1, 1,
4, 4, 4, 4, 4, 4, 4, 4, 6, 6, 1, 1, 1, 4, 4, 4, 4, 4, 4, 4, 4, 6, 6,
1, 1, 1, 4, 4, 4, 4, 4, 4, 4, 4, 4, 1, 1, 1, 1, 4, 4, 4, 4, 4, 4, 4,
4, 4, 4, 6, 6, 6, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 6)), .Names =
"mydata.asc", row.names = c(NA,
-143L), class = "data.frame")
, grid = new("GridTopology"
, cellcentre.offset = c(394984.42630274, 2671265.4912109)
, cellsize = c(25, 25)
, cells.dim = c(13L, 11L) )
, bbox = structure(c(394971.92630274, 2671252.9912109, 395296.92630274,
2671527.9912109), .Dim = c(2L, 2L), .Dimnames = list(NULL, c("min", "max")))
, proj4string = new("CRS"
, projargs = NA_character_ ) )
Here is what the file looks like if you view it with a text editor.
Here are the steps I would like to do
1) read in file
2) remove first 6 lines (header)
3) save file back out as .asc file with the same filename but in a different location
Of course, I'd like to do this to 800 files, but if I can figure out how to do this for one file, I should be able to write a function to loop through all files.
Thanks for any help.
-al
UPDATE:
This is the final code that worked for me, thanks to #Luca Braglia.
Set working directory
setwd("c:/temp/hdr/ascii")
newdir <- "c:/temp/hdr/ascii_no_hdr/"
files <- dir(pattern="*.asc")
for (my.file in files){
i <- read.table(my.file,skip=6,sep="")
write.table(i,file=paste(newdir,my.file,sep=""),sep="",row.names=FALSE,col.names=FALSE)
}
I didn't want the col and row names. A very simple and effective piece of code.
You can list all files, within a for loop read them all (using skip option of read.table)
## you are in the directory with your asc files
files <- dir(pattern="*.asc")
# loop
for (my.file in files) {
i <- read.table(my.file, skip = 6, sep = " ")
# change names here if you don't want V1, V2 ...
write.table(i, file = paste("new_dir", my.file, sep = "/"),
sep = " ", row.names = FALSE)
}

Resources