Using Likert Package in R for analyzing real survey data - r

I conducted a survey with 138 questions on it, of which only a few are likert type questions with some having different scales.
I have been trying to use the Likert package in R to analyze and graphically portray the data, however, I am seriously struggling to make sense of any of it.
I have gone through the "demos" which are only useful if you already know what is going on with the package. It doesn't explain any of the steps you have to take before being able to apply the likert package, what can actually be applied to the package, how you rename the variables etc.. All you get is a bunch of code and a rabbit hole to crawl down trying to figure it all out.
I have scoured google for a step by step guide to using the likert package but found nothing.
Can anyone please direct me to a guide or at least perhaps provide the steps I have to take with my dataframe before I can try to use the likert package?
I am hoping to fit a few of my columns(containing the likert responses) to stacked barplots using this package.
Once I figure out what exactly the Likert package will accept in terms of a cleaned up data frame, I should be able to follow the demo... maybe..
This is what I have done so far, based on my limited knowledge of R and trying to figure things out on my own.
library(likert)
library(dplyr)
fdaff_likert <- select(f2f, RESPID, daff_rate)
fdaff_likert <- data.frame(fdaff_likert)
fdaff_likert <- likert(items=fdaff_likert[,2, drop = FALSE], nlevels = 5)
the output of my likert is:
summary(fdaff_likert)
Item low neutral high mean sd
1 daff_rate 9.977827 37.91574 52.10643 3.802661 1.302508
The plot, however, is all over the place.. (unordered)
plot (fdaff_likert)
The likert scale is out of order and not properly centered. In addition, how do I rename the y-axis to the question?
For later analysis, how can I break it up into the group levels (based on another column specifying a region in the original data frame?

library(likert)
set.seed(1)
n <- 138
# An illustrative example
fdaff_likert <- data.frame(
RESPID=sample(1:5,n, replace=T),
daff_rate=factor(sample(1:5,n, replace=T), labels=c("Good","Neither","Poor","Very Good","Very Poor"))
)
fdaff_likert1 <- likert(items=fdaff_likert[,2, drop = FALSE], nlevels = 5)
# Plot with unordered categories
plot(fdaff_likert1)
# Reorder levels of daff_rate factor
fdaff_likert$daff_rate <- factor(fdaff_likert$daff_rate,
levels=levels(fdaff_likert$daff_rate)[c(5,3,2,1,4)])
fdaff_likert2 <- likert(items=fdaff_likert[,2, drop = FALSE], nlevels = 5)
# Plot with ordered categories
plot(fdaff_likert2)
Here is an illustrative example for creating a plot with grouped items.
set.seed(1)
fdaff_likert <- data.frame(
country=factor(sample(1:3, n, replace=T), labels=c("US","Mexico","Canada")),
item1=factor(sample(1:5,n, replace=T), labels=c("Very Poor","Poor","Neither","Good","Very Good")),
item2=factor(sample(1:5,n, replace=T), labels=c("Very Poor","Poor","Neither","Good","Very Good")),
item3=factor(sample(1:5,n, replace=T), labels=c("Very Poor","Poor","Neither","Good","Very Good"))
)
names(fdaff_likert) <- c("Country",
"1. I read only if I have to",
"2. Reading is one of my favorite hobbies",
"3. I find it hard to finish books")
fdaff_likert3 <- likert(items=fdaff_likert[,2:4], grouping=fdaff_likert[,1])
plot(fdaff_likert3)

Related

Is there a way to add species to an ISOMAP plot in R?

I am using the isomap-function from vegan package in R to analyse community data of epiphytic mosses and lichens. I started analysing the data using NMDS but due to the structure of the data ran into problems which is why I switched to ISOMAP which works perfectly well and returns very nice results. So far so good... However, the output of the function does not support plotting of species within the ISOMAP plot as species scores are not available. Anyway, I would really like to add species information to enhance the interpretability of the output.
Does anyone of you has a solution or hint to this problem? Is there a way to add species kind of post hoc to the plot as it can be done with environmental data?
I would greatly appreciate any help on this topic!
Thank you and best regards,
Inga
No, there is no function to add species scores to isomap. It would look like this:
`sppscores<-.isomap` <-
function(object, value)
{
value <- scale(value, center = TRUE, scale = FALSE)
v <- crossprod(value, object$points)
attr(v, "data") <- deparse(substitute(value))
object$species <- v
object
}
Or alternatively:
`sppscores<-.isomap` <-
function(object, value)
{
wa <- vegan::wascores(object$points, value, expand = TRUE)
attr(wa, "data") <- deparse(substitute(value))
object$species <- wa
object
}
If ord is your isomap result and comm are your community data, you can use these as:
sppscores(ord) <- comm # either alternative
I have no idea (yet) which of these alternatives is more correct. The first adds species scores as vectors of their linear increase, the second as their weighted averages in ordination space, but expanded so that we allow some species be more extreme than the site units where they occur.
These will add new element species to the result object ord. However, using these in vegan would need more coding, but you can extract the species scores with vegan::scores, but their scaling is based on the original scale of community data, and may be badly scaled with respect to points of site units, and working on this would require more work. However, you can plot them separately, or then multiply with a constant giving similar scaling as site unit scores.
sp <- scores(ord, display="species", choices=1:2)
plot(sp, type = "n", asp = 1) # does not allow plotting text
text(sp, labels = rownames(sp)) # so we must add text

Understanding the output from a factor analysis using the FAMD function

I have some survey data where people were asked questions and given a yes or no option (1=yes, 0=no). I would like to be able to pick out some patterns in this data.
The questions are:
Do you enjoy XX work?
Do you do XX work alone?
Has your workload increased?
Do you have a backlog of work?
I would like to know whether people who work alone are more likely to have an increased workload, a backlog of work and not enjoy their job. To answer this, I think factor analysis is the way to go but I'm struggling to interpret the output.
Here is an example of my data:
enjoy <- c(1,1,0,1,0,1,0,0,0,1)
alone <- c(0,0,1,1,1,0,0,1,1,0)
workload <- c(0,0,1,1,0,1,0,0,0,1)
backlog <- c(0,0,1,1,0,1,0,0,0,0)
data <- data.frame(enjoy, alone, workload, backlog)
data <- data %>% mutate_if(sapply(data, is.numeric), as.character) ## convert from numeric to categorical
I'm using the FAMD function in factomineR as this can use categorical data.
library(FactoMineR)
data_famd <- FAMD(data, graph = FALSE)
Then using factoextra, I can see which variables contribute to each axis
library(factoextra)
# Contribution to the first dimension
fviz_contrib(data_famd, "var", axes = 1) ## backlog & workload
# Contribution to the second dimension
fviz_contrib(data_famd, "var", axes = 2) ## enjoy and alone
Then I can make this plot:
fviz_mfa_ind(data_famd,
habillage = "alone", # color by groups
palette = c("#00AFBB", "#E7B800", "#FC4E07"),
addEllipses = TRUE, ellipse.type = "confidence",
repel = TRUE) # Avoid text overlapping
This looks like that people who work alone vs not alone answer questions differently. But I don't understand what answers people who work alone (yellow) are giving vs people who don't work alone. They are clearly distinct so are doing something differently.
My main question is: What do the axes mean? I've done PCA's using continuous data before and using the loadings I can figure out what the axes mean, and therefore interpret these graphs. How do you do this for a factor analysis? Is there a different package?
Thanks for any help.

R - Drawing multiple boxplots of different variables with the same scale / index in the same plot

Lets say I have 2 Variables with 20 data points each (0,1,2 or 3) and I want to plot boxplots of each of the variables but so that they share the y-axis in a diagram. How do I do this easily?
Writing boxplot(var1,var2,data = mydata) didn't work...
You should provide reproducible data and you should tell us what error message(s) you received with your code. When a function does not perform as expected, it is good to read the manual page (?boxplot). Here are some made-up data similar to yours:
set.seed(42)
mydata <- data.frame(var1=sample(0:3, 20, replace=TRUE), var2=sample(0:3, 20, replace=TRUE))
Then the box plot is just
boxplot(mydata) # Or boxplot(mydata[, c("var1", "var2")]) if you are excluding other columns

R: Cleaning GGally Plots

I am using the R programming language and I am new the GGally library. I followed some basic tutorials online and ran the following code:
#load libraries
library(GGally)
library(survival)
library(plotly)
I changed some of the data types:
#manipulate the data
data(lung)
data = lung
data$sex = as.factor(data$sex)
data$status = as.factor(data$status)
data$ph.ecog = as.factor(data$ph.ecog)
Now I visualize:
#make the plots
#I dont know why, but this comes out messy
ggparcoord(data, groupColumn = "sex")
#Cleaner
ggparcoord(data)
Both ggparcoord() code segments successfully ran, however the first one came out pretty messy (the axis labels seem to have been corrupted). Is there a way to fix the labels?
In the second graph, it makes it difficult to tell how the factor variables are labelled on their respective axis (e.g. for the "sex" column, is "male" the bottom point or is "female" the bottom type). Does anyone know if there is a way to fix this?
Finally, is there a way to use the "ggplotly()" function for "ggally" objects?
e.g.
a = ggparcoord(data)
ggplotly(a)
Thanks
Looks like your data columns get converted to a factor when adding the groupColumn. To prevent that you could exclude the groupColumn from the columns to be plotted:
BTW: Not sure about the general case. But at least for ggparcoord ggplotly works.
library(GGally)
library(survival)
data(lung)
data = lung
data$sex = as.factor(data$sex)
data$status = as.factor(data$status)
data$ph.ecog = as.factor(data$ph.ecog)
#I dont know why, but this comes out messy
ggparcoord(data, seq(ncol(data))[!names(data) %in% "sex"], groupColumn = "sex")

Likert grouping with different levels in R

I would like to use the Likert package and also to group by variable and plot the result. The problem is that I have different levels in the variabels I want to visualise. Is there a way around this ?
A simple example to illustrate my problem:
library(reshape)
library(likert)
foo <- data.frame(car = rep(c("Toyota", "BMW", "Ford"), times = 10),
satisfaction = c(1,3,4,7,7,6,2,3,5,5,5,2,4,1,7),
quality = c(1,1,3,5,4,3,6,4,3,6,6,1,7,2,7),
loyalty = c(1,1,3,5,4,3,9,4,3,10,6,1,7,2,8) )
foo[1:4] <- lapply(foo[1:4], as.factor)
likt <- likert(foo[,c(2:4)], grouping = foo$car)
plot(likt)
error message:
Error in likert(foo[, c(2:4)], grouping = foo$car) :
All items (columns) must have the same number of levels
Same as first answer, but now as a function of group.
foo[2:4] <- lapply(foo[2:4], factor, levels=1:10)
likt <- likert(foo[,c(2:4)], grouping = foo$car)
plot(likt)
Well I can't add a comment until I get more reputation points so I'm breaking the "responding to other answers" guidance - but I wouldn't want other R newbies like me to waste the time I just have figuring out that the line in the original question:
library(reshape)
breaks the answer provided by Ruthger.
So the code you need to generate Ruthger's plot is just (I tested this with R 3.3.1 having follow the likert installation instructions at the bottom of https://github.com/jbryer/likert):
library(likert)
foo <- data.frame(car = rep(c("Toyota", "BMW", "Ford"), times = 10),
satisfaction = c(1,3,4,7,7,6,2,3,5,5,5,2,4,1,7),
quality = c(1,1,3,5,4,3,6,4,3,6,6,1,7,2,7),
loyalty = c(1,1,3,5,4,3,9,4,3,10,6,1,7,2,8) )
foo[2:4] <- lapply(foo[2:4], factor, levels=1:10)
likt <- likert(foo[,c(2:4)], grouping = foo$car)
plot(likt)
Your underlying levels are in reality the same, you just have to tell your data frame that they exist:
foo[2:4] <- lapply(foo[2:4], factor, levels=1:9)
Then you can plot. (But how the grouping argumnent works remains a mystery - it's not clear from the help of that package.
likt <- likert(foo[,c(2:4)])
plot(likt)

Resources