I would like to use the Likert package and also to group by variable and plot the result. The problem is that I have different levels in the variabels I want to visualise. Is there a way around this ?
A simple example to illustrate my problem:
library(reshape)
library(likert)
foo <- data.frame(car = rep(c("Toyota", "BMW", "Ford"), times = 10),
satisfaction = c(1,3,4,7,7,6,2,3,5,5,5,2,4,1,7),
quality = c(1,1,3,5,4,3,6,4,3,6,6,1,7,2,7),
loyalty = c(1,1,3,5,4,3,9,4,3,10,6,1,7,2,8) )
foo[1:4] <- lapply(foo[1:4], as.factor)
likt <- likert(foo[,c(2:4)], grouping = foo$car)
plot(likt)
error message:
Error in likert(foo[, c(2:4)], grouping = foo$car) :
All items (columns) must have the same number of levels
Same as first answer, but now as a function of group.
foo[2:4] <- lapply(foo[2:4], factor, levels=1:10)
likt <- likert(foo[,c(2:4)], grouping = foo$car)
plot(likt)
Well I can't add a comment until I get more reputation points so I'm breaking the "responding to other answers" guidance - but I wouldn't want other R newbies like me to waste the time I just have figuring out that the line in the original question:
library(reshape)
breaks the answer provided by Ruthger.
So the code you need to generate Ruthger's plot is just (I tested this with R 3.3.1 having follow the likert installation instructions at the bottom of https://github.com/jbryer/likert):
library(likert)
foo <- data.frame(car = rep(c("Toyota", "BMW", "Ford"), times = 10),
satisfaction = c(1,3,4,7,7,6,2,3,5,5,5,2,4,1,7),
quality = c(1,1,3,5,4,3,6,4,3,6,6,1,7,2,7),
loyalty = c(1,1,3,5,4,3,9,4,3,10,6,1,7,2,8) )
foo[2:4] <- lapply(foo[2:4], factor, levels=1:10)
likt <- likert(foo[,c(2:4)], grouping = foo$car)
plot(likt)
Your underlying levels are in reality the same, you just have to tell your data frame that they exist:
foo[2:4] <- lapply(foo[2:4], factor, levels=1:9)
Then you can plot. (But how the grouping argumnent works remains a mystery - it's not clear from the help of that package.
likt <- likert(foo[,c(2:4)])
plot(likt)
Related
I have some survey data where people were asked questions and given a yes or no option (1=yes, 0=no). I would like to be able to pick out some patterns in this data.
The questions are:
Do you enjoy XX work?
Do you do XX work alone?
Has your workload increased?
Do you have a backlog of work?
I would like to know whether people who work alone are more likely to have an increased workload, a backlog of work and not enjoy their job. To answer this, I think factor analysis is the way to go but I'm struggling to interpret the output.
Here is an example of my data:
enjoy <- c(1,1,0,1,0,1,0,0,0,1)
alone <- c(0,0,1,1,1,0,0,1,1,0)
workload <- c(0,0,1,1,0,1,0,0,0,1)
backlog <- c(0,0,1,1,0,1,0,0,0,0)
data <- data.frame(enjoy, alone, workload, backlog)
data <- data %>% mutate_if(sapply(data, is.numeric), as.character) ## convert from numeric to categorical
I'm using the FAMD function in factomineR as this can use categorical data.
library(FactoMineR)
data_famd <- FAMD(data, graph = FALSE)
Then using factoextra, I can see which variables contribute to each axis
library(factoextra)
# Contribution to the first dimension
fviz_contrib(data_famd, "var", axes = 1) ## backlog & workload
# Contribution to the second dimension
fviz_contrib(data_famd, "var", axes = 2) ## enjoy and alone
Then I can make this plot:
fviz_mfa_ind(data_famd,
habillage = "alone", # color by groups
palette = c("#00AFBB", "#E7B800", "#FC4E07"),
addEllipses = TRUE, ellipse.type = "confidence",
repel = TRUE) # Avoid text overlapping
This looks like that people who work alone vs not alone answer questions differently. But I don't understand what answers people who work alone (yellow) are giving vs people who don't work alone. They are clearly distinct so are doing something differently.
My main question is: What do the axes mean? I've done PCA's using continuous data before and using the loadings I can figure out what the axes mean, and therefore interpret these graphs. How do you do this for a factor analysis? Is there a different package?
Thanks for any help.
I am trying to change the order in which my Likert items are being plotted with the Likert package and so far I haven't been very successful. Let's consider the following minimal code to reproduce my error. I would like to see plotting with a specific custom (or at least Question 1 to Question 4, top to bottom) ordering. I have tried several things (based on some questions and answers on here), and all failed.
First the data:
question1<- c(1,5,3,4,1,1,1,3,4,5)
question2<- rev(c(1,5,3,4,1,1,1,3,4,5))
question3<- c(1,1,1,2,2,2,3,3,4,5)
question4<- c(5,5,5,4,4,4,3,3,2,1)
testData<-data.frame(question1,question2,question3,question4)
testData <- lapply(testData, factor, levels= c(1:5), ordered = TRUE)
testData <- as.data.frame(testData)
Then plotting attempt #1:
p <- (likert(testData))
plot(p)
Gives me the following:
Plotting attempt number 2 (close enough but order is reversed and this does not give me a solution for any random ordering):
p <- (likert(testData))
plot(p, ordered=FALSE)
Gives me this:
Plotting attempt #3:
p <- (likert(testData))
p$Item <- factor(p$Item, levels = rev(c("question1", "question2", "question3", "question4")))
plot(p)
Also does not work. Would anyone have any idea how to solve this?
Thanks in advance.
Ok turns out I found the answer, doing the following works:
p <- (likert(testData))
plot(p, group.order = c("question1", "question2", "question3", "question4"))
I am using the R programming language and I am new the GGally library. I followed some basic tutorials online and ran the following code:
#load libraries
library(GGally)
library(survival)
library(plotly)
I changed some of the data types:
#manipulate the data
data(lung)
data = lung
data$sex = as.factor(data$sex)
data$status = as.factor(data$status)
data$ph.ecog = as.factor(data$ph.ecog)
Now I visualize:
#make the plots
#I dont know why, but this comes out messy
ggparcoord(data, groupColumn = "sex")
#Cleaner
ggparcoord(data)
Both ggparcoord() code segments successfully ran, however the first one came out pretty messy (the axis labels seem to have been corrupted). Is there a way to fix the labels?
In the second graph, it makes it difficult to tell how the factor variables are labelled on their respective axis (e.g. for the "sex" column, is "male" the bottom point or is "female" the bottom type). Does anyone know if there is a way to fix this?
Finally, is there a way to use the "ggplotly()" function for "ggally" objects?
e.g.
a = ggparcoord(data)
ggplotly(a)
Thanks
Looks like your data columns get converted to a factor when adding the groupColumn. To prevent that you could exclude the groupColumn from the columns to be plotted:
BTW: Not sure about the general case. But at least for ggparcoord ggplotly works.
library(GGally)
library(survival)
data(lung)
data = lung
data$sex = as.factor(data$sex)
data$status = as.factor(data$status)
data$ph.ecog = as.factor(data$ph.ecog)
#I dont know why, but this comes out messy
ggparcoord(data, seq(ncol(data))[!names(data) %in% "sex"], groupColumn = "sex")
I am a beginner at this and am really lost about it.
I would like to create a horizon chart that shows the percentage change in sales for the different towns using ggplot2 and R. Would anyone guide me in the approach I can take to create the chart?
The data that I have looks like this.
This is the type of chart I would like to do.
(source: https://harmoniccode.blogspot.com/2017/11/friday-fun-li-horizon-charts.html)
Thanks in advance for any help given!
Edit: here's a sample code of the data:
x <- data.frame(
"town" =c('sad','sad','sad','sad','happy','happy','happy','happy'),
"month"=c("2017-01","2017-02","2017-03","2017-04","2017-01","2017-02","2017-03","2017-04"),
"median_sales" = c(336500,355000,375000,395000,359000,361500,36000,375000),
"percentage_change" = c(NA,5.4977712,5.6338028,5.3333333,NA,0.6963788,-0.4149378, 4.1666667
))
x <-
x %>%
mutate(month = floor_date(as_date(as.yearmon(month)), "month"))
It would be helpful to give an example that will result in a reasonable plot, and to provide your example data as data rather than an image.
If you google 'horizon plot' the first answer should give you what you need.
Here is a simple example based on the data you gave:
library(latticeExtra)
sales.ts <- ts(matrix(sales$median_sales, ncol=2), names = c("sad", "happy"),
start = c(2017, 1), frequency = 365)
horizonplot(sales.ts)
I think this is correctly presenting your results, but again hard to tell as you haven't given a realistic dataset.
UPDATE: based on the data provided, this is the answer. Again, as you've only provided one time point a horizonplot is probably not what you want. They are designed to plot time series.
x.ts <- ts(matrix(x$median_sales, ncol=2), names = c("sad", "happy"),
start = c(2015, 1), frequency = 12)
horizonplot(x.ts)
I conducted a survey with 138 questions on it, of which only a few are likert type questions with some having different scales.
I have been trying to use the Likert package in R to analyze and graphically portray the data, however, I am seriously struggling to make sense of any of it.
I have gone through the "demos" which are only useful if you already know what is going on with the package. It doesn't explain any of the steps you have to take before being able to apply the likert package, what can actually be applied to the package, how you rename the variables etc.. All you get is a bunch of code and a rabbit hole to crawl down trying to figure it all out.
I have scoured google for a step by step guide to using the likert package but found nothing.
Can anyone please direct me to a guide or at least perhaps provide the steps I have to take with my dataframe before I can try to use the likert package?
I am hoping to fit a few of my columns(containing the likert responses) to stacked barplots using this package.
Once I figure out what exactly the Likert package will accept in terms of a cleaned up data frame, I should be able to follow the demo... maybe..
This is what I have done so far, based on my limited knowledge of R and trying to figure things out on my own.
library(likert)
library(dplyr)
fdaff_likert <- select(f2f, RESPID, daff_rate)
fdaff_likert <- data.frame(fdaff_likert)
fdaff_likert <- likert(items=fdaff_likert[,2, drop = FALSE], nlevels = 5)
the output of my likert is:
summary(fdaff_likert)
Item low neutral high mean sd
1 daff_rate 9.977827 37.91574 52.10643 3.802661 1.302508
The plot, however, is all over the place.. (unordered)
plot (fdaff_likert)
The likert scale is out of order and not properly centered. In addition, how do I rename the y-axis to the question?
For later analysis, how can I break it up into the group levels (based on another column specifying a region in the original data frame?
library(likert)
set.seed(1)
n <- 138
# An illustrative example
fdaff_likert <- data.frame(
RESPID=sample(1:5,n, replace=T),
daff_rate=factor(sample(1:5,n, replace=T), labels=c("Good","Neither","Poor","Very Good","Very Poor"))
)
fdaff_likert1 <- likert(items=fdaff_likert[,2, drop = FALSE], nlevels = 5)
# Plot with unordered categories
plot(fdaff_likert1)
# Reorder levels of daff_rate factor
fdaff_likert$daff_rate <- factor(fdaff_likert$daff_rate,
levels=levels(fdaff_likert$daff_rate)[c(5,3,2,1,4)])
fdaff_likert2 <- likert(items=fdaff_likert[,2, drop = FALSE], nlevels = 5)
# Plot with ordered categories
plot(fdaff_likert2)
Here is an illustrative example for creating a plot with grouped items.
set.seed(1)
fdaff_likert <- data.frame(
country=factor(sample(1:3, n, replace=T), labels=c("US","Mexico","Canada")),
item1=factor(sample(1:5,n, replace=T), labels=c("Very Poor","Poor","Neither","Good","Very Good")),
item2=factor(sample(1:5,n, replace=T), labels=c("Very Poor","Poor","Neither","Good","Very Good")),
item3=factor(sample(1:5,n, replace=T), labels=c("Very Poor","Poor","Neither","Good","Very Good"))
)
names(fdaff_likert) <- c("Country",
"1. I read only if I have to",
"2. Reading is one of my favorite hobbies",
"3. I find it hard to finish books")
fdaff_likert3 <- likert(items=fdaff_likert[,2:4], grouping=fdaff_likert[,1])
plot(fdaff_likert3)