Change order of plotting in Likert Scale - r

I am trying to change the order in which my Likert items are being plotted with the Likert package and so far I haven't been very successful. Let's consider the following minimal code to reproduce my error. I would like to see plotting with a specific custom (or at least Question 1 to Question 4, top to bottom) ordering. I have tried several things (based on some questions and answers on here), and all failed.
First the data:
question1<- c(1,5,3,4,1,1,1,3,4,5)
question2<- rev(c(1,5,3,4,1,1,1,3,4,5))
question3<- c(1,1,1,2,2,2,3,3,4,5)
question4<- c(5,5,5,4,4,4,3,3,2,1)
testData<-data.frame(question1,question2,question3,question4)
testData <- lapply(testData, factor, levels= c(1:5), ordered = TRUE)
testData <- as.data.frame(testData)
Then plotting attempt #1:
p <- (likert(testData))
plot(p)
Gives me the following:
Plotting attempt number 2 (close enough but order is reversed and this does not give me a solution for any random ordering):
p <- (likert(testData))
plot(p, ordered=FALSE)
Gives me this:
Plotting attempt #3:
p <- (likert(testData))
p$Item <- factor(p$Item, levels = rev(c("question1", "question2", "question3", "question4")))
plot(p)
Also does not work. Would anyone have any idea how to solve this?
Thanks in advance.

Ok turns out I found the answer, doing the following works:
p <- (likert(testData))
plot(p, group.order = c("question1", "question2", "question3", "question4"))

Related

Plot cut dendrogram with class labels

In the following example:
hc <- hclust(dist(mtcars))
hcd <- as.dendrogram((hc))
hcut4 <- cutree(hc,h=200)
class(hcut4)
plot(hcd,ylim=c(190,450))
I'd like to add the labels of the classes.
I can do:
hcd4 <- cut(hcd,h=200)$upper
plot(hcd4)
Besides the fact labels are oddly shifted, does the numbering
of the branches from cut() always correspond to the classes in hcut4?
In this case, they do:
hcd4cut <- cutree(hcd4, h=200)
hcd4cut
But is this the general case?
The example using dendextend (Label and color leaf dendrogram in r) is nice
library(dendextend)
colorCodes <- c("red","green","blue","cyan")
labels_colors(hcd) <- colorCodes[hcut4][order.dendrogram(hcd)]
plot(hcd)
Unfortunately, I always have many individuals, so plotting individuals is rarely a useful option for me.
I can do:
hcd <- as.dendrogram((hc))
hcd4 <- cut(hcd,h=200)$upper
and I can add colors
hcd4cut <- cutree(hcd4, h=200)
labels_colors(hcd4) <- colorCodes[hcd4cut][order.dendrogram(hcd4)]
plot(hcd4)
but the following does not work:
plot(hcd4,labels=hcd4cut)
Is there a better way to plot the cut dendrogram labelling branches
according to the classes (consistent with the result of cutree())?
This is an example of what I would need (class labels edited on the picture),
but note that the problem is that I do not know if the labels are actually at the right branch:

How to cut a dendrogram in r

Okay so I'm sure this has been asked before but I can't find a nice answer anywhere after many hours of searching.
I have some data, I run a classification then I make a dendrogram.
The problem has to do with aesthetics, specifically; (1) how to cut according to the number of groups (in this example I want 3), (2) make the group labels aligned with the branches of the trees, (2) Re-scale so that there aren't any huge gaps between the groups
More on (3). I have dataset which is very species rich and there would be ~1000 groups without cutting. If I cut at say 3, the tree has some branches on the right and one 'miles' off to the right which I would want to re-scale so that its closer. All of this is possible via external programs but I want to do it all in r!
Bonus points if you can put an average silhouette width plot nested into the top right of this plot
Here is example using iris data
library(ggplot2)
data(iris)
df = data.frame(iris)
df$Species = NULL
ED10 = vegdist(df,method="euclidean")
EucWard_10 = hclust(ED10,method="ward.D2")
hcd_ward10 = as.dendrogram(EucWard_10)
plot(hcd_ward10)
plot(cut(hcd_ward10, h = 10)$upper, main = "Upper tree of cut at h=75")
I suspect what you would want to look at is the dendextend R package (it also has a paper in bioinformatics).
I am not fully sure about your question on (3), since I am not sure I understand what rescaling means. What I can tell you is that you can do quite a lot of dendextend. Here is a quick example for coloring the branches and labels for 3 groups.
library(ggplot2)
library(vegan)
data(iris)
df = data.frame(iris)
df$Species = NULL
library(vegan)
ED10 = vegdist(df,method="euclidean")
EucWard_10 = hclust(ED10,method="ward.D2")
hcd_ward10 = as.dendrogram(EucWard_10)
plot(hcd_ward10)
install.packages("dendextend")
library(dendextend)
dend <- hcd_ward10
dend <- color_branches(dend, k = 3)
dend <- color_labels(dend, k = 3)
plot(dend)
You can also get an interactive dendrogram by using plotly (ggplot method is available through dendextend):
library(plotly)
library(ggplot2)
p <- ggplot(dend)
ggplotly(p)

Likert grouping with different levels in R

I would like to use the Likert package and also to group by variable and plot the result. The problem is that I have different levels in the variabels I want to visualise. Is there a way around this ?
A simple example to illustrate my problem:
library(reshape)
library(likert)
foo <- data.frame(car = rep(c("Toyota", "BMW", "Ford"), times = 10),
satisfaction = c(1,3,4,7,7,6,2,3,5,5,5,2,4,1,7),
quality = c(1,1,3,5,4,3,6,4,3,6,6,1,7,2,7),
loyalty = c(1,1,3,5,4,3,9,4,3,10,6,1,7,2,8) )
foo[1:4] <- lapply(foo[1:4], as.factor)
likt <- likert(foo[,c(2:4)], grouping = foo$car)
plot(likt)
error message:
Error in likert(foo[, c(2:4)], grouping = foo$car) :
All items (columns) must have the same number of levels
Same as first answer, but now as a function of group.
foo[2:4] <- lapply(foo[2:4], factor, levels=1:10)
likt <- likert(foo[,c(2:4)], grouping = foo$car)
plot(likt)
Well I can't add a comment until I get more reputation points so I'm breaking the "responding to other answers" guidance - but I wouldn't want other R newbies like me to waste the time I just have figuring out that the line in the original question:
library(reshape)
breaks the answer provided by Ruthger.
So the code you need to generate Ruthger's plot is just (I tested this with R 3.3.1 having follow the likert installation instructions at the bottom of https://github.com/jbryer/likert):
library(likert)
foo <- data.frame(car = rep(c("Toyota", "BMW", "Ford"), times = 10),
satisfaction = c(1,3,4,7,7,6,2,3,5,5,5,2,4,1,7),
quality = c(1,1,3,5,4,3,6,4,3,6,6,1,7,2,7),
loyalty = c(1,1,3,5,4,3,9,4,3,10,6,1,7,2,8) )
foo[2:4] <- lapply(foo[2:4], factor, levels=1:10)
likt <- likert(foo[,c(2:4)], grouping = foo$car)
plot(likt)
Your underlying levels are in reality the same, you just have to tell your data frame that they exist:
foo[2:4] <- lapply(foo[2:4], factor, levels=1:9)
Then you can plot. (But how the grouping argumnent works remains a mystery - it's not clear from the help of that package.
likt <- likert(foo[,c(2:4)])
plot(likt)

function lines() is not working

I have a problem with the function lines.
this is what I have written so far:
model.ew<-lm(Empl~Wage)
summary(model.ew)
plot(Empl,Wage)
mean<-1:500
lw<-1:500
up<-1:500
for(i in 1:500){
mean[i]<-predict(model.ew,data.frame(Wage=i*100),interval="confidence",level=0.90)[1]
lw[i]<-predict(model.ew,data.frame(Wage=i*100),interval="confidence",level=0.90)[2]
up[i]<-predict(model.ew,data.frame(Wage=i*100),interval="confidence",level=0.90)[3]
}
plot(Wage,Empl)
lines(mean,type="l",col="red")
lines(up,type="l",col="blue")
lines(lw,type="l",col="blue")
my problem i s that no line appears on my plot and I cannot figure out why.
Can somebody help me?
You really need to read some introductory manuals for R. Go to this page, and select one that illustrates using R for linear regression: http://cran.r-project.org/other-docs.html
First we need to make some data:
set.seed(42)
Wage <- rnorm(100, 50)
Empl <- Wage + rnorm(100, 0)
Now we run your regression and plot the lines:
model.ew <- lm(Empl~Wage)
summary(model.ew)
plot(Empl~Wage) # Note. You had the axes flipped here
Your first problem was that you flipped the axes. The dependent variable (Empl) goes on the vertical axis. That is the main reason you didn't get any lines on the plot. To get the prediction lines requires no loops at all and only a single plot call using matlines():
xval <- seq(min(Wage), max(Wage), length.out=101)
conf <- predict(model.ew, data.frame(Wage=xval),
interval="confidence", level=.90)
matlines(xval, conf, col=c("red", "blue", "blue"))
That's all there is to it.

How to plot a violin scatter boxplot (in R)?

I just came by the following plot:
And wondered how can it be done in R? (or other softwares)
Update 10.03.11: Thank you everyone who participated in answering this question - you gave wonderful solutions! I've compiled all the solution presented here (as well as some others I've came by online) in a post on my blog.
Make.Funny.Plot does more or less what I think it should do. To be adapted according to your own needs, and might be optimized a bit, but this should be a nice start.
Make.Funny.Plot <- function(x){
unique.vals <- length(unique(x))
N <- length(x)
N.val <- min(N/20,unique.vals)
if(unique.vals>N.val){
x <- ave(x,cut(x,N.val),FUN=min)
x <- signif(x,4)
}
# construct the outline of the plot
outline <- as.vector(table(x))
outline <- outline/max(outline)
# determine some correction to make the V shape,
# based on the range
y.corr <- diff(range(x))*0.05
# Get the unique values
yval <- sort(unique(x))
plot(c(-1,1),c(min(yval),max(yval)),
type="n",xaxt="n",xlab="")
for(i in 1:length(yval)){
n <- sum(x==yval[i])
x.plot <- seq(-outline[i],outline[i],length=n)
y.plot <- yval[i]+abs(x.plot)*y.corr
points(x.plot,y.plot,pch=19,cex=0.5)
}
}
N <- 500
x <- rpois(N,4)+abs(rnorm(N))
Make.Funny.Plot(x)
EDIT : corrected so it always works.
I recently came upon the beeswarm package, that bears some similarity.
The bee swarm plot is a
one-dimensional scatter plot like
"stripchart", but with closely-packed,
non-overlapping points.
Here's an example:
library(beeswarm)
beeswarm(time_survival ~ event_survival, data = breast,
method = 'smile',
pch = 16, pwcol = as.numeric(ER),
xlab = '', ylab = 'Follow-up time (months)',
labels = c('Censored', 'Metastasis'))
legend('topright', legend = levels(breast$ER),
title = 'ER', pch = 16, col = 1:2)
(source: eklund at www.cbs.dtu.dk)
I have come up with the code similar to Joris, still I think this is more than a stem plot; here I mean that they y value in each series is a absolute value of a distance to the in-bin mean, and x value is more about whether the value is lower or higher than mean.
Example code (sometimes throws warnings but works):
px<-function(x,N=40,...){
x<-sort(x);
#Cutting in bins
cut(x,N)->p;
#Calculate the means over bins
sapply(levels(p),function(i) mean(x[p==i]))->meansl;
means<-meansl[p];
#Calculate the mins over bins
sapply(levels(p),function(i) min(x[p==i]))->minl;
mins<-minl[p];
#Each dot is one value.
#X is an order of a value inside bin, moved so that the values lower than bin mean go below 0
X<-rep(0,length(x));
for(e in levels(p)) X[p==e]<-(1:sum(p==e))-1-sum((x-means)[p==e]<0);
#Y is a bin minum + absolute value of a difference between value and its bin mean
plot(X,mins+abs(x-means),pch=19,cex=0.5,...);
}
Try the vioplot package:
library(vioplot)
vioplot(rnorm(100))
(with awful default color ;-)
There is also wvioplot() in the wvioplot package, for weighted violin plot, and beanplot, which combines violin and rug plots. They are also available through the lattice package, see ?panel.violin.
Since this hasn't been mentioned yet, there is also ggbeeswarm as a relatively new R package based on ggplot2.
Which adds another geom to ggplot to be used instead of geom_jitter or the like.
In particular geom_quasirandom (see second example below) produces really good results and I have in fact adapted it as default plot.
Noteworthy is also the package vipor (VIolin POints in R) which produces plots using the standard R graphics and is in fact also used by ggbeeswarm behind the scenes.
set.seed(12345)
install.packages('ggbeeswarm')
library(ggplot2)
library(ggbeeswarm)
ggplot(iris,aes(Species, Sepal.Length)) + geom_beeswarm()
ggplot(iris,aes(Species, Sepal.Length)) + geom_quasirandom()
#compare to jitter
ggplot(iris,aes(Species, Sepal.Length)) + geom_jitter()

Resources