Traminer: Mean time barplot with number of observations - r

Because I am still new to TraMineR, my problem may seem trivial to most of you. I'm working on meantime plots with my data and would I like to plot on the bar charts the mean time spent in different states. is there a command in TramineR?

The option to add bar labels on the mean time plot has been implemented in version TraMineR v 2.2-3. The option is available through the arguments bar.labels, cex.barlab, and offset.barlab of the plot method for the outcome of seqmeant. These arguments can be passed as ... arguments to seqmtplot. In this latter case, when groups are specified, bar.labels should be a matrix with the labels for each group in columns.
I show, using the actcal data, how to display the meant times over the bars. The group is here sex, but can of course be your clusters.
library(TraMineR)
data(actcal)
## We use only a sample of 300 cases
set.seed(1)
actcal <- actcal[sample(nrow(actcal),300),]
actcal.lab <- c("> 37 hours", "19-36 hours", "1-18 hours", "no work")
actcal.seq <- seqdef(actcal,13:24,labels=actcal.lab)
group <- factor(actcal$sex)
blab <- NULL
for (i in 1:length(levels(group))){
blab <- cbind(blab,seqmeant(actcal.seq[group==levels(group)[i],]))
}
seqmtplot(actcal.seq, group=group,
bar.labels = round(blab,digits=2), cex.barlab=1.2)

Related

R - Drawing multiple boxplots of different variables with the same scale / index in the same plot

Lets say I have 2 Variables with 20 data points each (0,1,2 or 3) and I want to plot boxplots of each of the variables but so that they share the y-axis in a diagram. How do I do this easily?
Writing boxplot(var1,var2,data = mydata) didn't work...
You should provide reproducible data and you should tell us what error message(s) you received with your code. When a function does not perform as expected, it is good to read the manual page (?boxplot). Here are some made-up data similar to yours:
set.seed(42)
mydata <- data.frame(var1=sample(0:3, 20, replace=TRUE), var2=sample(0:3, 20, replace=TRUE))
Then the box plot is just
boxplot(mydata) # Or boxplot(mydata[, c("var1", "var2")]) if you are excluding other columns

grouping without additional packages

I'm using R to plot my data, but am unable to install packages for the moment as my workplace has put up a lot of firewalls (currently trying to get IT to get them down).
In the meantime, I was wondering if by using the plot() function I was able to plot my data in groups.
I have three variables in my data: IDName, Value, and Setpoints.
I wanted to aggregate my values for each setpoint thus I used the aggregate() function although this will aggregate all data for each setpoint, whereby I only want it to aggregate depending on the IDName. All forms of grouping seem to require a package, thus I was wondering if anyone knew any workarounds.
I've supplied the code below (note that the R script is within PowerBI, but for the purposes of my question only R expertise is needed). It would also be great if you know how to colour these points accordingly to each IDName.
# dataset <- data.frame(IDName, Value, Setpoints)
# dataset <- unique(dataset)
# Paste or type your script code here:
dat <- aggregate(Value ~ Setpoints, dataset, mean)
x <- dat$Value
y <- dat$Setpoints
z <- dataset$IDName
plot(x,y, main ="Turbidity Frequency Distribution",xlab="% Time < Turbidity level", ylab="Turbidity (NTU)")
lines(spline(x,y))

Scatter plot in R doesn't use the x values in the variable indicated in the plot statement

I am trying to make a scatter plot in R between two numeric variables, and it uses the observation number as the x variable. This is the problem I'm trying to fix: I would like to have a scatter plot that uses the values of the x variable I indicated in the plot statement.
Yes, both the X variable and the Y variable are numeric.
I've attached a screenshot showing the data setup (Galton height data), the fact that the father and son variables are both numeric, and the resulting plot.
Here's the code that sets up the data and runs the scatter plot:
#install.packages("dplyr")
library('dplyr')
#tidyverse is name of package used for class
library(tidyverse)
remove.packages('HistData')
install.packages('HistData')
library(HistData)
data("GaltonFamilies")
childNum <- galton_heights[,6]
gender <- galton_heights[,8]
#Different code to get son height
#If we wanted to follow the lesson exactly, we would
#use the following
son_data <- GaltonFamilies[GaltonFamilies$gender == "male" & GaltonFamilies$childNum == 1,]
son <- son_data$childHeight
#Now we can compare the oldest child's height (if they happen to be male) with that of the father:
GaltonFamilies %>% summarize(mean(father), sd(father), mean(son), sd(son))
GaltonFamilies$father2 <- as.numeric(GaltonFamilies$father)
#galton_heights$father <- as.numeric(levels(galton_heights$father))[galton_heights$father]
plot(GaltonFamilies$father,GaltonFamilies$son)
plot(GaltonFamilies$father2, GaltonFamilies$son, main="Scatterplot Example",
xlab="Father ", ylab="Son ")
Edit: the filter statement creating son_data wasn't working when I ran the above code fresh. I don't know why. I've replaced it with a way to get son_data without the filter.
son_data <- GaltonFamilies[GaltonFamilies$gender == "male" & GaltonFamilies$childNum == 1,]
There is no GaltonFamilies$son. See also: Random data added when using `plot` in R

Using Likert Package in R for analyzing real survey data

I conducted a survey with 138 questions on it, of which only a few are likert type questions with some having different scales.
I have been trying to use the Likert package in R to analyze and graphically portray the data, however, I am seriously struggling to make sense of any of it.
I have gone through the "demos" which are only useful if you already know what is going on with the package. It doesn't explain any of the steps you have to take before being able to apply the likert package, what can actually be applied to the package, how you rename the variables etc.. All you get is a bunch of code and a rabbit hole to crawl down trying to figure it all out.
I have scoured google for a step by step guide to using the likert package but found nothing.
Can anyone please direct me to a guide or at least perhaps provide the steps I have to take with my dataframe before I can try to use the likert package?
I am hoping to fit a few of my columns(containing the likert responses) to stacked barplots using this package.
Once I figure out what exactly the Likert package will accept in terms of a cleaned up data frame, I should be able to follow the demo... maybe..
This is what I have done so far, based on my limited knowledge of R and trying to figure things out on my own.
library(likert)
library(dplyr)
fdaff_likert <- select(f2f, RESPID, daff_rate)
fdaff_likert <- data.frame(fdaff_likert)
fdaff_likert <- likert(items=fdaff_likert[,2, drop = FALSE], nlevels = 5)
the output of my likert is:
summary(fdaff_likert)
Item low neutral high mean sd
1 daff_rate 9.977827 37.91574 52.10643 3.802661 1.302508
The plot, however, is all over the place.. (unordered)
plot (fdaff_likert)
The likert scale is out of order and not properly centered. In addition, how do I rename the y-axis to the question?
For later analysis, how can I break it up into the group levels (based on another column specifying a region in the original data frame?
library(likert)
set.seed(1)
n <- 138
# An illustrative example
fdaff_likert <- data.frame(
RESPID=sample(1:5,n, replace=T),
daff_rate=factor(sample(1:5,n, replace=T), labels=c("Good","Neither","Poor","Very Good","Very Poor"))
)
fdaff_likert1 <- likert(items=fdaff_likert[,2, drop = FALSE], nlevels = 5)
# Plot with unordered categories
plot(fdaff_likert1)
# Reorder levels of daff_rate factor
fdaff_likert$daff_rate <- factor(fdaff_likert$daff_rate,
levels=levels(fdaff_likert$daff_rate)[c(5,3,2,1,4)])
fdaff_likert2 <- likert(items=fdaff_likert[,2, drop = FALSE], nlevels = 5)
# Plot with ordered categories
plot(fdaff_likert2)
Here is an illustrative example for creating a plot with grouped items.
set.seed(1)
fdaff_likert <- data.frame(
country=factor(sample(1:3, n, replace=T), labels=c("US","Mexico","Canada")),
item1=factor(sample(1:5,n, replace=T), labels=c("Very Poor","Poor","Neither","Good","Very Good")),
item2=factor(sample(1:5,n, replace=T), labels=c("Very Poor","Poor","Neither","Good","Very Good")),
item3=factor(sample(1:5,n, replace=T), labels=c("Very Poor","Poor","Neither","Good","Very Good"))
)
names(fdaff_likert) <- c("Country",
"1. I read only if I have to",
"2. Reading is one of my favorite hobbies",
"3. I find it hard to finish books")
fdaff_likert3 <- likert(items=fdaff_likert[,2:4], grouping=fdaff_likert[,1])
plot(fdaff_likert3)

Combining dotplot R

Im trying to combine two plots into the same plot in R.
My code looks like this:
#----------------------------------------------------------------------------------------#
# RING data: Mikkel
#----------------------------------------------------------------------------------------#
# Set working directory
setwd("/Users/mikkelastrup/Dropbox/Master/RING R")
#### Read data & Converting factors ####
dat <- read.table("R SUM kopi.txt", header=TRUE)
str(dat)
dat$Vial <- as.factor(dat$Vial)
dat$Line <- as.factor(dat$Line)
dat$rep <- as.factor(dat$rep)
dat$fly <- as.factor(dat$fly)
str(dat)
mtdata <- droplevels(dat[dat$Line=="20",])
mt1data <- droplevels(mtdata[mtdata$rep=="1",])
tdata <- melt(mt1data, id=c("rep","Conc","Sex","Line","Vial", "fly"))
tdata$variable <- as.factor(tdata$variable)
tfdata <- droplevels(tdata[tdata$Sex=="f",])
tmdata <- droplevels(tdata[tdata$Sex=="m",])
####Plotting####
d1 <- dotplot(tfdata$value~tdata$variable|tdata$Conc,
main="Y Position over time Line 20 Female",
xlab="Time", ylab="mm above buttom")
d2 <- dotplot(tmdata$value~tdata$variable|tdata$Conc,
main="Y Position over time Line 20 Male",
xlab="Time", ylab="mm above buttom")
grid.arrange(d1,d2,ncol=2)
And that looks like this:
Im trying to combine it into one plot, with two different colors for male and female, i have tried to write it into one dotplot separated by a , and or () but that dosen't work and when i dont split the data and use tdata instead of tfdata and tfmdata i get all the dots in the same color. Im open to suggestions, using another package or another way of plotting the data that still looks somewhat like this since im new to R
All you need to do is to use the group parameter.
dotplot(value~variable|Conc, group=Sex, data=tdata,
main="Y Position over time Line 20 All",
xlab="Time", ylab="mm above buttom")
Also, don't use the $ notation in these functions; notice that you're using value from tfdata but value and variable from tdata. This is a problem because there's twice as many rows in tdata! Instead, use the data argument to specify which data frame to get the variables from.

Resources