table1() Output Labeling all Data as "Missing" - r

I am trying to make a descriptive statistics table in R and my code functions properly (producing a table) but despite the fact that I have no missing values in my dataset, the table outputs all of my values as missing. I am still a novice in R, so I do not have a broad enough knowledge base to troubleshoot.
My code:
data <- read_excel("Data.xlsx")
data$stage <-
factor(data$stage, levels=c(1,2,3,4,5,6,7),
labels =c("Stage 0", "Stage 1", "Stage 2", "Stage 3", "Unsure", "Unsure (Early Stage)", "Unsure (Late Stage"))
data$primary_language <-factor(data$primary_language, levels=c(1,2), labels = c("Spanish", "English"))
data$status_zipcode <- factor(data$status_zipcode, levels = (1:3), labels = c("Minority", "Majority", "Diverse"))
data$status_censusblock <- factor(data$status_censusblock, levels = c(0:2), labels = c("Minority", "Majority", "Diverse"))
data$self_identity <- factor(data$self_identity, levels = c(0:1), labels = c("Hispanic/Latina","White/Caucasian"))
data$subjective_identity <- factor(data$subjective_identity, levels = c(0,1,2,4), labels = c("Hispanic/Latina", "White/Caucasian", "Multiracial", "Asian"))
label (data$stage)<- "Stage at Diagnosis"
label(data$age) <- "Age"
label(data$primary_language) <- "Primary language"
label(data$status_zipcode)<- "Demographic Status in Zipcode Area"
label(data$status_censusblock)<- "Demographic Status in Census Block Group"
label(data$self_identity) <- "Self-Identified Racial/Ethnic Group"
label(data$subjective_identity)<- "Racial/Ethnic Group as Identified by Others"
table1(~ stage +age + primary_language + status_zipcode + status_censusblock + self_identity + subjective_identity| primary_language, data=data)
Table output:
enter image description here
Data set:
enter image description here

When I run the data set the values are there. It actually worked for me when I re-did the spacing:
data$stage <- factor(data$stage,
levels = c(1,2,3,4,5,6,7),
labels = c("Stage 0", "Stage 1", "Stage 2", "Stage 3", "Unsure", "Unsure (Early Stage)", "Unsure (Late Stage"))
When I did it exactly as you typed it came up with NA's, too. Try the first and see if it works for you that way. Then check the spacing for the others. That may be all it is.
I do end up with one NA on the stage column because 0 is not defined in your levels.
Edit: Ran the rest so here are some other points.
You end up with an NA in stage because one of your values is 0 but it's not defined with a label
You end up with NA's in language because you have a 0 and a 1 but you define it as 1, 2. So you'd need to change to the values. You end up with NA's in other portions because of the :
Change your code to this and you should have the values you need except that initial 0 in "stage":
data$stage <- factor(data$stage,
levels=c(1,2,3,4,5,6,7),
labels =c("Stage 0", "Stage 1", "Stage 2", "Stage 3", "Unsure", "Unsure (Early Stage)", "Unsure (Late Stage"))
data$primary_language <-factor(data$primary_language,
levels=c(0,1),
labels = c("Spanish", "English"))
data$status_zipcode <- factor(data$status_zipcode,
levels = c(0,1,2),
labels = c("Minority", "Majority", "Diverse"))
data$status_censusblock <- factor(data$status_censusblock,
levels = c(0,1,2),
labels = c("Minority", "Majority", "Diverse"))
data$self_identity <- factor(data$self_identity,
levels = c(0,1),
labels = c("Hispanic/Latina","White/Caucasian"))
data$subjective_identity <- factor(data$subjective_identity,
levels = c(0,1,2,4),
labels = c("Hispanic/Latina", "White/Caucasian", "Multiracial", "Asian"))
enter image description here

Related

Meta analysis: transform forest plot output to percentage

I am a new R user. I am trying to transform proportions to percentages on a forest plot I have generated using metaprop.
I have looked here Quick question about transforming proportions to percentages - forest function in R and at the link this post refers to.
mytransf = function(x)
(x) * 100
studies <- c("Study 1", "Study 2", "Study 3")
obs <- c(104, 101,79670)
denom <- c(1146, 2613, 147766)
m1 <- metaprop(obs, denom, studies, comb.random=FALSE,
byseparator=": ")
forest(m1, print.tau2 = FALSE, col.by="black", text.fixed = "Total number of events",
text.fixed.w = "Subtotal", rightcols = c("effect","ci"),
leftlabs=c("Study","Events","Total"),
xlim=c(0,0.7),
transf=mytransf)
The output is remains as proportions, not as percentages. I tried "atransf" as well. Is anyone able to please help me with this? This is what I can generate currently: picture of output
You can use the pscale option of metaprop:
library(meta)
studies <- c("Study 1", "Study 2", "Study 3")
obs <- c(104, 101,79670)
denom <- c(1146, 2613, 147766)
m1 <- metaprop(obs, denom, studies, comb.random=FALSE,
byseparator=": ",
pscale=100)
forest(m1, print.tau2 = FALSE, col.by="black",
text.fixed = "Total number of events",
text.fixed.w = "Subtotal",
rightlabs = c("Prop. (%)","[95% CI]"),
leftlabs=c("Study","Events","Total"),
xlim=c(0,70))

Function for label variable before plotting in R

I hope, this question is not too easy for this forum (actually, I'm almost a bit embarrassed to ask this question here, but I'm struggeling with this small issue the whole day...)
I have dataframes look like the following:
df <- data.frame(runif(4),
c("po", "pr", "po", "pr"),
c("Control 1","Control 1", "Treatment 1", "Treatment 1"))
names(df) <- list("values", "test_type", "group")
Now, I want easliy re-label the variables "test_type" and "group" for the plot afterwards. (it's nicer to read "pretest" instead of "pr" in a presentation :-) )
I could do it manually with:
df$test_type <- factor(df$test_type,
levels = c("pr", "po"),
labels = c("pretest", "posttest"))
df$group <- factor(df$group,
levels = c("Control 1", "Treatment 1"),
labels = c("control", "EST"))
In this case, I would have to repeat this for a lot more dataframes, which lead me to write a function:
var_label <- function(df, test, groups){
# Create labels
df$test_type <- factor(df$test,
levels = c("pr", "po"),
labels = c("pretest", "posttest"))
df$group <- factor(df$groups,
levels = c("Control 1", "Treatment 1"),
labels = c("control", "EST"))
return(list(df$test_type, df$group))
}
Unfortunately, this doesn't work. I tried a lot slight different versions and also different command from the Hmisc package, but none of these worked. I know, I can solve this problem in another way, but I try to write more efficient and shorter codes and would be really interested, what I have to change to make this function work. Or even better do you have a suggestion for a more efficient way?
Thank you a lot in advance!!
As I mentioned above, I think forcats::fct_relabel() is what you want here, along with dplyr::mutate_at(). Assuming that your relabeling needs are no more complex than what has been outlined in your question, the following should get you what you appear to be looking for.
####BEGIN YOUR DATAFRAME CREATION####
df <- data.frame(runif(4),
c("po", "pr", "po", "pr"),
c("Control 1","Control 1", "Treatment 1", "Treatment 1"))
names(df) <- list("values", "test_type", "group")
#####END YOUR DATAFRAME CREATION#####
# Load dplyr and forcats
library(dplyr)
library(forcats)
# create a map of labels and levels based on your implied logic
# the setup is label = level
label_map <- c("pretest" = "pr"
,"posttest" = "po"
,"control" = "Control 1"
,"EST" = "Treatment 1")
# create a function to exploit the label map
fct_label_select <- function(x, map) {
names(which(map == x))
}
# create a function which is responsive to a character vector
# as required by fct_relabel
fct_relabeler <- function(x, map) {
unlist(lapply(x, fct_label_select, map = map))
}
fct_relabeler(levels(df$test_type), map = label_map)
# function to meet your apparent needs
var_label <- function(df, cols, map){
df %>%
mutate_at(.vars = cols
,.fun = fct_relabeler
,map = map)
}
var_label(df = df, cols = c("test_type", "group"), map = label_map)
# values test_type group
# 1 0.05159681 posttest control
# 2 0.89050323 pretest control
# 3 0.42988881 posttest EST
# 4 0.32012811 pretest EST

Create empty data frame

I created a completely empty matrix. I would like to split a observation in 2 indices (like in Excel).
Indices <- matrix(NA, 8, 2)
rownames(Indices) <- rownames(Indices, do.NULL = FALSE, prefix = "Plot") # brauche ich das?
rownames(Indices) <- c("Plot 1", "Plot 2", "Plot 3", "Plot 8", "Plot 9", "Plot 10",
"Plot 12", "Plot 13")
colnames(Indices) <- c("Density", "Trees per ha")
I would like to split Densityone time in Density only Oaks and Density total. I have no idea how to call this, and is this even possible in R?

ggplot2 geom_text - 'dynamically' place label over barchart

I have what I know is going to be an impossibly easy question. I am showing an average number of days by month using a bar chart, using the following example:
dat <- structure(list(Days = c("217.00", "120.00", "180.00", "183.00",
"187.00", "192.00"), Amt = c("1,786.84", "1,996.53",
"1,943.23", "321.30", "2,957.03", "1,124.32"), Month = c(201309L,
201309L, 201309L, 201310L, 201309L, 201309L), Vendor = c("Comp A",
"Comp A", "Comp A", "Comp A", "Comp A",
"Comp A"), Type = c("Full", "Full",
"Self", "Self", "Self", "Self"
), ProjectName = c("Rpt 8",
"Rpt 8", "Rpt 8",
"Rpt 8", "Rpt 8",
"Rpt 8")), .Names = c("Days",
"Amt", "Month", "Vendor", "Type", "ProjectName"
), row.names = c("558", "561", "860", "1157", "1179", "1221"), class =
"data.frame")
ggplot(dat, aes(x=as.character(Month),y=as.numeric(Days),fill=Type))+
stat_summary(fun.y='mean', geom = 'bar')+
ggtitle('Rpt 8')+
xlab('Month')+
ylab('Average Days')+
geom_text(stat='bin',aes(y=100, label=paste('Avg:\n',..count..)))
Right now my labels are showing counts & showing up where ever i designate y.
I want to:
place labels at the top of the bars.
display the average, not the count.
I've pretty thoroughly - and unsuccessfully - tried most of the other solutions on SO & elsewhere.
Just got it:
means<-ddply(dat,.(Vendor,Type,Month), summarise, avg=mean(as.numeric(Days)))
ggplot(dat, aes(x=as.character(Month),y=as.numeric(Days),fill=Type))+
stat_summary(fun.y='mean', geom = 'bar')+
geom_text(data = means, stat='identity',
aes(y=avg+7, label=round(avg,0),group=Type))
i realize there is code nearly identical to this sitting elsewhere. my error came in placing the round's 0 outside the correct closing parenthesis -- thus moving all my labels to 0 on x axis... DUH!

Change direction of axis marks in a barplot

I need the following barplot with special axis marks. I tried for while but have difficulties getting it to work. Especially my axis-labels need to change their direction. I know that I have to use axTicks, axis and barplot commands. Anyone with an idea?
How it should look like:
Here is my data:
bpsamplevalues<-structure(c(21.3389252731795, 18.9930828477016, 19.4378755546201,
22.1009743407998, 23.8099463895258, 18.9706355343085, 19.4619810121121,
19.3433394825869, 26.8760997862876, 19.0948710373689), .Names = c("Div 1",
"Div 2", "Div 3", "Div 4", "Div 5", "Div 6", "Div 7", "Div 8",
"Div 9", "Div 10"))
I started with this code but I can not find a solution to get further:
barplot(bpsamplevalues, col="#87DEE1", axes=F, names.arg=F)
You may try this. It is the las argument which sets the orientation of axes labels. See ?par for more information.
barplot(bpsamplevalues, col = "#87DEE1", axes = FALSE, las = 2)
axis(side = 2, tick = FALSE, las = 1)
grid(nx = NA, ny = NULL, col = "white", lty = "solid")

Resources