Shiny : R: wilcox.test : change "value" -Output - r

I am working on a tiny shiny app, that allows the user to upload some data and run some statistics on it. I now have two things I would like to change in the output.
However, as you can see, the data part is quite saying nothing, I rather have the output: data: Group1 and Group2
I also would like to have n as the number of subjects(here: 109,114,115).
This is the code to produce the current output:
statsPaired <- function(boolean_dt1, boolean_dt2){
for (i in 4: (length(boolean_dt1))){
print(wilcox.test( (unlist(boolean_dt1[,i, with = FALSE])), unlist(boolean_dt2[,i, with = FALSE]) , paired = T, exact= FALSE))
cat("\n")
}
}

Related

How does R Markdown automatically format print effects into dataframes? Or how can I access special print methods?

I'm working with the WRS2 package and there are cases where it'll output its analysis (bwtrim) into a list with a special class of the analysis type class = "bwtrim". I can't as.data.frame() it, but I found that there is a custom print method called print.bwtrim associated with it.
As an example let's say this is the output: bwtrim.out <- bwtrim(...). When I run the analysis output in an Rmarkdown chunk, it seems to "steal" part of the text output and make it into a dataframe.
So here's my question, how can I either access print.bwtrim or how does R markdown automatically format certain outputs into dataframes? Because I'd like to take this outputted dataframe and use it for other purposes.
Update: Here is a minimally working example -- put the following in a chunk in Rmd file."
```{r}
library(WRS2)
df <-
data.frame(
subject = rep(c(1:100), each = 2),
group = rep(c("treatment", "control"), each = 2),
timepoint = rep(c("pre", "post"), times = 2),
dv = rnorm(200, mean = 2)
)
analysis <- WRS2::bwtrim(dv ~ group * timepoint,
id = subject,
data = df,
tr = .2)
analysis
```
With this, a data.frame automatically shows up in the chunk afterwards and it shows all the values very nicely. My main question is how can I get this data.frame for my own uses. Because if you do str(analysis), you see that it's a list. If you do class(analysis) you get "bwtrim". if you do methods(class = "bwtrim"), you get the print method. And methods(print) will have a line that says print.bwtrim*. But I can't seem to figure out how to call print.bwtrim myself.
Regarding what Rmarkdown is doing, compare the following
If you run this in a chunk, it actually steals the data.frame part and puts it into a separate figure.
```{r}
capture.output(analysis)
```
However, if you run the same line in the console, the entire output comes out properly. What's also interesting is that if you try to assign it to another object, the output will be stolen before it can be assigned.
Compare x when you run the following in either a chunk or the console.
```{r}
x<-capture.output(analysis)
```
This is what I get from the chunk approach when I call x
[1] "Call:"
[2] "WRS2::bwtrim(formula = dv ~ group * timepoint, id = subject, "
[3] " data = df, tr = 0.2)"
[4] ""
[5] ""
This is what I get when I do it all in the console
[1] "Call:"
[2] "WRS2::bwtrim(formula = dv ~ group * timepoint, id = subject, "
[3] " data = df, tr = 0.2)"
[4] ""
[5] " value df1 df2 p.value"
[6] "group 1.0397 1 56.2774 0.3123"
[7] "timepoint 0.0001 1 57.8269 0.9904"
[8] "group:timepoint 0.5316 1 57.8269 0.4689"
[9] ""
My question is what can I call whatever Rstudio/Rmarkdown is doing to make data.frames, so that I can have an easy data.frame myself?
Update 2: This is probably not a bug, as discussed here https://github.com/rstudio/rmarkdown/issues/1150.
Update 3: You can access the method by using WRS2:::bwtrim(analysis), though I'm still interested in what Rmarkdown is doing.
Update 4: It might not be the case that Rmarkdown is stealing the output and automatically making dataframes from it, as you can see when you call x after you've already captured the output. Looking at WRS2:::print.bwtrim, it prints a dataframe that it creates, which I'm guessing Rmarkdown recognizes then formats it out.
See below for the print.bwtrim.
function (x, ...)
{
cat("Call:\n")
print(x$call)
cat("\n")
dfx <- data.frame(value = c(x$Qa, x$Qb, x$Qab), df1 = c(x$A.df[1],
x$B.df[1], x$AB.df[1]), df2 = c(x$A.df[2], x$B.df[2],
x$AB.df[2]), p.value = c(x$A.p.value, x$B.p.value, x$AB.p.value))
rownames(dfx) <- c(x$varnames[2], x$varnames[3], paste0(x$varnames[2],
":", x$varnames[3]))
dfx <- round(dfx, 4)
print(dfx)
cat("\n")
}
<bytecode: 0x000001f587dc6078>
<environment: namespace:WRS2>
In R Markdown documents, automatic printing is done by knitr::knit_print rather than print. I don't think there's a knit_print.bwtrim method defined, so it will use the default method, which is defined as
function (x, ..., inline = FALSE)
{
if (inline)
x
else normal_print(x)
}
and normal_print will call print().
You are asking why the output is different. I don't see that when I knit the document to html_document, but I do see it with html_notebook. I don't know the details of what is being done, but if you look at https://rmarkdown.rstudio.com/r_notebook_format.html you can see a discussion of "output source functions", which manipulate chunks to produce different output.
The fancy output you're seeing looks a lot like what knitr::knit_print does for a dataframe, so maybe html_notebook is substituting that in place of print.

Loop in R through variable names with values as endings and create new variables from the result

I have 24 variables called empl_1 -empl_24 (e.g. empl_2; empl_3..)
I would like to write a loop in R that takes this values 1-24 and puts them in the respective places so the corresponding variables are either called or created with i = 1-24. The sample below shows what I would like to have within the loop (e.g. ye1- ye24; ipw_atet_1 - ipw_atet_14 and so on.
ye1_ipw <- empl$empl_1[insample==1]
ipw_atet_1 <- treatweight(y=ye1_ipw, d=treat_ipw, x=x1_ipw, ATET =TRUE, trim=0.05, boot = 2)
ipw_atet_1
ipw_atet_1$se
ye2_ipw <- empl$empl_2[insample==1]
ipw_atet_2 <- treatweight(y=ye2_ipw, d=treat_ipw, x=x1_ipw, ATET =TRUE, trim=0.05, boot = 2)
ipw_atet_2
ipw_atet_2$se
ye3_ipw <- empl$empl_3[insample==1]
ipw_atet_3 <- treatweight(y=ye3_ipw, d=treat_ipw, x=x1_ipw, ATET =TRUE, trim=0.05, boot = 2)
ipw_atet_3
ipw_atet_3$se
coming from a Stata environment I tried
for (i in seq_anlong(empl_list)){
ye[i]_ipw <- empl$empl_[i][insample==1]
ipw_atet_[i]<-treatweight(y=ye[i]_ipw, d=treat_ipw, x=x1_ipw, ATET=TRUE, trim=0.05, boot =2
}
However this does not work at all. Do you have any idea how to approach this problem by writing a nice loop? Thank you so much for your help =)
You can try with lapply :
result <- lapply(empl[paste0('empl_', 1:24)], function(x)
treatweight(y = x[insample==1], d = treat_ipw,
x = x1_ipw, ATET = TRUE, trim = 0.05, boot = 2))
result would be a list output storing the data of all the 24 variables in same object which is easier to manage and process instead of having different vectors.

How to continuously send data from LabVIEW to R? (code help)

I am trying to bring real time data from LabVIEW (vibration of a bearing and temperature) into an app written in R to create a control chart. It works for a while but eventually crashes with the following error message:
Error in aggregate.data.frame(B, list(rep(1:(nrow(B)%/%n + 1), each = n, :
no rows to aggregate
The process works as LabVIEW takes the data and projects it onto two Excel files. Those files are read in the R code and used to project a control chart in R. The process succeeds for some time, and the failure moment is not always the same amount of time. Sometimes the control chart will run for 6-7 min, other times is will crash in 2 min.
My suspicion is that the Excel files are not being updated fast enough, so the R code tries to read that Excel file when it is empty.
Any suggestions would be great! thank you!
I have tried to lower the sample size taken per second. That did not work.
getwd()
setwd("C:/Users/johnd/Desktop/R Data")
while(1) {
A = fread("C:/Users/johnd/Desktop/R Data/a1.csv" , skip = 4 , header = FALSE , col.names = c("t1","B2","t2","AM","t3","M","t4","B1"))
t1 = A$t1
B2 = A$B2
t2 = A$t2
AM = A$AM
t3 = A$t3
M = A$M
t4 = A$t4
B1 = A$B1
B = fread("C:/Users/johnd/Desktop/R Data/b1.csv" , skip = 4 , header = FALSE , col.names = c("T1","small","T2","big"))
T1 = B$T1
small = B$small
T2 = B$T2
big = B$big
DJ1 = A[seq(1,nrow(A),1),c('t1','B2','AM','M','B1')]
DJ1
n = 16
DJ2 = aggregate(B,list(rep(1:(nrow(B)%/%n+1),each=n,len=nrow(B))),mean)[-1]
DJ2
#------------------------------------------------------------------------
DJ6 = cbind(DJ1[,'B1'],DJ2[,c('small','big')]) # creates matrix for these three indicators
DJ6
#--------------T2 Hand made---------------------------------------------------------------------
new_B1 = DJ6[,'B1']
new_small = DJ6[,'small'] ### decompose the DJ6 matrix into vectors for each indicator(temperature, big & small accelerometers)
new_big = DJ6[,'big']
new_B1
new_small
new_big
mean_B1 = as.numeric(colMeans(DJ6[,'B1']))
mean_small = as.numeric(colMeans(DJ6[,'small'])) ##decomposes into vectors of type numeric
mean_big = as.numeric(colMeans(DJ6[,'big']))
cov_inv = data.matrix(solve(cov(DJ6))) # obtain inverse covariance matrix
cov_inv
p = ncol(DJ6) #changed to pull number of parameters by taking the number of coumns in OG matrix #p=3 # #ofQuality Characteristics
m=64 # #of samples (10 seconds of data)
a_alpha = 0.99
f= qf(a_alpha , df1 = p,df2 = (m-p)) ### calculates the F-Statistic for our data
f
UCL = (p*(m+1)*(m-1)*(f))/(m*(m-p)) ###produces upper control limit
UCL
diff_B1 = new_B1-mean_B1
diff_small = new_small-mean_small
diff_big = new_big-mean_big
DJ7 = cbind(diff_B1, diff_small , diff_big) #produces matrix of difference between average and observations (x-(x-bar))
DJ7
# DJ8 = data.matrix(DJ7[1,])
# DJ8
DJ9 = data.matrix(DJ7) ### turns matrix into appropriate numeric form
DJ9
# T2.1.1 = DJ8 %*% cov_inv %*% t(DJ8)
# T2.1.1
# T2.1 = t(as.matrix(DJ9[1,])) %*% cov_inv %*% as.matrix(DJ9[1,])
# T2.1
#T2 <- NULL
for(i in 1:64){ #### creates vector of T^2 statistic
T2<- t(as.matrix(DJ9[i,])) %*% cov_inv %*% as.matrix(DJ9[i,]) # calculation of T^2 test statistic ## there is no calculation of x-double bar
write.table(T2,"C:/Users/johnd/Desktop/R Data/c1.csv",append=T,sep="," , col.names = FALSE)#
#
DJ12 <-fread("C:/Users/johnd/Desktop/R Data/c1.csv" , header = FALSE ) #
}
# DJ12
DJ12$V1 = 1:nrow(DJ12)
# plot(DJ12 , type='l')
p1 = nrow(DJ12)-m
p2 = nrow(DJ12)
plot(DJ12[p1:p2,], type ='o', ylim =c(0,15), ylab="T2 Chart" , xlab="Data points") ### plots last 640 points
# plot(DJ12[p1:p2,], type ='o' , ylim =c(0,15) , ylab="T2 Chart" , xlab="Data points")
abline(h=UCL , col="red") ## displays upper control limit
Sys.sleep(1)
}
The process succeeds for some time, and the failure moment is not always the same amount of time. Sometimes the control chart will run for 6-7 min, other times is will crash in 2 min.
My suspicion is that the Excel files are not being updated fast enough, so the R code tries to read that Excel file when it is empty.
Your suspicion is correct.
With your current design, your R application can crash depending on how fast it runs relative to your LabVIEW application. This is called a race condition; you must eliminate race conditions from your code.
A quick and dirty solution
One simple solution to avoid the crash is to call NROW to check if any data exists. If there's no data available, don't call aggregate. This is described here: error message in r: no rows to aggregate
A more robust solution
A better solution is to use a communications protocol like TCP to stream data from LabVIEW to R, instead of using CSV files to transfer real-time data. For example, your R program could listen for data on a TCP socket. Make it wait for data to be sent from LabVIEW before running your data processing code.
Here is an example on using socketConnection in R: http://blog.corynissen.com/2013/05/using-r-to-communicate-via-socket.html
Here is an example on sending/receiving data over TCP in LabVIEW: http://www.ni.com/product-documentation/2710/en/

How to coerce stslist.freq to dataframe

I am doing some describtive sequence analysis using the "TraMineR" library. I want to report my findings via R-Markdown in html format. For formating tables I use "kable" and "kableExtra".
To get the frequency and propotions of the most common sequences I use seqtab(). The result is an stslist.freq object. When I try to coerce it to a dataframe, the dataframe is not containing any frequencies and proportions.
I tried to print the results of seqtab() and store this result again. This gives me the dataframe I desire. However there are two "problems" with that: (1) I don't understand what is happening here and it seems like a "dirty" trick, (2) as a result I also get the output of the print command in my final html document if I don't split the code in multiple chunks and disable the ouput in the specific chunk.
Here is some code to replicate the problem:
library("TraMineR")
#Data creation
data.long <- data.frame(
id=rep(1:50, each=4),
time = c(0,1,2,3),
status = sample(letters[1:2], 200, replace = TRUE),
weight=rep(runif(50, 0, 1), each=4)
)
#reshape
data.wide <- reshape(data.long, v.names = "status", idvar="id", direction="wide", timevar="time")
#sequence
sequence <- seqdef(data.wide,
var=c("status.0", "status.1", "status.2", "status.3"),
weights=data.wide$weight)
#frequencies of sequences
##doesn't work:
seqtab.df1 <- as.data.frame(seqtab(sequence))
##works:
seqtab.df2 <- print(seqtab(sequence))
I expect the dataframe to be the same as the one saved in seqtab.df2, however either without using the print command or with "silently" (no output printed) using the print command.
Thank you very much for your help and let me know if I forgot something to make answering the question possible!
If you look at the class() of the object returned by seqtab, it has the type
class(seqtab(sequence))
# [1] "stslist.freq" "stslist" "data.frame"
so if we look at exactly, what's happening in the print statement for such an object we can get a clue what's going on
TraMineR:::print.stslist.freq
# function (x, digits = 2, width = 1, ...)
# {
# table <- attr(x, "freq")
# print(table, digits = digits, width = width, ...)
# }
# <bytecode: 0x0000000003e831f8>
# <environment: namespace:TraMineR>
We see that what it's really giving you is the "freq" attribute. You can extract this directly and skip the print()
attr(seqtab(sequence), "freq")
# Freq Percent
# a/3-b/1 4.283261 20.130845
# b/1-a/1-b/2 2.773341 13.034390
# a/2-b/1-a/1 2.141982 10.067073
# a/1-b/1-a/1-b/1 1.880359 8.837476
# a/1-b/2-a/1 1.723489 8.100203
# b/1-a/2-b/1 1.418302 6.665861
# b/2-a/1-b/1 1.365099 6.415813
# a/1-b/3 1.241644 5.835586
# a/1-b/1-a/2 1.164434 5.472710
# a/2-b/2 1.092656 5.135360

CrossTable and Loop do not like each other

I am trying to get a set of cross tables with 70 variables. But no matter what I did, R kept generating the "function" back to me. I tried to move substitute after CrossTable but R seemed to have trouble using list(i=as.name(x)).
library(gmodel)
Independent_List <- colnames(Comorbidity)[1:70]
Comorbidity_Table <- lapply(Independent_List, function(x) {
substitute(CrossTable(i ,
Comorbidity$sleep,
prop.c = TRUE,
prop.r = FALSE,
prop.t = FALSE,
prop.chisq = FALSE,
data =Comorbidity),
list(i=as.name(x)))
})
lapply(Comorbidity_Table, summary)
[[1]]
Length Class Mode
8 call call
[[2]]
Length Class Mode
8 call call
[[3]]
Length Class Mode
8 call call
The goal is to try to make a table with specific cell numbers and column percentage and merge with my looped glm results.
I ended up using a much simpler method to solve this problem:
Tables <- lapply(Table_Data[, 1:11], function(x){table(x, Table_Data$TSD,exclude = NA)})
Prop_Tabs <- lapply(Tables[1:11], function(x){prop.table(x,2)})

Resources