Zelen Exact Test - Trying to use a k 2x2 in the function zelen.test()

Zelen Exact Test - Trying to use a k 2x2 in the function zelen.test() - r

I am trying to use the zelen.test function on the package NSM3. I am having difficulty reading the data into the function.
You can recreate my data using
data <- c(4, 2, 3, 3, 8, 3, 4, 7, 0, 7, 1, 1, 12, 13,
74, 74, 77, 85, 31, 37, 11, 7, 18, 18, 96, 97, 48, 40)
events <- matrix(data, ncol = 2)
The documentation on CRAN states that zelen.test(z, example = F, r = 3) where z is an array of k 2 x 2 matrix, example is set to FALSE because it returns a p-value for an example I cannot access, and r is the number of decimals the users wants returned in the p-value.
I've tried:
zelen.test(events, r = 4)
I thought it may want the study number and the trial data, so I tried this:
studies <- c(1, 1, 2, 2, 3, 3, 4, 4, 5, 5, 6, 6, 7, 7)
data <- c(4, 2, 3, 3, 8, 3, 4, 7, 0, 7, 1, 1, 12, 13,
74, 74, 77, 85, 31, 37, 11, 7, 18, 18, 96, 97, 48, 40)
events <- matrix(cbind(studies, events), ncol = 3)
zelen.test(events, r = 4)
but it continues to return and error stating
"Error in z[1, 1, ] : incorrect number of dimensions" for both cases I tried above.
Any help would be greatly appreciated!

If we check the source code by typing zelen.test on the console, if the example = TRUE, it is constructing a 3D array
...
if (example)
z <- array(c(2, 1, 2, 5, 1, 5, 4, 1), dim = c(2, 2, 2))
...
The input z dim is also specified in the documentation of ?zelen.test
z - data as an array of k 2x2 matrices. Small data sets only!
So, we may need to construct an array of dimensions 3
library(NSM3)
z1 <- array(c(4, 2, 3, 3, 8, 3, 4, 7), c(2, 2, 2))
zelen.test(z1, r = 4)
# Zelen's test:
# P = 1
Or with 3rd dimension of length 3
z1 <- array( c(4, 2, 3, 3, 8, 3, 4, 7, 0, 7, 1, 1), c(2, 2, 3))
zelen.test(z1, r = 4)
# Zelen's test:
#P = 0.1238

Related

Logarithmic scaling with ggplot2 in R

I am trying to create a diagram using ggplot2. There are several very small values to be displayed and a few larger ones. I'd like to display all of them in an appropriate way using logarithmic scaling. This is what I do:
plotPointsPre <- ggplot(data = solverEntries, aes(x = val, y = instance,
color = solver, group = solver))
...
finalPlot <- plotPointsPre + coord_trans(x = 'log10') + geom_point() +
xlab("costs") + ylab("instance")
This is the result:
It is just the same as without coord_trans(x = 'log10').
However, if I use it with the y-axis:
How do I achieve the logarithmic scaling on the x-axis? Besides, it is not about the x-axis, if I switch the values of x and y, then it works on the x-axis and no longer on the y-axis. So there seems to be some problem with the displayed values. Does anybody have an idea how to fix this?
Edit - Here's the used data contained in solverEntries:
solverEntries <- data.frame(instance = c(1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4, 5, 5, 5, 5, 6, 6, 6, 6, 7, 7, 7, 7, 8, 8, 8, 8, 9, 9, 9, 9, 10, 10, 10, 10, 11, 11, 11, 11, 12, 12, 12, 12, 13, 13, 13, 13, 14, 14, 14, 14, 15, 15, 15, 15, 16, 16, 16, 16, 17, 17, 17, 17, 18, 18, 18, 18, 19, 19, 19, 19, 20, 20, 20, 20),
solver = c(4, 3, 2, 1, 4, 3, 2, 1, 4, 3, 2, 1, 4, 3, 2, 1, 4, 3, 2, 1, 4, 3, 2, 1, 4, 3, 2, 1, 4, 3, 2, 1, 4, 3, 2, 1, 4, 3, 2, 1, 4, 3, 2, 1, 4, 3, 2, 1, 4, 3, 2, 1, 4, 3, 2, 1, 4, 3, 2, 1, 4, 3, 2, 1, 4, 3, 2, 1, 4, 3, 2, 1, 4, 3, 2, 1, 4, 3, 2, 1),
time = c(1, 24, 13, 6, 1, 41, 15, 5, 1, 26, 16, 5, 1, 39, 7, 4, 1, 28, 11, 3, 1, 31, 12, 3, 1, 38, 20, 3, 1, 37, 10, 4, 1, 25, 11, 3, 1, 32, 18, 4, 1, 27, 21, 3, 1, 23, 22, 3, 1, 30, 17, 2, 1, 36, 8, 3, 1, 37, 19, 4, 1, 40, 21, 3, 1, 29, 11, 4, 1, 33, 10, 3, 1, 34, 9, 3, 1, 35, 14, 3),
val = c(6553.48, 6565.6, 6565.6, 6577.72, 6568.04, 7117.14, 6578.98, 6609.28, 6559.54, 6561.98, 6561.98, 6592.28, 6547.42, 7537.64, 6549.86, 6555.92, 6546.24, 6557.18, 6557.18, 6589.92, 6586.22, 6588.66, 6588.66, 6631.08, 6547.42, 7172.86, 6569.3, 6582.6, 6547.42, 6583.78, 6547.42, 6575.28, 6555.92, 6565.68, 6565.68, 6575.36, 6551.04, 6551.04, 6551.04, 6563.16, 6549.86, 6549.86, 6549.86, 6555.92, 6544.98, 6549.86, 6549.86, 6561.98, 6558.36, 6563.24, 6563.24, 6578.98, 6566.86, 7080.78, 6570.48, 6572.92, 6565.6, 7073.46, 6580.16, 6612.9, 6557.18, 7351.04, 6562.06, 6593.54, 6547.42, 6552.3, 6552.3, 6558.36, 6553.48, 6576.54, 6576.54, 6612.9, 6555.92, 6560.8, 6560.8, 6570.48, 6566.86, 6617.78, 6572.92, 6578.98))

Your data in current form is not log distributed -- most val around 6500 and some 10% higher. If you want to stretch the data, you could use a custom transformation using the scales::trans_new(), or here's a simpler version that just subtracts a baseline value to make a log transform useful. After subtracting 6500, the small values will be mapped to around 50, with the large values around 1000, which is a more appropriate range for a log scale. Then we apply the same transformation to the breaks so that the labels will appear in the right spots. (i.e. the label 6550 is mapped to the data that is mapped to 6550 - 6500 = 50)
This method helps if you want to make the underlying values more distinguishable, but at the cost of distorting the underlying proportions between values. You might be able to help with this by picking useful breaks and labeling them with scaling stats, e.g.
7000
+7% over min
my_breaks <- c(6550, 6600, 6750, 7000, 7500)
baseline = 6500
library(ggplot2)
ggplot(data = solverEntries,
aes(x = val - baseline, y = instance,
color = solver, group = solver)) +
geom_point() +
scale_x_log10(breaks = my_breaks - baseline,
labels = my_breaks, name = "val")

Is this what you're looking for?
x_data <- seq(from=1,to=50)
y_data <- 2*x_data+rnorm(n=50,mean=0,sd=5)
#non log y
ggplot()+
aes(x=x_data,y=y_data)+
geom_point()
#log y scale
ggplot()+
aes(x=x_data,y=y_data)+
geom_point()+
scale_y_log10()
#log x scale
ggplot()+
aes(x=x_data,y=y_data)+
geom_point()+
scale_x_log10()

Quade test in R

I would like to perform a Quade test with more than one covariate in R. I know the command quade.test and I have seen the example below:
## Conover (1999, p. 375f):
## Numbers of five brands of a new hand lotion sold in seven stores
## during one week.
y <- matrix(c( 5, 4, 7, 10, 12,
1, 3, 1, 0, 2,
16, 12, 22, 22, 35,
5, 4, 3, 5, 4,
10, 9, 7, 13, 10,
19, 18, 28, 37, 58,
10, 7, 6, 8, 7),
nrow = 7, byrow = TRUE,
dimnames =
list(Store = as.character(1:7),
Brand = LETTERS[1:5]))
y
quade.test(y)
My question is as follows: how could I introduce more than one covariate? In this example the covariate is the Store variable.

Subset and remove rows from a dataset

I want to exclude some rows from my dataset while I also subset it. Something like I wrote below.
a<-c(2, 4, 6, 6, 8, 10, 12, 13, 14)
c<-c(2, 2, 2, 2, 2, 2, 4, 4,4)
d<-c(10, 10, 10, 30, 30, 30, 50, 50, 50)
ID<-rep(c("no","bo", "fo"), each=3)
mydata<-data.frame(ID, a, c, d)
gg.df <- melt(mydata, id="ID", variable.name="variable")
gg.df[gg.df$variable=="a"& gg.df$ID==-"fo", ]

dplyr: Create a new variable as a function of all existing variables without defining their names

In the following dataframe I want to create a new variable as the following function of all existing ones:
as.numeric(paste0(df[i,],collapse=""))
However, I don't want to define the column names explicitly because their number and names maybe different each time. How can I do that using dplyr?
The equivalent in base r would be something like this:
apply(df,1,function(x) as.numeric(paste0(x,collapse="")))
df <- structure(list(X1 = c(50, 2, 2, 50, 5, 5, 2, 50, 5, 5, 50, 2,
5, 5, 50, 2, 2, 50, 9, 9, 9, 9, 9, 9), X2 = c(2, 50, 5, 5, 50,
2, 5, 5, 50, 2, 2, 50, 9, 9, 9, 9, 9, 9, 50, 2, 2, 50, 5, 5),
X3 = c(5, 5, 50, 2, 2, 50, 9, 9, 9, 9, 9, 9, 50, 2, 2, 50,
5, 5, 2, 50, 5, 5, 50, 2), X4 = c(9, 9, 9, 9, 9, 9, 50, 2,
2, 50, 5, 5, 2, 50, 5, 5, 50, 2, 5, 5, 50, 2, 2, 50)), class = "data.frame", .Names = c("X1",
"X2", "X3", "X4"), row.names = c(NA, -24L))

You can try:
df %>% mutate(newcol=as.numeric(do.call(paste0,df)))
Or (as you suggested, maybe more dplyr style):
df %>% mutate(newcol=as.numeric(do.call(paste0,.)))

How to get same results of Wilcoxon sign rank test in R and SAS

R code:
x <- c(9, 5, 9 ,10, 13, 8, 8, 13, 18, 30)
y <- c(10, 6, 9, 8, 11, 4, 1, 3, 3, 10)
library(exactRankTests)
wilcox.exact(y,x, paired = TRUE, alternative = "two.sided")
The results: V = 3, p-value = 0.01562
SAS code:
data aaa;
set aaa;
diff=x-y;
run;
proc univariate;
var diff;
run;
The results: S=19.5 Pr >= |S| 0.0156
How to get statistics S in R?
If n<=20 the exact P was same in SAS and R,but if n>20 the results were different.
x <- c(9, 5, 9 ,10, 13, 8, 8, 13, 18, 30,9, 5, 9 ,10, 13, 8, 8, 13, 18, 30,9,11,12,10)
y <- c(10, 6, 9, 8, 11, 4, 1, 3, 3, 10,10, 6, 9, 8, 11, 4, 1, 3, 3, 10,10,12,11,12)
wilcox.exact(y,x,paired=TRUE, alternative = "two.sided",exact = FALSE)
The results: V = 34, p-value = 0.002534
The SAS results:S=92.5 Pr >= |S| 0.0009
How to get the same statistics S and P value in SAS and R? Thank you!

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

Zelen Exact Test - Trying to use a k 2x2 in the function zelen.test() - r

Related

Logarithmic scaling with ggplot2 in R

Quade test in R

Subset and remove rows from a dataset

dplyr: Create a new variable as a function of all existing variables without defining their names

How to get same results of Wilcoxon sign rank test in R and SAS

Categories

Resources