T.test R program [duplicate] - r

This question already has an answer here:
T.test in R program for multiple data sets
(1 answer)
Closed 9 years ago.
I'm trying to do a t.test for a lot of data sets and I want to them to be contained in a single ouput
So far I'm doing a t.test similar to this
test1=t.test(dat$velocity,x[[1]][[2]])
test2=t.test(dat$velocity,x[[2]][[2]])
test3=t.test(dat$velocity,x[[3]][[2]])

Something like this should work:
tests <- lapply(1:length(x), function(i) t.test(dat$velocity,x[[i]][[2]]))
tests is a list list of the length length(x). You can access each t-test result with tests[[1]].

Related

Two sample T test in R [duplicate]

This question already has answers here:
How to perform a paired t-test in R when all the values are in one column?
(1 answer)
R - fast two sample t test
(2 answers)
Closed 1 year ago.
How do I run a T test comparing groups I and B by there accuracy? enter image description here
The command you are looking for is t.test(). In your case, it should look like:
t.test(accuracy ~ group, data = DATA_NAME)

Select only numeric variables of a data frame in R [duplicate]

This question already has answers here:
Why does apply convert logicals in data frames to strings of 5 characters?
(2 answers)
Selecting only numeric columns from a data frame
(12 answers)
Closed 2 years ago.
I know that the question is very easy, but I have a more specific one:
I have a data frame, with 50 variables (numeric and non-numeric) and 5000 observations.
Now what I want to do is create another data frame containing only the numerica variables of the original one.
On this website I found the solution of my problem, that is:
numeric_variables<-unlist(lapply(original_data,is.numeric))
X<-original_data[numeric_variables]
But I was wondering: why if I try like this, it does not work instead? what's wrong?
numeric_variables2<-apply(original_data,2,is.numeric)
x<-original_data[numeric_variables2]
try this :
names_num <- names(which(sapply(df, is.numeric)))
df_num <- df[, names_num]

how to use a quoted string in a t_test function [duplicate]

This question already has answers here:
Dynamically select data frame columns using $ and a character value
(10 answers)
How to write a loop to run the t-test of a data frame?
(5 answers)
Closed 3 years ago.
I have a test datafile
I would like to run a t test.
This will work, which I already know
t.test(df_test$clin_value ~ df_test$trt_variable)
But I would like to do this:
trt_var = "trt_variable"
noquote(trt_var) # which gives me trt_variable
why I can not run this?
t.test(df_test$clin_value ~ df_test$(noquote(trt_var)))
How can I make this work?
I have to do this way, because I would like to change trt_var constantly.

Difference between two lists to create a dataset [duplicate]

This question already has answers here:
Find complement of a data frame (anti - join)
(7 answers)
Closed 5 years ago.
I have a dataset, like this mushrooms <- read.csv("mushrooms.csv") and now I already have a mushrooms.training_set which is 1/3 of the whole dataset. For both variables, typeof() returns list.
Now, I want to select the rows in the original dataset mushrooms, that are not in the mushrooms.training_set. How would I do this? I have tried the following:
mushrooms[c(!mushrooms.training_set),] but this returns something in the order of 64K rows.
mushrooms[!mushrooms.training_set,]
mushrooms[!duplicated(mushrooms.training_set)]
Who helps me out?
From where you are in the question, you can use dplyr::setdiff:
library(dplyr)
mushroooms.test = setdiff(mushrooms, mushrooms.training_set)
But most of the time it's easier to create the test set using at the same time as the training set. Lots of examples here at How to split data into training and test sets?

R, Add binary columns based on values in existing column [duplicate]

This question already has answers here:
Generate a dummy-variable
(17 answers)
Closed 5 years ago.
Beginner in R and looking to avoid unnecessary copy+pasting...
I have a data frame with a numeric column. I would like to create binary columns based on the values in the numeric column.
I know the tedious approach would be to copy+paste the following and manually add the different values:
DataFrame$NewCol1 <- as.numeric(DataFrame$ExistingCol == 1);
DataFrame$NewCol2 <- as.numeric(DataFrame$ExistingCol == 2);
Would a "for" loop be able to accomplish this task?
How about something like this?
model.matrix(~factor(DataFrame$ExistingCol))[,-1]

Resources