T test failed in R - r

Iam using R to run a large number using input from a delimited table which is compost of 40000 row and 4 col. iam trying to implement the t test ,p value , but i have error which is (the data are essentially constant) , i used the for loop and apply for both case i had same issue the code is:
NormData3= NormData1[1:40000,1:5]
for(i in 1:nrow(NormData3)) {
g1=NormData3[i,2:3]
g2=NormData3[i,4:5]
p[i]=t.test(g1,g2,var.equal=TRUE)$p.value
}
I don't know what is the problem ??

It's nice that the software recognizes situations in which a sensible
answer can't be computed. At that point, there are two possible actions:
(1) stop with an informative error, and (2) silently return NA.
If you are running this in a iterative loop, you want the second output. Here is the small function for that :
my.t.test.p.value <- function(...) {
obj<-try(t.test(...), silent=TRUE)
if (is(obj, "try-error")) return(NA) else return(obj$p.value)
}
Use this function instead of t.test in your code. This will not disturb your loop and allows it to continue.

Related

Problems with breaking for loop in R

I am currently having a problem with R not following my break command.
Here are my two inputs:
Elements should not be bigger than 16, but it is returning two elements bigger than 16:
for (ndx in calc:length(b)) {
print(calc)
if(calc >16) {break}
For this one, I should not be getting elements in my loop that are >50 and <6, but am getting them anyways:
for(ndx in a){
print (a^2)
if (a>50 && a<6) {next}}
Could anyone tell me what I am doing wrong?
For the first one, replacing the variables with more generic names (I'm assuming that calc is an integer, otherwise the : operator shouldn't work anyway)
for (i in j:k) {
print(j)
if (j>16) break
}
Note that the value of j is not changing during the loop (the index variable i is never used in the loop, and no variables get modified in the loop, so nothing changes due to the loop body except for the index variable). So if j>16 it will be printed exactly once (provided length(j:k) is at least 1). Otherwise the loop will never break and j will be printed length(j:k) times.
Maybe
for (i in j:k) {
print(i)
if (i>16) break
}
is what you had in mind??
You do an unconditional print first, then you test the condition. Surely you should test the condition first, before printing?

R Prompt User for Number Input

I'm trying to write an R script that asks the user to input a number that will be stored for later use. I'm struggling with storing the value though. Below is what I have written.
numberOfStudents <- function()
{
s <- readline("How many incoming students are there? ")
return(as.integer(s))
}
print(numberOfStudents())
print(s)
If I store 500 as the value after running print(numberOfStudents()), print(s) returns "Error in print(s) : object 's' not found".
Any suggestions?
You are almost there.
The problem is that you save the input in a variable inside the function, and print it out, but you never expressly tell the function to save the variable to the outside world.
The function numberOfStudents is inside your current enviroment. SO you might consider a few things, but the best idea is to return a value to the current environment and assign it.
numberOfStudents <- function()
{
s <- readline("How many incoming students are there? ")
return(as.integer(s))
}
print(numberOfStudents())
Now you can call your function, set it equal to some value which will persist outside of the function
newStudentList<-numberOfStudents() # you have saved your function output to this variable
remember you always need to set the results of a function equal to something in the current environment to capture it. If you run sd() without setting it equal to a variable, it also just prints the standard deviation but you lose access to that value.
x<-sd(samples) #will return a values saved to x

R - Assigning "NA" to objects 'not found' inside a function; is it possible?

I am running a data set (in the example, "data object ") through several different functions in R and concatenating the numeric results at the end. See:
a<-median((function1(x=1,dataobject,reps=500)),na.rm=TRUE)
b<-median((function2(x=1,dataobject,reps=500)),na.rm=TRUE)
c<-median((function3(x=1,dataobject,reps=500)),na.rm=TRUE)
d<-median((function4(x=1,dataobject,reps=500)),na.rm=TRUE)
e<-median((function5(x=1,dataobject,reps=500)),na.rm=TRUE)
f<-median((function6(x=1,dataobject,reps=500)),na.rm=TRUE)
c(a,b,c,d,e,f)
However, some of the functions cannot be run with the data set I am using, and so they return an error; e.g. "function3" can't be run so when it gets to the concatenation step it gives "Error: object 'e' not found" and does not return anything. Is there any way to tell R at the concatenation step to assign a value of "NA" to an object that is not found and continue to run the rest of the code instead of stopping? So that the return would be
[1] 99.233 75.435 77.782 92.013 NA 97.558
A simple question, but I could not find any other instances of it being asked. I originally tried to set up a function to run everything and output the concatenated results, but ran into the same problem (when a function can't be run, the entire wrapper function stops as well and I don't know how to tell R to skip something it can't compute).
Any thoughts are greatly appreciated! Thanks!
A couple of solutions I can think of,
Initialize all the variables you plan to use, so they have a default value that you want.
a = b = c = d = e = NA
then run your code. If an error pops up, you will have NA in the variable.
Use "tryCatch". If you are unaware what this is, I recommend reading on it. It lets you handle errors.
Here is an example from your code,
tryCatch({
a<-median((function1(x=1,dataobject,reps=500)),na.rm=TRUE)
},
error = function(err){
print("Error in evaluating a. Initializing it to NA")
a <<- NA
})

FOR loops giving no result or error in R

I am running the following code:
disc<-for (i in 1:33) {
m=n[i]
xbar<-sum(data[i,],na.rm=TRUE)/m
Sx <- sqrt(sum((data[i,]-xbar)^2,na.rm=TRUE)/(m-1))
Sx
i=i+1}
Running it:
>disc
NULL
Why is it giving me NULL?
This is from the documentation for for, accessible via ?`for`:
‘for’, ‘while’ and ‘repeat’ return ‘NULL’ invisibly.
Perhaps you are looking for something along the following lines:
library(plyr)
disc <- llply(1:33, function(i) {
m=n[i]
xbar<-sum(data[i,],na.rm=TRUE)/m
Sx <- sqrt(sum((data[i,]-xbar)^2,na.rm=TRUE)/(m-1))
Sx
})
Other variants exists -- the ll in llply stands for "list in, list out". Perhaps your intended final result is a data frame or an array -- appropriate functions exist.
The code above is a plain transformation of your example. We might be able to do better by splitting data right away and forgetting the otherwise useless count variable i (untested, as you have provided no data):
disc <- daply(cbind(data, n=n), .(), function(data.i) {
m=data.i$n
xbar<-sum(data.i,na.rm=TRUE)/m
sqrt(sum((data.i-xbar)^2,na.rm=TRUE)/(m-1))
})
See also the plyr website for more information.
Related (if not a duplicate): R - How to turn a loop to a function in R
krlmlr's answer shows you how to fix your code, but to explain your original problem in more abstract terms: A for loop allows you to run the same piece of code multiple times, but it doesn't store the results of running that code for you- you have to do that yourself.
Your current code only really assigns a single value, Sx, for each run of the for loop. On the next run, a new value is put into the Sx variable, so you lose all the previous values. At the end, you'll just end up with whatever the value of Sx was on the last run through the loop.
To save the results of a for loop, you generally need to add them to a vector as you go through, e.g.
# Create the empty results vector outside the loop
results = numeric(0)
for (i in 1:10) {
current_result = 3 + i
results = c(results, current_result)
}
In R for can't return a value. The unique manner to return a value is within a function. So the solution here, is to wrap your loop within a function. For example:
getSx <- function(){
Sx <- 0
disc <- for (i in 1:33) {
m=n[i]
xbar <- sum(data[i,],na.rm=TRUE)/m
Sx <- sqrt(sum((data[i,]-xbar)^2,na.rm=TRUE)/(m-1))
}
Sx
}
Then you call it:
getSx()
Of course you can avoid the side effect of using a for by lapply or by giving a vectorized But this is another problem: You should maybe give a reproducible example and explain a little bit what do you try to compute.

R Script - How to Continue Code Execution on Error

I have written an R script which includes a loop that retrieves external (web) data. The format of the data are most of the time the same, however sometimes the format changes in an unpredictable way and my loop is crashing (stops running).
Is there a way to continue code execution regardless the error? I am looking for something similar to "On error Resume Next" from VBA.
Thank you in advance.
Use try or tryCatch.
for(i in something)
{
res <- try(expression_to_get_data)
if(inherits(res, "try-error"))
{
#error handling code, maybe just skip this iteration using
next
}
#rest of iteration for case of no error
}
The modern way to do this uses purrr::possibly.
First, write a function that gets your data, get_data().
Then modify the function to return a default value in the case of an error.
get_data2 <- possibly(get_data, otherwise = NA)
Now call the modified function in the loop.
for(i in something) {
res <- get_data2(i)
}
You can use try:
# a has not been defined
for(i in 1:3)
{
if(i==2) try(print(a),silent=TRUE)
else print(i)
}
How about these solutions on this related question :
Is there a way to `source()` and continue after an error?
Either parse(file = "script.R") followed by a loop'd try(eval()) on each expression in the result.
Or the evaluate package.
If all you need to do is a small piece of clean up, then on.exit() may be the simplest option. It will execute the expression "when the current function exits (either naturally or as the result of an error)" (documentation here).
For example, the following will delete my_large_dataframe regardless of whether output_to_save gets created.
on.exit(rm("my_large_dataframe"))
my_large_dataframe = function_that_does_not_error()
output_to_save = function_that_does_error(my_large_dataframe)

Resources