Why is the for loop returning NA vectors in some positions (in R)? - r

Following a youtube tutorial, I have created a vector x [-3,6,2,5,9].
Then I create an empty variable of length 5 with the function 'numeric(5)'
I want to store the squares of my vector x in 'Storage2' with a for loop.
When I do the for loop and update my variable, it returns a very strange thing:
[1] 9 4 0 9 25 36 NA NA 81
I can see all numbers in x have been squared, but the order is so random, and there's more than 5.
Also, why are there NAs?? If it's because the last number of x is 9 (and so this number defines the length??), and there's no 7 and 8 position, I would understand, but then I'm also missing positions 1, 3 and 4, so there should be more NAs...
I'm just starting with R, so please keep it simple, and correct me if I'm wrong during my thought process! Thank you!!
x <- c(-3,6,2,5,9)
Storage2 <- numeric(5)
for(i in x){
Storage2[i] <- i^2
}
Storage2
# [1] 9 4 0 9 25 36 NA NA 81

You're looping over the elements of x not over the positions as probably intended. You need to change your loop like so:
for(i in 1:length(x)) {
Storage2[i] <- x[i]^2
}
Storage2
# [1] 9 36 4 25 81
(Note: 1:length(x) can also be expressed as seq_along(x), as pointed out by #NelsonGon in comments and might be faster.)
However, R is a vectorized language so you can simply do that:
Storage2 <- x^2
Storage2
# [1] 9 36 4 25 81

Related

Remove missing values with user defined function

I have a dataset as data with missing values.
a <- sample(1:100,15)
b <- sample(1:20,15)
data <- data.frame(a,b)
data[c(3,6,8,12),2] <- NA
data
Now I want to delete the rows with missing values by one variable at a time. (Don't want to use na.omit() ). I have written the following function, but it's not working.
rmv_missing <- function(y,z){
z <- z[is.na(z$y) == TRUE,]
return(z)
}
rmv_missing("b",data)
Also tried this one...
library(dplyr)
na_values <- function(x,y,z){
z <- (filter(z,!is.na(y)))
return(z)
}
rmv_missing("b",data)
None of these functions are working. Could someone help me to understand where did I make the mistake and rectify the code. Thanks in advance.
First thing, you don't really need the "== T" since the "is.na" function spits out already a logical vector. The other problem is that accessing a data.frame as "data$b" will not work within a function. So instead, do the following et voilĂ :
rmv_missing <- function(y,z)
{
print(z$y) # Does not work
print(z[, y]) # Works
z[is.na(z[, y]),]
}
rmv_missing("b",data)
# NULL
# [1] 9 16 NA 5 13 NA 8 NA 11 17 20 NA 10 12 1
# a b
# 3 33 NA
# 6 59 NA
# 8 81 NA
# 12 26 NA

Is there a way to create a permutation of a vector without using the sample() function in R?

I hope you are having a nice day. I would like to know if there is a way to create a permutation (rearrangement) of the values in a vector in R?
My professor provided with an assignment in which we are supposed create functions for a randomization test, one while using sample() to create a permutation and one not using the sample() function. So far all of my efforts have been fruitless, as any answer that I can find always resorts in the use of the sample() function. I have tried several other methods, such as indexing with runif() and writing my own functions, but to no avail. Alas, I have accepted defeat and come here for salvation.
While using the sample() function, the code looks like:
#create the groups
a <- c(2,5,5,6,6,7,8,9)
b <- c(1,1,2,3,3,4,5,7,7,8)
#create a permutation of the combined vector without replacement using the sample function()
permsample <-sample(c(a,b),replace=FALSE)
permsample
[1] 2 5 6 1 7 7 3 8 6 3 5 9 2 7 4 8 1 5
And, for reference, the entire code of my function looks like:
PermutationTtest <- function(a, b, P){
sample.t.value <- t.test(a, b)$statistic
perm.t.values<-matrix(rep(0,P),P,1)
N <-length(a)
M <-length(b)
for (i in 1:P)
{
permsample <-sample(c(a,b),replace=FALSE)
pgroup1 <- permsample[1:N]
pgroup2 <- permsample[(N+1) : (N+M)]
perm.t.values[i]<- t.test(pgroup1, pgroup2)$statistic
}
return(mean(perm.t.values))
}
How would I achieve the same thing, but without using the sample() function and within the confines of base R? The only hint my professor gave was "use indices." Thank you very much for your help and have a nice day.
You can use runif() to generate a value between 1.0 and the length of the final array. The floor() function returns the integer part of that number. At each iteration, i decrease the range of the random number to choose, append the element in the rn'th position of the original array to the new one and remove it.
a <- c(2,5,5,6,6,7,8,9)
b <- c(1,1,2,3,3,4,5,7,7,8)
c<-c(a,b)
index<-length(c)
perm<-c()
for(i in 1:length(c)){
rn = floor(runif(1, min=1, max=index))
perm<-append(perm,c[rn])
c=c[-rn]
index=index-1
}
It is easier to see what is going on if we use consecutive numbers:
a <- 1:8
b <- 9:17
ab <- c(a, b)
ab
# [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
Now draw 17 (length(ab)) random numbers and use them to order ab:
rnd <- runif(length(ab))
ab[order(rnd)]
# [1] 5 13 11 12 6 1 17 3 10 2 8 16 7 4 9 15 14
rnd <- runif(length(ab))
ab[order(rnd)]
# [1] 14 11 5 15 10 7 13 9 17 8 2 6 1 4 16 12 3
For each permutation just draw another 17 random numbers.

R triangular numbers function

While working on a small program for calculating the right triangular number that fulfils an equation, I stumbled over a page that holds documentation on the function Triangular()
Triangular function
When I tried to use this, Rstudio says it couldn't find it and I can't seem to find any other information about what library this could be in.
Does this function even exist and/or are there other ways to fill a vector with triangular numbers?
Here is a base R solution to define your custom triangular number generator, i.e.,
myTriangular <- function(n) choose(seq(n),2)
or
myTriangular <- function(n) cumsum(seq(n)-1)
such that
> myTriangular(10)
[1] 0 1 3 6 10 15 21 28 36 45
If you would like to use Triangular() from package Zseq, then please try
Zseq::Triangular(10)
such that
> Zseq::Triangular(10)
Big Integer ('bigz') object of length 10:
[1] 0 1 3 6 10 15 21 28 36 45
It's pretty easy to do it yourself:
triangular <- function(n) sapply(1:n, function(x) sum(1:x))
So you can do:
triangular(10)
# [1] 1 3 6 10 15 21 28 36 45 55

mysterious values as output in vector R using if and else

I have a vector of values (numbers only). I want to split up this vector into two vectors. One vector will contain values less than the average of the original vector, the other will contain values more than the average of the original vector. I have the following as a test R script:
v <- c(1,1,4,6,3,67,10,194,847)
#Initialize
v.in<- c(rep(0),length(v))
v.out<- c(rep(0),length(v))
for (i in 1:length(v))
{
if (v < 0.68 * mean(v))
{
v.in[i] <- v[i]
}
else
{
v.out[i] <- v[i]
}
}
v.in
v.out
## <https://gist.github.com/8a6747ea9b7421161c43>
I get the following result:
9: In if (v < 0.68 * mean(v)) { :
the condition has length > 1 and only the first element will be used
> v.in
[1] 1 1 4 6 3 67 10 194 847
> v.out
[1] 0 9
> v
[1] 1 1 4 6 3 67 10 194 847
>
Clearly, 0 and 9 are not values of any of the elements in v.
Any suggestions what is going on and how to fix this?
Thanks,
Ed
#BenBolker pointed out in the comment why you code doesn't work: you need to select a single element from v when using if. However, you might find split a better function for a task like this:
split(v,v<0.68*mean(v))
$`FALSE`
[1] 194 847
$`TRUE`
[1] 1 1 4 6 3 67 10
The answer to the mystery of v.out is that its branch doesn't get selected so it doesn't get changed. It therefore retains its inital value, which is (presumably) erroneously given the value of a single 0 and the length of the vector (9) rather than nine copies of zero as I suspect you intended.

returning a list in R and functional programming behavior

I have a basic questions regarding functional programming in R.
Given a function that returns a list, such as:
myF <- function(x){
return (list(a=11,b=x))
}
why is it that the list returned when calling the function with a range or vector is always the same lenght for 'a'
Ex:
myF(1:10)
returns:
$a
[1] 11
$b
[1] 1 2 3 4 5 6 7 8 9 10
How can one change the behavior so that the 'a' list has the sample length as b's.
I am actually working with a bunch of S4 objects that do I cannot easily convert to list (using as.list) so _apply is not my first choice.
Thanks for any insight or help!
EDIT (Added further explanations)
I am not necessarily looking to just pad 'a' to makes its length equal to b's. However using the solution
as.list(data.frame(a=myA,b=x)) pads the 'a' with the same value computed first.
myF <- function(x){
myA = ceiling(runif(1, max=100))
return (as.list(data.frame(a=myA
,b=x)))
}
myF(1:5)
$a
[1] 79 79 79 79 79 79 79 79 79 79
$b
[1] 1 2 3 4 5 6 7 8 9 10
I still am not sure why that happens!
Thanks
are you just looking to have 11 repeated so that a is the same length as b? if so:
> myF <- function(x){
+ return (list(a=rep(11,length(x)),b=x))
+ }
> myF(1:10)
$a
[1] 11 11 11 11 11 11 11 11 11 11
$b
[1] 1 2 3 4 5 6 7 8 9 10
EDIT based on OP's clarification/comments. If you want 'a' to instead be a random vector with length equal to 'b':
> myF <- function(x){
+ return (list(a=ceiling(runif(length(x),max=100)),b=x))
+ }
> myF(1:10)
$a
[1] 4 31 8 45 25 74 36 95 64 32
$b
[1] 1 2 3 4 5 6 7 8 9 10
I don't quite understand what you mean by not being able to use as.list. You should be able to get a version of your function satisfying the requirement that all components of the list be equally long by doing:
myF <- function(x){
return as.list(data.frame(a=11,b=x))
}
EDIT:
The reason list does not work the way you expect is that list applied to a number of lists/vectors/e.t.c. is just that, a list of those lists/vectors/e.t.c.; it does not "inspect" their structure.
What I think you want is the additional semantics that the vectors contained in the list should match up and produce a set of "rows", each with one corresponding element from each one of your vectors. This is exactly what a data frame is suppose to be (indeed how, I think, a data frame is represented in R). The final as.list call does little but change what type its tagged as.
EDIT2:
Note that if I'm wrong above (and that's not the general behaviour you want) then Mac's solution is more appropriate, as it gives you exactly the behaviour that both the vectors should have the same length, without implying that they should "line up".
This would both be confusing to anyone reading the code (as using a data.frame implies you think of your vectors as matching up) as well as forcing any additional elements you add to the list to be converted into vectors of the appropriate length (which may or may not be what you want)
In case I did not understand you correctly last time, here is another possibility:
If you want to generate a second vector, given some function/expression, of the same length as your argument you could do something like:
myF <- function(x){
return (list(a=replicate(length(x),f),b=x))
}
in your example f could be runif(1, max=100), though in the specific case of runif you could explicitly tell it to generate a vector of appropriate length by calling runif(length(x), max=100) inside the function.
replicate simply re-evaluates f the number of times you request, and gives you the vector of all the results.
It appears that your function is "hard coding" a. So no matter what you specify it will always give 11.
If for example you changed the function to:
myF <- function(x){ return (list(a=x,b=x)) }
myF(1:10)
$a
[1] 1 2 3 4 5 6 7 8 9 10
$b
[1] 1 2 3 4 5 6 7 8 9 10
a is allowed to change like b.
or
myF <- function(x,y){ return (list(a=y,b=x)) }
myF(10:1,1:10)
$a
[1] 1 2 3 4 5 6 7 8 9 10
$b
[1] 10 9 8 7 6 5 4 3 2 1
Now a is allowed to change independent of b.

Resources