Function in R does not return vector - r

I am sure this is a really dumb question with a simple answer, but I have been banging my head against the desk for an hour now. The goal is to write a simple function that returns a vector of n length consisting of integers spaces as evenly as possible, from 1 to k. So:
place_in_groups <- function(n, k){
rate = (n - 1) / (k - 1)
vect <- round(seq(from = 1, to = n, by = rate), 0)
return(vect)
}
When I run the lines inside the function on the outside of the function, it does what I want it to do: creates a vector with the appropriate values. But when I run it inside the vector, I get the actual values, not the vector:
place_in_groups(4,5)
[1] 1 2 2 3 4
As I said, I'm sure it is something obvious I'm doing wrong, but it is also something I'm obviously in need of learning.

I'm not shure I undestand the question correctly. Try str() on your results. What you are getting is a vector.
vect <- place_in_groups(4,5)
str(vect)
num [1:5] 1 2 2 3 4
What do you want to do with the vector, or what is your challenge?

Related

How do I create a function for finding the Least Common Multiple of a vector of integers using Greatest Common Factor in R?

Suppose I have a function gcf(x,y) that returns the greatest common factor of x and y. So, for example,
gcf(75,85) = 5
Now, I'm trying to create a function lcm(v) that takes a vector of integers and returns the least common multiple.
I know mathematically it will be
lcd(a,b) = a*b/((gcf(a,b))
But I have a vector as the argument. How do I start writing the code?
Also, how do I make sure that the vector has at least two integers and no more than 100?
No need to reinvent the wheel. You can try the package library(pracma) and the vectorized functions gcd & Lcm like
a1 = c(75, 30)
b1 = c(85, 10)
pracma::gcd(a1,b1)
[1] 5 10
pracma::Lcm(a1,b1)
[1] 1275 30
Check View(pracma::gcd) or hit F2 in RStudio to learn from the source code. The second question is a typical if statement:
foo <- function(x, y){
require(pracma)
if(length(x) < 2 | length(x) > 100 ) return("Length of x must be at least 2 and not more than 100")
if(length(y) < 2 | length(y) > 100 ) return("Length of y must be at least 2 and not more than 100")
list(gcd=pracma::gcd(x,y),
lcm=pracma::Lcm(x,y))
}
foo(a1, b1)
$gcd
[1] 5 10
$lcm
[1] 1275 30
Edit
As you commented you want to find the least common factor of three or more numbers. Thus, you can start and play with the following lines of code e.g. for a vector of length of six. The idea is to test all combinations of length 2 and finally reduce them from left to right.
a =c(21, 45, 3 , 90, 72, 99)
combn(a, 2, simplify = F) %>%
map(~gcd(.[1], .[2])) %>%
Reduce(function(x,y) gcd(x, y),.)
But without any guarantee for correctness of the result.
Package numbers contains number-theoretic functions, and function mLCM() therein does what you want (I think):
> numbers::mLCM(c(20,50,75))
[1] 300
If you want to write your own function, you may still want to take a look into this function -- the logic behind is simple.

Muliplying Elements of a Vector one more each time

I am trying to create a vector from another vector where I multiply the numbers in the vector one more each time.
For example if I had (1,2,3) the new vector would be (1, 1 x 2, 1 x 2 x 3)=(1,2,6)
I tried to create a loop for this as seen below. It seems to work for whole numbers but not decimals. I am not sure why.
x <- c(0.99,0.98,0.97,0.96,0.95)
for(i in 1:5){x[i]=prod(x[1:i])}
The result given is 0.9900000 0.9702000 0.9316831 0.8590845 0.7303385
which is incorrect as prod(x) = 0.8582777. Which is not the same as the last element of the vector.
Does anyone know why this is the case? Or have a suggestion for improvement in my code to get the correct answer.
test<-c(1,2,3)
cumprod(test)
[1] 1 2 6
As #akrun suggests, one can achieve the same with:
Reduce("*", test, accumulate = TRUE)

ifelse with for loop

I would like to traverse through rows of a matrix and perform some operations on data entries based on a condition.
Below is my code
m = matrix(c(1,2,NA,NA,5,NA,NA,1,NA,NA,NA,NA,4,5,NA,NA,NA,NA,NA,NA), nrow = 5, ncol = 4)
if (m[,colSums(!is.na(m)) > 1, drop = FALSE]){
for(i in 1:4){
a = which(m[i,] != "NA") - mean(which(!is.na(m[i,])))
for(j in 2:5){
b = which(m[j,] != "NA") - mean(which(!is.na(m[j,])))
prod(a,b)
}
}
}
I get a warning message as below in my "if" condition
Warning message:
In if (m[, colSums(!is.na(m)) > 1, drop = FALSE]) { :
the condition has length > 1 and only the first element will be used
I know it returns a vector and I should be using ifelse block. How to incorporate for loops inside ifelse block? It seems to be a basic question, I am new to R.
Based on your description, you want to check the number of non NA in matrix by column and then do something dependent on this results (that why you need "if"/"ifelse" statement). So, you can implemented as below, and write inner loops in a specific function.
yourFunc <- function(x, data) {
# do what your want / your loops on "data"
# sample, you can check the result in here
if(x > 1) 1
else 0
}
m = matrix(c(1,2,NA,NA,5,NA,NA,1,NA,NA,NA,NA,4,5,NA,NA,NA,NA,NA,NA), nrow = 5, ncol = 4)
# use "apply" series function in here
sapply(colSums(!is.na(m)), yourFunc, data=m)
#[1] 1 0 1 0
Actually, I think you need to re-organize your problem and optimize the code, the "ifelse with for loop" may be totally unnecessary.
As you are new to R, I assume that some of the terminology is maybe a bit
confusing. So here is a little explanation regarding the if statement.
Lets look at the if condition:
m[,colSums(!is.na(m)) > 1, drop = FALSE]
[,1] [,2]
[1,] 1 NA
[2,] 2 NA
[3,] NA 4
[4,] NA 5
[5,] 5 NA
This is nothing that if can work with as an if condition has to be
boolean (evaluate to TRUE/FALSE). So why the result? Well the result of
colSums(!is.na(m))
[1] 3 1 2 0
is a vector of counts of entries that are not NA! (= number of TRUE's in each column). Be carful as this is not the same as
colSums(m, na.rm = TRUE)
[1] 8 1 9 0
which returns a vector of sums over all five rows for each column, excluding NA's. My guess is that the latter is what you are looking for. In any case: be aware of the difference!
By asking which of those sums is greater than 1 you do get a boolean vector
colSums(!is.na(m)) > 1
[1] TRUE FALSE TRUE FALSE
However, using that boolean vector as a criteria for selecting columns, you correctly get a matrix which is obviously not boolean:
m[,colSums(!is.na(m)) > 1]
Note: drop = FALSE is unnecessary here as there are no dimensions to be dropped potentially. See ?[ or ?drop. You can verify this using identical:
identical(m[,colSums(!is.na(m)) > 1, drop = FALSE],
m[,colSums(!is.na(m)) > 1])
Now to the loop. You find tons of discussions on avoiding for loops and using the apply family of functions. I suspect you have to take some time togo through all that. Note however, that using apply - contrary to common belief - is not necessarily superior to a for loop in terms of speed, as it is actually just a fancy wrapper around a for loop (check the source code!). It is, however, clearly superior in terms of code clarity as it is compact and clear about what it is doing. So do try to use apply functions if possible!
In order to rewrite your loop it would be helpful if you could verbally
describe what you actually want to do, since I assume that what the loop
is doing right now is probably not what you want. As which() returns the index/posistion of an element in a vector or matrix what you are basically
doing is:
indices of the i'th row that are not NA (for a given column) - mean over these indices
While this is theoretically possible, this usually doesnt make much sense. So with all my notes at hand: clearly state your problem so we can think of a fix.

i not showing up as number in loop

so I have a loop that finds the position in the matrix where there is the largest difference in consecutive elements. For example, if thematrix[8] and thematrix[9] have the largest difference between any two consecutive elements, the number given should be 8.
I made the loop in a way that it will ignore comparisons where one of the elements is NaN (because I have some of those in my data). The loop I made looks like this.
thenumber = 0 #will store the difference
for (i in 1:nrow(thematrix) - 1) {
if (!is.na(thematrix[i]) & !is.na(thematrix[i + 1])) {
if (abs(thematrix[i] - thematrix[i + 1]) > thenumber) {
thenumber = i
}
}
}
This looks like it should work but whenever I run it
Error in if (!is.na(thematrix[i]) & !is.na(thematrix[i + 1])) { :
argument is of length zero
I tried this thing but with a random number in the brackets instead of i and it works. For some reason it only doesn't work when I use the i specified in the beginning of the for-loop. It doesn't recognize that i represents a number. Why doesn't R recognize i?
Also, if there's a better way to do this task I'd appreciate it greatly if you could explain it to me
You are pretty close but when you call i in 1:nrow(thematrix) - 1 R evaluates this to make i = 0 which is what causes this issue. I would suggest either calling i in 1:nrow(thematrix) or i in 2:nrow(thematrix) - 1 to start your loop at i = 1. I think your approach is generally pretty intuitive but one suggestion would be to frequently use the print() function to evaluate how i changes over the course of your function.
The issue is that the : operator has higher precedence than -; you just need to use parentheses around (nrow(thematrix)-1). For example,
thematrix <- matrix(1:10, nrow = 5)
##
wrong <- 1:nrow(thematrix) - 1
right <- 1:(nrow(thematrix) - 1)
##
R> wrong
#[1] 0 1 2 3 4
R> right
#[1] 1 2 3 4
Where the error message is coming from trying to access the zero-th element of thematrix:
R> thematrix[0]
integer(0)
The other two answers address your question directly, but I must say this is about the worst possible way to solve this problem in R.
set.seed(1) # for reproducible example
x <- sample(1:10,10) # numbers 1:10 in random order
x
# [1] 3 4 5 7 2 8 9 6 10 1
which.max(abs(diff(x)))
# [1] 9
The diff(...) function calculates sequential differences, and which.max(...) identifies the element number of the maximum value in a vector.

Similar function to R "rep" in jags to create array?

Is there a similar function in jags as the R function rep? I want to create an array using similar code as the following:
n ~ dmulti(pi, N) # pi is a 3 dimensional probability vector, N is fixed
# the dimension of n is hard coded in this line:
a <- c(rep(0, n[1]), rep(1, n[2]), rep(2, n[3]))
I read through the manual and wasn't able to find a way to achieve this. I understand that Stan would probably allow this but I couldn't use Stan because I need to do inference on discrete parameters. I really appreciate your help!
This question is also posted on the JAGS help forum.
I have added a rep function to the development version (future JAGS 4.0.0) as Matt and John have alluded to, this requires the second argument to be fixed so that the length of the resulting vector can be determined at compile time.
The short answer is no, I'm afraid not. One of the stipulations of the JAGS/BUGS language is that variables must have fixed dimensions (with every element defined exactly once) - in your example a will change dimension size depending on the vector n. There may be other ways to get the result you are looking for, but not using this approach.
Incidentally, you use n twice in that bit of code (LHS and RHS of the multinominal distribution) which is not allowed - although that may just be a typo :)
Matt
You could populate your vector with some loops:
library(R2jags)
M <- function() {
for (i in 1:n[1]) {
a[i] <- 0
}
for (i in 1:n[2]) {
a[i + n[1]] <- 1
}
for (i in 1:n[3]) {
a[i + sum(n[1:2])] <- 2
}
}
j <- jags(list(n=3:5), NULL, 'a', M, DIC=FALSE)
j$BUGSoutput$mean$a
## [1] 0 0 0 1 1 1 1 2 2 2 2 2
However, as #MattDenwood alluded to, if the sum of the elements of n is variable this will throw an error - a must be of constant length throughout the simulation.

Resources