Trying to understand how some function works - r

I was given a task to write a function, which I name: my_mode_k.
The input is consisted of two variables:
(x, k)
as x, is a vector of natural numbers with the length of n. the greatest object of x can be k, given that k < n.
my_mode_k output is the highest frequency object of x. if there's more then one object in the vector that are common in x the same number of times - then the function will output the minimum object between them.
for example:
my_mode_k(x = c(1, 1, 2, 3, 3) , k =3)
1
This is code I wrote:
my_mode_k <- function(x, k){
n <- length(x)
x_lemma <- rep(0, k)
for(i in 1:n){
x_lemma[i] < x_lemma[i] +1
}
x_lem2 <- 1
for( j in 2:k){
if(x_lemma[x_lem2] < x_lemma[j]){
x_lem2 <- j
}
}
x_lem2
}
which isn't working properly.
for example:
my_mode_k(x = c(2,3,4,3,2,2,5,5,5,5,5,5,5,5), k=5)
[1] 1
as the function is supposed to return 5.
I don't understand why and what is the intuition to have in order to even know if a function is working properly (It took me some time to realize that it's not executing the needed task) - so I could fix the mistake in it.

Here are a few steps on how you can achieve this.
k <- 5
input <- c(2,3,4,3,3,3,3,3,3,3,2,2,5,5,5,5,5,5,5,5)
# Calculate frequencies of elements.
tbl <- table(input[input <= k])
# Find which is max. Notice that it returns the minimum of there is a tie.
tbl.max <- which.max(tbl)
# Find which value is your result.
names(tbl.max)
input <- c(2,2,3,3,3,5,5,5)
names(which.max(table(input[input <= k])))
# 3
input <- c(2,2,5,5,5,3,3,3)
names(which.max(table(input[input <= k])))
# 3

Related

Understanding Breakpoint function: how for loops work inside functions

I have the following exercise to be solved in R. Under the exercise, there is a hint towards the solution.
Exercise: If there are no ties in the data set, the function above will produce breakpoints with h observations in the interval between two consecutive breakpoints (except the last two perhaps). If there are ties, the function will by construction return unique breakpoints, but there may be more than h observations in some intervals.
Hint:
my_breaks <-function(x, h = 5) {
x <-sort(x)
breaks <- xb <- x[1]
k <- 1
for(i in seq_along(x)[-1])
{if(k<h)
{k <- k+1}
else{
if(xb<x[i-1]&&x[i-1]<x[i])
{xb <- x[i-1]
breaks <-c(breaks, xb)
k <- 1
}
}
}
However, I am having a hard time understanding the above function particularly the following lines
for(i in seq_along(x)[-1])
{if(k<h)
{k <- k+1}
Question:
How is the for loop supposed to act in k if k is previously defined as 1 and i is different than k? How are the breakpoints chosen according to the h=5 gap if the for loop is not acting on x? Can someone explain to me how this function works?
Thanks in advance!
First, note that your example is incomplete. The return value and the final brace are missing there. Here is the correct version.
my_breaks <-function(x, h = 5) {
x <- sort(x)
breaks <- xb <- x[1]
k <- 1
for(i in seq_along(x)[-1]){
if(k<h) {
k <- k+1
} else {
if(xb<x[i-1]&&x[i-1]<x[i]){
xb <- x[i-1]
breaks <-c(breaks, xb)
k <- 1
}
}
}
breaks
}
Let's check if it works.
my_breaks(c(1,1,1:5,8:10), 2)
#[1] 1 2 4 8
my_breaks(c(1,1,1:5,8:10), 5)
#[1] 1 3
As you can see, everything is fine. And what is seq_along(x)[-1]? We could write this equation as 2:length(x). So the for loop goes through each element of the vector x in sequence, skipping the first element.
What is the k variable for? It counts the distance to take into account the h parameter.

What argument to give to the function to return results for more than one value?

I need to simulate the probabilities that are computed using the function transitionProbability1D from isingLenzMC package. I want to simulate them for 10 values of bF at once and receive a vector of results but still receive only one number and I dont know why. Here is my code
N <- 100
conf0 <- genConfig1D(N)
conf1 <- flipConfig1D(conf0)
# transition probability at J=H=1/kBT=1.0, 1= p-ty metropolis 2=glauber
bF <- 1:10
J <- h <- rep(1,10)
# HERE IT DOESNT WORK EVEN THOUGHT bF IS A VECTOR
transitionProbability1D(bF, conf0, conf1, J, h, 1)
>> 0.298615
You might want to look at how to vectorize a function.
On your example, the following would probably give you what you expect:
library(isingLenzMC)
N <- 100
conf0 <- genConfig1D(N)
conf1 <- flipConfig1D(conf0)
# transition probability at J=H=1/kBT=1.0, 1= p-ty metropolis 2=glauber
bF <- 1:10
# Here I changed these inputs to single values
J <- h <- 1
# HERE IT DOESNT WORK EVEN THOUGHT bF IS A VECTOR
transitionProbability1D(bF, conf0, conf1, J, h, 1)
# Vectorize on the first argument
transitionProbability1D_vectorized <- Vectorize(transitionProbability1D, vectorize.args = "ikBT")
# Now there are as many results as input values
transitionProbability1D_vectorized(ikBT = bF, x = conf0, xflip = conf1, J = J, H = h, probSel = 1)
You could also use a (for) loop!

How to create condition for warnings in R

I am trying to create a "for" loop where each of 100 trials has a set of parameters, each randomly chosen from probability distributions. From there, a model will take in these parameters and spit out an output. The input and output will be stored in a matrix, with each row representing a successful run through. Eventually, this matrix will be converted into a dataframe. I am displaying a sample run through for one case of the for loop below:
#matrix M will have 100 rows for each trial, and 4 columns
#columns will be a val, b val, c val and output
M <- matrix(0, nrow=100, ncol=4)
for (i in 1:100){
#random values for a,b,c for 1st trial
a =runif(1)
b=runif(1)
c=runif (1)
v <- c(a,b,c)
#some model
output[i]=v[1]*v[2]/v[3]
M[i,4]=output[i]
#don't know how to populate first 3 columns with all diff values of a,b,c
}
I know this code will not work, but that's my first question. How do I get the a,b, and c values to regenerate from trial to trial so I can have new outputs for each trial. From there, I am pretty sure I know how to store them in the matrix.
My last question is about warning messages. If I have a warning message because my output did not generate for some trial (no problems with this one, but if I had to divide by 0 or something)... how could I just tell the program to skip that trial and keep going until we get to 100 working trials?
Please comment if I should edit or clarify something above. Thanks in advance.
To answer your first question, you can first generate parameter vectors and then apply your function to each parameter set.
ntrials <- 100
M <- matrix(0, nrow=ntrials, ncol=4)
## Generate parameter vectors
M[,1] <- runif(ntrials)
M[,2] <- runif(ntrials)
M[,3] <- runif(ntrials)
## Example model function
run_mod <- function(a, b, c) {
return(a+b+c)
}
## Create output
M[, 4] <- run_mod(a = M[, 1], b = M[, 2], c = M[, 3])
To address your second question, you could use a while statement to continue generating parameter sets and trying to obtain valid model results until you have enough valid results. Your model function will need a way to handle errors or warnings that could occur, such as tryCatch().
## Example model function with error handling
run_mod <- function(a, b, c) {
tryCatch(
a+b+c,
error = function(e) print("Error"),
warning = function(w) print("Warning")
)
return(a+b+c)
}
i <- 0
while(i < ntrials) {
## Generate a single set of parameters
a <- runif(1)
b <- runif(1)
c <- runif(1)
## Example error
if(floor(100*a) %% 2 == 0) {
a <- "Bad parameter"
}
## Try running your model
output <- run_mod(a,b,c)
## If successful, save output and move on to the next set
if(!is.character(output)) {
M[i, 1] <- a
M[i, 2] <- b
M[i, 3] <- c
M[i, 4] <- output
i <- i + 1
}
}

Using two for loops to fill one empty matrix by rows

I'm trying to do a forecast analysis with some error measures. My question relates more to a technical Problem.
Here is a short example:
Im studying error measures regarding to the forecast length (h) and the k-step-forecast. I want to compare forecast lengths of 12, 18 and 24 months.
h<-c(12,18,24)
And for that lengths I'm comparing the 1-12 step ahead forecasts.
k <- c(1:12)
I've written two functions:
The first one (foo) is computing the hole code and the second one (forecast_analysis) is doing my forecast Analysis.
foo <- function(series, k, h){
Outfinal <- matrix(nrow = length(h)*length(k), ncol = 5)
for(i in 1:length(h)){
for(j in 1:length(k)){
Outfinal[j,] <- forecast_analysis(series,k[j],h[i])
}
}
return(Outfinal)
}
my Problem is, that I couldnt find a way to fill the Matrix by rows like this:
h k measure 1 measure 2 measure3 measure 4 measure 5
12 1
12 2
12 3
. .
. .
. .
24 10
24 11
24 12
So, first I want to fill the Matrix for a fixed value of h for all values of k. And then repeating this for all values of h. I hope ure understanding my Problem.
I know that apply functions would be more efficient here. But I'm not yet able to do so.
You can build a table of all h x k combinations plus a result using expand.grid.
This code should get you started
dummy_forecast <- function(h, k) 42
h<-c(12,18,24)
k <- 1:12 # no need for the c function here
combinations <- expand.grid(h = h, k = k, forecast = NA)
for (row in seq_along(combinations$h)) {
combinations[row, "forecast"] <-
with(combinations[row,], dummy_forecast(h, k))
}
If you return more than one value from your function, you need to assign to more than one column in combinations[row,...], but otherwise it should work.
Update
To handle a function that returns more than one value, do something like this:
dummy_forecast <- function(h, k) rep(42, 5)
result <- matrix(nrow = length(h) * length(k), ncol = 7)
combinations <- expand.grid(h = h, k = k)
for (row in seq_along(combinations$h)) {
result[row,] <- with(combinations[row,], c(h, k, dummy_forecast(h, k)))
}

Interpreting [R] Greatest Common Divisor (GCD) (and LCM) Function in {numbers} package

I don't have background in programming (except from wrestling with R to get things done), and I'm trying to verbalize what the formula for the greater common divisor in the R {numbers} package is trying to do at each step. I need help with understanding the flow of steps within the function:
function (n, m)
{
stopifnot(is.numeric(n), is.numeric(m))
if (length(n) != 1 || floor(n) != ceiling(n) || length(m) !=
1 || floor(m) != ceiling(m))
stop("Arguments 'n', 'm' must be integer scalars.")
if (n == 0 && m == 0)
return(0)
n <- abs(n)
m <- abs(m)
if (m > n) {
t <- n
n <- m
m <- t
}
while (m > 0) {
t <- n
n <- m
m <- t%%m
}
return(n)
}
<environment: namespace:numbers>
For instance, in the if (m > n) {} part the n becomes t and ultimately it becomes m? I'm afraid to ask, because it may be painfully obvious, but I don't know what is going on. The same apply to, I guess, he else part of the equation with %% being perhaps modulo.
What it says is:
Stop if either m or n are not numeric, more than one number, or have decimals, and return the message, "Arguments 'n', 'm' must be integer scalars."
If they both are zero, return zero.
Using absolute values from now on.
Make sure that n > m because of the algorithm we'll end up applying in the next step. If this is not the case flip them: initially place n in a temporary variable "t", and assign m to n, so that now the larger number is at the beginning of the (n, m) expression. At this point both the initial (n, m) values contain m. Finish it up by retrieving the value in the temporary variable and assigning it to m.
Now they apply the modified Euclidean algorithm to find the GCD - a more efficient version of the algorithm that shortcuts the multiple subtractions, instead replacing the larger of the two numbers by its remainder when divided by the smaller of the two.
The smaller number at the beginning of the algorithm will end up being the larger after the first iteration, therefore we'll assign it to n to get ready for the second iteration. To do so, though, we need to get the current n out of the way by assigning it to the temporary variable t. After that we get the modulo resulting from dividing the original larger number (n), which now is stored in t, by the smaller number m. The result will replace the number stored in m.
As long as there is a remainder (modulo) the process will go on, this time with the initial smaller number, m playing the role of the big guy. When there is no remainder, the smaller of the numbers in that particular iteration is returned.
ADDENDUM:
Now that I know how to read this function, I see that it is limited to two numbers in the input to the function. So I entertained myself putting together a function that can work with three integers in the input:
require(numbers)
GCF <- function(x,y,z){
tab.x <- tabulate(primeFactors(x))
tab.y <- tabulate(primeFactors(y))
tab.z <- tabulate(primeFactors(z))
max.len <- max(length(tab.x), length(tab.y), length(tab.z))
tab_x = c(tab.x, rep(0, max.len - length(tab.x)))
tab_y = c(tab.y, rep(0, max.len - length(tab.y)))
tab_z = c(tab.z, rep(0, max.len - length(tab.z)))
GCD_elem <- numeric()
for(i in 1:max.len){
GCD_elem[i] <- min(tab_x[i], tab_y[i], tab_z[i]) * i
}
GCD_elem <- GCD_elem[!GCD_elem==0]
GrCD <- prod(GCD_elem)
print(GrCD)
}
Also for the LCM:
LCM <- function(x,y,z){
tab.x <- tabulate(primeFactors(x))
tab.y <- tabulate(primeFactors(y))
tab.z <- tabulate(primeFactors(z))
max.len <- max(length(tab.x), length(tab.y), length(tab.z))
tab_x = c(tab.x, rep(0, max.len - length(tab.x)))
tab_y = c(tab.y, rep(0, max.len - length(tab.y)))
tab_z = c(tab.z, rep(0, max.len - length(tab.z)))
LCM_elem <- numeric()
for(i in 1:max.len){
LCM_elem[i] <- i^(max(tab_x[i], tab_y[i], tab_z[i]))
}
LCM_elem <- LCM_elem[!LCM_elem==0]
LCM <- prod(LCM_elem)
print(LCM)
}

Resources