mapply error: 'length.out' must be a non-negative number - r

I've created the following simple function:
fillampenv <- function(samples, samprate, rise, fall){
# Create output vector
v <- vector("numeric", samples)
# Fill output vector
v <- c(seq(0, 1, length = rise * samprate),
seq(1, 1, length = (((samples/samprate) -
(rise + fall)) * samprate) -1),
seq(1, 0, length = fall * samprate))
return(v)
}
Which I'd like to call on each row of the dataframe:
df <- structure(list(samples = c(17640, 17640, 17640, 17640, 17640), samprate = c(44100, 44100, 44100, 44100, 44100), rise = c(0.75, 0.75, 0.75, 0.75, 0.75), fall = c(0.3, 0.3, 0.3, 0.3, 0.3)), class = "data.frame", row.names = c(NA, -5L))
However, when I tried to do this using the following (which works for me with other functions I've made):
ampenvs <- mapply(fillampenv,
samples = df$samples,
samprate = df$samprate,
rise = df$rise,
fall = df$fall)
I get the error:
Error in seq.default(1, 1, length = (((samples/samprate) - (rise + fall)) * :
'length.out' must be a non-negative number
Any ideas why? I'm struggling to work out why it doesn't work with this function in particular (while others work just fine).

Your function results in negative length.out parameter passed to seq. For example,
> df %>% mutate(x = (((samples/samprate) -
+ (rise + fall)) * samprate) -1)
samples samprate rise fall x
1 17640 44100 0.75 0.3 -28666
2 17640 44100 0.75 0.3 -28666
3 17640 44100 0.75 0.3 -28666
4 17640 44100 0.75 0.3 -28666
5 17640 44100 0.75 0.3 -28666

Everywhere you faced this issue you have to parse your number to your desire one like Float, Integer and ...

Related

Check which range the value belongs to and replace it with the midpoint of another vector in R

I have three vectors
The first vector contains my values as follows
my_val <- c(0.61, 0.254, 0.5545, 0.47897)
The second vector start from zero and ends with 1
vec1 <- seq(0, 1, by = 0.25)
The third vector is the midpoint of vec1 as
vec2 <- zoo::rollmean(vec1, 2)
I want to check for each value in my_val in which range of vec1 it lies and then replace it by the corresponding midpoint of that vec1, which is vec2 indeed.
Here, I do it manually as follows:
ifelse(0.61 >=0 & 0.61 <= 0.25, 0.125, ifelse(0.61 >=0.25 & 0.61 <= 0.5, 0.375, ifelse(0.61 >=0.5 & 0.61 <= 0.75, 0.625, 0.875)
and keep doing the same for other values of my_val
Note that the actual length of my_val is 8000 and the step in vec1 is 0.01667
Any help please
Using cut you could do:
my_val <- c(0.61, 0.254, 0.5545, 0.47897)
vec1 <- seq(0, 1, by = 0.25)
vec2 <- zoo::rollmean(vec1, 2)
vec2[cut(my_val, breaks = vec1, labels = FALSE)]
#> [1] 0.625 0.375 0.625 0.375

How to create a dataframe from the return of the function?

model <- function(alpha,n,m){
ybar <- numeric()
for(i in 1:m){
y <- arima.sim(model=list(ar=alpha),n)
ybar[i] <- mean(y)
}
CI <- mean(ybar) + c(1,-1)*qnorm(0.025)*sqrt(1/n)*(1/(1-alpha))
width <- abs(abs(CI[1])-abs(CI[2]))
list("Confidence Interval"=CI, Width=width)
}
model(-0.8,1000,1000)
model(-0.4,1000,1000)
model(-0.3,1000,1000)
model(0.2,1000,1000)
model(0.8,1000,1000)
I want to create a dataframe such that the the first column is the list of alpha (e.g. -0.8,-0.4,...,0.8) and the second column is the value for confidence interval while the 3rd column is the widthof CI. Each column associate with their own column name (alpha, confidence interval, width).
How can I do that?
Not sure if this is what we need (in base R)
do.call(rbind, lapply(c(-0.8, -0.4, -0.3),
function(x) data.frame(alpha = x, model(x, 1000, 1000))))
# alpha Confidence.Interval Width
#1 -0.8 -0.03474170 0.0006172874
#2 -0.8 0.03412441 0.0006172874
#3 -0.4 -0.04439685 0.0002515509
#4 -0.4 0.04414530 0.0002515509
#5 -0.3 -0.04777081 0.0001885317
#6 -0.3 0.04758228 0.0001885317
If we need the upper and lower bound as columns
do.call(rbind, lapply(c(-0.8, -0.4, -0.3), function(x) {
out <- model(x, 1000, 100)
data.frame(alpha = x, lower_bound = out$`Confidence Interval`[1],
upper_bound = out$`Confidence Interval`[2], Width = out$Width)}))
# alpha lower_bound upper_bound Width
#1 -0.8 -0.03163379 0.03723232 0.005598532
#2 -0.4 -0.04186212 0.04668002 0.004817898
#3 -0.3 -0.04833423 0.04701885 0.001315380
Or with tidyverse
library(dplyr)
library(purrr)
tibble(alpha = c(-0.8, -0.4, -0.3),
out = map(alpha, model, n = 1000, m = 1000)) %>%
unnest_wider(c(out)) %>%
unnest_longer(c(`Confidence Interval`))

How to fill a matrix by proportion?

I'm trying to create aa 20x20 matrix filled with numbers from -1:2. However, I don't want it to be random but by proportion that I decide.
For example, I would want 0.10 of the cells to be -1, 0.60 to be 0, 0.20 to be 1, 0.10 to be 2.
This code was able to get me a matrix with all of the values I want, but I don't know how to edit it to specify the proportion of each value I want.
r <- 20
c <- 20
mat <- matrix(sample(-1:2,r*c, replace=TRUE),r,c)
We can use the prob argument from sample
matrix(sample(-1:2,r*c, replace=TRUE, prob = c(0.1, 0.6, 0.2, 0.2)), r, c)
r <- 20
c <- 20
ncell = r * c
val = c(-1, 0.2, 1, 2)
p = c(0.1, 0.6, 0.2, 0.1)
fill = rep(val, ceiling(p * ncell))[1:ncell]
mat <- matrix(data = sample(fill), nrow = r, ncol = c)
prop.table(table(mat))
#> mat
#> -1 0.2 1 2
#> 0.1 0.6 0.2 0.1
Created on 2019-09-20 by the reprex package (v0.3.0)

Create samples with different range and weights

I want to create a total sample of 3000 entries with some rules :
Category-1(low) 0.1 - 0.3
Category-2(Medium) 0.4 - 0.7
Category-3(High) 0.7 - 0.9
I want to create the sample in such a way that each category has weights for example :
Category-1(low) 20% of the dataset
Category-2(Medium) 30% of the dataset
Category-3(High) 50% of the dataset
I am unable to find pointers to do that. Can anyone help me out with the same. Thanks a lot in advance.
We can use Map to create a sequence of values between the ranges showed in the OP's post, while generating the sample on the ranges with the proportion also being passed in as argument to Map
lst1 <- Map(function(x, y, z) sample(seq(x, y, by = 0.1), z,
replace = TRUE), c(0.1, 0.4, 0.7), c(0.3, 0.7, 0.9), c(0.2, 0.3, 0.5) * 3000)
names(lst1) <- c("low", "medium", "high")
lengths(lst1)
# low medium high
# 600 900 1500
out <- unlist(lst1)
length(out)
#[1] 3000
If we need as a two column data.frame
dat <- stack(lst1)[2:1]
I like to use the simstudy package for data generation. In this case I back-filled your values that conform to category rules. Simstudy gives a data.table object, but I'm more familiar with Tidyverse syntax:
library(simstudy)
library(dplyr)
set.seed(1724)
# define data
def <- defData(varname = "category", formula = "0.2;0.3;0.5", dist = "categorical", id = "id")
def <- defData(def, varname = "value", dist = "nonrandom", formula = NA)
# generate data
df <- genData(3000, def) %>% as_tibble()
# add in values that conform to category rules
df[df$category == 1,]$value <- runif(nrow(df[df$category == 1,]), min = 0.1, max = 0.3)
df[df$category == 2,]$value <- runif(nrow(df[df$category == 2,]), min = 0.4, max = 0.7)
df[df$category == 3,]$value <- runif(nrow(df[df$category == 3,]), min = 0.7, max = 0.9)
# A tibble: 3,000 x 3
id category value
<int> <int> <dbl>
1 1 3 0.769
2 2 2 0.691
3 3 3 0.827
4 4 3 0.729
5 5 2 0.474
6 6 3 0.818
7 7 2 0.635
8 8 2 0.552
9 9 3 0.794
10 10 3 0.792
# ... with 2,990 more rows
A rather simple approach:
1. This is not that random, but depending on the application this may suffice
out <- c(runif(600, 0.1, 0.3), runif(900, 0.4, 0.7), runif(1500, 0.7, 0.9))
2. Here, you'd draw the numbers coming from each category as well: so more random...
sam <- sample(1:3, size = 3000, prob = c(0.2, 0.3, 0.5), replace = TRUE)
x1 <- sum(sam == 1)
x2 <- sum(sam == 2)
x3 <- sum(sam == 3)
out <- c(runif(x1, 0.1, 0.3), runif(x2, 0.4, 0.7), runif(x3, 0.7, 0.9))

What is the difference between matrixpower() and markov() when it comes to computing P^n?

Consider a Markov chain with state space S = {1, 2, 3, 4} and transition matrix
P = 0.1 0.2 0.4 0.3
0.4 0.0 0.4 0.2
0.3 0.3 0.0 0.4
0.2 0.1 0.4 0.3
And, take a look at the following source code:
# markov function
markov <- function(init,mat,n,labels)
{
if (missing(labels))
{
labels <- 1:length(init)
}
simlist <- numeric(n+1)
states <- 1:length(init)
simlist[1] <- sample(states,1,prob=init)
for (i in 2:(n+1))
{
simlist[i] <- sample(states, 1, prob = mat[simlist[i-1],])
}
labels[simlist]
}
# matrixpower function
matrixpower <- function(mat,k)
{
if (k == 0) return (diag(dim(mat)[1]))
if (k == 1) return(mat)
if (k > 1) return( mat %*% matrixpower(mat, k-1))
}
tmat = matrix(c(0.1, 0.2, 0.4, 0.3,
0.4, 0.0, 0.4, 0.2,
0.3, 0.3, 0.0, 0.4,
0.2, 0.1, 0.4, 0.3), nrow=4, ncol=4, byrow=TRUE)
p10 = matrixpower(mat = tmat, k=10)
rowMeans(p10)
nn <- 10
alpha <- c(0.25, 0.25, 0.25, 0.25)
set.seed(1)
steps <- markov(init=alpha, mat=tmat, n=nn)
table(steps)/(nn + 1)
Output
> rowMeans(p10)
[1] 0.25 0.25 0.25 0.25
>
.
.
.
> table(steps)/(nn + 1)
steps
1 2 3 4
0.09090909 0.18181818 0.18181818 0.54545455
> ?rowMeans
Why are results so different?
What is the difference between using matrixpower() and markov() when it come to compute Pn?
Currently you are comparing completely different things. First, I'll focus not on computing Pn, but rather A*Pn, where A is the initial distribution. In that case matrixpower does the job:
p10 <- matrixpower(mat = tmat, k = 10)
alpha <- c(0.25, 0.25, 0.25, 0.25)
alpha %*% p10
# [,1] [,2] [,3] [,4]
# [1,] 0.2376945 0.1644685 0.2857105 0.3121265
those are the true probabilities of being in states 1, 2, 3, 4, respectively, after 10 steps (after the initial draw made using A).
Meanwhile, markov(init = alpha, mat = tmat, n = nn) is only a single realization of length nn + 1 and only the last number of this realization is relevant for A*Pn. So, as to try to get similar numbers to the theoretical ones, we need many realizations with nn <- 10, as in
table(replicate(markov(init = alpha, mat = tmat, n = nn)[nn + 1], n = 10000)) / 10000
#
# 1 2 3 4
# 0.2346 0.1663 0.2814 0.3177
where I simulate 10000 realizations and take only the last state of each realization.

Resources