Multiple curve from a function in one plot - r

I have this code
N <- 1000
beta1 = runif(N, -1,1);
beta2 = runif(N, -1,1);
x1 = seq(-500, 500, 0.01);
and for each i evaluated from 1 to N, I want to plot this function
z = beta1[i] + beta2[i]*x1;
pr = 1/(1+exp(-z));
plot (x1,pr);
at the end I would expect 1000 curve of pr vs x1.
for that I've tried this
for (i in 1:N){
z[i]= res[i,1] + res[i,2]*x1
pr[i] = 1/(1+exp(-z[i]));
plot(x1,pr[i])
}
But it gave list of 50 warnings and it didn't worked out.
Any helps?

This is a great time for some matrix multiplication to simplify and speed up calculation. Your biggest problem was that plot opens a new plot every time it's called. I assume you want all the lines plotted on the same graph.
N <- 1000
beta1 = runif(N, -1, 1)
beta2 = runif(N, -1, 1)
# I changed this to by = 1
# for plotting purposes you really done need 100k points per line
x1 = seq(-500, 500, 1)
z = cbind(1, x1) %*% rbind(beta1, beta2)
pr = 1 / (1 + exp(-z))
# this is the bug step you were missing
# initialize an empty plot with sufficient range
plot(range(x1), range(pr), type = "n")
# then just add to it in the for loop
for (i in 1:N) {
lines(x1, pr[, i])
}

Related

How to use a for loop over negative (and positive) values in R?

I am trying to use a for-loop over a range of positive and negative values and then plot the results. However, I'm having trouble getting R not to plot the correct values, since the negative values seem to screw up the index.
More precisely, the code I am running is:
# Setup objects
R = (1:20)
rejection = rep(NA, 20)
t = seq(from = -10, to = 10, by = 1)
avg_rej_freq = rep(NA, 21)
# Test a hypothesis for each possible value of x and each replication
for (x in t) {
for (r in R) {
# Generate 1 observation from N(x,1)
y = rnorm(1, x, 1)
# Take the average of this observation
avg_y = mean(y)
# Test this observation using the test we found in part a
if (avg_y >= 1 + pnorm(.95))
{rejection[r] = 1}
if (y < 1 + pnorm(.95))
{rejection[r] = 0}
}
# Calculate the average rejection frequency across the 20 samples
avg_rej_freq[x] = mean(rejection)
}
# Plot the different values of x against the average rejection frequency
plot(t, avg_rej_freq)
The resulting graph should look something like this
# Define the rejection probability for n=1
rej_prob = function(x)(1-pnorm(1-x+qnorm(0.95)))
# Plot it
curve(rej_prob,from = -10, to = 10, xlab = expression(theta),
ylab = "Rejection probability")
...but there's clearly something wrong with my code that is shifting the positive values on the graph over to the left.
Any help on how to fix this would be much appreciated!
Yep, as you suspected the negative indices are causing problems. R doesn't know how to store something as the "negative first" object in a vector, so it just drops them. Instead, try using seq_along to produce a vector of all positive indices and looping over those instead:
# Setup objects
R = (1:20)
rejection = rep(NA, 20)
t = seq(from = -10, to = 10, by = 1)
avg_rej_freq = rep(NA, 21)
# Test a hypothesis for each possible value of x and each replication
for (x in seq_along(t)) {
for (r in R) {
# Generate 1 observation from N(x,1)
# Now we ask for the value of t at index x rather than t directly
y = rnorm(1, t[x], 1)
# Take the average of this observation
avg_y = mean(y)
# Test this observation using the test we found in part a
if (avg_y >= 1 + pnorm(.95))
{rejection[r] = 1}
if (y < 1 + pnorm(.95))
{rejection[r] = 0}
}
# Calculate the average rejection frequency across the 20 samples
avg_rej_freq[x] = mean(rejection)
}
# Plot the different values of x against the average rejection frequency
plot(t, avg_rej_freq)
which produces the following plot:
Not sure why you want to simulate the vectorized function pnrom() using for loops, still correcting the mistakes in your code (check the comments):
# Test a hypothesis for each possible value of x and each replication
for (x in t) {
for (r in R) {
# Generate 1 observation from N(x,1)
y = rnorm(1, x, 1)
# no need to take average since you have a single observation
# Test this observation using the test we found in part a
rejection[r] = ifelse(y >= 1 + pnorm(.95), 1, 0)
}
# Calculate the average rejection frequency across the 20 samples
# `R` vector index starts from 1, transform your x values s.t., negative values become positive
avg_rej_freq[x-min(t)+1] = mean(rejection)
}
# Define the rejection probability for n=1
rej_prob = function(x)(1-pnorm(1-x+qnorm(0.95)))
# Plot it
curve(rej_prob,from = -10, to = 10, xlab = expression(theta),
ylab = "Rejection probability")
# plot your points
points(t, avg_rej_freq, pch=19, col='red')
Not sure why the for loops etc, what you are doing can be collapsed into a one line. The rest of the code taken from #Sandipan Dey:
R <- 20
t <- seq(from = -10, to = 10, by = 1)
#All the for-loops collapsed into this one line:
avg_rej_freq <- rowMeans(matrix(rnorm(R * length(t), t), 21) >= 1 + pnorm(.95))
rej_prob <- function(x) 1 - pnorm(1 - x + qnorm(0.95))
curve(rej_prob,from = -10, to = 10, xlab = expression(theta),
ylab = "Rejection probability")
# plot your points
points(t, avg_rej_freq, pch=19, col='red')

Running rJAGS when the likelihood is a custom density

I am trying to figure out how to sample from a custom density in rJAGS but am running into issues. having searched the site, I saw that there is a zeroes (or ones) trick that can be employed based on BUGS code but am having a hard time with its implementation in rJAGS. I think I am doing it correctly but keep getting the following error:
Error in jags.model(model1.spec, data = list(x = x, N = N), n.chains = 4, :
Error in node dpois(lambda)
Length mismatch in Node::setValue
Here is my rJAGS code for reproducibility:
library(rjags)
set.seed(4)
N = 100
x = rexp(N, 3)
L = quantile(x, prob = 1) # Censoring point
censor = ifelse(x <= L, 1, 0) # Censoring indicator
x[censor == 1] <- L
model1.string <-"
model {
for (i in 1:N){
x[i] ~ dpois(lambda)
lambda <- -N*log(1-exp(-(1/mu)))
}
mu ~ dlnorm(mup, taup)
mup <- log(.0001)
taup <- 1/49
R <- 1 - exp(-(1/mu) * .0001)
}
"
model1.spec<-textConnection(model1.string)
jags <- jags.model(model1.spec,
data = list('x' = x,
'N' = N),
n.chains=4,
n.adapt=100)
Here, my negative log likelihood of the density I am interested in is -N*log(1-exp(-(1/mu))). Is there an obvious mistake in the code?
Using the zeros trick, the variable on the left-hand side of the dpois() relationship has to be an N-length vector of zeros. The variable x should show up in the likelihood somewhere. Here is an example using the normal distribution.
set.seed(519)
N <- 100
x <- rnorm(100, mean=3)
z <- rep(0, N)
C <- 10
pi <- pi
model1.string <-"
model {
for (i in 1:N){
lambda[i] <- pow(2*pi*sig2, -0.5) * exp(-.5*pow(x[i]-mu, 2)/sig2)
loglam[i] <- log(lambda[i]) + C
z[i] ~ dpois(loglam[i])
}
mu ~ dnorm(0,.1)
tau ~ dgamma(1,.1)
sig2 <- pow(tau, -1)
sumLL <- sum(log(lambda[]))
}
"
model1.spec<-textConnection(model1.string)
set.seed(519)
jags <- jags.model(model1.spec,
data = list('x' = x,
'z' = z,
'N' = N,
'C' = C,
'pi' = pi),
inits = function()list(tau = 1, mu = 3),
n.chains=4,
n.adapt=100)
samps1 <- coda.samples(jags, c("mu", "sig2"), n.iter=1000)
summary(samps1)
Iterations = 101:1100
Thinning interval = 1
Number of chains = 4
Sample size per chain = 1000
1. Empirical mean and standard deviation for each variable,
plus standard error of the mean:
Mean SD Naive SE Time-series SE
mu 4.493 2.1566 0.034100 0.1821
sig2 1.490 0.5635 0.008909 0.1144
2. Quantiles for each variable:
2.5% 25% 50% 75% 97.5%
mu 0.6709 3.541 5.218 5.993 7.197
sig2 0.7909 0.999 1.357 1.850 2.779

How to draw Poisson density curve in R?

I need to show that the amount of events in Poisson process are distributed by Poisson distribution with parameter lambda * t.
Here is the Poisson process generator:
ppGen <- function(lambda, maxTime){
taos <- taosGen(lambda, maxTime)
pp <- NULL
for(i in 1:maxTime){
pp[i] <- sum(taos <= i)
}
return(pp)
}
Here I try to replicate the process 1000 times and vectorisee the total occurrences in each realisation:
d <- ppGen(0.5,100)
tail(d,n=1)
reps <- 1000
x1 <- replicate(reps, tail(ppGen(0.5,100), n=1))
hist(x1)
Here is the histogram:
Here I am trying to draw a theoretical Poisson density curve with parameter lambda * t:
xfit<-seq(1,100,length=100)
yfit<-dpois(xfit,lambda = 0.5*100)
lines(xfit,yfit)
But the curve doesn't appear anywhere near the histogram. Can anyone suggest on the right way to do this?
Maybe you can try curve like below
x <- rpois(1000, 0.5 * 100)
dp <- function(x, lbd = 0.5 * 100) dpois(x, lambda = lbd)
curve(dp, 0, 100)
hist(x, freq = FALSE, add = TRUE)

How to integrate the product of two functions

Suppose I am seeking to integrate the following function from 0 to 10:
How would I accomplish this in R?
Functions
# Functional form
fn <- function(t) -100*(t)^2 + 20000
# First derivative w.r.t. t
fn_dt <- function(t) -200*t
# Density funciton phi
phi <- approxfun(density(rnorm(35, 15, 7)))
# Delta t
delta <- 5
How about the following:
First off, we choose a fixed seed for reproducibility.
# Density funciton phi
set.seed(2017);
phi <- approxfun(density(rnorm(35, 15, 7)))
We define the integrand.
integrand <- function(x) {
f1 <- -500 * x^2 + 100000;
f2 <- phi(x);
f2[is.na(f2)] <- 0;
return(f1 * f2)
}
By default, approxfun returns NA if x falls outside the interval [min(x), max(x)]; since phi is based on the density of a normal distribution, we can replace NAs with 0.
Let's plot the integrand
library(ggplot2);
ggplot(data.frame(x = 0), aes(x)) + stat_function(fun = integrand) + xlim(-50, 50);
We use integrate to calculate the integral; here I assume you are interested in the interval [-Inf, +Inf].
integrate(integrand, lower = -Inf, upper = Inf)
#-39323.06 with absolute error < 4.6

Simulating a mixed linear model and evaluating it with lmerTest in R

I am trying to understand how to use mixed linear models to analyse my data by simulating a model, but I can't reproduce the input parameters. What am I missing?
I want to start simulating a model with a random intercept for each subject. Here is the formula of what I want to simulate and reproduce:
If beta1 (<11) is small I find gamma00 as the intercept in fixed section, but I am completedly unaable to retrieve the slope (beta1). Also, the linear effect is not significant. Where is my conceptual mistake?
library(lmerTest)
# Generating data set
# General values and variables
numObj <- 20
numSub <- 100
e <- rnorm(numObj * numSub, mean = 0, sd = 0.1)
x <- scale(runif(numObj * numSub, min = -100, max = 100))
y <- c()
index <- 1
# Coefficients
gamma00 <- 18
gamma01 <- 0.5
beta1 <- -100
w <- runif(numSub, min = -3, max = 3)
uo <- rnorm(numSub, mean = 0, sd = 0.1)
meanBeta0 <- mean(gamma00 + gamma01*w + uo) # I should be able to retrieve that parameter.
for(j in 1:numSub){
for(i in 1:numObj){
y[index] <- gamma00 + gamma01*w[j]+ uo[j] + beta1*x[i] + e[index]
index <- index + 1
}
}
dataFrame2 <- data.frame(y = y, x = x, subNo = factor(rep(1:numSub, each = numObj)), objNum = factor(rep(1:numObj, numSub)))
model2 <- lmer(y ~ x +
(1 | subNo), data = dataFrame2)
summary(model2)
anova(model2)
No conceptual mistake here, just a mixed up index value: you should be using index rather than i to index x in your data generation loop.
Basically due to the mix-up you were using the first subject's x values for generating data for all the subjects, but using the individual x values in the model.

Resources