Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 2 years ago.
Improve this question
I need a non time-series dataset for evaluating various forecasting techniques in R. Please help me find a suitable dataset. I can't find any. The requirements are: one dependent variable and at least 4 continuous independent variables no factor or binary columns. Please help me out if you have such data.
You could make one yourself:
var1 <- runif(100, 10, 100)
var2 <- rnorm(100, 20, 3)
var3 <- runif(100, 0.1, 0.9)
var4 <- rnorm(100, 1000, 2000)
odds <- (var1 + var2 + 50 * var3 + var4 / 100)/100
dv <- rbinom(100, 1, odds/(1 + odds))
df <- data.frame(dv, var1, var2, var3, var4)
Related
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 1 year ago.
Improve this question
My dataset contains multiple observations (Mean Intensity gfp) for different species (aapl2). Each species has a several number of observations.
I have already grouped observations in groups by species and calculated 95 percentil with:
data2 = aggregate(data$"Mean Intensity gfp" ~ data$aapl2, FUN = quantile, probs = c(0.95)).
But now, I have problem and I don't know how to solve it.
I need to calculate a median and mean of calculated 95 percentile , but I really don't know how to do it.
Could somebody help me, please?
Thank you very very much
enter image description here
Using iris... To get the mean/median of those where the value is below the 95th percentile (of its species)
library(data.table)
data.table(iris)[, keep := Petal.Length < quantile(Petal.Length, 0.95),
by = Species][
keep==TRUE,
.(mean(Petal.Length), median(Petal.Length)),
by = Species]
And using dplyr
library(dplyr)
iris %>%
group_by(Species) %>%
filter(Petal.Length < quantile(Petal.Length, 0.95)) %>%
summarize("mean"=mean(Petal.Length),
"med"=median(Petal.Length))
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 3 years ago.
Improve this question
Write a program(use R) to calculate P(X + Y + Z = k) for arbitrary discrete non-negative rv’s X, Y , and Z.(rv, random variable)
This is an exercise from my book. I have no idea how to start.
Thank you for your time.
Here's a start
First, define three different lists, each containing 30 draws from unique binomial distributions (you can define X,Y, and Z as any discrete distribution you want):
X = rbinom(30, 10, 0.8)
Y = rbinom(30, 5, 0.5)
Z = rbinom(30, 8, 0.3)
Then, create a function that calculates the probability of drawing a certain number k from the added lists:
probability <- function(k) {
combine <- X+Y+Z
return(sum(combine==k)/length(combine))
}
An example call with k=14:
> probability(k=14)
[1] 0.2333333
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 5 years ago.
Improve this question
I want to know R code for generating 1000 random values for x and y between 1 to 7 such that x<=y and all the numbers are identically distributed.
With float number
x <- c()
y <- c()
for (i in 1:1000){
x[i] <- runif(1, 1, 7)
y[i] <- runif(1, x[i], 7)
// print or do something you want here
}
With integer number
x <- c()
y <- c()
for (i in 1:1000){
x[i] <- sample(1:7, 1)
y[i] <- sample(x[i]:7, 1)
// print or do something you want here
}
You can try it.
Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 6 years ago.
Improve this question
I am an absolute beginner with R so please bear with me.
I have some generated polynomial (squared) data
x.training <- seq(0, 5, by=0.01) # x data
error.training <- rnorm(n=length(x.training), mean=0, sd=1) # Error (0, 1)
y.training <- x.training^2 + error.training # y data
I want to apply 3 different regression models to this data to demonstrate which one has a better fit. My 3 models are linear, polynomial, and trigonometric (cos).
I have tried the following but the lines either don't show up or are just straight lines. How could I go about applying these models properly?
Full code:
x.training <- seq(0, 5, by=0.01) # x data
error.training <- rnorm(n=length(x.training), mean=0, sd=1) # Error (0, 1)
y.training <- x.training^2 + error.training # y data
linear.model <- lm(y.training~x.training)
poly.model <- lm(y.training~poly(x.training, 2))
trig.model <- lm(y.training~cos(x.training))
linear.predict <- predict(linear.model)
poly.predict <- predict(poly.model)
trig.predict <- predict(trig.model)
plot(x.training, y.training)
lines(linear.predict, col="red")
lines(poly.predict, col="blue")
lines(trig.predict, col="green")
Absolutely simple mistake on my part. I feel silly.
lines(x.training, linear.predict, col="red")
lines(x.training, poly.predict, col="blue")
lines(x.training, trig.predict, col="green")
I wasn't feeding in any X coordinates, and predict only returns Y-hat.
Much better!
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
I'm having trouble solving a normal distribution problem in R. I'm unfamiliar with the syntax and would like some help.
If X~N(2,9), compute
a. P(X>=2)
b. P(1<=X<7)
c. P(-2.5<=X<-1)
d. P(-3<=X-2<3)
You are looking for the pnorm function. This is the normal CDF. So you want to do something like:
# A
1 - pnorm(2, mean = 2, sd = 9) # = 0.5
# B
pnorm(7, mean = 2, sd = 9) - pnorm(1, mean = 2, sd = 9) # = 0.255
I think you can figure out the last two yourself.