Is it possible to show values of variables in equation?
(not only for a+b, but for complex formules too)
Basically:
x = 10
y = 20
x + y
>>> ans =
30.
How I can do something like this:
x = 10
y = 20
x + y
>>> ans = 10 + 20 =
30.
Or:
x = 10
y = 20
x + y
>>> ans =
x + y = 10 + 20 = 30.
(like mathcad explicit)
Related
I have written a code that generates x and y data and am able to plot it.
# Number of observations
n <- 250
# x randomly drawn from a continuous uniform distribution with bounds [0,10]
x <- runif(min = 0, max = 1, n = sample(n))
# Error term from Normal distribution
error <- rnorm(n = n, mean = 0, sd = 2)
beta_0 <- 1
beta_1 <- -1
y <- beta_0*x + (beta_1*x - error)
library(tibble)
df <- tibble(x = x, y = y)
df
library(ggplot2)
ggplot(data = df, aes(x = x, y = y)) + geom_point()
labs(title = "y = f(x)")
I get an graph image like this:
I also get a data table like this of different coordinate data:
x
y.
0.139
-2.87
0.981
1.48
I would like to now randomly classify my data, such that my table looks like:
x
y.
Group1
Group2
0.139
-2.87
-1
1
0.981
1.48
1
-1
Where 1 represents that points membership to the group and -1 representing the point not being affiliated to the group. On the graph this would look like I had blue dots for Group1 membership vs red dots for Group2 membership.
Any help with this would be greatly appreciated.
Thank you.
To do it the way you suggested (with one column for group 1 and one column for group 2), you could do:
library(dplyr)
library(ggplot2)
df %>%
mutate(group1 = sample(c(-1, 1), n, TRUE),
group2 = -group1) %>%
ggplot(aes(x = x, y = y, color = factor(group1))) +
geom_point() +
scale_color_brewer('group', palette = 'Set1',
labels = c('Group 1', 'Group 2')) +
labs(title = "y = f(x)")
However, it seems a bit redundant to me having two mutually exclusive binary columns. You could just have a single column called group which is either group 1 or group 2:
df %>%
mutate(group = sample(c('Group 1', 'Group 2'), n, TRUE)) %>%
ggplot(aes(x = x, y = y, color = group)) +
geom_point() +
scale_color_brewer(palette = 'Set1') +
labs(title = "y = f(x)"
If I have following variables
x <- data.frame(ret = c(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15) )
k <- 4
and I want to get y such that
y[i,1] = x[i,1]*(1/k) + x[i+1,1]*(2/k) + x[i+2,1]*(3/k) + x[i+3,1]*(k/k)
.
.
.
till i = nrow(x) - k + 1
how can I achieve this?
It is basically sum of last k values but it is multiplied by n/k where n is the index of last k elements.
for the given x as input the output will have following values
y
7.5 <- y[1,1] = (x[1,1] * 0.25 + x[2,1] *0.5 + x[3,1] *0.75 + 1 * x[4,1])
10
12.5
15
17.5
20
22.5
25
27.5
30
32.5
35
Use rollapply with the indicated function:
library(zoo)
wsum <- function(x, k) sum(seq(k) * x) / k
transform(x, ret = rollapply(ret, k, wsum, k = k, align = "left", fill = NA))
Update
An alternative that allows us to omit the k = k argument is:
wsum <- function(x, k = length(x)) sum(seq(k) * x) / k
transform(x, ret = rollapply(ret, k, wsum, align = "left", fill = NA))
Suppose I have
y <- 10:15
x <- 1:6
ggplot()+geom_line(aes(x = x,y = y))+scale_y_continuous(limits = c(min(y),max(y)),sec.axis = sec_axis(trans = ~. )
I want a transformation on the secondary axis that gives me values from 0 to 1. That is, getting the y value, subtract min (y) and then divide the resulting number by (max (y) - min (y).
The problem is, I have to get the number, subtract min (y), and then I have to transform again. I can't do this. If I try trans = ~.-min(y)/(max(y)-min(y)), I don't get what I want. How can I make it understand (yvalue - min y) is my new value, and then divide it?
You can wrap your calculation steps in a temporary function:
y <- 10:15
x <- 1:6
my_fun <- function(y) {
(y - min(y)) / (max(y)-min(y))
}
ggplot() +
geom_line(aes(x = x,y = y)) +
scale_y_continuous(limits = c(min(y),
max(y)),
sec.axis = sec_axis(trans = ~ my_fun(.) ))
I'm trying to combine 3 matrices to one plot.
I'm trying to simulate a mark-recapture scenario. However, instead of having 1 population, there are 3 (which are contained in each of their matrices).
Because I want to sample from each population once, the x-axis will range from 0-300. However, 1-100 on the x-axis will correspond to the samples collected from population:
101-200 from population 2
201-300 from population 3. The only deviation from the picture is that I'd like a continuous line, from 0-300.
I have the code to create these matrices and made each matrix the same size, but I don't know how to 1) convert and plot them using ggplot2 2) put all three on one graph
## Population size
N <- 400
N
## Vector labeling each item in the population
pop <- c(1:N)
pop
## Lower and upper bounds of sample size
lower.bound <- round(x = .05 * N, digits = 0)
lower.bound ## Smallest possible sample size
upper.bound <- round(x = .15 * N, digits = 0)
upper.bound ## Largest possible sample size
## Length of sample size interval
length.ss.interval <- length(c(lower.bound:upper.bound))
length.ss.interval ## total possible sample sizes, ranging form lower.bound to upper.bound
## Determine a sample size randomly (not a global variable...simply for test purposes)
## Between lower and upper bounds set previously
## Give equal weight to each possible sample size in this interval
sample(x = c(lower.bound:upper.bound),
size = 1,
prob = c(rep(1/length.ss.interval, length.ss.interval)))
## Specify number of samples to take
n.samples <- 100
## Initiate empty matrix
## 1st column is population (item 1 thorugh item 400)
## 2nd through nth column are all rounds of sampling
dat <- matrix(data = NA,
nrow = length(pop),
ncol = n.samples + 1)
dat[,1] <- pop
## Take samples of random sizes
## Record results in columns 2 through n
## 1 = sampled (marked)
## 0 = not sampled (not marked)
for(i in 2:ncol(dat)) {
a.sample <- sample(x = pop,
size = sample(x = c(lower.bound:upper.bound),
size = 1,
prob = c(rep(1/length.ss.interval, length.ss.interval))),
replace = FALSE)
dat[,i] <- dat[,1] %in% a.sample
}
## How large was each sample size?
apply(X = dat, MARGIN = 2, FUN = sum)
## 1st element is irrelevant
## 2nd element through nth element: sample size for each of the 100 samples
schnabel.comp <- data.frame(sample = 1:n.samples,
n.sampled = apply(X = dat, MARGIN = 2, FUN = sum)[2:length(apply(X = dat, MARGIN = 2, FUN = sum))]
)
## First column: which sample, 1-100
## Second column: number selected in that sample
## How many items were previously sampled?
## For 1st sample, it's 0
## For 2nd sample, code is different than for remaning samples
n.prev.sampled <- c(0, rep(NA, n.samples-1))
n.prev.sampled
n.prev.sampled[2] <- sum(ifelse(test = dat[,3] == 1 & dat[,2] == 1,
yes = 1,
no = 0))
n.prev.sampled
for(i in 4:ncol(dat)) {
n.prev.sampled[i-1] <- sum(ifelse(test = dat[,i] == 1 & rowSums(dat[,2:(i-1)]) > 0,
yes = 1,
no = 0))
}
schnabel.comp$n.prev.sampled <- n.prev.sampled
## n.newly.sampled: in each sample, how many items were newly sampled?
## i.e., never seen before?
schnabel.comp$n.newly.sampled <- with(schnabel.comp,
n.sampled - n.prev.sampled)
## cum.sampled: how many total items have you seen?
schnabel.comp$cum.sampled <- c(0, cumsum(schnabel.comp$n.newly.sampled)[2:n.samples-1])
## numerator of schnabel formula
schnabel.comp$numerator <- with(schnabel.comp,
n.sampled * cum.sampled)
## denominator of schnable formula is n.prev.sampled
## pop.estimate -- after each sample (starting with 2nd -- need at least two samples)
schnabel.comp$pop.estimate <- NA
for(i in 1:length(schnabel.comp$pop.estimate)) {
schnabel.comp$pop.estimate[i] <- sum(schnabel.comp$numerator[1:i]) / sum(schnabel.comp$n.prev.sampled[1:i])
}
## Plot population estimate after each sample
if (!require("ggplot2")) {install.packages("ggplot2"); require("ggplot2")}
if (!require("scales")) {install.packages("scales"); require("scales")}
small.sample.dat <- schnabel.comp
small.sample <- ggplot(data = small.sample.dat,
mapping = aes(x = sample, y = pop.estimate)) +
geom_point(size = 2) +
geom_line() +
geom_hline(yintercept = N, col = "red", lwd = 1) +
coord_cartesian(xlim = c(0:100), ylim = c(300:500)) +
scale_x_continuous(breaks = pretty_breaks(11)) +
scale_y_continuous(breaks = pretty_breaks(11)) +
labs(x = "\nSample", y = "Population estimate\n",
title = "Sample sizes are between 5% and 15%\nof the population") +
theme_bw(base_size = 12) +
theme(aspect.ratio = 1)
small.sample
It seems that what you want to do is...
Given three data frames like this:
> d1
x y
1 1 0.899683096
2 2 0.604513234
3 3 0.005824789
4 4 0.442692758
5 5 0.103125175
> d2
x y
1 1 0.35260029
2 2 0.06248654
3 3 0.79272047
> d3
x y
1 1 0.4791399
2 2 0.2583674
3 3 0.1283629
4 4 0.7133847
Construct d:
> d = rbind(d1,d2,d3)
> d$x = 1:nrow(d)
> d
x y
1 1 0.899683096
2 2 0.604513234
3 3 0.005824789
4 4 0.442692758
5 5 0.103125175
6 6 0.352600287
7 7 0.062486543
8 8 0.792720473
9 9 0.479139947
10 10 0.258367356
11 11 0.128362933
12 12 0.713384651
And then plot x against y as normal.
When using recursion in R, it would be useful to have recursive environments as well. For example, in the the below example, it would be useful for the below code to print 1 to 9. That is, the x in the environment of each recursion would be one more than the x in the parent environment. Is there an easy way to modify the code such that this is the case?
x = 1
y = function() {
print(x)
x = x + 1
if (x <= 10) y()
}
Edit: a more complicated situation would just involve more variables:
w = 1
x = 2
y = 3
z = 4
y = function() {
print(x)
w = w + 1
x = w + x
y = x + y
z = y + z
if (w <= 10) y()
}
Now instead of four variables, say there's 50 variables. This couldn't be solved very easily through argument passing.
Edit 2:
In edit 1, what I'm hoping for would be something like this:
global: w = 1, x = 2, y = 3, z = 4
recursion 1: w = 2, x = 4, y = 7, z = 11
recursion 2: w = 3, x = 7, y = 14, z = 25
etc. Excuse math errors.
You could use a for loop and build up vectors of the variable states at each iteration. If it runs for long this might become inefficient however.
w <- 1
x <- 2
y <- 3
z <- 4
while(w[1] <= 10){
w <- c(w[1] + 1, w)
x <- c(w[1] + x[1], x)
y <- c(x[1] + y[1], y)
z <- c(y[1] + z[1], z)
}
cbind(w, x, y, z)
If you really want to use recursive environments (although I'd prefer a looping solution) you can get around the problem by passing all four variables along in a vector.
y <- function(v=c(w=1, x=2, y=3, z=4)) {
print(v["x"])
v["w"] <- 1 + v["w"]
v["x"] <- v["w"] + v["x"]
v["y"] <- v["x"] + v["y"]
v["z"] <- v["y"] + v["z"]
if (v["w"] <= 10){
y(v)
} else {
v
}
}
y()