Suppose I have a vector of numbers that I want to find a general cutoff for. For example:
x <- c(35, 2, 3, 30, 1, 4, 33, 6, 36)
In this case, I would want to only extract a subset that countains 35, 30, 33, 36. In this case the cutoff would be at 30 Without hardcoding a definite cutoff, I would like my code to adapt to different vectors of numbers in order to find that cutoff.
Another example would be:
x <- c(1, 20, 42, 13, 118, 149, 130, 30, 11, 32, 120, 0.5, 0.03)
In this case, a reasonable cutoff would be around 118.
Currently I am hard coding the cutoffs because I am dealing with simple cases, however I would like to make this process more modular for more variable vectors.
You could use the quantile function
cutoff <- function(y, prob=0.7) y[y > quantile(y, prob)]
x <- c(35, 2, 3, 30, 1, 4, 33, 6, 36)
cutoff(x)
[1] 35 33 36
x <- c(1, 20, 42, 13, 118, 149, 130, 30, 11, 32, 120, 0.5, 0.03)
cutoff(x)
[1] 118 149 130 120
And you can define a different probability as desired
cutoff(x, 0.8)
[1] 149 130 120
Related
Note: Initial problem was "Sensitivity analysis for ODE with parameters that include lists", as the sensRange-function gave an error due to the lists passed in the parameters. The question evolved as the list-parameters were fixed but a different problem where the sensitivity analysis showed strange results with a standard deviation of 0 for all runs.
I have a model simulating the concentrations of a chemical over time in R using the deSolve package. The parameters I use are the chemical properties (volume of distribution, halflife etc) as well as weight over time. Weight is given in a list which iterates over each time step in the ODE due to weight increase over time, and the chemical properties are given as a numeric.
I would like to perform a sensitivity analysis for this model, and only test the chemical properties. I have not done a sensitivity analysis before, but I am trying to follow examples using sensRange(). However, it doesn't seem like the sensRange() function allows that one of the parameters is given as a list. I get the error:
Error in yRef[, ivar] : invalid subscript type 'list'
My code for the model and global sensitivity analysis is set up like this:
library(FME)
library(deSolve)
c.weight <- c(3.5, 4, 5, 5, 6, 7, 7, 7, 8, 8, 8, 9, 9, 9, 9, 9, 10, 10, 10, 11, 11, 11, 12, 12, 12, 13, 13, 13, 14, 14, 14, 15, 15, 15, 16, 16, 16, 17, 17, 17, 18, 18, 18, 19, 19, 19, 20, 20, 20, 21, 21, 21, 22, 22, 22, 23, 23, 23, 24, 24, 24, 25, 25, 25, 26, 26, 26, 27, 27, 27, 27, 28, 28, 28, 29, 29, 29, 30, 30, 30, 31, 31, 31, 32, 32, 32, 33, 33, 33, 34, 34, 34, 35, 35, 35, 36, 36, 36, 37, 37, 37, 38, 38, 38, 39, 39, 39, 40, 40, 40, 41, 41, 41, 42, 42, 42, 43, 43, 43, 44, 44, 44, 45, 45, 45, 46, 46, 46, 47, 47, 47, 48, 48, 48, 49, 49, 49, 50, 50, 50, 51, 51, 52, 54, 55)
params.sens <- c(vd = 0.2,
pm = 0.05,
halflife = 5,
dose = 0.001,
c.weight = c.weight)
solve.sens <- function(pars) {
sens.model <- function(times, state, parameters) {
with(as.list(c(state, parameters)), {
if (times <= 5) {
volume <- (((0.2 * times) * c.weight[times+1]) * 30)
transferred <- pm * volume
} else if (times > 5 & times <= 14) {
volume <- ((0.1 * c.weight[times+1]) * 30)
transferred <- pm * volume
} else {
transferred <- 0
}
intake.c <- (dose * c.weight[times+1])
elimination.c <- concentration * vd * c.weight[times+1] * log(2) / (halflife * 12)
#total
concentration <- (intake.c + transferred - elimination.c) / (vd * c.weight[times+1])
list(c(concentration))
})
}
state <- c(concentration = 0.5)
months <- seq(0, 144, 1)
return(as.data.frame(ode(y = state, times = months, func = sens.model, parms = params.sens)))
}
out <- solve.sens(params.sens)
parRanges <- data.frame(min = c(0.02, 0.1, 2.1), max = c(0.09, 0.2, 5.5))
rownames(parRanges) <- c("pm", "vd", "halflife")
sens <- sensRange(func = solve.sens, parms = params.sens, dist = "latin", sensvar = "concentration", parRange = parRanges, num = 50)
head(summary(sens))
summ.sens <- summary(sens)
plot(summ.sens, xlab = "months", ylab = "concentration")
I don't know how to go forward, does anyone have any tips or see where my mistake is??
Edit: followed the Bacterial growth model from Soetart and Herman, 2009, to correct my =/<- errors and added the parameter values from the comments into the model. Now it runs with no error however the summary shows identical values for all (mean, min, max, and all quantiles) so I am assuming it is not running correctly
x Mean Sd Min Max q05 q25 q50 q75 q95
concentration0 0 0.500000 0 0.500000 0.500000 0.500000 0.500000 0.500000 0.500000 0.500000
concentration1 1 1.246348 0 1.246348 1.246348 1.246348 1.246348 1.246348 1.246348 1.246348
concentration2 2 3.475493 0 3.475493 3.475493 3.475493 3.475493 3.475493 3.475493 3.475493
concentration3 3 7.170403 0 7.170403 7.170403 7.170403 7.170403 7.170403 7.170403 7.170403
concentration4 4 12.314242 0 12.314242 12.314242 12.314242 12.314242 12.314242 12.314242 12.314242
concentration5 5 18.890365 0 18.890365 18.890365 18.890365 18.890365 18.890365 18.890365 18.890365
The original post confused = and <- to create a named vector, so I recommended the following code snippet:
params.sens <- c(vd = p.vd,
pm = p.pm,
halflife = p.hl,
dose = concentrations$dose,
c.weight = variables$c.weight)
After the edit made by the original poster, this answer became mostly obsolete and was deleted. However, it turned out in a later post, that the distinction between <-(assignment) and = (parameter matching; here creation of a named vector) was not yet completely clear.
Here an example that shows the difference. To avoid confusion, run it in a fresh R session or delete the workspace before:
#rm(list=ls()) # uncomment this to clear the work space
x <- c(a <- 2, b <- 3)
y <- c(d = 2, e = 3)
x
y
ls()
where we see that y is a named vector, whereas x has no named elements. Instead a and b were "on the fly" created as variables in the user work space:
> x
[1] 2 3
> y
d e
2 3
> ls()
[1] "a" "b" "x" "y"
Thanks to fixing the assignment mistakes, the script runs now trough and the behavior can be reproduced.
Now we can see that the parameters passed to solve.sens were not passed down to the ode function.
A fix is to replace parms.sens in the ode call with pars, that was passed to the calling function:
return(as.data.frame(ode(y = state, times = months, func = sens.model, parms = pars)))
Then
plot(summ.sens, xlab = "months", ylab = "concentration")
results in:
I'm looking to find multiple max values using multiple ranges from a single table without using a loop.
It's difficult to explain, but here's an example:
list of value <- c(100, 110, 54, 64, 73, 23, 102)
beginning_of_max_range <- c(1, 2, 4)
end_of_max_range <- c(3, 5, 6)
output
110, 110, 73
max(100, 110, 54)
max(110, 54, 64)
max(64, 73, 23)
You may do this with mapply -
list_of_value <- c(100, 110, 54, 64, 73, 23, 102)
beginning_of_max_range <- c(1, 2, 4)
end_of_max_range <- c(3, 5, 6)
mapply(function(x, y) max(list_of_value[x:y]), beginning_of_max_range, end_of_max_range)
#[1] 110 110 73
We create a sequence from beginning_of_max_range to end_of_max_range, subset it from list_of_value and get the max from each pair.
I need to create a knight tour plot out of such an exemplary matrix:
Mat = matrix(c(1, 38, 55, 34, 3, 36, 19, 22,
54, 47, 2, 37, 20, 23, 4, 17,
39, 56, 33, 46, 35, 18, 21, 10,
48, 53, 40, 57, 24, 11, 16, 5,
59, 32, 45, 52, 41, 26, 9, 12,
44, 49, 58, 25, 62, 15, 6, 27,
31, 60, 51, 42, 29, 8, 13, 64,
50, 43, 30, 61, 14, 63, 28, 7), nrow=8, ncol=8, byrow=T)
Numbers indicate the order in which knight moves to create a path.
I have a lot of these kind of results with chessboard up to 75 in size, however I have no way of presenting them in a readable way, I found out that R, given the matrix, is capable of creating a plot like this:
link (this one is 50x50 in size)
So for the matrix I presented the lines between two points occur between the numbers like: 1 - 2 - 3 - 4 - 5 - ... - 64, in the end creating a path presented in the link, but for the 8x8 chessboard, instead of 50x50
However, I have a very limited time to learn R good enough to accomplish it, I am desperate for any kind of direction. How hard does creating such code in R, that tranforms any matrix into such plot, is going to be ? Or is it something trivial ? Any code samples would be a blessing
You can use geom_path as described here: ggplot2 line plot order
In order to do so you need to convert the matrix into a tibble.
coords <- tibble(col = rep(1:8, 8),
row = rep(1:8, each = 8))
coords %>%
mutate(order = Mat[8 * (col - 1) + row]) %>%
arrange(order) %>%
ggplot(aes(x = col, y = row)) +
geom_path() +
geom_text(aes(y = row + 0.25, label = order)) +
coord_equal() # Ensures a square board.
You can subtract .5 from the col and row positions to give a more natural chess board feel.
I have x and y values for points (on a grid with discrete steps). I want to find those points that are in the same position or within a certain range from another point. I tried with the functions match(), duplicated(), which(), for loops, and if cases of different kinds and somehow got stuck.
As an example:
x <- c(23, 45, 98, 23, 12)
y <- c(15, 90, 10, 15, 70)
[1] and [4] would 'collide' in this case.
x <- c(24, 45, 98, 23, 12)
y <- c(14, 90, 10, 15, 70)
range<-1
[1] and [4] would again 'collide' in this case.
Either index or values of the points will do, however I will need one information per collision.
This is brute force but should work well as long and x and y are not massive.
x <- c(24, 45, 98, 23, 12)
y <- c(14, 90, 10, 15, 70)
range <- 2
temp = as.matrix(dist(cbind(x, y)))
diag(temp) = Inf
unique(t(apply(which(temp < range, arr.ind = TRUE), 1, sort)))
# [,1] [,2]
#4 1 4
I am sorry if I am asking the question if it has already been asked, but I could not find it..
AGE<-c(25, 37, 57, 72, 48, 28, 31, 57, 43, 38)
LLS<-c(24, 1, 24, 24, 14, 21, 4, 12, 8, 1)
RLS<-c(11, 1, 14, 21, 7, 21, 22, 8, 27, 12)
dat <- data.frame(AGE, LLS, RLS)
and want to get the maximum values of column LLS AND RLS for each rows.
Please can you tell me how to do it?
Thanks.
You can try pmax
do.call(pmax, dat[-1])
#[1] 24 1 24 24 14 21 22 12 27 12
If this is for each pair of columns, you can use combn
res <- combn(names(dat),2, FUN=function(x) do.call(pmax,dat[x]))
colnames(res) <- apply(combn(names(dat),2),2, paste, collapse="_")
I believe that for each row, you want to return a single value, whichever is higher of RLS or LLS. Right?
If so, Akrun's answer is good. Alternatively you can use the handy rowMaxs() function in the matrixStats package. In my opinion it's a little more straightforward, but that's the only real advantage.
Here is code, you can combine into fewer steps, but I wrote it out to make it clear what is going on.
AGE<-c(25, 37, 57, 72, 48, 28, 31, 57, 43, 38)
LLS<-c(24, 1, 24, 24, 14, 21, 4, 12, 8, 1)
RLS<-c(11, 1, 14, 21, 7, 21, 22, 8, 27, 12)
dat <- data.frame(AGE, LLS, RLS)
Create a subset of your dataframe, including only the columns you want
dat2 <- dat[,2:3]
Turn the new dataframe into a matrix so rowMaxs() doesn't complain
dat3 <- as.matrix(dat2)
Load the matrixStats package and call rowMaxs()
library(matrixStats)
rowMaxs(dat3)
[1] 24 1 24 24 14 21 22 12 27 12