Display order of objects in vector in R - r

Let's say I create a simple vector:
x <- seq(1, 50, by = 5)
Then I might want to display its contents to see which item is the 7th:
print(x)
[1] 1 6 11 16 21 26 31 36 41 46
Is there a simple way to display the contents such that each item is numbered?
[1] 1, [2] 6, [3] 11, etc.

One simple option is to column-bind a counter next to the original vector:
cbind(1:length(x), x)

An elegant way might be to use ?names.
x <- seq(1, 50, by = 5)
names(x) <- seq_along(x)
x
1 2 3 4 5 6 7 8 9 10
1 6 11 16 21 26 31 36 41 46

I assume you're asking how to modify print to include the index of every element of a vector x.
Here is a possibility
x <- seq(1, 50, by = 5)
cat(sapply(seq_along(x), function(i) (sprintf("[%i] %i", i, x[i]))), "\n")
#[1] 1 [2] 6 [3] 11 [4] 16 [5] 21 [6] 26 [7] 31 [8] 36 [9] 41 [10] 46
Or you could define a custom my.print function that nicely wraps lines every nmaxth entry for long vectors
my.print <- function(x, nmax = 6) {
os <- 0
while (length(x) > 0) {
cat(sapply(seq_along(x[1:min(length(x), 6)]), function(i)
sprintf("[%i] %i", i + os, x[i])), "\n")
x <- x[-(1:min(length(x), nmax))]
os <- os + nmax
}}
my.print(x)
#[1] 1 [2] 6 [3] 11 [4] 16 [5] 21 [6] 26
#[7] 31 [8] 36 [9] 41 [10] 46

Related

Making a "Race" Between Two Variables

I would like to make two variables ("a" and "b") that keep:
taking a random value less ALWAYS than their current value (i.e. a1 > a2 > a3 ...> an , b1 > b2 > b3 ... bn ALWAYS)
until one of them less than or equal to 0:
I showed a demo below:
#iteration 1
a1 = 100 - rnorm(1,5,10)
b1 = 100 -rnorm(1,5,10)
a2 = a1 - rnorm(1,5,10)
b2 = b1 -rnorm(1,5,10)
a3 = a2 - rnorm(1,5,10)
b3 = b2 -rnorm(1,5,10)
#etc.
I would then like to repeat this many times. In the end, this would look something :
Currently, I am doing this manually, and then using the bind_rows() command to "pile" each iteration on top of each other. Can someone please show me a faster way to do this?
Thank you!
You could write a smallrecursive function:
fun <- function(x){
if(any(x < 0)) x
else rbind(x, fun(x - abs(rnorm(length(x),5,10)) ))
}
Now for 1 draw of A and B:
set.seed(1)
fun(c(A=100, B=100))
A B
x 100.00000 100.000000
x 98.73546 93.163567
x 95.37918 72.210759
x 87.08410 69.006075
x 77.20981 56.622828
x 66.45199 54.676712
x 46.33418 45.778279
x 45.12178 28.631280
x 28.87247 24.080617
x 24.03437 9.642254
10.82216 -1.296759
We can use this within a function to replicate. Will maintain BASE R although can be simplified in tidyverse:
random_seq <- function(n, start){
fun <- function(x){
if(any(x < 0)) c(x)
else rbind(x, fun(x - abs(rnorm(length(x),5,10)) ))
}
R <-replicate(n, data.frame(fun(start), row.names = NULL), simplify = FALSE)
S <- do.call(rbind, Map(cbind, id = seq(R), R))
U <-transform(S, time = ave(id, id, FUN = seq_along))
reshape(U, dir='wide', idvar = 'id', sep='')
}
set.seed(1)
random_seq(4, c(A=20,B=20))
id A1 B1 A2 B2 A3 B3 A4 B4
1 1 20 20 18.7354619 13.163567 15.379176 -7.789241 NA NA
4 2 20 20 11.7049223 16.795316 1.830632 4.412069 -8.927182 2.465953
8 3 20 20 -0.1178117 11.101568 NA NA NA NA
10 4 20 20 18.7875942 2.853001 2.538285 -1.697663 NA NA
BONUS:
if interested, fun can directly reproduce the names:
fun <- function(x){
nms <- as.numeric(sub('\\D+', '',names(x))) + 1
names(x) <- paste0(sub("\\d+", '', names(x)), nms)
if(any(x < 0)) c(x)
else c(x, Recall(x - abs(rnorm(length(x),5,10)) ))
}
fun(c(A0=20, B0=30))
A1 B1 A2 B2 A3 B3
20.000000 30.000000 11.234808 23.323201 -9.611483 1.544311
Here's a function that runs a single start to 0, nicely configurable, and we can use replicate to run it as many times as needed, returning a list.
to_0 = function(start = 100, fun = runif, ..., n = 1000) {
if(start <= 0) stop("Must start greater than 0")
result = start - c(0, cumsum(fun(n, ...)))
if(all(result > 0)) stop("Didn't reach 0, set a higher n or check inputs.")
first_0 = match(TRUE, result < 0)
result[seq_len(first_0)]
}
I used runif as the default instead of your rnorm because you say you want the series to be strictly decreasing, but rnorm is sometimes positive and sometimes negative so it will sometimes lead to increases.
I cut off the series at the first negative value. Since the lengths of each run are different, a data.frame seems like a bad choice, keeping them in a list is better. We can use lengths() to see how long each vector in the list is.
The function is parametrized, so you can easily try out other distributions or custom functions, e.g., to_0(start = 100, fun = rexp, rate = 0.1). Below I demonstrate with the uniform distribution starting at 10.
set.seed(47)
race = replicate(n = 100, to_0(start = 10))
head(race)
# [[1]]
# [1] 10.00000000 9.02303800 8.64912196 7.88761993 7.06512831 6.49158390 5.80017147 5.41110962 4.94216364 4.39885390 3.47396185
# [12] 3.33516427 2.63317707 2.47098343 1.87167641 1.36564030 0.46366678 0.06316398 0.03221901 -0.03913915
#
# [[2]]
# [1] 10.00000000 9.27320918 8.54814801 7.77974923 7.34440424 7.27499236 6.76825217 6.75134855 6.20214287 5.43031741 4.56633348
# [12] 3.59288910 3.24547860 2.60269295 1.75639299 1.73279651 1.72371866 1.38211688 0.71933800 0.04916749 -0.40714758
#
# [[3]]
# [1] 10.00000000 9.08923490 9.06189460 8.69397353 8.30179409 8.11077841 7.96295850 7.49701585 6.52812608 6.26480567 5.34558158
# [12] 5.31801508 4.90573089 3.98774633 3.89046321 3.70358854 3.61482042 3.53824450 3.36900151 2.86522484 2.23295349 1.80544403
# [23] 0.82311022 0.73664857 -0.09385818
#
# [[4]]
# [1] 10.0000000 9.2172681 8.4175584 8.1672679 7.3683421 7.3373712 7.0319788 6.6512214 5.7210315 5.2732412 4.6817849 4.1065416
# [13] 3.9452541 3.4009742 2.5018050 1.5316136 0.7175295 0.4410275 -0.1859260
#
# [[5]]
# [1] 10.00000000 9.91914621 9.90238843 9.82993154 9.33156028 8.90827720 8.44160294 7.46348397 6.76539075 6.27298443 5.97401412
# [12] 5.03395592 4.55537992 3.75737919 2.82175869 2.75045000 2.70081885 2.67523320 2.20266408 2.12695183 1.25880525 0.57011279
# [23] 0.03173135 -0.79275633
#
# [[6]]
# [1] 10.0000000 9.9292630 9.6154147 9.0754730 8.7814754 8.5273701 7.6998567 6.8127609 5.9944598 5.6232599 5.1505038 4.8676191
# [13] 4.6337121 4.5868438 4.0435219 3.0981151 2.2621741 1.9925101 1.2104707 0.9334569 0.7574446 0.1643009 -0.5220925
lengths(race)
# [1] 20 21 25 19 24 23 21 24 23 22 25 24 19 19 23 17 19 23 25 21 24 25 18 22 24 25 19 19 23 22 19 26 20 23 24 24 22 21 25 23 21 28 19 20 16 20
# [47] 22 25 20 22 23 23 24 22 19 23 23 23 22 18 22 23 24 21 21 23 21 22 20 25 22 23 21 17 20 20 16 25 21 21 21 20 20 19 24 19 23 24 26 25 20 21
# [93] 23 17 27 18 30 24 21 23

Twin primes less than 87 in R

I am trying to list the first 87 twin primes. I'm using the Eratosthenes approach. Here is what I've worked on so far
Eratosthenes <- function(n) {
# Return all prime numbers up to n (based on the sieve of Eratosthenes)
if (n >= 2) {
sieve <- seq(2, n) # initialize sieve
primes <- c() # initialize primes vector
for (i in seq(2, n)) {
if (any(sieve == i)) { # check if i is in the sieve
primes <- c(primes, i) # if so, add i to primes
sieve <- sieve[(sieve %% i) != 0] # remove multiples of i from sieve
}
}
return(primes)
} else {
stop("Input value of n should be at least 2.")
}
}
Era <- c(Eratosthenes(87))
i <- 2:86
for (i in Era){
if (Era[i]+2 == Era[i+1]){
print(c(Era[i], Era[i+1]))
}
}
First thing I dont understand is this error:
Error in if (Era[i] + 2 == Era[i + 1]) { :
missing value where TRUE/FALSE needed
Second thing is in the list there are missing twin primes so for example (29,31)
Within your for loop, i is not index any more but the element in Era. In this case, you can try using (i+2) %in% Era to judge if i+2 is the twin
for (i in Era){
if ((i+2) %in% Era){
print(c(i,i+2))
}
}
which gives
[1] 3 5
[1] 5 7
[1] 11 13
[1] 17 19
[1] 29 31
[1] 41 43
[1] 59 61
[1] 71 73
A simpler way might be using diff, e.g.,
i <- Era[c(diff(Era)==2,FALSE)]
print(cbind(i,j = i+2))
which gives
> print(cbind(i,j = i+2))
i j
[1,] 3 5
[2,] 5 7
[3,] 11 13
[4,] 17 19
[5,] 29 31
[6,] 41 43
[7,] 59 61
[8,] 71 73
Firstly, (23,29) is not twin prime.
Secondly, your answer may be found in here
Edit: I've tried your code, I found that length of Era is 23.
Maybe when running if (Era[i] + 2 == Era[i+1]), it reaches to 24 and causes the problem.
for (i in Era) will set i to 2, then 3, then 5 etc which is not what you intended. Use for (i in seq_len(length(Era) - 1)).
for (i in seq_len(length(Era) - 1)){
if (Era[i] + 2 == Era[i + 1]){
print(c(Era[i], Era[i + 1]))
}
}
#> [1] 3 5
#> [1] 5 7
#> [1] 11 13
#> [1] 17 19
#> [1] 29 31
#> [1] 41 43
#> [1] 59 61
#> [1] 71 73

Designing a function to output the smallest plurality winner possible

I'm trying to design a function/formula where you're given two integer variables representing, say for example, 5 and 100. The first number could represent 5 ice cream flavours on a survey, and the 100 the number of people being sampled being asked their favourite ice cream.
I want to design a function/formula which would produce the combination of numbers where 1 of the 5 ice cream flavour's can win by smallest plurality (so, I'm guessing in most cases, by 1), and have the smallest number possible based on the number of ice cream flavours on the survey.
So with 5 ice cream flavour's and 100 respondents, I would want R to produce a vector of (order not really important):
[1] 21 20 20 20 19
As 21 is the smallest number possible for a majority ice cream flavour winner out of 100 respondants and 5 flavours. As a function it would need to deal with when numbers of choices don't neatly divide with the numeber of respondants as well.
Desired output
combinations_function <- function(x, y) {
?????
}
combinations_function(5, 100)
[1] 21 20 20 20 19
combinations_function(5, 38)
[1] 9 8 7 7 7
combinations_function(7, 48)
[1] 8 7 7 7 7 6 6
Think I got it:
smallest_margin <- function(choices, respondents)
{
values = rep(respondents %/% choices, choices)
remainder = respondents %% choices
while(remainder != 0)
{
values[which.min(values)] <- values[which.min(values)] + 1
remainder = remainder - 1
}
if(length(which(values == max(values))) > 1)
values[which(values == max(values))[1:2]] <-
values[which(values == max(values))[1:2]] + c(-1, 1)
return(sort(values))
}
smallest_margin(5, 100)
# [1] 19 20 20 20 21
smallest_margin(1, 100)
# [1] 100
smallest_margin(5, 99)
# [1] 19 19 20 20 21
smallest_margin(12, 758)
# [1] 63 63 63 63 63 63 63 63 63 63 63 65
Here is a code-golfian approach
f <- function(x,y)
rep(y%/%x, x) + ifelse(rep(y%%x>0, x),
c(1^(1:(y%%x)), 0*((y%%x+1):x)), 0) + c((-1)^(0:1), 0*(3:x))
Example
f(5, 100)
# [1] 21 19 20 20 20
f(5, 38)
# [1] 9 7 8 7 7
f(5, 48)
# [1] 11 9 10 9 9
f(7, 48)
# [1] 8 6 7 7 7 7 6

Storing the output from a loop as a list in R

I am running a small loop to randomly assign a list of numbers (1 to 30) to a subset of 4 groups. I would like to store the outputs of the loop (for 4 subsets) as a single line in one variable and use the results elsewhere. I am also getting some warnings, though the output is correctly displayed on the screen.
list = as.vector(c(6, 9, 3, 12)
start <- 1
end <- 6
i <- 1
while(i<=list){
print(sample(start:end, replace=T))
start <- start+list[i]
end <- end + list[i+1]
i <- i+1
}
[1] 3 5 6 1 5 6
[1] 9 13 12 7 11 12 14 11 14
[1] 16 17 17
[1] 28 22 26 21 28 26 22 28 26 30 21 19
Error in start:end : NA/NaN argument
In addition: Warning messages:
1: In while (i <= list) { :
the condition has length > 1 and only the first element will be used
2: In while (i <= list) { :
the condition has length > 1 and only the first element will be used
3: In while (i <= list) { :
the condition has length > 1 and only the first element will be used
4: In while (i <= list) { :
the condition has length > 1 and only the first element will be used
5: In while (i <= list) { :
the condition has length > 1 and only the first element will be used
I am unable to find the reasons for this error. Please help. Thanks.
Works fine using for loop than while loop, no need of sub-setting i variable when we use seq function
list = c(6, 9, 3, 12)
start <- 1
end <- 6
for(i in seq(list)){
if(i <= list[i]){
start <- start+list[i]
end <- end + (list[i]+1)
print(sample(start:end, replace=T))
}
}
[1] 10 8 11 7 11 10 12
[1] 23 17 18 21 22 18 20 21
[1] 25 21 27 23 26 26 23 25 22
[1] 33 32 37 37 35 40 32 37 34 38

replacing specific elements of a vector

I am trying to make a user-defined function below using the R
wrkexpcode.into.month <- function(vec) {
tmp.vec <- vec
tmp.vec[tmp.vec == 0 | tmp.vec == 9] <- NA
tmp.vec[tmp.vec == 1] <- 4
tmp.vec[tmp.vec == 2] <- 13
tmp.vec[tmp.vec == 3] <- 31
tmp.vec[tmp.vec == 4] <- 78
tmp.vec[tmp.vec == 5] <- 174
tmp.vec[tmp.vec == 6] <- 240
return (tmp.vec)
}
but when I execute with a simple command like
wrkexpcode.into.month(c(3,2,2,3,1,3,5,6,4))
the result comes like
[1] 31 13 13 31 78 31 174 240 78
but I expect the result like
[1] 31 13 13 31 **4** 31 174 240 78
How can I fix this?
You have to carefully follow the flow of your function, evaluating what the values are. You are expecting 1 to be replaced by 4 based on tmp.vec[tmp.vec == 1] <- 4, however in tmp.vec[tmp.vec == 4] <- 78 later down the road, the 4 is replaced by a 78. This is caused by replacing the values in tmp.vec and using tmp.vec for determining what needs to be replaced. Like #MattewPlourde said, you need to base the replacement on vec:
tmp.vec[vec == 1] <- 4
Although I would simply replace the code by:
wrkexpcode.into.month <- function(vec) {
translation_vector = c('0' = NA, '1' = 4, '2' = 13, '3' = 31,
'4' = 78, '5' = 174, '6' = 240, '9' = NA)
return(translation_vector[as.character(vec)])
}
wrkexpcode.into.month(c(3,2,2,3,1,3,5,6,4))
# 3 2 2 3 1 3 5 6 4
# 31 13 13 31 4 31 174 240 78
See also a blogpost I wrote recently about this kind of operation.
It think it will be much easier to use one of the many recode functions that are designed for such purposes instead of hard-coding it. It's just a one-liner then, e.g.
library(likert)
x <- c(3,2,2,3,1,3,5,6,4)
recode(x, from=c(0:6, 9), to=c(NA, 4,13,31,78,174,240,NA))
[1] 31 13 13 31 4 31 174 240 78
And if desired, wrap it into a function, e.g.
wrkexpcode.into.month <- function(x)
recode(x, from=c(0:6, 9), to=c(NA, 4,13,31,78,174,240,NA))
wrkexpcode.into.month(x)
[1] 31 13 13 31 4 31 174 240 78
You could create matrix pointing the input value (column1) to the desired output value (column2)
table=matrix(c(0,1,2,3,4,5,6,9,NA,4,13,31,78,174,240,NA),ncol=2)
And using sapply on the vector c(3,2,2,3,1,3,5,6,4)
sapply(c(3,2,2,3,1,3,5,6,4), function(x) table[which(table[,1] == x),2] )
to give you the desired output too

Resources