I am experiencing some strange behavior in R when trying to index a matrix with another matrix. I run into an error of subscript out of bounds with indexing with a 2 column matrix, but not with a four column matrix. See the following reproducible code. Any insight would be appreciated!
This
data <- matrix(rbinom(100, 1, .5), nrow = 10)
idx <- cbind(1:50, 51:100)
data[idx]
results in:
Error in data[idx] : subscript out of bounds
However
data[cbind(idx,idx)]
works.
My session info:
R version 3.3.1 (2016-06-21)
Platform: x86_64-apple-darwin15.5.0 (64-bit)
Running under: OS X 10.11.5 (El Capitan)
The key insight as to why this is wrong isn't working is given in ?'[':
When indexing arrays by [ a single argument i can be a matrix with as many columns as there are dimensions of x; the result is then a vector with elements corresponding to the sets of indices in each row of i.
and it is clear when the subscript out of bounds error arises; data doesn't have 50 rows and 100 columns.
What's happening in the second example the indexing matrix is just being treated as a vector because it has more columns than the matrix being indexed has dimensions, and is extracting elements c(1:100, 1:100) from data.
This is more easily see with
m <- matrix(1:100, ncol = 10, byrow = TRUE)
and indexing with cbind(idx, idx) gives
> m[cbind(idx,idx)]
[1] 1 11 21 31 41 51 61 71 81 91 2 12 22 32 42 52 62 72
[19] 82 92 3 13 23 33 43 53 63 73 83 93 4 14 24 34 44 54
[37] 64 74 84 94 5 15 25 35 45 55 65 75 85 95 6 16 26 36
[55] 46 56 66 76 86 96 7 17 27 37 47 57 67 77 87 97 8 18
[73] 28 38 48 58 68 78 88 98 9 19 29 39 49 59 69 79 89 99
[91] 10 20 30 40 50 60 70 80 90 100 1 11 21 31 41 51 61 71
[109] 81 91 2 12 22 32 42 52 62 72 82 92 3 13 23 33 43 53
[127] 63 73 83 93 4 14 24 34 44 54 64 74 84 94 5 15 25 35
[145] 45 55 65 75 85 95 6 16 26 36 46 56 66 76 86 96 7 17
[163] 27 37 47 57 67 77 87 97 8 18 28 38 48 58 68 78 88 98
[181] 9 19 29 39 49 59 69 79 89 99 10 20 30 40 50 60 70 80
[199] 90 100
which is the same as
m[c(idx[,1], idx[,2], idx[,1], idx[,2])]
or specifically,
m[c(1:50, 51:100, 1:50, 51:100)]
Related
In Stata, when changing values to variables (or other related operations), the output includes a comment regarding the number of changes. E.g:
Is there a way to obtain similar commentary in RStudio?
For instance, sometimes I want to check how many changes a command made (partly to see if command worked, or to count the extent of a potential problem in the data). Currently, I have to inspect the data manually or do a pretty uninformative comparison using all(), for instance.
Base R doesn't do this, but you could write a function to do it, and then instead of saying
x <- y
you'd say
x <- showChanges(x, y)
For example,
library(waldo)
showChanges <- function(oldval, newval) {
print(compare(oldval, newval))
newval
}
set.seed(123)
x <- 1:100
x <- showChanges(x, x + rbinom(100, size = 1, prob = 0.01))
#> `old[21:27]`: 21 22 23 24 25 26 27
#> `new[21:27]`: 21 22 23 25 25 26 27
x
#> [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
#> [19] 19 20 21 22 23 25 25 26 27 28 29 30 31 32 33 34 35 36
#> [37] 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54
#> [55] 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72
#> [73] 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90
#> [91] 91 92 93 94 95 96 97 98 99 100
Created on 2021-10-21 by the reprex package (v2.0.0)
I want to assign some value to a vecter like:
a = rep(0, 101)
for(i in seq(0, 1, 0.01)){
u <- 100 * i + 1
a[u] <- u
}
a
plot(a)
The output is
> a
[1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 30 0 31 32 33 34
[35] 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 59 0 60 61 62 63 64 65 66 67 68
[69] 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101
There are problems on the 29th and the 59th elements. They should be 29 and 59, but it turns out to be 0, the default value. And the previous values, the 28th and 58th, are also incorrect. Why is this happening? Thank you!
There is a problem with your indexing. I don't know how to explain why it doesn't work as written, but here is a modification to your code that works:
a = rep(0, 101)
s<-seq(0, 1, 0.01)
for(i in 1:101){
a[i] <- 100 * s[i] + 1
}
a
plot(a)
In general it is best to avoid multiple indexes in the same loop as it can be confusing and difficult to diagnose problems.
This is a rather simple question: why is this code in R not printing numbers from 1 to 100, but jumps with the value of i? Is there a way to prevent this?
t <-5
for (i in 1:t){
print(20*(i-1)+1:20*i)
}
to get the question closed
t <-5
for (i in 1:t){
print(20*(i-1)+1:20)
}
#> [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
#> [1] 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42
#> [1] 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63
#> [1] 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84
#> [1] 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100
This question already has answers here:
Get a seq() in R with alternating steps
(6 answers)
Closed 6 years ago.
I want to use R to create the sequence of numbers 1:8, 11:18, 21:28, etc. through 1000 (or the closest it can get, i.e. 998). Obviously typing that all out would be tedious, but since the sequence increases by one 7 times and then jumps by 3 I'm not sure what function I could use to achieve this.
I tried seq(1, 998, c(1,1,1,1,1,1,1,3)) but it does not give me the results I am looking for so I must be doing something wrong.
This is a perfect case of vectorisation( recycling too) in R. read about them
(1:100)[rep(c(TRUE,FALSE), c(8,2))]
# [1] 1 2 3 4 5 6 7 8 11 12 13 14 15 16 17 18 21 22 23 24 25 26 27 28 31 32
#[27] 33 34 35 36 37 38 41 42 43 44 45 46 47 48 51 52 53 54 55 56 57 58 61 62 63 64
#[53] 65 66 67 68 71 72 73 74 75 76 77 78 81 82 83 84 85 86 87 88 91 92 93 94 95 96
#[79] 97 98
rep(seq(0,990,by=10), each=8) + seq(1,8)
You want to exclude numbers that are 0 or 9 (mod 10). So you can try this too:
n <- 1000 # upper bound
x <- 1:n
x <- x[! (x %% 10) %in% c(0,9)] # filter out (0, 9) mod (10)
head(x,80)
# [1] 1 2 3 4 5 6 7 8 11 12 13 14 15 16 17 18 21 22 23 24 25 26 27
# 28 31 32 33 34 35 36 37 38 41 42 43 44 45 46 47 48 51 52 53 54 55 56 57
# 58 61 62 63 64 65 66 67 68 71 72 73 74 75 76 77 78 81 82 83 84 85
# 86 87 88 91 92 93 94 95 96 97 98
Or in a single line using Filter:
Filter(function(x) !((x %% 10) %in% c(0,9)), 1:100)
# [1] 1 2 3 4 5 6 7 8 11 12 13 14 15 16 17 18 21 22 23 24 25 26 27 28 31 32 33 34 35 36 37 38 41 42 43 44 45 46 47 48 51 52 53 54 55 56 57
# [48] 58 61 62 63 64 65 66 67 68 71 72 73 74 75 76 77 78 81 82 83 84 85 86 87 88 91 92 93 94 95 96 97 98
With a cycle: for(value in c(seq(1,991,10))){vector <- c(vector,seq(value,value+7))}
The getOption("max.print") can be used to limit the number of values that can be printed from a single function call. For example:
options(max.print=20)
print(cars)
prints only the first 10 rows of 2 columns. However, max.print doesn't work very well lists. Especially if they are nested deeply, the amount of lines printed to the console can still be infinite.
Is there any way to specify a harder cutoff of the amount that can be printed to the screen? For example by specifying the amount of lines after which the printing can be interrupted? Something that also protects against printing huge recursive objects?
Based in part on this question, I would suggest just building a wrapper for print that uses capture.output to regulate what is printed:
print2 <- function(x, nlines=10,...)
cat(head(capture.output(print(x,...)), nlines), sep="\n")
For example:
> print2(list(1:10000,1:10000))
[[1]]
[1] 1 2 3 4 5 6 7 8 9 10 11 12
[13] 13 14 15 16 17 18 19 20 21 22 23 24
[25] 25 26 27 28 29 30 31 32 33 34 35 36
[37] 37 38 39 40 41 42 43 44 45 46 47 48
[49] 49 50 51 52 53 54 55 56 57 58 59 60
[61] 61 62 63 64 65 66 67 68 69 70 71 72
[73] 73 74 75 76 77 78 79 80 81 82 83 84
[85] 85 86 87 88 89 90 91 92 93 94 95 96
[97] 97 98 99 100 101 102 103 104 105 106 107 108