What does this R expression do? - r

sp_full_in is matrix:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
1 0 1 1 1 1 2 2 2 1 1 1 1 1 2 1 1 1 1 1 1 2
2 1 0 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1 1 1
3 2 2 0 2 2 2 2 2 2 1 1 2 2 2 1 2 1 1 1 2 1
4 1 2 1 0 2 2 2 1 2 1 1 1 2 2 1 2 1 1 2 2 1
5 2 2 2 2 0 2 2 2 2 1 1 2 1 2 1 2 1 1 1 2 2
6 2 1 1 1 1 0 1 1 1 2 2 2 2 2 1 2 1 2 2 1 1
7 2 1 1 2 1 1 0 1 1 2 1 1 2 1 1 2 1 1 1 2 1
8 1 2 1 1 1 2 2 0 1 1 1 2 2 2 1 2 1 1 2 1 1
9 2 2 1 2 1 1 2 2 0 1 1 2 1 2 1 2 1 1 2 2 2
10 2 2 1 1 1 2 2 1 1 0 2 2 2 2 1 1 1 1 1 2 2
11 2 2 1 1 1 2 1 1 1 1 0 2 1 2 1 2 1 1 1 1 2
12 1 2 1 1 2 1 1 2 1 1 1 0 2 2 1 2 1 2 1 1 1
13 2 2 2 2 1 3 2 2 2 1 1 3 0 2 1 2 2 1 2 2 2
14 2 2 1 2 1 2 1 2 1 2 2 2 1 0 1 2 1 1 1 1 1
15 2 2 2 2 2 2 2 2 2 1 1 2 2 1 0 2 1 1 1 1 2
16 1 2 2 1 1 2 2 2 1 1 2 2 2 2 1 0 1 1 2 1 2
17 2 2 1 1 1 1 1 2 1 1 1 1 2 2 1 2 0 2 2 1 1
18 1 1 1 1 1 2 1 1 1 1 1 2 1 1 1 1 2 0 1 1 1
19 2 2 1 2 1 2 2 2 2 1 1 2 2 2 1 2 1 1 0 2 2
20 2 2 1 1 1 2 2 2 2 1 2 2 2 2 1 2 1 1 1 0 1
21 1 1 1 1 1 1 1 1 1 2 2 1 2 1 1 2 1 1 2 1 0
mean(sp_full_in[which(sp_full_in != Inf)])
produces the result [1] 1.38322
I'm not quite sure I understand what this does, but the way I read it is: for every cell in sp_full_in, check if it is not infinite, if so, return the output 1, then average all the outputs. Is that correct? If not, how should it be ready?

which(sp_full_in != Inf) returns a vector of integers (and only one of them is 1). That vector of integers is then handed to "[" as indices into sp_full_in and returns all the values of sp_full_in as a vector passed to the mean function.
It is a good idea to learn to read R expressions from the "inside out". Find the innermost function call and mentally evaluate it, in this case sp_full_in != Inf,. That returns a logical matrix of all TRUE's that gets passed to which(), and since there is no 'arr.ind' argument, it returns an atomic vector of indices.

The other answers are good at explaining why you get the mean of all the finite entries in the matrix, but it's worth noting that in this case the which does nothing. I used to have the bad habit of over-using which as well.
> a <- matrix(rnorm(4), nrow = 2)
> a
[,1] [,2]
[1,] 0.5049551 -0.7844590
[2,] -1.7170087 -0.8509076
> a[which(a != Inf)]
[1] 0.5049551 -1.7170087 -0.7844590 -0.8509076
> a[a != Inf]
[1] 0.5049551 -1.7170087 -0.7844590 -0.8509076
> a[1] <- Inf
> a
[,1] [,2]
[1,] Inf -0.7844590
[2,] -1.717009 -0.8509076
> a[which(a != Inf)]
[1] -1.7170087 -0.7844590 -0.8509076
## Similarly if there was an Infinite value
> a[a != Inf]
[1] -1.7170087 -0.7844590 -0.8509076
And, while we're at it, we should also mention the function is.finite which is often preferable to != Inf. is.finite will return FALSE on Inf, -Inf, NA and NaN.

No, but you are close, when which is applied to a matrix, it checks every cell of the matrix against the condition,here it is Not Inf. Return the indices of all cells satisfying the conditions,then, according to your code, output the value of the cell according to the returned indices and finally calculate mean of those.

Related

How to find the streaks of a particular value in R?

The rle() function returns a list with values and lengths. I have not found a way to subset the output to isolate the streaks of a particular value that does not involve calling rle() twice, or saving the output into an object to later subset (an added step).
For instance, for runs of heads (1's) in a series of fair coin tosses:
s <- sample(c(0,1),100,T)
rle(s)
Run Length Encoding
lengths: int [1:55] 1 2 1 2 1 2 1 2 2 1 ...
values : num [1:55] 0 1 0 1 0 1 0 1 0 1 ...
# Double-call:
rle(s)[[1]][rle(s)[[2]]==1]
[1] 2 2 2 2 1 1 1 1 6 1 1 1 2 2 1 1 2 2 2 2 2 3 1 1 4 1 2
# Adding an intermediate step:
> r <- rle(s)
> r$lengths[r$values==1]
[1] 2 2 2 2 1 1 1 1 6 1 1 1 2 2 1 1 2 2 2 2 2 3 1 1 4 1 2
I see that a very easy way of getting the streak lengths just for 1 is to simply tweak the rle() code (answer), but there may be an even simpler way.
in Base R:
with(rle(s), lengths[values==1])
[1] 1 3 2 2 1 1 1 3 2 1 1 3 1 1 1 1 1 2 3 1 2 1 3 3 1 2 1 1 2
For a sequence of outcomes s and when interested solely the lengths of the streaks on outcome oc:
sk = function(s,oc){
n = length(s)
y <- s[-1L] != s[-n]
i <- c(which(y), n)
diff(c(0L, i))[s[i]==oc]
}
So to get the lengths for 1:
sk(s,1)
[1] 2 2 2 2 1 1 1 1 6 1 1 1 2 2 1 1 2 2 2 2 2 3 1 1 4 1 2
and likewise for 0:
sk(s,0)
[1] 1 1 1 1 2 2 2 2 4 1 1 2 1 1 1 1 1 1 3 1 1 2 6 2 1 1 4 4

When I run complete(), I am getting the an error- Error in (function (classes, fdef, mtable)

I have a dataset which looks like this(A-J are column names)
A B C D E F G H I J
1 2 2 3 2 1 1 1 1
2 1 1 1 1 1 1 1 1 1
2 1 2 2 2 2 2 2 1 1
2 1 2 1 1 1 1 1 1 1
2 1 3 3 3 2 2 2 2
2 1 3 2 2 3 1 1 1 1
1 3 2 1 2 2 2 1 2
2 1 2 2 2 2 2 2 1 1
1 2 2 2 2 1 1 1 1
2 1 2 1 1 1 2 1 1 1
2 1 1 1 1 1 2 2 1 1
2 1 2 1 1 1 1 1 1 2
2 1 1 1 1 1 1 1
2 1 3 3 3 3 1 1 1 2
1 2 2 1 2 1 1 1 1
1 2 2 2 2 2 2 1 1
2 2 4 1 1 1 2 2 1 1
1 1 3 3 3 3
2 1 3 3 1 2 2 2 2 3
I am getting the below error-
Error in (function (classes, fdef, mtable) : unable to find an inherited method for function ‘complete’ for signature ‘"mids", "numeric"’
My data has lot of NULL values and I am trying to impute the data using below code-
imp_data<-mice(data = data_NA, m = 5, method = "rf", maxit = 5, seed = 500)
I get the error when I run the code-
complete(imp_data,1)
Please suggest where I am doing wrong
It seems that NA values are not properly assigned in the data_NA data.frame which is causing the problem.
The modified data (with NA) and transforming it using mice as it worked for me:
library(mice)
imp_data <- mice(data = data_NA, m = 5, method = "rf", maxit = 5, seed = 500)
complete(imp_data, 1)
EDITED: The error seen by OP was resolved by changing the call as:
mice::complete(imp_data, 1)
May be the mice::complete was masked by some function other package.
#Result
# A B C D E F G H I J
# 1 1 2 2 3 2 1 1 1 1 2
# 2 2 1 1 1 1 1 1 1 1 1
# 3 2 1 2 2 2 2 2 2 1 1
# 4 2 1 2 1 1 1 1 1 1 1
# 5 2 1 1 3 3 3 2 2 2 2
# 6 2 1 3 2 2 3 1 1 1 1
# 7 1 3 2 1 2 2 2 1 2 1
# 8 2 1 2 2 2 2 2 2 1 1
# 9 1 2 2 2 2 1 1 1 1 1
# 10 2 1 2 1 1 1 2 1 1 1
# 11 2 1 1 1 1 1 2 2 1 1
# 12 2 1 2 1 1 1 1 1 1 2
# 13 2 1 1 1 1 1 1 1 1 1
# 14 2 1 3 3 3 3 1 1 1 2
# 15 1 2 2 1 2 1 1 1 1 1
# 16 1 2 2 2 2 2 2 1 1 1
# 17 2 2 4 1 1 1 2 2 1 1
# 18 1 1 3 3 3 3 2 1 2 1
# 19 2 1 3 3 1 2 2 2 2 3
#
Data
data_NA<- read.table(text =
"A B C D E F G H I J
1 2 2 3 2 1 1 1 1 NA
2 1 1 1 1 1 1 1 1 1
2 1 2 2 2 2 2 2 1 1
2 1 2 1 1 1 1 1 1 1
2 1 NA 3 3 3 2 2 2 2
2 1 3 2 2 3 1 1 1 1
1 3 2 1 2 2 2 1 2 NA
2 1 2 2 2 2 2 2 1 1
1 2 2 2 2 1 1 1 1 NA
2 1 2 1 1 1 2 1 1 1
2 1 1 1 1 1 2 2 1 1
2 1 2 1 1 1 1 1 1 2
2 1 1 1 NA NA 1 1 1 1
2 1 3 3 3 3 1 1 1 2
1 2 2 1 2 1 1 1 1 NA
1 2 2 2 2 2 2 1 1 NA
2 2 4 1 1 1 2 2 1 1
1 1 3 3 3 3 NA NA NA NA
2 1 3 3 1 2 2 2 2 3",header = TRUE)

Recoding Number to String R

I am new to R and I am trying to recode a numeric variable
which is 1,2,3 to string. I have seen how to do it but I do not know why mine
is not working, maybe it is because it should be from string to number?
This is what I got, and thanks in advance!
cars$origin = as.factor(cars$origin)
cars$origin
[1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 3 1 1 1 3 2 2 2 2 2 1 1 1 1 1 3 1 3 1 1
[35] 1 1 1 1 1 1 1 1 2 2 2 3 3 2 1 3 1 2 1 1 1 1 1 1 1 1 1 1 1 3 1 1 1 1
[69] 2 2 3 3 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 3 1 1 3 1 2 1 3 1 1 1
Levels: 1 2 3
cars$origin <- recode(cars$origin, "1='american';2='european';3='japan'")
Error: Argument 2 must be named, not unnamed
Function factor has argument labels for that:
cars$origin = factor(cars$origin,
levels = c(1, 2, 3),
labels = c("american", "european", "japan"))

number of occurrences by lines R

I have this array:
[1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1
[38] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
[75] 1 1 2 1 2 2 1 2 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 1 2 2 2 2 2 2 2 2 2 2 2 2 2
[112] 2 1 1 2 2 2 2 2 2 1 2 1 1 2 1 1 2 1 1 2 1 1 2 2 1 2 2 2 2 1 2 2 2 1 2 2 2
And I want to count the number of occurrences of '1' and '2'. From [1] to [70] and from [71] to the end.
I tried :
sum(x==1)
But this for all.How can I select lines?
the function sum {base} should return the sum of all the values present in its arguments
you could define the arguments the following way:
with x[a:b] you can set boundaries (for example a=1 and b=10, will set the area from [1] to[10]);
with the operator == you can check if one specific value c is present between your boundaries ... e.g.: x[a:b]==c
if you want to look for more than one value ( for example c & d , where c==1 and d==2 , you can (for example) use a simple addition to sum up your results:
Now you can just say: sum(x[a:b]==c) + sum(x[a:b]==c)
Where a&b are your boundaries and c&d are the values you want to compare.

Off-diagonal and Diagonal symmetry check, Getting off-diagonal and diagonal element(s) without repetition of a Matrix

Suppose I have this matrix
8 3 1 1 2 2 1 1 1 1 1 1 2 2 1 1 3
3 8 3 1 1 2 2 1 1 1 1 1 1 2 2 1 1
1 3 8 3 1 1 2 2 1 1 1 1 1 1 2 2 1
1 1 3 8 3 1 1 2 2 1 1 1 1 1 1 2 2
2 1 1 3 8 3 1 1 2 2 1 1 1 1 1 1 2
2 2 1 1 3 8 3 1 1 2 2 1 1 1 1 1 1
1 2 2 1 1 3 8 3 1 1 2 2 1 1 1 1 1
1 1 2 2 1 1 3 8 3 1 1 2 2 1 1 1 1
1 1 1 2 2 1 1 3 8 3 1 1 2 2 1 1 1
1 1 1 1 2 2 1 1 3 8 3 1 1 2 2 1 1
1 1 1 1 1 2 2 1 1 3 8 3 1 1 2 2 1
1 1 1 1 1 1 2 2 1 1 3 8 3 1 1 2 2
2 1 1 1 1 1 1 2 2 1 1 3 8 3 1 1 2
2 2 1 1 1 1 1 1 2 2 1 1 3 8 3 1 1
1 2 2 1 1 1 1 1 1 2 2 1 1 3 8 3 1
1 1 2 2 1 1 1 1 1 1 2 2 1 1 3 8 3
3 1 1 2 2 1 1 1 1 1 1 2 2 1 1 3 8
I want to check
Off-diagonals are symmetric or not?(in above matrix, these are symmetric)
Elements occur in Off-diagonal (without repetition)?-- in above matrix, these elements are 1,2,3
Elements in diagonal are symmetric? if yes print element? (like 8 in above matrix)
# 1
all(mat == t(mat))
[1] TRUE
# 2
unique(mat[upper.tri(mat) | lower.tri(mat)])
[1] 3 1 2
# 3
if(length(unique(diag(mat))) == 1) print(diag(mat)[1])
[1] 8
mat <- as.matrix(read.table('abbas.txt'))
isSymmetric(unname(mat))
'Note that a matrix is only symmetric if its 'rownames' and 'colnames' are identical.'
unique(mat[lower.tri(mat)])
all(diag(mat) == rev(diag(mat)))
# I assume you mean the diagonal is symmetric when its reverse is the same with itself.

Resources