Perform 'cross product' of two vectors, but with addition - r

I am trying to use R to perform an operation (ideally with similarly displayed output) such as
> x<-1:6
> y<-1:6
> x%o%y
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 1 2 3 4 5 6
[2,] 2 4 6 8 10 12
[3,] 3 6 9 12 15 18
[4,] 4 8 12 16 20 24
[5,] 5 10 15 20 25 30
[6,] 6 12 18 24 30 36
where each entry is found through addition not multiplication.
I would also be interested in creating the 36 ordered pairs (1,1) , (1,2), etc...
Furthermore, I want to use another vector like
z<-1:4
to create all the ordered triplets possible between x, y, and z.
I am using R to look into likelihoods of possible total when rolling dice with varied numbers of sizes.
Thank you for all your help! This site has been a big help to me. I appreciate anyone that takes the time to answer a stranger's question.
UPDATE So I found that `outer(x,y,'+') will do what I wanted first. But I still don't know how to create ordered pairs or ordered triplets.

Your first question is easily handled by outer:
outer(1:6,1:6,"+")
For the others, I suggest you try expand.grid, although there are specialized combination and permutation functions out there as well if you do a little searching.

expand.grid can answer your second question:
expand.grid(1:6,1:6)
expand.grid(1:6,1:6,1:4)

Related

How to extract the values from a raster in R

I want to use R to extract values from a raster. Basically, my raster has values from 0-6 and I want to extract for every single pixel the corresponding value. So that I have at the end a data table containing those two variables.
Thank you for your help, I hope my explanations are precisely enough.
Example data
library(raster)
r <- raster(ncol=5, nrow=5, vals=1:25)
To get all values, you can do
values(r)
# [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
#as.matrix(r)
# [,1] [,2] [,3] [,4] [,5]
#[1,] 1 2 3 4 5
#[2,] 6 7 8 9 10
#[3,] 11 12 13 14 15
#[4,] 16 17 18 19 20
#[5,] 21 22 23 24 25
Also see ?getValues
You can also use indexing
r[2,2]
#7
r[7:8]
#[1] 7 8
For more complex extractions using points, lines or polygons, see ?extract
x is the raster object you are trying to extract values from; y is may be a SpatialPoints, SpatialPolygons,SpatialLines, Extent or a vector representing cell numbers (take a look at ?extract). Your code values_raster <- extract(x = values, df=TRUE) will not work because you're feeding the function with any y object/vector.
You could try to build a vector with all cell numbers of your raster. Imagine your raster have 200 cells. If your do values_raster <- extract(x = values,y=seq(1,200,1), df=TRUE) you'll get a dataframe with values for each cell.
How about simply doing
as.data.frame(s, xy=TRUE) # s is your raster file

R Pooled DataFrame analysis

I'm trying to perform several analysis on subsets of data in a dataframe in R, and i was wondering if there is generic way for doing this.
Say, I have a dataframe like:
one two three four
[1,] 1 6 11 16
[2,] 2 7 12 17
[3,] 3 8 11 18
[4,] 4 9 11 19
[5,] 5 10 15 20
how could I apply some computation (e.g. cumulative counting) based upon values in col "one" condition upon (grouped by) the value in col "three".
That is, I wanna do stuff to one column, based upon grouping in another column. I can do this with loops, but I feel there might be standard ways to do this all at once.
thank you in advance!
ddply(data, .(coln), Stat) does the trick exactly

Set column names while calling a function

Consider we have a numeric data.frame foo and want to find the sum of each two columns:
foo <- data.frame(x=1:5,y=4:8,z=10:14, w=8:4)
bar <- combn(colnames(foo), 2, function(x) foo[,x[1]] + foo[,x[2]])
bar
# [,1] [,2] [,3] [,4] [,5] [,6]
#[1,] 5 11 9 14 12 18
#[2,] 7 13 9 16 12 18
#[3,] 9 15 9 18 12 18
#[4,] 11 17 9 20 12 18
#[5,] 13 19 9 22 12 18
Everything is fine, except the column names that are missing from bar. I want column names of bar to show the related columns in foo, for instance in this example:
colnames(bar) <- apply(combn(colnames(foo),2), 2, paste0,collapse="")
colnames(bar)
#[1] "xy" "xz" "xw" "yz" "yw" "zw"
This is simple, but I want to perform column labeling in the same bar <- combn(...) command. Is there anyway?
It is possible, but it obfuscates your code. The tradeoff between brevity and clarity here is acute.
To understand how it works, I reference this question.
colnames(x) <- y
Is internally rewritten as
x <- `colnames<-`(x,y)
You can then do the translation yourself.
bar <- `colnames<-`(combn(colnames(foo), 2, function(x) foo[,x[1]] + foo[,x[2]]),
apply(combn(colnames(foo),2), 2, paste0,collapse=""))
In many cases, however, it's not worth the mental and syntactic gymnastics required to collapse lines of code in this way. Multiple lines tend to be clearer to follow.
You start with a data.frame, not a matrix. Not that important, but it helps to keep up with the jargon we usually use.
What you're after is not possible. If you look at the code of combn, when the result is simplified, it uses no dimension names.
}
if (simplify)
array(out, dim.use)
else out
}
You can either hack the function and make it add dimension names, or, you can add it manually to your result post festum.

filling matrix with circular patern

I want to write a function that fill a matrix m by m where m is odd as follows :
1) it's starts from middle cell of matrix (for example for 5 by 5 A, matrix middle cell are A[2,2] ) , and put number 1 there
2) it's go one cell forward and add 1 to previous cell and put it in second cell
3) it's go down and put 3, left 4, left 5, up 6, up 7,...
for example the resulting matrix could be like this :
> 7 8 9
6 1 2
5 4 3
could somebody help me to implement?
max_x=5
len=max_x^2
middle=ceiling(max_x/2)
A=matrix(NA,max_x,max_x)
increments=Reduce(
f=function(lhs,rhs) c(lhs,(-1)^(rhs/2+1)*rep(1,rhs)),
x=2*(1:(max_x)),
init=0
)[1:len]
idx_x=Reduce(
f=function(lhs,rhs) c(lhs,rep(c(TRUE,FALSE),each=rhs)),
1:max_x,
init=FALSE
)[1:len]
increments_x=increments
increments_y=increments
increments_x[!idx_x]=0
increments_y[idx_x]=0
A[(middle+cumsum(increments_x)-1)*(max_x)+middle+cumsum(increments_y)]=1:(max_x^2)
Gives
#> A
# [,1] [,2] [,3] [,4] [,5]
#[1,] 21 22 23 24 25
#[2,] 20 7 8 9 10
#[3,] 19 6 1 2 11
#[4,] 18 5 4 3 12
#[5,] 17 16 15 14 13
Explanation:
The vector increments denotes the steps along the path of the increasing numbers. It's either 0/+1/-1 for unchanged/increasing/decreasing row and column indices. Important here is that these numbers do not differentiate between steps along columns and rows. This is managed by the vector idx_x - it masks out increments that are either along a row (TRUE) or a column (FALSE).
The last line takes into account R's indexing logic (matrix index increases along columns).
Edit:
As per request of the OP, here some more information about how the increments vector is calculated.
You always go two consecutive straight lines of equal length (row-wise or column-wise). The length, however, increases by 1 after you have walked twice. This corresponds to the x=2*(1:(max_x)) argument together with rep(1,rhs). The first two consecutive walks are in increasing column/row direction. Then follow two in negative direction and so on (alternating). This is accounted for by (-1)^(rhs/2+1).

Finding the index of the minimum value which is larger than a threshold in R

This is probably very simple, but I'm missing the correct syntax in order to simplify it.
Given a matrix, find the entry in one column which is the lowest value, greater than some input parameter. Then, return an entry in a different column on that corresponding row. Not very complicated... and I've found something that works but, a more efficient solution would be greatly appreciated.
I found this link:Better way to find a minimum value that fits a condition?
which is great.. but that method of finding the least entry loses the index information required to find a corresponding value in a corresponding row.
Let's say column 2 is the condition column, and column 1 is the one I want to return.... currently I've made this: (note that this only works because row two is full of numbers which are less than 1).
matrix[which.max((matrix[,2]>threshhold)/matrix[,2]),1]
Any thoughts? I'm expecting that there is probably some quick and easy function which has this effect... it's just never been introduced to me haha.
rmk's answer shows the basic way to get a lot of info out of your matrix. But if you know which column you're testing for the minimum value (above your threshold), and then want to return a different value in that row, maybe something like
incol<- df[,4] # select the column to search
outcol <- 2 # select the element of the found row you want to get
threshold <- 5
df[ rev(order(incol>threshold))[1] ,outcol]
You could try the following. Say,
df <- matrix(sample(1:35,35),7,5)
> df
[,1] [,2] [,3] [,4] [,5]
[1,] 18 16 27 19 31
[2,] 24 1 7 12 5
[3,] 28 35 23 4 6
[4,] 33 3 25 26 15
[5,] 14 10 11 21 20
[6,] 9 2 32 17 13
[7,] 30 8 29 22 34
Say your threshold is 5:
apply(df,2,function(x){ x[x<5] <- max(x);which.min(x)})
[1] 6 7 2 2 2
Corresponding to the values:
[1] 9 8 7 12 5
This should give you the index of the smallest entry in each column greater than threshold according to the original column indexing.

Resources