How do I check if a value( numeric value) appears n times consecutively across columns A0 to A37 and gives the value of n in SAS? - count

I have a data that looks like this
I need something like this
Thanks in advance

It is not entirely clear what you are wanting to do, but when you want to look across variables in a data set Arrays are your best friend. After reviewing the documentation, look for any of the great papers posted on how to use Arrays.
Update your question for a more specific response.

Please type out your data in the future, ideally as shown below. This shows how you can capture a streak.
data have;
input Name A0-A10 MAX;
cards;
1 10 9 12 12 12 12 12 11 12 12 9 12
2 45 67 23 67 9 45 67 67 67 67 67 67
3 12 14 1 16 14 16 16 16 16 16 16 16
4 14 56 7 68 94 35 67 342 252 253 35 342
;
;
;
run;
data want;
set have;
*declare array to loop;
array A(0:10) A0-A10;
*use a counter to keep track of consecutive values;
counter=0;
*use a variable to keep track of the largest streak;
max_streak=0;
*loop start;
do i=0 to 10;
*if value matches maximum increase the counter;
if A(i)=max then
counter+1;
*if streak is larger than the largest streak replace the largest streak;
if counter>max_streak then
max_streak=counter;
*set streak back to 0 if value does not match maximum;
if A(i) ne max then
counter=0;
end;
run;

data have;
input Name A0-A10 MAX;
datalines;
1 10 9 12 12 12 12 12 11 12 12 9 12
2 45 67 23 67 9 45 67 67 67 67 67 67
3 12 14 1 16 14 16 16 16 16 16 16 16
4 14 56 7 68 94 35 67 342 252 253 35 342
;
data want(drop = c);
set have;
array a a:;
i = 0; c = 0;
do over a;
if a = max then c + 1;
if c > i then i = c;
if a ne max then c = 0;
end;
run;

Related

How do I convert a vector of triplets to a 3xnx3 matrix in Dyalog APL?

I have a vector containing 9000 integer elements, where each group of 9 has 3 sub-groups that I'd like to separate out, resulting in a matrix with the shape 3 1000 3. Here's what I did:
⎕IO←0
m←(9÷⍨≢data) 9⍴data
a←m[;0 1 2]
b←m[;3 4 5]
c←m[;6 7 8]
d←↑a b c
which does what I want -- but can I shape the vector directly?
Solution
1 0 2 ⍉ (9÷⍨≢data) 3 3 ⍴ data
Explanation
By using ⍳45 as placeholder data, we can see what is intended:
data ← ⍳45
a←m[;0 1 2]
b←m[;3 4 5]
c←m[;6 7 8]
d←↑a b c
d
0 1 2
9 10 11
18 19 20
27 28 29
36 37 38
3 4 5
12 13 14
21 22 23
30 31 32
39 40 41
6 7 8
15 16 17
24 25 26
33 34 35
42 43 44
The final shape will clearly be 3 (9÷⍨≢data) 3, but we are filling one row from each layer first, then the second row from each layer, and so on. Compare this to the normal way of filling; all rows of the first layer, then all the rows of the second layer, and so on:
3 (9÷⍨≢data) 3⍴data
0 1 2
3 4 5
6 7 8
9 10 11
12 13 14
15 16 17
18 19 20
21 22 23
24 25 26
27 28 29
30 31 32
33 34 35
36 37 38
39 40 41
42 43 44
In other words, our job is to swap the filling order of the first two axes. To do this, we list the axis lengths in the order we want them filled:
(9÷⍨≢data) 3 3⍴data
0 1 2
3 4 5
6 7 8
9 10 11
12 13 14
15 16 17
18 19 20
21 22 23
24 25 26
27 28 29
30 31 32
33 34 35
36 37 38
39 40 41
42 43 44
Now we need to swap the first two axes. This is possible using the dyadic transpose function ⍉ which (for our use case) can be thought of as the "reorder axes" function. The left argument is an array of where you want the corresponding axis to go (first element defines the final location of the first axis and so on). While the normal indices of the axes are 0 1 2 we can swap the first two axes with 1 0 2.
Thus 1 0 2 ⍉ (9÷⍨≢data) 3 3 ⍴ data takes our (9÷⍨≢data) 3 3 shape and puts it into the desired shape of 3 (9÷⍨≢data) 3.
d ≡ 1 0 2 ⍉ (9÷⍨≢data) 3 3 ⍴ data
1

Doing Row products conditionnally to value of specific columns

I have a dataset of this form.
a=data.frame(A=1:5,B=1:5,matrix(seq(50),nrow = 5))
colnames(a)<-c("A","B", paste0(1:10))
A B 1 2 3 4 5 6 7 8 9 10
1 1 1 6 11 16 21 26 31 36 41 46
2 2 2 7 12 17 22 27 32 37 42 47
3 3 3 8 13 18 23 28 33 38 43 48
4 4 4 9 14 19 24 29 34 39 44 49
5 5 5 10 15 20 25 30 35 40 45 50
I am intending to use apply in order to do the product of rows conditionnally to the value of A and B. Let's take row 2 for instance, we have A=2 and B=2 then the code will be looking for column="2" and column="2+2" and will do the product of all the elements of the selected vectors, Result is thus equal to 7*12*17=1248.
I can do it for a row
prod(a[1,match(a$A[1],colnames(a)):match(a$A[1]+a$B[1],colnames(a))])
but can't figure a way to apply it to all the data.frame. Any help?
Here is one option with apply,specify the MARGIN as 1 to loop over the rows,, create the index to match the column names from the first two elements (A, 'B), create a sequence (:), subset the values of 'x' and get theprod`
apply(a, 1, function(x) {
i1 <- match(x[1], names(x))
i2 <- match(x[1] + x[2], names(x))
prod(x[i1:i2])
})
#[1] 6 1428 150696 17535024 2362500000

R: Extract a circle from a matrix

Given a square matrix with sides of length L, how one can extract in R all the values that fall into the largest possible circle able to fill the matrix?
I found Filled circle in matrix(2D array) for C++ but how to test if the position of each cell of the matrix falls into the equation? How to know the X and Y of each cell while using an apply for exemple?
For some 8x8 matrix m:
m = matrix(1:64,8,8)
create a data frame of the coordinates:
g = expand.grid(1:nrow(m), 1:nrow(m))
compute distance-to-centre:
g$d2 = sqrt ((g$Var1-4.5)^2 + (g$Var2-4.5)^2)
compare with circle radius:
g$inside = g$d2<=4
you now have a data frame of row, column, distance to centre, and is-it-inside:
> head(g)
Var1 Var2 d2 inside
1 1 1 4.949747 FALSE
2 2 1 4.301163 FALSE
3 3 1 3.807887 TRUE
4 4 1 3.535534 TRUE
5 5 1 3.535534 TRUE
Then you can extract from a matrix by a two-column matrix with:
m[as.matrix(g[g$inside,c("Var1","Var2")])]
[1] 3 4 5 6 10 11 12 13 14 15 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
[26] 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 50 51 52 53 54 55 59 60
[51] 61 62
from your image that should be 64 minus 12 (three in each corner) cells, so the length of 52 in my answer looks correct.
If you are looking for speed then skip the square root and compare with 16, the distance-squared. But you'll probably find a solution in C++ much faster.

R-Creating a New Column Based On Positions Specified by an Existing Column

I have a dataset (df) looks like this.
A B C D E Position
67 68 69 70 71 5
20 21 22 23 24 2
98 97 96 95 94 3
2 5 7 9 12 5
4 8 12 16 20 4
I am trying to create a new column (Result) where the value of result is equal to the position of the column specified in the position column for each row of the resulting column.
For example, if the row 1 of position column is 5, the Result column will have the value of 5th column of row 1.
My resulting column will look like this:
A B C D E Position Result
67 68 69 70 71 5 71
20 21 22 23 24 2 21
98 97 96 95 94 3 96
2 5 7 9 12 5 12
4 8 12 16 20 4 16
I used the following command, which does not give me what I need. It lumps all of the position column values in each row. I am unable to determine how I can get the correct result.
Any help is appreciated.
Thanks!
Use matrix indexing to extract the values:
df[cbind(1:nrow(df), df$Position)]
# [1] 71 21 96 12 16
Assign the result in the normal way:
df$Result <- df[cbind(1:nrow(df), df$Position)]

Referencing Specific Location in My Data

I am new to R and trying to solve a problem.
Here is an example of my data:
product_id week purchases
53 0 19
53 1 27
53 2 34
53 3 43
53 4 44
For this data, there are three types of product_id, and the week variable runs from 0-15 for each, with a positive purchase value for each.
I would like to add a third variable called percent, and would like it to equal Purchases / the value of purchases when week = 15, for the relevant product data id.
My problem is I don't know how to tell R I want to refer only to week=15 & the product id of whatever row I am on, when writing this equation.
Any help would be appreciated!
Using week==4 instead of 15 (so it works with your example data). All these results assuming that there is only one value == TRUE for week==4
You could use ave (and transform)
transform(DF, prop.purchases = ave(purchases, product_id, FUN = function(x) x/ x[week==4]))
Using data.table
library(data.table)
DT <- data.table(DF)
DT[, prop.purchase := purchases / purchases[week==4], by = product_id]
an alternative approach using keys and by-without-by
DT <- data.table(DF, key = 'product_id')
DT[DT[week==4], prop.purchase := purchases / i.purchaes]
Using plyr and ddply
library(plyr)
ddply(DF, .(product_id), mutate, prop.purchases = purchases / purchases[week==4])
Using the simple ifelse for your simple sample data works:
the sample data is called as sample
sample data (with added data for id 54 and 55:
product_id week purchases
53 0 19
53 1 27
53 2 34
53 3 43
53 4 44
53 14 23
54 0 23
54 1 21
54 2 22
54 3 32
54 4 33
54 14 22
55 0 22
55 1 33
55 2 44
55 3 55
55 4 11
55 14 12
sample$percent<-with(sample,ifelse(product_id ==53, purchases/purchases[week==14 &product_id==53],ifelse(product_id ==54, purchases/purchases[week==14 & product_id==54],purchases/purchases[week==14 &product_id==55])))
Output:
product_id week purchases percent
1 53 0 19 0.8260870
2 53 1 27 1.1739130
3 53 2 34 1.4782609
4 53 3 43 1.8695652
5 53 4 44 1.9130435
6 53 14 23 1.0000000
7 54 0 23 1.0454545
8 54 1 21 0.9545455
9 54 2 22 1.0000000
10 54 3 32 1.4545455
11 54 4 33 1.5000000
12 54 14 22 1.0000000
13 55 0 22 1.8333333
14 55 1 33 2.7500000
15 55 2 44 3.6666667
16 55 3 55 4.5833333
17 55 4 11 0.9166667
18 55 14 12 1.0000000

Resources