I've got a column A, which has several values, some of them repeating. So, example: A = c(5, 9, 6, 5, 5). I need to go through A and count the frequencies of each of the values in A. So, for this example, for the set of 5s in A, there are 3 occurancies of 5s. I need to save these frequencies so I can use them in another calculation. By the way, I have several other variables in this dataset.
How do I do this?
Thanks.
You can try
library(data.table)#v1.9.4+
setDT(yourdf)[, .N, by = A]
Related
Basically, I wish to sum (add) numeric values in rows (1 and 2; 3 and 4; 12; 13 and 14) of column 'tdiff' in a dataframe 'taPa'? I tried taPa [rowSums(1:2, 3:4, 12, 13:14),] but it gives an error: 'x' must be an array of at least two dimensions. Any help would be great. Thanks.
I'm assuming you want to add them all together (1-4, 12-14), right?
If that's the case, you could simple use
sum(taPa$tdiff[c(1, 2, 3, 4, 12, 13, 14)])
to get that sum.
I have identical two vectors, S and T.
S <- seq(from = 0, to = 80, by = 2)
T is the exact same. I am trying to create a data frame so that column one would be all of the S values (2 through 80) but column two would be all of the T values (2 through 80). However, I want it so that row one would be 0, 0. Row 2 would be 0, 2. Row 3 would be 0,4. etc. And then row 42 would be 2, 0. I believe it would be possible using a for loop, but I am struggling on how to accomplish this. Any advice would greatly help. I understand that there would be close to if not over 1000 rows, but I feel like there is a simple way to accomplish this.
Don't label variables T or t in R. T is a popular abbreviation of TRUE, and t is a function (transpose).
expand.grid() is probably what you're looking for.
S <- seq(from = 0, to=80, by=2)
TT <- S
expand.grid(S,TT)
Yes, it's big.
dim(expand.grid(S,TT))
[1] 1681 2
anyone got any idea how to print element greater than some value from a set without using booleans?
E.g., suppose I have the set x, which includes the elements (1, 4, 6, 3, 5, 2, 9).
Obviously, we can print all the values of x greater than 5 using the following code:
x[x>5]
But this way of coding uses booleans (it uses TRUE, FALSE etc.)
But is there any way to do this using solely integers?
I was thinking about some sort of loop that would start contain the number of elements, and then do
x[c(variable)]
but I don't know really.
Please help?
I know I can use
head(sample(x),m)
to print a random selection of m rows from my dataset, but in this case each new draw is randomized. What if, instead of randomizing every draw, I wanted to randomize only the starting position for the first draw, while preserving the order of subsequent rows?
To illustrate, imagine we have a dataset of n rows and I wanted to print m of them in order, starting from a random position. The randomly drawn starting position is 5, so my desired function would print 5, 6, 7, ..., m < n.
This is more of a theoretical question, not a diagnostic one, so I don't believe a MWE example is needed...please let me know if you think it is and I will be happy to provide one.
We create a numeric index using the sample element and adding with the sequence of 'n' rows that should follow it. If the sampled index is say the last row, then we can create a condition to check for those cases
i1 <- sample(nrow(df1), 1)+ 0:3
df1[ i1[i1 <= nrow(df1)], ]
This is a simple problem, however I cannot find an elegant solution for:
Given is the following vector series:
series=c(1,2,4,5,6,1,2,4,5,6,7,8,2,4)
I now want to count blocks of this vector in the same vector; e.g. if I have a block size of 2, I would like to count the pairs 1&2, 2&4, 4&5 and so on (in total 8 unique blocks if I did the counting right).
Can you think of an easy way to program that so that I receive an output matrix with a column for the "unique block number" and a corresponding column for the counts?
One idea is to can use rollapply from zoo,
nrow(unique(rollapply(series, 2, by = 1, paste0)))
#[1] 8
You can change '2' to get combinations(block sizes) of 3, 4, etc...