Counting number of sequences in a vector [duplicate] - r

This question already has answers here:
How can I count runs in a sequence?
(2 answers)
Closed 8 years ago.
I have a binary vector and I want to count how many sequences of 1's I've got. So that if I have a vector like:
bin <- c(1,1,0,1,1,1,1,0,0,0,1,0,1,1,0,0,1,1,1)
I would get 5. I haven't found any existing functions that could do this, anyone got any good tips on how one could write one? I don't know how to build the "counter" when the sequences all have different lengths.

The run length encoding function (rle) is built for this. Helpfully whilst it computes the length of runs of equal values in a vector, it returns those lengths with the values. So use rle( bin ).
Compare the $values output to your desired value (1) with == and sum the result (because you get a TRUE or 1L when the run of values is of 1's):
sum( rle(bin)$values == 1 )
[1] 5

Related

Count occurrences of value in a set of variables in R (per column) [duplicate]

This question already has answers here:
Counting the number of elements with the values of x in a vector
(20 answers)
Closed 1 year ago.
I have this data and I want to figure out a way to know how many ones and how many zeros are in each column (ie Arts and Crafts). I have been trying different things but it hasn't been working. Does anyone have any suggestions?
You can use the table() function in R. This creates a categorical representation of your data. Additionally here convert list to vector I have used unlist() function.
df1 <- read.csv("Your_CSV_file_name_here.csv")
table(unlist(df1$ArtsAndCrafts))
If you want to row vice categorize the number of zeros and ones you can refer to this question in Stackoverflow.

r: How to sample from a population of size 1? [duplicate]

This question already has answers here:
Sample from vector of varying length (including 1)
(4 answers)
Closed 6 months ago.
I appreciate that sampling from a list of length 1 has little practical use yet all the same I tried the following:
When I run the r snippet sample(c(1,2,3),1) then I obtain a random single value from the list c(1,2,3).
When I run the r snippet sample(c(3),1) then I would expect the number 3 to always be output but I don't, I seem to obtain the same behaviour as above.
Why is this? How can I sample from a list of length 1?
I found that sample(c(3,3),1) does indeed output the intended, but feels not what I had in mind.
See documentation for sample:
If x has length 1, is numeric (in the sense of is.numeric) and x >= 1, sampling via sample takes place from 1:x.
You can use resample() from the gdata package. This saves you having to redefine resample in each new script. Just call
gdata::resample(c(3), 1)
https://www.rdocumentation.org/packages/gdata/versions/2.18.0/topics/resample

Is there a way to create this type of vector without using a for loop? [duplicate]

This question already has an answer here:
rep() with each equals a vector
(1 answer)
Closed 2 years ago.
I have a vector with a bunch of numbers in it, lets say 3,2,0,0,0,1,2,....
I want to make a vector that has numbers based on the numbers in the above vector.
It's hard to explain, but the vector created from the above numbers would be 1,1,1,2,2,6,7,7
One appears three times because the number in the first spot is a three, two shows up twice because the second number is a two, and so on.
I can do this just fine with a for loop using rep(), but I would love a way to do this with sapply and a custom function (or an already existing one if there is such a thing). I'm not sure how to do it without a counter variable i.
You can use rep in vectorized way here, looping over the position of each element with seq_along and repeating it x times.
x <- c(3,2,0,0,0,1,2)
rep(seq_along(x), x)
#[1] 1 1 1 2 2 6 7 7

Why this conditional subsetting from a csv file returns incorrect answer in R? [duplicate]

This question already has answers here:
How to count TRUE values in a logical vector
(8 answers)
Closed 5 years ago.
Suppose I have the following data called D (9 columns, 395 rows):
D = read.csv("https://docs.google.com/uc?id=0B5V8AyEFBTmXQ1QwWVZuS3FXOHc&export=download")
In D, when I try to find out the length of p.values that are less than .05, I get an erroneous answer:
length(D$p.value <= .05) # Returns "395", which is the total number of rows not those <= .05
I'm wondering what the correct code code return the correct length of p.values that are less than .05 in D?
Try this:
sum(D$p.value <= .05)
I believe your problem may be that you are simply counting the size of the comparison vector. Of course, its size is the same as the data frame. Instead, my answer counts only entries for which the inequality is actually true.
#RichScriven edit: Summing the inequality will automatically convert the booleans to numbers, either 0 or 1.
Note that if you take a sum of a vector containing even one NA value then the resulting sum will also be NA. One option would be to ignore those NA values by removing them via:
sum(D$p.value <= .05, na.rm = TRUE)

R compare multiple values with vector and return vector [duplicate]

This question already has answers here:
Test if a vector contains a given element
(8 answers)
Closed 7 years ago.
I have a vector "A", and for each element of A I want to check whether it is equal to any element from a second vector "Targets". I want a vector of logical values with the length of A as return.
The same problem is mentioned here. Here is a related discussion of how to test if a vector contains a specific value, but I did not know how to apply this to my question because I want a vector as output and not a single boolean.
Here is example code:
#example data
A <- c(rep("A",4),rep(c("B","C"),8))
Targets <- c("A","C","D")
I would like a vector which tells me for each element of A if it is equal to at least one of the elements in Targets. For the example, this should produce a vector that is identical with:
result <- c(rep("TRUE",4), rep(c("FALSE","TRUE"), 8))
In case this is relevant, eventually vector A will be much longer (ca 20000 elements), and the vector Targets will contain approximately 30 elements.
Just try:
A %in% Targets
The %in% function tells you if each element of the first argument equals one of the elements of the second argument, that's exactly what you are looking for.

Resources