Tallying values in single column and separating into Rows in R [duplicate] - r

This question already has answers here:
Counting the number of elements with the values of x in a vector
(20 answers)
Closed 6 years ago.
I have a single row of numbers. I'm wondering how I can separate it out so that it outputs columns that total the tally of each set of numbers. I've tried playing around with "separate" but I can't figure out how to make it work.
Here's my data frame:
2
2
2
2
2
4
4
4
I'd like it to be
2 4
5 3

You can use the table() function.
> df
V1
1 2
2 2
3 2
4 2
5 2
6 4
7 4
8 4
> table(df$V1)
2 4
5 3

We can use tabulate which would be faster
tabulate(factor(df1$V1))
#[1] 5 3

Related

increasing value by one with each occurrence of non-repeated number [duplicate]

This question already has answers here:
Increment by 1 for every change in column
(6 answers)
Closed 2 years ago.
v <- c(1,1,2,3,3,3,1,1,3,4,4)
I'm trying to create a vector of elements in which the first occurrence of a non-repeated number always increases by one relative to the previous number.
This is the desired output
1,1,2,3,3,3,4,4,5,6,6
What would an efficient way of doing this would be?
A base R option with rle
> with(rle(v),rep(seq_along(values),lengths))
[1] 1 1 2 3 3 3 4 4 5 6 6
or data.table::rleid
> data.table::rleidv(v)
[1] 1 1 2 3 3 3 4 4 5 6 6

Repeating rows in data frame by using the content of a column in R [duplicate]

This question already has answers here:
Repeat each row of data.frame the number of times specified in a column
(10 answers)
Closed 2 years ago.
I want to create a data frame by repeating rows by using content of a column in a data frame. Below is the source data frame.
data.frame(c("a","b","c"), c(4,5,6), c(2,2,3)) -> df
colnames(df) <- c("sample", "measurement", "repeat")
df
sample measurement repeat
1 a 4 2
2 b 5 2
3 c 6 3
I want to repeat the rows by using the "repeat" column and its content to get a data frame like the one below. Ideally, I would like to have a function to this.
sample measurement repeat
1 a 4 2
2 a 4 2
3 b 5 2
4 b 5 2
5 c 6 3
6 c 6 3
7 c 6 3
Thanks in advance!
Solved. df[rep(rownames(df), df$repeat), ] did the job.

Merging Overlapping Intervals in R [duplicate]

This question already has answers here:
Overlap join with start and end positions
(5 answers)
Merge overlapping ranges into unique groups, in dataframe
(2 answers)
Collapse rows with overlapping ranges
(5 answers)
Closed 4 years ago.
I have a problem where I get information on the range of occupied cells. There may be multiple start and end entries of the range which can overlap for the same test. Not all the "test" have entries.
I have a data frame in R and want to merge all the unique ranges for each "test".
x<-data.frame(test=c(2,3,3,2,3,4),start=c(1,1,1,2,3,4),end=c(1,2,3,3,4,4))
> x
test start end
1 2 1 1
2 3 1 2
3 3 1 3
4 2 2 3
5 3 3 4
6 4 4 4
I would like to transform this data frame into:
test start end
1 2 1 1
2 2 2 3
3 3 1 4
4 4 4 4
In the end I just want to know how many cells are occupied by the range for each "row", so row 2 has (1,1) and (2,3) which means 3 cells. row 3 has (1,4) so 4 cells. row 4 has (4,4) so 1 cell. since row 1 or 5 to n has none occupied, all are 0 cells:
u<-unique(y[,1])
a<-rep(0,length(u))
for(i in 1:length(u)){
a[i]<-sum(y[which(y[,1]==u[i]),3]-y[which(y[,1]==u[i]),2])+length(which(y[,1]==u[i]))
}
> a
[1] 3 4 1

R - Subset dataframe where 2 columns have values [duplicate]

This question already has answers here:
Remove rows with all or some NAs (missing values) in data.frame
(18 answers)
Closed 6 years ago.
How can I subset a dataframe where 2 columns have values?
For example:
A B
1 2
3
5 6
8
becomes
A B
1 2
5 6
> subset(df, !is.na(df$A) & !is.na(df$B))
> df[!is.na(df$A) & !is.na(df$B),]
> df[!is.na(rowSums(df)),]
> na.omit(df)
all equivalent
One easiest way is to use na.omit (if you are targeting NA values).
Kindly go through following R code snippet:
> x
a b
1 1 2
2 3 NA
3 5 6
4 NA 8
> na.omit(x)
a b
1 1 2
3 5 6
Another way is to use complete.cases as shown below:
> x[complete.cases(x),]
a b
1 1 2
3 5 6
You can also use na.exclude as shown below:
> na.exclude(x)
a b
1 1 2
3 5 6
Hope it works for you!

Transpose multiple columns to rows in R [duplicate]

This question already has answers here:
How to reshape data from long to wide format
(14 answers)
Closed 6 years ago.
I have a data set as following, in which each ID has multiple rows for different attributes.
ID<-c(1,1, 2,2,3,3)
Score<-c(4,5, 5,7,8,9)
Attribute<-c("Att_1","Att_2", "Att_1","Att_2", "Att_1","Att_2")
T<-data.frame(ID, Score, Attribute)
Need to transform it to following format so each ID has one row:
ID Att_1 Att_2
1 4 5
2 5 7
3 8 9
There are threads on how to do this in excel, just wondering is there is any neat way to do in R? Thanks a lot!
You could try this:
library(reshape2)
dcast(T, ID ~ Attribute, value.var="Score")
# ID Att_1 Att_2
#1 1 4 5
#2 2 5 7
#3 3 8 9
This can be done with reshape():
reshape(data.frame(ID,Score,Attribute),idvar='ID',timevar='Attribute',dir='w');
## ID Score.Att_1 Score.Att_2
## 1 1 4 5
## 3 2 5 7
## 5 3 8 9

Resources