Assigning values to correlative series in r - r

I hope you can help me with this issue I have.
I have a big dataframe, to simplify it, it look like this:
df <- data.frame(radius = c (2,3,5,7,4,6,9,8,3,7,8,9,2,4,5,2,6,7,8,9,1,10,8))
df$num <- c(1,2,3,4,5,6,7,8,9,10,11,1,12,13,1,14,15,16,17,18,19,1,1)
df
The column $num has correlative series (1-11, 1, 12-13, 1, 14-19,1,1)
I would like to assign a value (sorted) per each correlative serie as a column. the outcome should be like this:
df$outcome <- c(1,1,1,1,1,1,1,1,1,1,1,2,3,3,4,5,5,5,5,5,5,6,7)
df
thanks a lot!
A.

We can get the difference between adjacent elements in 'num' using diff and check whether it is not equal to 1. The logical output will be one less than the length of the 'num' vector. We pad with 'TRUE' and cumsum to get the expected output.
df$outcome <- cumsum(c(TRUE,diff(df$num)!=1))
df$outcome
#[1] 1 1 1 1 1 1 1 1 1 1 1 2 3 3 4 5 5 5 5 5 5 6 7

Related

To count how many times one row is equal to a value

To count how many times one row is equal to a value
I have a df here:
df <- data.frame('v1'=c(1,2,3,4,5),
'v2'=c(1,2,1,1,2),
'v3'=c(NA,2,1,4,'1'),
'v4'=c(1,2,3,NaN,5),
'logical'=c(1,2,3,4,5))
I would like to know how many times one row is equal to the value of the variable 'logical' with a new variable 'count'
I wrte a for loop like this:
attach(df)
df$count <- 0
for(i in colnames(v1:v4)){
if(df$logical == i){
df$count <- df$count+1}
}
but it doesn't work. there's still all 0 in the new variable 'count'.
Please help to fix it.
the perfect result should looks like this:
df <- data.frame('v1'=c(1,2,3,4,5),
'v2'=c(1,2,1,1,2),
'v3'=c(NA,2,1,4,'1'),
'v4'=c(1,2,3,NaN,5),
'logical'=c(1,2,3,4,5),
'count'=c(3,4,2,2,2))
Many thanks from a beginner.
We can use rowSums after creating a logical matrix
df$count <- rowSums(df[1:4] == df$logical, na.rm = TRUE)
df$count
#[1] 3 4 2 2 2
Personally I guess so far the solution by #akrun is an elegant and also the best efficient way to add the column count.
Another way (I don't know if that is the one you are looking for the "elegance") you can used to "attach" the column the count column to the end of df might be using within, i.e.,
df <- within(df, count <- rowSums(df[1:4]==logical,na.rm = T))
such that you will get
> df
v1 v2 v3 v4 logical count
1 1 1 <NA> 1 1 3
2 2 2 2 2 2 4
3 3 1 1 3 3 2
4 4 1 4 NaN 4 2
5 5 2 1 5 5 2

R: Assign Rank 1 to Predifined Largest Value

I have dataset like this:
Value
5
4
2
1
I want the largest value to have the smallest rank while the lowest value to have the highest rank.
In this dataset, Value=1 will recode to 5 while Value=5 will recode to 1.
However, due to the missing Value=3 in my dataset, by using the rank function rank(-Value), I only managed to get this
Value Rank
5 1
4 2
2 3
1 4
Is there any way in R to get something like this?
Value Rank
5 1
4 2
2 4
1 5
Try it like this:
df <- data.frame(Value = c(5, 4, 2, 1))
df$fact <- as.factor(df$Value)
df$Rank <- as.numeric(rev(levels(df$fact)))[df$fact]
> (df <- df[, -2])
Value Rank
1 5 1
2 4 2
3 2 4
4 1 5
You can do this by finding the max and min values of your vector and then searching for the index within a complete number set between the max and min.
v <- c(5,4,2)
x <- min(v)
y <- max(v)
x:y
match(v,x:y)
[1] 4 3 1
Using the levels of a factor as J.Win. suggests will work as long as there is a 1 in your vector but otherwise, the highest value will not have a rank of 1. Sorry, I do not have enough reputation to add this as a comment.

R: loop matrix sort columns individually for specific rows

I want to sort my Matrix (U) columnwise for the rows, which have the same name. My (very large) matrix looks similar to this:
1 2
1 5 6
1 -4 4
1 6 -2
2 7 -2
2 -2 3
Now I want to loop through the matrix looking for the same rows and then sort the columns which have the same row.name resulting in this matrix:
1 2
1 -4 -2
1 5 4
1 6 6
2 -2 -2
2 7 3
My code until now looks like this:
First step was the row count, which works:
z <- 1
for(i in (1:nrow(U))){
if(row.names(U)[i] != row.names(U)[i-1]){
z = (sum(row.names(U) == row.names(U)[i]))+1}}
Now I wanted to add after the row count a sorting function and I tried this for the first set of rows manually:
x <- 1
for(x in (1:ncol(U))){
U[1:3,x]<- U[do.call(order, lapply(x:NCOL(U), function(x) U[1:3, x]
However this loop is on the one hand very slow and on the other hand it only fills in the first column correctly
Do you have a recommendation how I could improve my sorting function, while taking into account the performance issues?
EDIT: I guess this was confusing in my first edit. The first "column" of my matrix are the row.names and I have in this example a 5x2 Matrix
Here's an approach which just uses order() first by row name, then by each column in turn. Is this what you're after?
U <- matrix(c(5,6,-4,4,6,-2,7,-2,-2,3), byrow=TRUE, ncol=2, dimnames=list(c(1,1,1,2,2), c(1,2)))
apply(U, 2, function(j) j[order(rownames(U), j)])
We can use data.table, convert to data.table, grouped by the first column ('U'), loop through the columns and sort
library(data.table)
as.data.table(m1)[, lapply(.SD, sort), by = U]
An alternative using dplyr
df = read.table(textConnection("U 1 2
1 5 6
1 -4 4
1 6 -2
2 7 -2
2 -2 3"), header= TRUE)
library(dplyr)
df %>% group_by(U) %>% transmute(sort(X1),sort(X2))

i want to do data.frame1[1,] <-data.frame[1,], but i got trouble

Consider this sampel data
df1<-data.frame(c(1,2,1),c(3,3,2),c(2,5,8))
df2<-data.frame("a","a","a")
The result that I want is
> df1
1 2 3
1 a a a
2 2 3 5
3 1 2 8
but after I do this: df1[1,] <- df2[1,]
> df1
1 2 3
1 1 1 1
2 2 3 5
3 1 2 8
why? what should I do that I can get the result what I want?
Each column in a data frame must have the same type. The key thing here is that the values in df2 are factors, not characters (because stringsAsFactors = TRUE). Factors have an underlying integer representation so when you combine a factor and a numeric in the same vector the factor is promoted to numeric type. The first level of a factor corresponds to 1 which is why a became 1.
Regarding the factor vs character type conversion, note the following:
c("a", 2, 3)
## "a" "2" "3"
c(factor("a"), 2, 3)
## 1 2 3
#Chaconne's answer gives a good explanation. If you really want to do what you say you want, you can do this:
df1<-data.frame(v1=c(1,2,1),v2=c(3,3,2),v3=c(2,5,8))
df2<-data.frame("a","a","a",stringsAsFactors=FALSE)
df1[1,] <- df2[1,]
but it will convert ("coerce") all of your data to character type, which is probably not what you want ...
Perhaps you want names(df1) <- df2[1,] ?

How many times occur pair of 1 in a vector

i have a problem.
I have a vector, that consists from 0 or 1 - for example (011011111011100001111). In R i need to figure out, how to count how many times appears in vector two 1, three 1, four 1 and so on. In this example vector I have 1 times 11, 1 times 111, 1 times 1111 and 1 times 11111.
Thanks a lot, Peter
I'm assuming you have an actual vector like c(0, 1, 1, 0...).
Here is a solution using table and rle. I've also provided some longer sample data to make it a bit more interesting.
set.seed(1)
myvec <- sample(c(0, 1), 100, replace = TRUE)
temp <- rle(myvec)
table(temp$lengths[temp$values == 1])
#
# 1 2 3 4 6
# 15 8 1 2 1
If, indeed, you are dealing with a crazy-long character string of ones and zeroes, just use strsplit and follow the same logic as above.
myvec <- "00110111100010101101101000001001001110101111110011010000011010001001"
myvec <- as.numeric(strsplit(myvec, "")[[1]])
Here, I've converted to numeric, but that's just so you can use the same code as earlier. You can use rle on a character vector too.
rle is your friend:
vec <-c(0,1,1,0,1,1,1,1,1,0,1,1,1,0,0,0,0,1,1,1,1)
res <-data.frame(table(rle(vec)))
res[res$values==1,]
lengths values Freq
6 1 1 0
7 2 1 1
8 3 1 1
9 4 1 1
10 5 1 1

Resources