Group together identical elements in a vector using r - r

A very simple question but couldn't find an answer.
I have a vector of characters (for example - "a" "a" "a" "c" "c" "c" "b" "b" "b").
I would like to group together the elements to "a" "c" "b".
Is there a specific function for that?
Thank you

You can using sqldf librayr and using group by:
require(sqldf)
vector<- data.frame(v=c("a", "a", "a", "c", "c", "c", "b", "b", "b"))
sqldf("SELECT v from vector group by v")

Here you go
vector <- c("a", "a", "a", "c", "c", "c", "b", "b", "b")
sorted <- sort(vector)

If you just want the unique elements then there is, well, unique.
> unique(c("a", "a", "a", "c", "c", "c", "b", "b", "b"), sort=TRUE)
[1] "a" "c" "b"
Update
With the new description of the problem, this would be my solution
shifted <- c(NA, vector[-length(vector)])
vector[is.na(shifted) | vector != shifted]
I shift the vector one to the right, putting NA at the front because I have no better idea of what to put there, and then pick out the elements that are not NA and not equal to the previous element.
If the vector contains NA, some additional checks will be needed. It is not obvious how to put something that isn't the first element in the first position of the shifted vector without knowing a bit more. For example, you could extract all the elements form the vector and pick one that isn't the first, but that would fail if the vector only contains identical elements.
Another question now: is there a smarter way to implement the shift operation? I couldn't think of one, but there might be an more canonical solution.

Related

R: Is there a method in R, to substiute the values of a vector using a dictionary (2 column dataframe with old and new value)

Is there a method in R, to substitute the values of a vector using a dictionary (2 column dataframe with old and new value)
The only method I know is to extract the old value into a dataframe and merge it with, what I call,the dictionary (which is a two column dataframe with old and new values). Afterwards reassign the new value to the original old value. However, it seems when using merge (at least since R v4.1, the order of the x value is not maintained, so I am using join now which keeps the original order of dataframe x intact. I am thinking that there must be an easier way, I just have not found it. Hope this is understandable, I appreciate any help.
cheers Hermann
You could use a named character vector as a dict for replacement by unquoting with !!! inside of dplyr::recode. If you have your "dict" stored as a two-column dataframe, then tidyr::deframe might be handy.
library(tidyverse)
x <- c("a", "b", "c")
dict <- tribble(
~old, ~new,
"a", "d",
"b", "e",
"c", "f"
)
recode(x, !!!deframe(dict))
#> [1] "d" "e" "f"
Created on 2021-06-14 by the reprex package (v1.0.0)
You can use match to substitute the values of a vector using a dictionary:
D$new[match(x, D$old)]
#[1] "d" "e" "f"
You can also use the names to get the new values:
L <- setNames(D$new, D$old)
L[x]
#"d" "e" "f"
Data:
x <- c("a", "b", "c")
D <- data.frame(old = c("a", "b", "c"), new = c("d", "e", "f"))

Returning the values of a list based on "two" parameters

Very new to R. So I am wondering if you can use two different parameters to get the position of both elements from a list. See the below example...
x <- c("A", "B", "A", "A", "B", "B", "C", "C", "A", "A", "B")
y <- c(which(x == "A"))
[1] 1 3 4 9 10
x[y]
[1] "A" "A" "A" "A" "A"
x[y+1]
[1] "B" "A" "B" "A" "B"
But I would like to return the positions of both y and y+1 together in the same list. My current solution is to merge the two above lists by row number and create a dataframe from there. I don't really like that and was wondering if there is another way. Thanks!
I dont know what exactly you want, but this could help:
newY = c(which(x == "A"),which(x == "A")+1)
After that you can sort it with
finaldata <- newY[order(newY)]
Or you do both in one step:
finaldata <- c(which(x == "A"),which(x == "A")+1)[order(c(which(x == "A"),which(x == "A")+1))]
Then you could also delete duplicates if you want to. Please tell me if this is what you wanted.

How to merge only specific elements of a vector in R

This is probably pretty straightforward but I'm really stuck: Let's say that I have a vector=c("a", "b", "c","d","e"). How can I concatenate only some specific elements? For example, how do I merge "b" and "c", which will lead me to the vector=c("a","bc","d","e") ?
Thank you
We can do
i1 <- vector %in% c("b", "c")
c(vector[!i1], paste(vector[i1], collapse=""))

Order a numeric vector by length in R

I've got two numeric vectors that I want to order by the length of the their observations, i.e., the number of times each observation appears.
For example:
x <- c("a", "a", "a", "b", "b", "b", "b", "c", "e", "e")
Here, b occurs four times, a three times, e two and c one time. I'd like my result in this order.
ans <- c("b", "b", "b", "b", "a", "a", "a", "e", "e", "c")
I´ve tried this:
x <- x[order(-length(x))] # and some similar lines.
Thanks
Using rle you can get values lenghts. You order lengths, and use values to recreate the vector again using the new order:
xx <- c('a', 'a', 'a', 'b', 'b', 'b','b', 'c', 'e', 'e')
rr <- rle(xx)
ord <- order(rr$lengths,decreasing=TRUE)
rep(rr$values[ord],rr$length[ord])
## [1] "b" "b" "b" "b" "a" "a" "a" "e" "e" "c"
You may also use ave when calculating the lengths
x[order(ave(x, x, FUN = length), decreasing = TRUE)]
# [1] "b" "b" "b" "b" "a" "a" "a" "e" "e" "c"

How to calculate how many times vector appears in a list? in R

I have a list of 10,000 vectors, and each vector might have different elements and different lengths. I would like to know how many unique vectors I have and how often each unique vector appears in the list.
I guess the way to go is the function "unique", but I don't know how I could use it to also get the number of times each vector is repeated.
So what I would like to get is something like that:
"a" "b" "c" d" 301
"a" 277
"b" c" 49
being the letters, the contents of each unique vector, and the numbers, how often are repeated.
I would really appreciate any possible help on this.
thank you very much in advance.
Tina.
Maybe you should look at table:
Some sample data:
myList <- list(A = c("A", "B"),
B = c("A", "B"),
C = c("B", "A"),
D = c("A", "B", "B", "C"),
E = c("A", "B", "B", "C"),
F = c("A", "C", "B", "B"))
Paste your vectors together and tabulate them.
table(sapply(myList, paste, collapse = ","))
#
# A,B A,B,B,C A,C,B,B B,A
# 2 2 1 1
You don't specify whether order matters (that is, is A, B the same as B, A). If it does, you can try something like:
table(sapply(myList, function(x) paste(sort(x), collapse = ",")))
#
# A,B A,B,B,C
# 3 3
Wrap this in data.frame for a vertical output instead of horizontal, which might be easier to read.
Also, do be sure to read How to make a great R reproducible example? as already suggested to you.
As it is, I'm just guessing at what you're trying to do.

Resources