remove duplicates and it's intrinsec value [duplicate] - r

This question already has answers here:
R: Extracting non-duplicated values from vector (not keeping one value for duplicates) [duplicate]
(2 answers)
Keep only non-duplicate rows based on a Column Value [duplicate]
(1 answer)
R: Removing duplicate elements in a vector [duplicate]
(4 answers)
Closed 1 year ago.
Suppose the next vector:
just_a_random_vector <- c("A", "B", "B", "C", "C", "D")
The idea is that if certain value has duplicates then drop all duplicate values and the value itself. In order to get something that looks like this:
# A D
Is there any way to get the above output?

Using duplicated in forward and reverse to return two logical vectors, then use OR (|) when either one of them is TRUE, negate (!) and subset the vector
just_a_random_vector[!(duplicated(just_a_random_vector)|
duplicated(just_a_random_vector, fromLast = TRUE))]
[1] "A" "D"
Or another option is table to create a logical vector based on the frequency count i.e. count equal to 1 is returned
just_a_random_vector [just_a_random_vector %in%
names(which(table(just_a_random_vector) == 1))]
[1] "A" "D"

Related

Find which element (row and column) that differ between two data frames in R? [duplicate]

This question already has answers here:
Identifying specific differences between two data sets in R
(4 answers)
Select rows from one data.frame that are not present in a second data.frame
(14 answers)
Closed 10 months ago.
Is there a simple way to find which element (row and column) that differ between two data frames in R? I know I can get which rows are different using setdiff() or dplyr::anti_join(). I also know it is possible using other software, but I’d like to know if I can do this inside R (or RStudio).
Have a look at the waldo library.
> library(waldo)
> a = data.frame(x=1:3, y=c("a", "b", "c"))
> b = data.frame(x=1:3, y=c("a", "B", "c"))
> compare(a, b)
old vs new
y
old[1, ] a
- old[2, ] b
+ new[2, ] B
old[3, ] c
`old$y`: "a" "b" "c"
`new$y`: "a" "B" "c"

I Want Frequency Value with the Maximum Sample Using R [duplicate]

This question already has answers here:
Find value corresponding to maximum in other column [duplicate]
(2 answers)
Closed 1 year ago.
I want the frequency with the maximum samples in df
df <- data.frame(Freq = c(1,2,3,4,5,6,7,8,9,10), Valu = c(10,5,11,7,13,15,9,6,12,12))
apply(df, 2, which.max)
.
What I want
I want it to print just the frequency of the maximum Valu which is 6
We could use which.max on the column 'Sample', get the index and extract ([), the corresponding 'Freq' value
with(df, Freq[which.max(Valu)])
#[1] 6
If the column names are changing, then use position index
df[[1]][which.max(df[[2]])]
[1] 6
Or may use order as well
df[[1]][order(-df[[2]])][1]
[1] 6
If we loop over the columns (*apply) with MARGIN = 2 and apply the function which.max, it returns the index of max for those columns separately

Subset a data.table based on column value where column is selected based on sequence [duplicate]

This question already has answers here:
Extract a column from a data.table as a vector, by position
(2 answers)
Closed 3 years ago.
Let say I have below data.table
library(data.table)
DT <- data.table(x=sample(letters, 1e6, TRUE), y=rnorm(1e6), v=runif(1e6))
Now I want to subset DT based on the 1st column value equals to letters[1:2]
If the chose column by name, then this is straighforward -
DT[x %in% letters[1:2]]
However I want to select column by placement i.e. first column or 4th column etc
Below code doesnt work
DT[1 %in% letters[1:2]]
Any pointer on what will be right syntax to select column based on placement, would be helpful
You can do this :
DT[DT[[1]] %in% letters[1:2]]

Generate all the possible combinations of a vector [duplicate]

This question already has answers here:
How can I generate all the possible combinations of a vector
(2 answers)
Compute all pairwise differences of elements in a vector [duplicate]
(1 answer)
Closed 5 years ago.
I am interested to generate all the possible combinations of a vector, for example,
v1 <- c("A", "Aa", "B")
and all the possible combinations will be
"A-Aa" "Aa-A" "B-A" "B-Aa" "A-B" "Aa-B"
I saw a post- How can I generate all the possible combinations of a vector, but it does not give correct results for v1. The same thing happens to v1 <- c("Testis_NOA_ID","Testis_NOA_IDSt") which returns "Testis_NOA_ID-Testis_NOA_IDSt" only but I am expecting "Testis_NOA_IDSt-Testis_NOA_ID" also.
I checked some other vectors such as v1 <- c("a,"b","c) or v1 <- c("normal", "cancer"), which it gives correct results. The problem comes when the vector content is repeating like v1 <- c("Testis_NOA_ID","Testis_NOA_IDSt"). How to fix it.

Looping a column in a dataframe while changing its value [duplicate]

This question already has answers here:
Using ifelse in R
(2 answers)
Closed 6 years ago.
I'm quite new to R, picked it up less than two weeks ago and was wondering why this didn't work. Basically what I'm trying to do is to loop through the new added column, compare a value of another column in the same row, and based on a condition, change the value in the column I'm looping.
myDataFrame["column2"] <- "a"
refValue = x
for(i in nrow(myDataFrame){
if(column1[i] >= refValue){
column2[i] <- "b"
}}
Tried to run it but the value doesn't change
View(MyDataFrame)
So myDataFrame is at th moment is
column1---------column 2
someValue------a
someValue------a
someValue------a
someValue------a
after it finished looping based on a condition which is the value of the corresponding row in column1, I want to change some of the 'a's to 'b's
No need to use loop for this. You can replace your code with
myDataFrame$column2 <- with(myDataFrame, ifelse(column1 >= x, "b", "a"))

Resources