R, Removing duplicate along with the original value [duplicate] - r

This question already has answers here:
R: Removing duplicate elements in a vector [duplicate]
(4 answers)
Closed 6 years ago.
I'm trying to use R to delete all the duplicate values without keeping the original value.
For example I've the following list
1. Car
2. Car
3. Food
Unique function will remove the the duplicate and will keep the original.
1. Car
2. Food
I need to to remove all the duplicate values and to get only the value that occurred once.
1. Food
Thank you in advance

We can use
df1[!(duplicated(df1$col)|duplicated(df1$col, fromLast=TRUE)),, drop=FALSE]
# col
#3. Food

Related

Count occurrences of value in a set of variables in R (per column) [duplicate]

This question already has answers here:
Counting the number of elements with the values of x in a vector
(20 answers)
Closed 1 year ago.
I have this data and I want to figure out a way to know how many ones and how many zeros are in each column (ie Arts and Crafts). I have been trying different things but it hasn't been working. Does anyone have any suggestions?
You can use the table() function in R. This creates a categorical representation of your data. Additionally here convert list to vector I have used unlist() function.
df1 <- read.csv("Your_CSV_file_name_here.csv")
table(unlist(df1$ArtsAndCrafts))
If you want to row vice categorize the number of zeros and ones you can refer to this question in Stackoverflow.

How do I shorten row names in R? Please read below [duplicate]

This question already has answers here:
R - remove anything after comma from column
(5 answers)
Closed 2 years ago.
I have a table of values in R where the row names are very large. I want to shorten them. My row names look like this:
GSM1051550_7800246087_R02C01
I want to rename every row to only have the first part of the name, i.e., GSM1051550. How can I do this in R?
Building on jay.sf's comment (assuming your table's names is ABC):
row.names(ABC) <- sub("\\_.*", "", row.names(ABC))

Select duplicate rows by comapring multiple columns in R [duplicate]

This question already has answers here:
Find duplicate values in R [duplicate]
(5 answers)
Closed 4 years ago.
I have an issue in selecting duplicate rows in R. A data fame has 14 columns and 1 million rows. I have to do row comparison i.e finding out identical rows, would be duplicate. I want to get the duplicate row by this method. My data frame is like
Data frame sample
Last two rows were identical, so need to mark it as flag value 1.
I don't know how to start with this.
I have tried these codes,
df <- unique(data[,1:97]) //this method gives me unique set not number of duplicates.
dim(data[duplicated(data),])[1] // this method gives me the number of duplicates but not ids.
I need to know the duplicate ids.
my intension is to check each row and written total number of duplicate rows or the line number.
Look into the duplicated() function. It can be used to remove the duplicated rows or inversely keep them as well

sequential counts of a column based on unique identifier [duplicate]

This question already has answers here:
How do you find duplicated elements of a vector that outputs the integer appearance of that duplicate instead of a logical?
(1 answer)
Add column with order counts
(2 answers)
Closed 7 years ago.
I wish to create a new row of numbers (count) based on the number of occurrences of a unique identifier(unique_id) in the first column (using R).
For example:
unique_id <- c("1a","2b","2b","1c","1c","1c","3d","3d","1e","3b","3b","3b","3b")
count <- c(1,1,2,1,2,3,1,2,1,1,2,3,4)
data <- data.frame(unique_id,count)
I've had a good search, but because I'm unsure of the correct terminology for this (still new to R), I was unable to find any relevant info.

How do I find how many times a value has been repeated in a column - R [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Count unique values in R
I have just one column full of names, and I need to know how many times each name is on this column.
I can do a
summary(dfUL)
where dfUL is my user list data frame
This will give me a summary with the number of times a particular value is repeated, but it will only do it for the top 6. How can i do that for the entire data frame?
Did you try already table(dfUL)?
Another possibly useful method would be the match() function.
match(x,dfUL$somecol) #Where x is the value in somecol you are looking for
match(max(dfUL$somecol),dfUL) #Returns the row with the maximum value of somecol

Resources