Subsetting number of observations [duplicate] - r

This question already has answers here:
Remove last N rows in data frame with the arbitrary number of rows
(4 answers)
Closed 2 years ago.
I have a dataset consisting of 250 observations. I want to select all observations expect last. I know I can do this by following codes. But if do not know exact number of observations how I can do this.
dataset(mtcars)
mtcars_lag<-mtcars[1:31,]
## skipping first observation and selecting all
mtcars_forward<-mtcars[2:32,]

Using nrow() gets you the number of observations in the dataset. mtcars_subset <- mtcars[1:(nrow(mtcars)-1), ] will fetch you all observations except the last one.
EDIT: Added parenthesis in line with suggestion from MrFlick.

Related

Count occurrences of value in a set of variables in R (per column) [duplicate]

This question already has answers here:
Counting the number of elements with the values of x in a vector
(20 answers)
Closed 1 year ago.
I have this data and I want to figure out a way to know how many ones and how many zeros are in each column (ie Arts and Crafts). I have been trying different things but it hasn't been working. Does anyone have any suggestions?
You can use the table() function in R. This creates a categorical representation of your data. Additionally here convert list to vector I have used unlist() function.
df1 <- read.csv("Your_CSV_file_name_here.csv")
table(unlist(df1$ArtsAndCrafts))
If you want to row vice categorize the number of zeros and ones you can refer to this question in Stackoverflow.

Calculate mean for reocurring observations of one column but with differing values of other column [duplicate]

This question already has answers here:
Mean per group in a data.frame [duplicate]
(8 answers)
Closed 1 year ago.
I have this data frame of about 35'000 observations. The problem is that there are about 5'000 occurences (as exemplified by the first two and last two rows of the image) whereby I have two observations relating to the same COD_DOM but with differing values of RENDIMENTO. What I would like is to calculate the average RENDIMENTO for all COD_DOM which appear twice and thus keep only one observation with the average value.
If your data.frame is just these two columns, you should be able to use:
library(dplyr)
new_df <- data.frame %>%
group_by(COD_DOM) %>%
summarize(RENDIMENTO=mean(RENDIMENTO))

Remove rows with at least 1 value less than a certain limit in R [duplicate]

This question already has answers here:
R keep rows with at least one column greater than value
(3 answers)
Delete rows in R if a cell contains a value larger than x
(1 answer)
Closed 2 years ago.
I have a matrix, and I want to remove all rows that contain at least one element less than a value, 3 on this example. Sample data:
A=matrix(c(10,2,4,8,5,4,8,10,5),byrow=T,ncol=3)
#Remove rows that contain at least one value less than 3
final_matrix=matrix(c(8,5,4,8,10,5), byrow=T,ncol=3)
How to get to the final matrix from the initial matrix A? My real matrix contains thousands of rows tens of columns, this is a toy example. I tried A=A[A>3,] but I get an error "logical subscript too long"

Select duplicate rows by comapring multiple columns in R [duplicate]

This question already has answers here:
Find duplicate values in R [duplicate]
(5 answers)
Closed 4 years ago.
I have an issue in selecting duplicate rows in R. A data fame has 14 columns and 1 million rows. I have to do row comparison i.e finding out identical rows, would be duplicate. I want to get the duplicate row by this method. My data frame is like
Data frame sample
Last two rows were identical, so need to mark it as flag value 1.
I don't know how to start with this.
I have tried these codes,
df <- unique(data[,1:97]) //this method gives me unique set not number of duplicates.
dim(data[duplicated(data),])[1] // this method gives me the number of duplicates but not ids.
I need to know the duplicate ids.
my intension is to check each row and written total number of duplicate rows or the line number.
Look into the duplicated() function. It can be used to remove the duplicated rows or inversely keep them as well

Find total frequency of number In Column [duplicate]

This question already has answers here:
Counting the number of elements with the values of x in a vector
(20 answers)
Closed 5 years ago.
I am new on R. I want to ask, How to find frequency of each Number in Column, there are multiple numbers in column. i want to frequency of each number. I want just simple code. You can imagine that data set name is Oct-TT. Thanks
Here is the answer:
df <- as.data.frame(sample(10:20, 20,replace=T))
colnames(df) <- "Numbers"
View(df)
as.data.frame(table(df$Numbers))

Resources