How to select alternatively 12 rows from data frame in R [duplicate] - r

This question already has an answer here:
Selecting multiple parts of a list
(1 answer)
Closed 5 years ago.
Suppose I have a data frame containing 192 rows and I want to select 12 rows alternatively.
i.e. select first 12 rows, then select 25 to 36 rows, then select 49 to 60 rows.
How to do that in R?

Using the iris data as an example.
Simply use iris[1:12,] for the first 12 rows:
# Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#1 5.1 3.5 1.4 0.2 setosa
#2 4.9 3.0 1.4 0.2 setosa
#3 4.7 3.2 1.3 0.2 setosa
#4 4.6 3.1 1.5 0.2 setosa
#5 5.0 3.6 1.4 0.2 setosa
#6 5.4 3.9 1.7 0.4 setosa
#7 4.6 3.4 1.4 0.3 setosa
#8 5.0 3.4 1.5 0.2 setosa
#9 4.4 2.9 1.4 0.2 setosa
#10 4.9 3.1 1.5 0.1 setosa
#11 5.4 3.7 1.5 0.2 setosa
#12 4.8 3.4 1.6 0.2 setosa
iris[25:36,] for rows 25 to 36, and so on.
Note that iris will be swapped to the name of your data frame. The comma is used to select either rows or columns. Thus, iris[,1:3] would select the first 3 columns of the data frame.

You could do this vectorized using recycling technique in R (df is your data frame):
df[rep(c(TRUE, FALSE), each = 12),]

Related

subseting a dataframe in R

I have a dataframe and I want to Create a subset,< Frame>, of just the species variable and display the first five records. with R how can I subset?
there are 10 rows and 7 columns.one column is Species
netID- fishID - species- tl - wtag - scale
By select.
head(
select(dataframe, speceis)
)
Assuming your dataframe is called df you can subset with dplyr
library(dplyr)
df <- iris[1:10,]
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1 5.1 3.5 1.4 0.2 setosa
2 4.9 3.0 1.4 0.2 setosa
3 4.7 3.2 1.3 0.2 setosa
4 4.6 3.1 1.5 0.2 setosa
5 5.0 3.6 1.4 0.2 setosa
6 5.4 3.9 1.7 0.4 setosa
7 4.6 3.4 1.4 0.3 setosa
8 5.0 3.4 1.5 0.2 setosa
9 4.4 2.9 1.4 0.2 setosa
10 4.9 3.1 1.5 0.1 setosa
newdf<-df %>% select(Species) %>%slice(1:5)
Here you are selecting species from your data frame and then using slice you can select the range of rows you need. The Output of newdf is
Species
1 setosa
2 setosa
3 setosa
4 setosa
5 setosa

How to Create a New Column Based on an if else statement in R? [duplicate]

This question already has an answer here:
Create new column with binary data or presence/absence data in R [duplicate]
(1 answer)
Closed 2 years ago.
I have a list of Noxious Weed species in California and a table with all of the species ever seen in a certain site. I want to create a column in the table that will denote which species are Noxious Weeds.
I've been hitting dead ends with this all day and I'm not sure how to continue!
data(iris)
head(iris)
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1 5.1 3.5 1.4 0.2 setosa
2 4.9 3.0 1.4 0.2 setosa
3 4.7 3.2 1.3 0.2 setosa
4 4.6 3.1 1.5 0.2 setosa
5 5.0 3.6 1.4 0.2 setosa
6 5.4 3.9 1.7 0.4 setosa
iris$new_col <- ifelse(iris$Species=="setosa",1,0)
head(iris)
Sepal.Length Sepal.Width Petal.Length Petal.Width Species new_col
1 5.1 3.5 1.4 0.2 setosa 1
2 4.9 3.0 1.4 0.2 setosa 1
3 4.7 3.2 1.3 0.2 setosa 1
4 4.6 3.1 1.5 0.2 setosa 1
5 5.0 3.6 1.4 0.2 setosa 1
6 5.4 3.9 1.7 0.4 setosa 1

Renaming columns based on condition about their names

I would like to add a prefix to my dataset column names only if they already begin with a certain string, and I would like to do it (if possible) using a dplyr pipeline.
Taking the iris dataset as toy example, I was able to get the expected result with base R (with a quite cumbersome line of code):
data("iris")
colnames(iris)[startsWith(colnames(iris), "Sepal")] <- paste0("YAY_", colnames(iris)[startsWith(colnames(iris), "Sepal")])
head(iris)
YAY_Sepal.Length YAY_Sepal.Width Petal.Length Petal.Width Species
1 5.1 3.5 1.4 0.2 setosa
2 4.9 3.0 1.4 0.2 setosa
3 4.7 3.2 1.3 0.2 setosa
4 4.6 3.1 1.5 0.2 setosa
5 5.0 3.6 1.4 0.2 setosa
6 5.4 3.9 1.7 0.4 setosa
In this example, the prefix YAY_ has been added to all the column names starting with Sepal. Is there a way to obtain the same result with a dplyr command/pipeline?
An option would be rename_at
library(tidyverse)
iris %>%
rename_at(vars(starts_with("Sepal")), ~ str_c("YAY_", .))
# YAY_Sepal.Length YAY_Sepal.Width Petal.Length Petal.Width Species
#1 5.1 3.5 1.4 0.2 setosa
#2 4.9 3.0 1.4 0.2 setosa
#3 4.7 3.2 1.3 0.2 setosa
#4 4.6 3.1 1.5 0.2 setosa
#5 5.0 3.6 1.4 0.2 setosa
#6 5.4 3.9 1.7 0.4 setosa
# ...

How i can shift one row of data frame to first row?

How i can shift one raw of data frame to first raw, i want the id raw be the first raw. in R.
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
5.1 3.5 1.4 0.2 setosa
4.9 3.0 1.4 0.2 setosa
4.7 3.2 1.3 0.2 setosa
4.6 3.1 1.5 0.2 setosa
5.0 3.6 1.4 0.2 setosa
id A B C D
We can use grepl to create a logical vector based on the 'id' in 'Sepal.Length', then set the column names of the dataset by extracting that row while removing the row from the original dataset
i1 <- grepl("id", df1$Sepal.Length)
setNames(df1[!i1,], unlist(df1[i1,]))
# id A B C D
#1 5.1 3.5 1.4 0.2 setosa
#2 4.9 3.0 1.4 0.2 setosa
#3 4.7 3.2 1.3 0.2 setosa
#4 4.6 3.1 1.5 0.2 setosa
#5 5.0 3.6 1.4 0.2 setosa
You could do the following (assuming your ID is the Nth row):
df <- iris # Example of data.frame
myIdrow <- 5 # as an example id row
df2 <- df[c(myIdrow, (1:nrow(df))[-myIdrow]), ]
Although I would recommend to have the ID as column name.

Subsetting observations based on a duplicate values

How do I retain just one observation in my dataset when the dataset contains two columns with duplicate values? For example if this is my dataset below:
row1 & row 2
col(Sepal.Length) and col(Petal.Length)
contain similar values (5.1, 1.4), (5.1, 1.4)
I want to remove the second row and just retain the first row.
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1 5.1 3.5 1.4 0.2 setosa
2 5.1 3.0 1.4 0.2 setosa
3 4.7 3.2 1.3 0.2 setosa
4 4.6 3.1 1.5 0.2 setosa
5 5.0 3.6 5.0 0.2 setosa
6 5.4 3.9 1.7 0.4 setosa
Reproducible test data:
test12 <- head(iris)
test12[2,1] <- 5.1
Thanks in advance.
Use duplicated to compare those specific columns:
test12[!duplicated(test12[,c(1,3)]),]
## or referencing the column names themselves:
test12[!duplicated(test12[,c("Sepal.Length","Petal.Length")]),]
# Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#1 5.1 3.5 1.4 0.2 setosa
#3 4.7 3.2 1.3 0.2 setosa
#4 4.6 3.1 1.5 0.2 setosa
#5 5.0 3.6 5.0 0.2 setosa
#6 5.4 3.9 1.7 0.4 setosa
To keep only the first row:
row1 <- test12[1, ]
To drop the second row of your dataFrame:
dropRow <- test12[-2, ]

Resources