R: Adding data from different rows to other rows - r

Currently, I have managed to clean and merged my data to only contain those variables/observations of interest that I want. However, I really want to add all the N85 and N_unknown data to the respective N80_84 years.
head of data.frame <- the col names
rows I want to add <- the rows I want to add in the above example.
For eg, I want to add YEAR 1986 (col 1), for age groups N85 (col2) data to the respective YEAR 1986, age groups N80_84.
Like row 13 + row 96 = newN80_84 in year 1986; row 11 + row 105 = new80_84 in year 1987 etc.
Is there a code for that? To add to their respective years and not a lump sum? I wanted to use rowSums(), but it doesn't add specifically to their respective years.
Also, I only wanted to add cols 3 and 4, not the last column with 500 as values. Is it possible to "specify" which cols to add?

Using mtcars dataset you can add two specific columns as follows:
head(mtcars)
mpg cyl disp hp drat wt qsec vs am gear carb
Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1
So you can add mpg column to cyl column and then add it to your dataframe as shown below:
newcol <- c(mtcars['mpg']+mtcars['cyl']) # by using column names
# or by using column number you can add as follows:
newcol <- mtcars[,1]+mtcars[,2]
newdf=cbind(mtcars,newcol)
head(newdf)
mpg cyl disp hp drat wt qsec vs am gear carb newcol
Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 27.0
Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4 27.0
Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 26.8
Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1 27.4
Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2 26.7
Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1 24.1
You can see new column named mpg added at the end of the dataframe.

Related

RStudio: colnames() function not showing name of very first column

When I run colnames(), it never shows the name of this first column.
For example, after wasting a lot of time researching online, I discovered the name of the first column in mtcars is das_Auto.
Why doesn't this name show when I run this code?
[colnames(mtcars)][1]
What's the easiest way to determine the name of the first column in a data set?
This is because the first 'column' of mtcars is not actually a column but an index. If you want to convert it to a column you can run the below:
df <- cbind(das_Auto = rownames(mtcars), mtcars)
rownames(df) <- 1:nrow(mtcars)
head(df)
das_Auto mpg cyl disp hp drat wt qsec vs am gear carb
1 Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
2 Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
3 Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
4 Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
5 Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
6 Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1

Looping through rows and columns in R does not work

I am just trying to fill gaps but in a loop. It is a monthly data, and fill_gaps produces NAs for every day. I am not sure why.
for (x in 2:length(differencing)){
for(micky in 1:length(differencing$`d_ BA`)){
if(is.na(differencing[micky,x])== T){
differencing[micky,x] = differencing[micky-1,x]
}
}
}
here is the error that I am getting:
Error: Assigned data `differencing[(micky - 1), x]` must be compatible with row subscript `micky`.
x 1 row must be assigned.
x Assigned data has 0 rows.
i Row updates require a list value. Do you need `list()` or `as.list()`?
Run `rlang::last_error()` to see where the error occurred.
This can be easily done using fill
library(tidyr)
library(dplyr)
differencing %>%
fill(everything())
Or we can use na.locf from zoo
library(zoo)
na.locf(differencing)
In the OP's loop, in the first line, it would be
for (x in 2:length(differencing$`d_ BA`)
...
as length of a data.frame will be the number of columns (as mentioned in the comments) and is different from length of a column i.e. vector
As the OP mentioned none of them works (OP didn't provide any example), using a small reproducible example ('tmp')
tmp %>%
fill(everything())
# mpg cyl disp hp drat wt qsec vs am gear carb
#Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
#Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
#Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
#Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
#Hornet Sportabout 18.7 6 258 110 3.15 3.440 17.02 0 0 3 2
#Valiant 18.1 6 258 110 2.76 3.460 20.22 1 0 3 1
or using na.locf
na.locf(tmp)
# mpg cyl disp hp drat wt qsec vs am gear carb
#Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
#Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
#Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
#Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
#Hornet Sportabout 18.7 6 258 110 3.15 3.440 17.02 0 0 3 2
#Valiant 18.1 6 258 110 2.76 3.460 20.22 1 0 3 1
data
tmp <- head(mtcars)
tmp[c(2, 5, 6), c(3, 4, 2)] <- NA

Specifying where columns should be placed in r

When I create a new variable, is there a way to specify in the function where to place it?
Right now, it adds it to the end of the dataframe, but for ease of viewing in Excel for example, I'd like to place a new calculated column beside the columns I used for the calculation.
Here's an example of code:
rawdata2 <- (rawdata1 %>% unite(location, locations1,locations2, locations3,
na.rm = TRUE, remove=TRUE)
%>% select(-location7, -location16)
%>% unite(Sector, Sectors, na.rm=TRUE, remove=TRUE)
%>% unite(TypeofSpace, TypesofSpace, type.of.spaceOther, na.rm=TRUE,
remove=TRUE)
)
You can rearrange the columns in your data frame. It looks like you are using dplyr::select in your example.
library(dplyr)
head(mtcars)
# mpg cyl disp hp drat wt qsec vs am gear carb
# Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
# Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
# Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
# Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
# Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
# Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1
mtcars2 <- mtcars %>%
select(mpg, carb, everything()) ## moves carb up behind mpg
head(mtcars2)
# mpg carb cyl disp hp drat wt qsec vs am gear
# Mazda RX4 21.0 4 6 160 110 3.90 2.620 16.46 0 1 4
# Mazda RX4 Wag 21.0 4 6 160 110 3.90 2.875 17.02 0 1 4
# Datsun 710 22.8 1 4 108 93 3.85 2.320 18.61 1 1 4
# Hornet 4 Drive 21.4 1 6 258 110 3.08 3.215 19.44 1 0 3
# Hornet Sportabout 18.7 2 8 360 175 3.15 3.440 17.02 0 0 3
# Valiant 18.1 1 6 225 105 2.76 3.460 20.22 1 0 3
You can do the same thing with base subsetting, for example with a data frame with 11 columns you can move the 11th behind the second by
mtcars3 <- mtcars[,c(1,11,2:10)]
identical(mtcars2, mtcars3)
# [1] TRUE
I ended up using relocate, documentation here: dplyr.tidyverse.org/reference/relocate.html

Rename ONLY the last column in each dataframe in a list of dataframes [duplicate]

This question already has answers here:
Using lapply to change column names of a list of data frames
(3 answers)
Closed 3 years ago.
I would like to rename the final column in each one of my dataframes which are in a list of dataframes. The new name will be the same for all.
I made code that extracts the last column (below), but none that renames the final column header.
List <- lapply(List, function(x) x[,ncol(x)])
You could do
lapply(lst, function(x) {names(x) <- c(names(x[-length(x)]), "new_name");x})
#[[1]]
# mpg cyl disp hp drat wt qsec vs am gear new_name
#Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
#Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
#Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
#Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
#Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
#Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1
#[[2]]
# mpg cyl disp hp drat wt qsec vs am gear new_name
#Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
#Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
#Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
#Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
#Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
#Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1
Or a simpler version as mentioned by #Shree
lapply(lst, function(x) {names(x)[ncol(x)] <- "new_name";x})
We can also use rename_at
library(dplyr)
library(purrr)
map(lst,. %>% rename_at(ncol(.), ~"new_name"))
data
lst <- list(head(mtcars), head(mtcars))

how to swap the third column with the last column, and then delete the swapped last column in R

i want to swap a specific column with the last column, and then delete the last column after swapping. After delete ncol(testFrame) will decrease by 1
Usually a reproducible example is expected but your description is clear enough to understand what you want to do.
Using mtcars as sample data
df <- mtcars
head(df)
# mpg cyl disp hp drat wt qsec vs am gear carb
#Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
#Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
#Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
#Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
#Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
#Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1
swap_column <- 3
cols <- seq_len(ncol(df))
df1 <- df[replace(cols, cols == swap_column, ncol(df))][-ncol(df)]
head(df1)
# mpg cyl carb hp drat wt qsec vs am gear
#Mazda RX4 21.0 6 4 110 3.90 2.620 16.46 0 1 4
#Mazda RX4 Wag 21.0 6 4 110 3.90 2.875 17.02 0 1 4
#Datsun 710 22.8 4 1 93 3.85 2.320 18.61 1 1 4
#Hornet 4 Drive 21.4 6 1 110 3.08 3.215 19.44 1 0 3
#Hornet Sportabout 18.7 8 2 175 3.15 3.440 17.02 0 0 3
#Valiant 18.1 6 1 105 2.76 3.460 20.22 1 0 3
We replace the column number swap_column with last column number (ncol(df)) and then remove the last column (-ncol(df)).
We can do this conveniently with add_column from tibble. The .after and .before parameters can take either column index or column name. Suppose, we need to shift last column to third position
library(tibble)
data(mtcars)
df1 <- add_column(mtcars[-ncol(mtcars)], mtcars[ncol(mtcars)], .after = 2)
head(df1)
# mpg cyl carb disp hp drat wt qsec vs am gear
#Mazda RX4 21.0 6 4 160 110 3.90 2.620 16.46 0 1 4
#Mazda RX4 Wag 21.0 6 4 160 110 3.90 2.875 17.02 0 1 4
#Datsun 710 22.8 4 1 108 93 3.85 2.320 18.61 1 1 4
#Hornet 4 Drive 21.4 6 1 258 110 3.08 3.215 19.44 1 0 3
#Hornet Sportabout 18.7 8 2 360 175 3.15 3.440 17.02 0 0 3
#Valiant 18.1 6 1 225 105 2.76 3.460 20.22 1 0 3

Resources