In R, how to copy object - r

In R, how to quote object in code ? for example there is object 'df1' in RAM --dataframe
library(tidyverse)
df1 <- data.frame(dt=c(1:100))
df1_copy <- sym(paste0("df","1"))
df1_copy not the same as df1 ---- df1_copy is "symbol" and value is "mt1"。How to fix it, Thanks!

If you want to programmatically make copies of objects in your environment, you can go along the lines of the second answer to this post.
df1 <- data.frame(v1 = 1:10)
df2 <- data.frame(V1 = 11:20)
original.objects <- ls(pattern="df[0-9]+")
for(i in 1:length(original.objects)){
assign(paste0("copy_", original.objects[i]), eval(as.name(original.objects[i])))
}
ls()
#> [1] "copy_df1" "copy_df2" "df1" "df2"
#> [5] "i" "original.objects"
print(list(df1, df2, copy_df1, copy_df2))
#> [[1]]
#> v1
#> 1 1
#> 2 2
#> 3 3
#> 4 4
#> 5 5
#> 6 6
#> 7 7
#> 8 8
#> 9 9
#> 10 10
#>
#> [[2]]
#> V1
#> 1 11
#> 2 12
#> 3 13
#> 4 14
#> 5 15
#> 6 16
#> 7 17
#> 8 18
#> 9 19
#> 10 20
#>
#> [[3]]
#> v1
#> 1 1
#> 2 2
#> 3 3
#> 4 4
#> 5 5
#> 6 6
#> 7 7
#> 8 8
#> 9 9
#> 10 10
#>
#> [[4]]
#> V1
#> 1 11
#> 2 12
#> 3 13
#> 4 14
#> 5 15
#> 6 16
#> 7 17
#> 8 18
#> 9 19
#> 10 20
Created on 2023-01-12 with reprex v2.0.2

Related

How can I change the label of row structure into a string?

I'm trying to change the type of structure in row label (the one with red rectangle) into a string(character). Any ideas/suggestion of how can I change it?
Set the rownames() for the data.frame. You might also find the rownames_to_column(), rowid_to_column(), and column_to_rownames() functions from the {tibble} package useful:
dat <- data.frame(x = 1:26)
head(dat)
#> x
#> 1 1
#> 2 2
#> 3 3
#> 4 4
#> 5 5
#> 6 6
rownames(dat) <- letters
head(dat)
#> x
#> a 1
#> b 2
#> c 3
#> d 4
#> e 5
#> f 6
tibble::rownames_to_column(dat, var = "rowname") |>
head()
#> rowname x
#> 1 a 1
#> 2 b 2
#> 3 c 3
#> 4 d 4
#> 5 e 5
#> 6 f 6
tibble::rowid_to_column(dat, var = "rowid") |>
head()
#> rowid x
#> 1 1 1
#> 2 2 2
#> 3 3 3
#> 4 4 4
#> 5 5 5
#> 6 6 6
dat <- data.frame(x = 1:26, rowname = letters)
head(dat)
#> x rowname
#> 1 1 a
#> 2 2 b
#> 3 3 c
#> 4 4 d
#> 5 5 e
#> 6 6 f
tibble::column_to_rownames(dat, var = "rowname") |>
head()
#> x
#> a 1
#> b 2
#> c 3
#> d 4
#> e 5
#> f 6
Created on 2022-07-22 by the reprex package (v2.0.1)

How can I replace values with the continuous count every n rows in R

I am a bit stuck with replacing values.
I have a column that counts frames per second.
I appended the file but the appended file starts with the frame "1" again in the [12] column below.
So what I need to do is replace the last four frames featuring "1" with "4" and so on. Or in other words, assign a new value to every four rows.
let's say df$frames is:
[,1] 1
[1,] 1
[2,] 1
[3,] 1
[4,] 2
[5,] 2
[6,] 2
[7,] 2
[8,] 3
[9,] 3
[10,] 3
[11,] 3
[12,] 1
[13,] 1
[14,] 1
[15,] 1
A quick hint would help me a lot :D
Best :*
I'm not sure I understand the problem; if you have:
df <- data.frame(frames = c(rep(1:3, each = 4), rep(1, 4)))
df
#> frames
#> 1 1
#> 2 1
#> 3 1
#> 4 1
#> 5 2
#> 6 2
#> 7 2
#> 8 2
#> 9 3
#> 10 3
#> 11 3
#> 12 3
#> 13 1
#> 14 1
#> 15 1
#> 16 1
Created on 2022-07-09 by the reprex package (v2.0.1)
And you want :
df <- data.frame(frames = c(rep(1:4, each = 4)))
df
#> frames
#> 1 1
#> 2 1
#> 3 1
#> 4 1
#> 5 2
#> 6 2
#> 7 2
#> 8 2
#> 9 3
#> 10 3
#> 11 3
#> 12 3
#> 13 4
#> 14 4
#> 15 4
#> 16 4
Created on 2022-07-09 by the reprex package (v2.0.1)
You can change the value of the last 4 rows:
df <- data.frame(frames = c(rep(1:3, each = 4), rep(1, 4)))
df
#> frames
#> 1 1
#> 2 1
#> 3 1
#> 4 1
#> 5 2
#> 6 2
#> 7 2
#> 8 2
#> 9 3
#> 10 3
#> 11 3
#> 12 3
#> 13 1
#> 14 1
#> 15 1
#> 16 1
df[(nrow(df)-3):nrow(df),] <- 4
df
#> frames
#> 1 1
#> 2 1
#> 3 1
#> 4 1
#> 5 2
#> 6 2
#> 7 2
#> 8 2
#> 9 3
#> 10 3
#> 11 3
#> 12 3
#> 13 4
#> 14 4
#> 15 4
#> 16 4
Created on 2022-07-09 by the reprex package (v2.0.1)
Or you can change the value of every 4 rows using rep(), e.g.:
df <- data.frame(frames = c(rep(1:6, each = 4)))
df
#> frames
#> 1 1
#> 2 1
#> 3 1
#> 4 1
#> 5 2
#> 6 2
#> 7 2
#> 8 2
#> 9 3
#> 10 3
#> 11 3
#> 12 3
#> 13 4
#> 14 4
#> 15 4
#> 16 4
#> 17 5
#> 18 5
#> 19 5
#> 20 5
#> 21 6
#> 22 6
#> 23 6
#> 24 6
Created on 2022-07-09 by the reprex package (v2.0.1)
Or is there something I'm missing?

R+dplyr: conditionally swap the elements of two columns

Consider the dataframe df at the end of the post.
I simply would like to swap the elements of columns x and y whenever x>y.
There may be other columns in the dataframe which I do not want to touch.
In a sense, I would like to sort row wise the columns x and y.
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
df<-tibble(x=1:10, y=10:1, extra=LETTERS[1:10])
df
#> # A tibble: 10 × 3
#> # Rowwise:
#> x y extra
#> <int> <int> <chr>
#> 1 1 10 A
#> 2 2 9 B
#> 3 3 8 C
#> 4 4 7 D
#> 5 5 6 E
#> 6 6 5 F
#> 7 7 4 G
#> 8 8 3 H
#> 9 9 2 I
#> 10 10 1 J
Created on 2021-10-06 by the reprex package (v2.0.1)
base solution:
use which(df$x > df$y) to determine row numbers you want to change, then use rev to swap values for these:
df[which(df$x > df$y), c("x", "y")] <- rev(df[which(df$x > df$y), c("x", "y")])
df
# x y extra
# <int> <int> <chr>
# 1 1 10 A
# 2 2 9 B
# 3 3 8 C
# 4 4 7 D
# 5 5 6 E
# 6 5 6 F
# 7 4 7 G
# 8 3 8 H
# 9 2 9 I
# 10 1 10 J
Thanks everyone!
I wrote a small function which does what I need and generalizes to the case of multiple variables.
See the reprex
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
set.seed(1234)
set_colnames <- `colnames<-`
df<-tibble(x=1:10, y=10:1, z=rnorm(10), extra=LETTERS[1:10]) %>%
rowwise()
df
#> # A tibble: 10 × 4
#> # Rowwise:
#> x y z extra
#> <int> <int> <dbl> <chr>
#> 1 1 10 -1.21 A
#> 2 2 9 0.277 B
#> 3 3 8 1.08 C
#> 4 4 7 -2.35 D
#> 5 5 6 0.429 E
#> 6 6 5 0.506 F
#> 7 7 4 -0.575 G
#> 8 8 3 -0.547 H
#> 9 9 2 -0.564 I
#> 10 10 1 -0.890 J
sort_rows <- function(df, col_names, dec=F){
temp <- df %>%
select(all_of(col_names))
extra_names <- setdiff(colnames(df), col_names)
temp2 <- df %>%
select(all_of(extra_names))
res <- t(apply(temp, 1, sort, decreasing=dec)) %>%
as_tibble %>%
set_colnames(col_names) %>%
bind_cols(temp2)
return(res)
}
col_names <- c("x", "y", "z")
df_s <- df %>%
sort_rows(col_names, dec=F)
#> Warning: The `x` argument of `as_tibble.matrix()` must have unique column names if `.name_repair` is omitted as of tibble 2.0.0.
#> Using compatibility `.name_repair`.
df_s
#> # A tibble: 10 × 4
#> x y z extra
#> <dbl> <dbl> <dbl> <chr>
#> 1 -1.21 1 10 A
#> 2 0.277 2 9 B
#> 3 1.08 3 8 C
#> 4 -2.35 4 7 D
#> 5 0.429 5 6 E
#> 6 0.506 5 6 F
#> 7 -0.575 4 7 G
#> 8 -0.547 3 8 H
#> 9 -0.564 2 9 I
#> 10 -0.890 1 10 J
Created on 2021-10-06 by the reprex package (v2.0.1)
This looks like sorting for me:
library(tidyverse)
df <- tibble(x=1:10, y=10:1, extra=LETTERS[1:10])
df
#> # A tibble: 10 x 3
#> x y extra
#> <int> <int> <chr>
#> 1 1 10 A
#> 2 2 9 B
#> 3 3 8 C
#> 4 4 7 D
#> 5 5 6 E
#> 6 6 5 F
#> 7 7 4 G
#> 8 8 3 H
#> 9 9 2 I
#> 10 10 1 J
extra_cols <- df %>% colnames() %>% setdiff(c("x", "y"))
extra_cols
#> [1] "extra"
df %>%
mutate(row = row_number()) %>%
pivot_longer(-c(row, extra_cols)) %>%
group_by_at(c("row", extra_cols)) %>%
transmute(
value = value %>% sort(),
name = c("x", "y"),
) %>%
pivot_wider() %>%
ungroup() %>%
select(-row)
#> Note: Using an external vector in selections is ambiguous.
#> ℹ Use `all_of(extra_cols)` instead of `extra_cols` to silence this message.
#> ℹ See <https://tidyselect.r-lib.org/reference/faq-external-vector.html>.
#> This message is displayed once per session.
#> # A tibble: 10 x 3
#> extra x y
#> <chr> <int> <int>
#> 1 A 1 10
#> 2 B 2 9
#> 3 C 3 8
#> 4 D 4 7
#> 5 E 5 6
#> 6 F 5 6
#> 7 G 4 7
#> 8 H 3 8
#> 9 I 2 9
#> 10 J 1 10
Created on 2021-10-06 by the reprex package (v2.0.1)
Try using apply on axis 1 and transpose it with t, then use as_tibble to convert it to a tibble.
Then finally change the column names:
> df <- as_tibble(t(apply(df, 1, sort)))
> names(df) <- c('x', 'y')
> df
# A tibble: 10 x 2
x y
<int> <int>
1 1 10
2 2 9
3 3 8
4 4 7
5 5 6
6 5 6
7 4 7
8 3 8
9 2 9
10 1 10

How can I split large dataset iteratively to get smaller datasets by rows

I have a dataset that has a column with days spanning from 1-182. I want to split this dataset into smaller 30 days data frames. However, I want the data frames to form as follows:
Dataframe 1: Day 1 - Day 30 (Row 1-30)
Dataframe 2: Day 2 - Day 31 (Row 2-31)
Dataframe 3: Day 3 - Day 33 (Row 3-32) and so on.
I already know how to split by 30 days but can't find a way to split like this! Please let me know how I can do this with some function in R
Here is my take on what you asked for.
dat <- data.frame(jday = 1:182,
value = rnorm(182, 10, 1))
# window interval
windx <- 30
# iterate up until you run out of rows
res <- lapply(1:(nrow(dat) - windx), function(i) {
dat[i:(i + (windx-1)),]
})
# 152 data.frames
length(res)
#> [1] 152
# 30 rows
nrow(res[[1]])
#> [1] 30
# look at first 6 values from first 6 data.frames
lapply(head(res), head)
#> [[1]]
#> jday value
#> 1 1 13.062751
#> 2 2 9.468940
#> 3 3 9.371270
#> 4 4 11.477544
#> 5 5 11.072019
#> 6 6 9.598129
#>
#> [[2]]
#> jday value
#> 2 2 9.468940
#> 3 3 9.371270
#> 4 4 11.477544
#> 5 5 11.072019
#> 6 6 9.598129
#> 7 7 9.349836
#>
#> [[3]]
#> jday value
#> 3 3 9.371270
#> 4 4 11.477544
#> 5 5 11.072019
#> 6 6 9.598129
#> 7 7 9.349836
#> 8 8 10.149530
#>
#> [[4]]
#> jday value
#> 4 4 11.477544
#> 5 5 11.072019
#> 6 6 9.598129
#> 7 7 9.349836
#> 8 8 10.149530
#> 9 9 9.521323
#>
#> [[5]]
#> jday value
#> 5 5 11.072019
#> 6 6 9.598129
#> 7 7 9.349836
#> 8 8 10.149530
#> 9 9 9.521323
#> 10 10 9.726165
#>
#> [[6]]
#> jday value
#> 6 6 9.598129
#> 7 7 9.349836
#> 8 8 10.149530
#> 9 9 9.521323
#> 10 10 9.726165
#> 11 11 8.876201
# all data.frames are 30 rows long
all(unlist(lapply(res, nrow) == 30))
#> [1] TRUE
Created on 2020-12-03 by the reprex package (v0.3.0)
Assuming you have a data.frame like this:
set.seed(1)
d <- data.frame(matrix(sample(20, 30, TRUE), ncol = 3))
# X1 X2 X3
# 1 6 5 19
# 2 8 4 5
# 3 12 14 14
# 4 19 8 3
# 5 5 16 6
# 6 18 10 8
# 7 19 15 1
# 8 14 20 8
# 9 13 8 18
# 10 2 16 7
... create a matrix that identifies the rows of interest. Here, I'm interested in every three rows, thus 1-3, 2-4, 3-5, ... , 8-10. Change "3" to 30 for your case.
m <- embed(1:nrow(d), 3)
m
# [,1] [,2] [,3]
# [1,] 3 2 1
# [2,] 4 3 2
# [3,] 5 4 3
# [4,] 6 5 4
# [5,] 7 6 5
# [6,] 8 7 6
# [7,] 9 8 7
# [8,] 10 9 8
Once you have those, use lapply across the indices to extract the relevant rows.
lapply(1:nrow(m), function(x) d[rev(m[x, ]), ])
# [[1]]
# X1 X2 X3
# 1 6 5 19
# 2 8 4 5
# 3 12 14 14
#
# [[2]]
# X1 X2 X3
# 2 8 4 5
# 3 12 14 14
# 4 19 8 3
#
# [[3]]
# X1 X2 X3
# 3 12 14 14
...
...
# [[7]]
# X1 X2 X3
# 7 19 15 1
# 8 14 20 8
# 9 13 8 18
#
# [[8]]
# X1 X2 X3
# 8 14 20 8
# 9 13 8 18
# 10 2 16 7
The result is a list of your data.frames. You can use list2env if you really want to have all the subsets as separate data.frames in your workspace.

Passing different arguments to function for each dataframe in a list

Here's a simplified example of my data:
I have a list of dataframes
set.seed(1)
data1 <- data.frame(A = sample(1:10))
data2 <- data.frame(A = sample(1:10))
data3 <- data.frame(A = sample(1:10))
data4 <- data.frame(A = sample(1:10))
list1 <- list(data1, data2, data3, data4)
And a dataframe containing the same number of values as there are dataframes in list1
data5 <- data.frame(B = c(10, 20, 30, 40))
I would like to create a new column C in each of the dataframes within list1 where:
C = A * (B/nrow(A))
with the value for B coming from data5, so that B = 10 for the first dataframe in list1 (i.e. data1), and B = 20 for the second dataframe data2 and so on.
From what I've read, mapply is probably the solution, but I'm struggling to work out how to specify a single value of B across all rows in each of the dataframes in list1.
Any suggestions would be hugely appreciated.
You need to use Map to loop on different vectors or list in parallel :
Map(function(df, B) transform(df, C = A*(B/nrow(df))),list1,data5$B)
#> [[1]]
#> A C
#> 1 8 8
#> 2 10 10
#> 3 1 1
#> 4 6 6
#> 5 7 7
#> 6 9 9
#> 7 3 3
#> 8 4 4
#> 9 2 2
#> 10 5 5
#>
#> [[2]]
#> A C
#> 1 10 20
#> 2 3 6
#> 3 2 4
#> 4 1 2
#> 5 9 18
#> 6 4 8
#> 7 6 12
#> 8 5 10
#> 9 8 16
#> 10 7 14
#>
#> [[3]]
#> A C
#> 1 5 15
#> 2 7 21
#> 3 1 3
#> 4 4 12
#> 5 2 6
#> 6 6 18
#> 7 10 30
#> 8 3 9
#> 9 8 24
#> 10 9 27
#>
#> [[4]]
#> A C
#> 1 3 12
#> 2 9 36
#> 3 6 24
#> 4 4 16
#> 5 2 8
#> 6 1 4
#> 7 10 40
#> 8 5 20
#> 9 7 28
#> 10 8 32
You can be a bit more compact using tidyverse :
library(tidyverse)
map2(list1, data5$B, ~mutate(.x, C = A*(.y/nrow(.x))))

Resources