R Switch first part and second part of string at puncuation

R Switch first part and second part of string at puncuation - r

I have a data.frame with columns:
names(data) = ("newid","Player.WR","data_col.WR","Trend.WR","Player.QB","data_col.QB","Trend.QB","Player.RB","data_col.RB","Trend.RB","Player.TE","data_col.TE","Trend.TE" )
However, I need to flip the first and second portions of each name at the period so it looks like this:
names(data) = ("newid", "WR.Player", "WR.data_col", "WR.Trend", "QB.Player", "QB.data_col", "QB.Trend", "RB.Player", "RB.data_col", "RB.Trend", "TE.Player", "TE.data_col", "TE.Trend")
My initial thought was to try to do a strsplit and then somehow do an lapply statement to reorder, but I wasn't sure how to make the lapply work.
Thanks!

With a vector of names v, you could also try:
v <- c("newid","Player.WR","data_col.WR","Trend.WR",
"Player.QB","data_col.QB","Trend.QB","Player.RB",
"data_col.RB","Trend.RB","Player.TE","data_col.TE","Trend.TE")
gsub(
'(.*)\\.(.*)',
'\\2\\.\\1',
v
)
Output:
[1] "newid" "WR.Player" "WR.data_col" "WR.Trend" "QB.Player" "QB.data_col" "QB.Trend" "RB.Player"
[9] "RB.data_col" "RB.Trend" "TE.Player" "TE.data_col" "TE.Trend"
And to directly assign it to names:
names(data) <- gsub('(.*)\\.(.*)', '\\2\\.\\1', v)

I would suggest next approach using a function to exchange position of values and lapply():
#Data
vec <- c("newid","Player.WR","data_col.WR","Trend.WR",
"Player.QB","data_col.QB","Trend.QB","Player.RB",
"data_col.RB","Trend.RB","Player.TE","data_col.TE","Trend.TE" )
#Split
L <- lapply(vec,strsplit,split='\\.')
#Format function
myfun <- function(x)
{
y <- x[[1]]
#if check
if(length(y)!=1)
{
z <- paste0(y[c(2,1)],collapse = '.')
} else
{
z <- y
}
return(z)
}
#Apply
L2 <- lapply(L,FUN = myfun)
#Bind
do.call(c,L2)
Output:
[1] "newid" "WR.Player" "WR.data_col" "WR.Trend" "QB.Player" "QB.data_col" "QB.Trend"
[8] "RB.Player" "RB.data_col" "RB.Trend" "TE.Player" "TE.data_col" "TE.Trend"
Last output can be saved in a new vector like vecnamesnew <- do.call(c,L2)

Arg0naut91's answer is quite concise, and I would recommend using Arg0naut91's approach. However, for the sake of providing a (somewhat) concise solution using strsplit and lapply with (perhaps) a bit more readability for those unfamiliar with gsub syntax, I submit the following:
names<-c("newid","Player.WR","data_col.WR","Trend.WR",
"Player.QB","data_col.QB","Trend.QB","Player.RB",
"data_col.RB","Trend.RB","Player.TE","data_col.TE","Trend.TE" )
newnames<-lapply(names,function(x) paste(rev(unlist(strsplit(x,split="\\."),use.names=FALSE)),collapse="."))
print(newnames)
which yields
[1] "newid" "WR.Player" "WR.data_col" "WR.Trend" "QB.Player" "QB.data_col" "QB.Trend"
[8] "RB.Player" "RB.data_col" "RB.Trend" "TE.Player" "TE.data_col" "TE.Trend"
as output.

Related

Cant manipulate global/local variables inside a function in R

dna = c("A","G","C","T")
x =sample(dna,50,replace =TRUE)
dna_f = function(x){
dnastring <- ""
for (val in x){
paste(dnastring,val,sep="")
}
return(dnastring)
}
dna_f(x)
I'm trying to produce a single string that contains all the randomly sampled letters. x contains all 50 letters and im trying to combine them into one string using the paste function. but when i run this, the output is an empty string. I tried placing dnastring as a global variable because i thought maybe the scope of a function operates differently in R(I'm new to R) but i got the same output. some help would be appreciated thanks.

You don't need for loop here. Try paste with collapse argument.
dna_f = function(x){
paste0(x, collapse = '')
}
dna_f(x)
#[1] "CCTACCAACCCTTTCTAGCCCACTATGCATCACAACTGCGGTCTCATCAC"

You forgot the dnastring <-
dna = c("A","G","C","T")
x =sample(dna,50,replace =TRUE)
dna_f = function(x){
dnastring <- ""
for (val in x){
dnastring <- paste(dnastring,val,sep="")
}
return(dnastring)
}
Output:
> dna_f(x)
[1] "GGTCTGGCCGAACTACTGTACACCCCAAAGACAACGCCCCCGACGCTCTA"

Change data type of elements in a nested list

Is it possible to scan a list of lists for elements with a certain name and change their datatype but retain their value?
As an example, the following list containing elements 'N' of class 'character' or 'numeric'
x = list(list(N=as.character(1)),
list(a=1,b=2,c="another element",N=as.character(5)),
list(a=2,b=2,N=as.character(7),c=NULL),
list(a=2,b=2,list(N=as.character(3))))
should then become:
x = list(list(N=as.numeric(1)),
list(a=1,b=2,c="another element",N=as.numeric(5)),
list(a=2,b=2,N=as.numeric(7),c=NULL),
list(a=2,b=2,list(N=as.numeric(3))))
To be clear, the solution should allow for deeper nesting, and respect the data type of fields with names other than "N". I have not been able to find a general solution that works for lists with an arbitrary structure.
I have tried something along the lines of the solution given in this post:
a <- as.relistable(x)
u <- unlist(a)
u[names(u) == "N"] <- as.numeric(u[names(u) == "N"])
relist(u, a)
Unfortunately the substitution does not work in it's current form. In addition, relist does not seem to work in case the list contains NULL elements.

Use lapply to repeat the process over the list elements with a condition to check for your element of interest, so you don't inadvertently add elements to your sublists:
x <- lapply(x, function(i) {
if(length(i$N) > 0) {
i$N <- as.numeric(i$N)
}
return(i)
})

A solution that works only on a list of lists containing numbers or strings with numbers:
x <- list(list(N=as.character(1)),
list(a=1,b=2,N=as.character(5)),
list(a=2,b=2,N=as.character(7)),
list(a=2,b=2))
y1 <- lapply(x, function(y) lapply(y, as.numeric))
y2 <- list(list(N=as.numeric(1)),
list(a=1,b=2,N=as.numeric(5)),
list(a=2,b=2,N=as.numeric(7)),
list(a=2,b=2))
identical(y1,y2)
# [1] TRUE
EDIT. Here is a more general code that works on nested lists of number and strings. It uses a recursive function as_num and the list.apply function of the rlist package.
library(rlist)
x = list(list(N=as.character(1)),
list(a=1,b=2,c="another element",N=as.character(5)),
list(a=2,b=2,N=as.character(7),c=NULL),
list(a=2,b=2,list(N=as.character(3))))
# Test if the string contains a number
is_num <- function(x) grepl("[-]?[0-9]+[.]?[0-9]*|[-]?[0-9]+[L]?|[-]?[0-9]+[.]?[0-9]*[eE][0-9]+",x)
# A recursive function for numeric convertion of strings containing numbers
as_num <- function(x) {
if (!is.null(x)) {
if (class(x)!="list") {
y <- x
if (is.character(x) & is_num(x)) y <- as.numeric(x)
} else {
y <- list.apply(x, as_num)
}
} else {
y <- x
}
return(y)
}
y <- list.apply(x, as_num)
z = list(list(N=as.numeric(1)),
list(a=1,b=2,c="another element",N=as.numeric(5)),
list(a=2,b=2,N=as.numeric(7),c=NULL),
list(a=2,b=2,list(N=as.numeric(3))))
identical(y,z)
# [1] TRUE

The answer provided by marco sandri can be further generalised to:
is_num <- function(x) grepl("^[-]?[0-9]+[.]?[0-9]*|^[-]?[0-9]+[L]?|^[-]?[0-9]+[.]?[0-9]*[eE][0-9]+",x)
as_num <- function(x) {
if (is.null(x)||length(x) == 0) return(x)
if (class(x)=="list") return(lapply(x, as_num))
if (is.character(x) & is_num(x)) return(as.numeric(x))
return(x)
}
y <- as_num(z)
identical(y,z)
This solution also allows for list elements to contain numerical(0) and mixed datatypes such as 'data2005'.

how to add value to existing variable from inside a loop?

I want to add a computed value to an existing vector from within a loop in which the wanted vector is called from within the loop . that is im looking for some function that is similar to assign() function but that will enable me to add values to an existing variables and not creating new variables.
example:
say I have 3 variabels :
sp=3
for(i in 1:sp){
name<-paste("sp",i,sep="")
assign(name,rnorm(5))
}
and now I want to access the last value in each of the variabels, double it and add the resault to the vector:
for(i in 1:sp){
name<-paste("sp",i,sep="")
name[6]<-name[5]*2
}
the problem here is that "name" is a string, how can R identify it as a veriable name and access it?

What you are asking for is something like this:
get(name)
In your code it would like this:
v <- 1:10
var <- "v"
for (i in v){
tmp <- get(var)
tmp[6] <- tmp[5]*2
assign(var, tmp)
}
# [1] 1 2 3 4 5 10 7 8 9 10
Does that help you in any way?
However, I agree with the other answer, that lists and the lapply/sapply-functions are better suited!

This is how you can do this with a list:
sp=3
mylist <- vector(mode = "list", length = sp) #initialize a list
names(mylist) <- paste0("sp",seq_len(sp)) #set the names
for(i in 1:sp){
mylist[[i]] <- rnorm(5)
}
for(i in 1:sp){
mylist[[i]] <- c(mylist[[i]], mylist[[i]][5] * 2)
}
mylist
#$sp1
#[1] 0.6974563 0.7714190 1.1980534 0.6011610 -1.5884306 -3.1768611
#
#$sp2
#[1] -0.2276942 0.2982770 0.5504381 -0.2096708 -1.9199551 -3.8399102
#
#$sp3
#[1] 0.235280995 0.276813498 0.002567075 -0.774551774 0.766898045 1.533796089
You can then access the list elements as described in help("["), i.e., mylist$sp1, mylist[["sp1"]], etc.
Of course, this is still very inefficient code and it could be improved a lot. E.g., since all three variables are of same type and length, they really should be combined into a matrix, which could be filled with one call to rnorm and which would also allow doing the second operation with vectorized operations.

#Roland is absolutely right and you absolutely should use a list for this type of problem. It's cleaner and easier to work with. Here's another way of working with what you have (It can be easily generalised):
sp <- replicate(3, rnorm(5), simplify=FALSE)
names(sp) <- paste0("sp", 1:3)
sp
#$sp1
#[1] -0.3723205 1.2199743 0.1226524 0.7287469 -0.8670466
#
#$sp2
#[1] -0.5458811 -0.3276503 -1.3031100 1.3064743 -0.7533023
#
#$sp3
#[1] 1.2683564 0.9419726 -0.5925012 -1.2034788 -0.6613149
newsp <- lapply(sp, function(x){x[6] <- x[5]*2; x})
newsp
#$sp1
#[1] -0.3723205 1.2199743 0.1226524 0.7287469 -0.8670466 -1.7340933
#
#$sp2
#[1] -0.5458811 -0.3276503 -1.3031100 1.3064743 -0.7533023 -1.5066046
#
#$sp3
#[1] 1.2683564 0.9419726 -0.5925012 -1.2034788 -0.6613149 -1.3226297
EDIT: If you are truly, sincerely dedicated to doing this despite being recommended otherwise, you can do it this way:
for(i in 1:sp){
name<-paste("sp",i,sep="")
assign(name, `[<-`(get(name), 6, `[`(get(name), 5) * 2))
}

Set atomic vector names by reference

I am wondering if it is possible to set vector names by reference in R.
I often use data.table::fread to read text files, and then I clean up the variable names by wrapping setnames (which also works on a plain data.frame) and a string cleanup function similar to:
clean_var_name <- function(s) {
gsub("^_+|_+$","",gsub("(\\s|\\-|[[:punct:]])+", "_", tolower(s) ) )
}
so my function looks like:
clean_names <- function(x){
require(data.table)
if(is.data.frame(x)){setnames(x, names(x), clean_var_name(names(x)))} # this part works
else if(is.vector(x)){ do_something_here } # this is the question
}
I'm wondering if there is a way to include the case of vectors in the same function in a way that performs names(x) <- clean_var_name(names(x)) by reference.
v <- c(`thIs.Is.A.Terrible-Name`=1, `this One is TOO`=2)
dt <- data.table(t(v))
clean_names(dt)
dt
# this_is_a_terrible_name this_one_is_too
# 1: 1 4
# would like to be able to do same for clean_names(v)
I'm also open to explanations of why this is a bad idea (side effects, functional programming, etc.)

Use setattr function:
library(data.table)
x <- 1:10
address(x)
# [1] "0x713cfd0"
setattr(x,"names",letters[1:10])
address(x)
# [1] "0x713cfd0"

Getting names from ... (dots)

In improving an rbind method, I'd like to extract the names of the objects passed to it so that I might generate unique IDs from those.
I've tried all.names(match.call()) but that just gives me:
[1] "rbind" "deparse.level" "..1" "..2"
Generic example:
rbind.test <- function(...) {
dots <- list(...)
all.names(match.call())
}
t1 <- t2 <- ""
class(t1) <- class(t2) <- "test"
> rbind(t1,t2)
[1] "rbind" "deparse.level" "..1" "..2"
Whereas I'd like to be able to retrieve c("t1","t2").
I'm aware that in general one cannot retrieve the names of objects passed to functions, but it seems like with ... it might be possible, as substitute(...) returns t1 in the above example.

I picked this one up from Bill Dunlap on the R Help List Serve:
rbind.test <- function(...) {
sapply(substitute(...()), as.character)
}
I think this gives you what you want.

Using the guidance here How to use R's ellipsis feature when writing your own function?
eg substitute(list(...))
and combining with with as.character
rbind.test <- function(...) {
.x <- as.list(substitute(list(...)))[-1]
as.character(.x)
}
you can also use
rbind.test <- function(...){as.character(match.call(expand.dots = F)$...)}

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

R Switch first part and second part of string at puncuation - r

Related

Cant manipulate global/local variables inside a function in R

Change data type of elements in a nested list

how to add value to existing variable from inside a loop?

Set atomic vector names by reference

Getting names from ... (dots)

Categories

Resources