Cant manipulate global/local variables inside a function in R - r

dna = c("A","G","C","T")
x =sample(dna,50,replace =TRUE)
dna_f = function(x){
dnastring <- ""
for (val in x){
paste(dnastring,val,sep="")
}
return(dnastring)
}
dna_f(x)
I'm trying to produce a single string that contains all the randomly sampled letters. x contains all 50 letters and im trying to combine them into one string using the paste function. but when i run this, the output is an empty string. I tried placing dnastring as a global variable because i thought maybe the scope of a function operates differently in R(I'm new to R) but i got the same output. some help would be appreciated thanks.

You don't need for loop here. Try paste with collapse argument.
dna_f = function(x){
paste0(x, collapse = '')
}
dna_f(x)
#[1] "CCTACCAACCCTTTCTAGCCCACTATGCATCACAACTGCGGTCTCATCAC"

You forgot the dnastring <-
dna = c("A","G","C","T")
x =sample(dna,50,replace =TRUE)
dna_f = function(x){
dnastring <- ""
for (val in x){
dnastring <- paste(dnastring,val,sep="")
}
return(dnastring)
}
Output:
> dna_f(x)
[1] "GGTCTGGCCGAACTACTGTACACCCCAAAGACAACGCCCCCGACGCTCTA"

Related

R Switch first part and second part of string at puncuation

I have a data.frame with columns:
names(data) = ("newid","Player.WR","data_col.WR","Trend.WR","Player.QB","data_col.QB","Trend.QB","Player.RB","data_col.RB","Trend.RB","Player.TE","data_col.TE","Trend.TE" )
However, I need to flip the first and second portions of each name at the period so it looks like this:
names(data) = ("newid", "WR.Player", "WR.data_col", "WR.Trend", "QB.Player", "QB.data_col", "QB.Trend", "RB.Player", "RB.data_col", "RB.Trend", "TE.Player", "TE.data_col", "TE.Trend")
My initial thought was to try to do a strsplit and then somehow do an lapply statement to reorder, but I wasn't sure how to make the lapply work.
Thanks!
With a vector of names v, you could also try:
v <- c("newid","Player.WR","data_col.WR","Trend.WR",
"Player.QB","data_col.QB","Trend.QB","Player.RB",
"data_col.RB","Trend.RB","Player.TE","data_col.TE","Trend.TE")
gsub(
'(.*)\\.(.*)',
'\\2\\.\\1',
v
)
Output:
[1] "newid" "WR.Player" "WR.data_col" "WR.Trend" "QB.Player" "QB.data_col" "QB.Trend" "RB.Player"
[9] "RB.data_col" "RB.Trend" "TE.Player" "TE.data_col" "TE.Trend"
And to directly assign it to names:
names(data) <- gsub('(.*)\\.(.*)', '\\2\\.\\1', v)
I would suggest next approach using a function to exchange position of values and lapply():
#Data
vec <- c("newid","Player.WR","data_col.WR","Trend.WR",
"Player.QB","data_col.QB","Trend.QB","Player.RB",
"data_col.RB","Trend.RB","Player.TE","data_col.TE","Trend.TE" )
#Split
L <- lapply(vec,strsplit,split='\\.')
#Format function
myfun <- function(x)
{
y <- x[[1]]
#if check
if(length(y)!=1)
{
z <- paste0(y[c(2,1)],collapse = '.')
} else
{
z <- y
}
return(z)
}
#Apply
L2 <- lapply(L,FUN = myfun)
#Bind
do.call(c,L2)
Output:
[1] "newid" "WR.Player" "WR.data_col" "WR.Trend" "QB.Player" "QB.data_col" "QB.Trend"
[8] "RB.Player" "RB.data_col" "RB.Trend" "TE.Player" "TE.data_col" "TE.Trend"
Last output can be saved in a new vector like vecnamesnew <- do.call(c,L2)
Arg0naut91's answer is quite concise, and I would recommend using Arg0naut91's approach. However, for the sake of providing a (somewhat) concise solution using strsplit and lapply with (perhaps) a bit more readability for those unfamiliar with gsub syntax, I submit the following:
names<-c("newid","Player.WR","data_col.WR","Trend.WR",
"Player.QB","data_col.QB","Trend.QB","Player.RB",
"data_col.RB","Trend.RB","Player.TE","data_col.TE","Trend.TE" )
newnames<-lapply(names,function(x) paste(rev(unlist(strsplit(x,split="\\."),use.names=FALSE)),collapse="."))
print(newnames)
which yields
[1] "newid" "WR.Player" "WR.data_col" "WR.Trend" "QB.Player" "QB.data_col" "QB.Trend"
[8] "RB.Player" "RB.data_col" "RB.Trend" "TE.Player" "TE.data_col" "TE.Trend"
as output.

How to take value from one column and store it in newly created column using function call

firstly sorry if this is a stupid question ... I am learning R, and really dont have too much experience
I have following function in R programming language, that is taking value and returning value.
dec2binSingle <- function(decimal) {
print(decimal)
binaryValue <- ""
index <- 0
decimal <- as.numeric(decimal)
while(decimal != 0) {
print(decimal)
temp <- as.numeric(decimal) %% 2
if (temp == 1) {
binaryValue <- paste("1", binaryValue, sep="", collapse = NULL)
decimal <- decimal - 1
} else {
binaryValue <- paste("0", binaryValue, sep="", collapse = NULL)
}
index <- index + 1
decimal <- decimal / 2
}
return(binaryValue)
}
The function is converting decimal number into binary equivalent.
When I try to call the function, the function completes without any error, but when I try to see the data, the following error appears:
Error in View : 'names' attribute [200] must be the same length as the vector [1]
And this is the way, how the function is being called:
test_function <- function(value1) {return(dec2binSingle(as.numeric(unlist(value1))))}
data_example$tv <- with(data_example, test_function(data_example[which(colnames(data_example) == "numbers")]))
Any help is appreciated... thanks
EDIT:
I called the function for single value and it works as expected.
> dec2binSingle(23)
[1] "10111"
>
I hope this is what you wanted to achieve with your code.
#sample data
df <- data.frame(char1=c("abc","def","xyz"), num1=c(1,34,12), num2=c(34,20,8))
df
#function to convert decimal into binary
bin_func <- function(x) {gsub("^0+","",paste(rev(as.numeric(intToBits(x))), collapse=""))}
#verify which all columns are numeric
num_col <- sapply(df,is.numeric)
df1 <- as.data.frame(lapply(df[,num_col], FUN = function(x) {sapply(x, FUN = bin_func)}))
names(df1) <- paste(names(df1),"_converted",sep="")
#final dataframe having original as well as converted columns
df <- cbind(df,df1)
df
Please don't forget to let us know if it helped :)

String splitting in R Programming

Currently the script below is splitting a combined item code into a specific item codes.
rule2 <- c("MR")
df_1 <- test[grep(paste("^",rule2,sep="",collapse = "|"),test$Name.y),]
SpaceName_1 <- function(s){
num <- str_extract(s,"[0-9]+")
if(nchar(num) >3){
former <- substring(s, 1, 4)
latter <- strsplit(substring(s,5,nchar(s)),"")
latter <- unlist(latter)
return(paste(former,latter,sep = "",collapse = ","))
}
else{
return (s)
}
}
df_1$Name.y <- sapply(df_1$Name.y, SpaceName_1)
Example,
Combined item code: Room 324-326 is splitting into MR324 MR325 MR326.
However for this particular Combined item code: Room 309-311 is splitting into MR309 MR300 MR301.
How should I amend the script to give me MR309 MR310 MR311?
You can try something along these lines:
range <- "324-326"
x <- as.numeric(unlist(strsplit(range, split="-")))
paste0("MR", seq(x[1], x[2]))
[1] "MR324" "MR325" "MR326"
I assume that you can obtain the numerical room sequence by some means, and then use the snippet I gave you above.
If your combined item codes always have the form Room xxx-yyy, then you can extract the range using gsub:
range <- gsub("Room ", "", "Room 324-326")
If your item codes were in a vector called codes, then you could obtain a vector of ranges using:
ranges <- sapply(codes, function(x) gsub("Room ", "", x))
We can also evaluate the string after replacing the - with : and then paste the prefix "MR".
paste0("MR", eval(parse(text=sub("\\S+\\s+(\\d+)-(\\d+)", "\\1:\\2", range))))
#[1] "MR324" "MR325" "MR326"
Wrap it as a function for convenience
fChange <- function(prefixStr, RangeStr){
paste0(prefixStr, eval(parse(text=sub("\\S+\\s+(\\d+)-(\\d+)",
"\\1:\\2", RangeStr))))
}
fChange("MR", range)
fChange("MR", range1)
#[1] "MR309" "MR310" "MR311"
For multiple elements, just loop over and apply the function
sapply(c(range, range1), fChange, prefixStr = "MR")
data
range <- "Room 324-326"
range1 <- "Room 309-311"

Dynamic variable names in plots, files and compatibility with loop

I am trying to write a function that makes a plot and saves it into a file automatically.
The trick I struggle with it to do both dynamically [plotname=varname & filename=varname &],
and to make it compatible with calling it from a loop.
# Create data
my_df = cbind(uni=runif (100),norm=rnorm (100),bino=rbinom(100,20, 0.5)); head (my_df)
my_vec = my_df[,'uni'];
# How to make plot and file-name meaningful if you call the variable in a loop?
# if you call by name, the plotname is telling. It is similar what I would like to see.
hist(my_df[,'bino'])
for (plotit in colnames(my_df)) {
hist(my_df[,plotit])
print (plotit)
# this is already not meaningful
}
# step 2 write it into files
hist_auto <- function(variable, col ="gold1", ...) {
if ( length (variable) > 0 ) {
plotname = paste(substitute(variable), sep="", collapse = "_"); print (plotname); is (plotname)
# I would like to define plotname, and later tune it according to my needs
FnP = paste (getwd(),'/',plotname, '.hist.pdf', collapse = "", sep=""); print (FnP)
hist (variable, main = plotname)
#this is apparently not working: I do not get my_df[, "bino"] or anything similar
dev.copy2pdf (file=FnP )
} else { print ("var empty") }
}
hist_auto (my_vec)
# name works, and is meaningful [as much as the var name ... ]
hist_auto (my_df[,'bino'])
# name sort of works, but falls apart
assign (plotit, my_df[,'bino'])
hist_auto (get(plotit))
# name works, but meaningless
# Now in a loop
for (plotit in colnames(my_df)) {
my_df[,plotit]
hist(my_df[,plotit])
## name works, but meaningless and NOT UNIQUE > overwritten by next
}
for (plotit in colnames(my_df)) {
hist_auto(my_df[,plotit])
## name works, but meaningless and NOT UNIQUE > overwritten by next
}
for (plotit in colnames(my_df)) {
assign (plotit, my_df[,plotit])
hist_auto (get(plotit))
## name works, but meaningless and NOT UNIQUE > overwritten by next
}
My aim is to have a function that iterates over eg. columns of a matrix, plots and saves each with a unique and meaningful name.
The solution will probably involve a smart combination of substitute() parse() eval() and paste (), but lacking solid understanding I failed to figure out.
My basis of experimentation was:
how to dynamically call a variable?
How about something like this? You may need to install.packages("ggplot2")
library(ggplot2)
my_df <- data.frame(uni=runif(100),
norm=rnorm(100),
bino=rbinom(100, 20, 0.5))
get_histogram <- function(df, varname, binwidth=1, save=T) {
stopifnot(varname %in% names(df))
title <- sprintf("Histogram of %s", varname)
p <- (ggplot(df, aes_string(x=varname)) +
geom_histogram(binwidth=binwidth) +
ggtitle(title))
if(save) {
filename <- sprintf("histogram_%s.png", gsub(" ", "_", varname))
ggsave(filename, p, width=10, height=8)
}
return(p)
}
for(var in names(my_df))
get_histogram(my_df, var, binwidth=0.5) # If you want to save them
get_histogram(my_df, "uni", binwidth=0.1, save=F) # If you want to look at a specific one
So I ended up with 2 functions, one that can iterate over data frames, and another that takes a single vectors. Using parts of Adrian's [thanks!] solution:
hist_dataframe <- function(variable, col ="gold1", ...) {
stopifnot(colName %in% colnames(df))
variable = df[,colName]
stopifnot(length (variable) >1 )
plotname = paste(substitute(df),'__', colName, sep="")
FnP = paste (getwd(),'/',plotname, '.hist.pdf', collapse = "", sep=""); print (FnP)
hist (variable, main = plotname)
dev.copy2pdf (file=FnP )
}
And the one for simple vectors stays as in Q.

How can I include a variable name in a function call in R?

I'm trying to change the name of a variable that is included inside a for loop and function call. In the example below, I'd like column_1 to be passed to the plot function, then column_2 etc. I've tried using do.call, but it returns "object 'column_j' not found". But object column_j is there, and the plot function works if I hard-code them in. Help much appreciated.
for (j in 2:12) {
column_to_plot = paste("column_", j, sep = "")
do.call("plot", list(x, as.name(column_to_plot)))
}
I do:
x <- runif(100)
column_2 <-
column_3 <-
column_4 <-
column_5 <-
column_6 <-
column_7 <-
column_8 <-
column_9 <-
column_10 <-
column_11 <-
column_12 <- rnorm(100)
for (j in 2:12) {
column_to_plot = paste("column_", j, sep = "")
do.call("plot", list(x, as.name(column_to_plot)))
}
And I have no errors. Maybe you could provide hard-code which (according to your question) works, then will be simpler to find a reason of the error.
(I know that I can generate vectors using loop and assign, but I want to provide clear example)
You can do it without the paste() command in your for loop. Simply assign the columns via the function colnames() in your loop:
column_to_plot <- colnames(dataframeNAME)[j]
Hope that helps as a first kludge.
Are you trying to retrieve an object in the workspace by a character string? In that case, parse() might help:
for (j in 2:12) {
column_to_plot = paste("column_", j, sep = "")
plot(x, eval(parse(text=column_to_plot)))
}
In this case you could use do.call(), but it would not be required.
Edit: wrapp parse() in eval()
Here is one way to do it:
tmp.df <- data.frame(col_1=rnorm(10),col_2=rnorm(10),col_3=rnorm(10))
x <- seq(2,20,by=2)
plot(x, tmp.df$col_1)
for(j in 2:3){
name.list <- list("x",paste("col_",j,sep=""))
with(tmp.df, do.call("lines",lapply(name.list,as.name))) }
You can also do colnames(tmp.df)[j] instead of paste(..) if you'd like.

Resources