Paste multiple elements in R - r

I make this code using a for-statement. (The main purpose of this code is to list different webpages, which are obtained via httr and rvest)
r = "asdgkjkhdf"
t = "osrt"
all = c()
for(i in 1:400)
{
y = paste(r, i, sep = '')
d = paste(y, t, sep = '')
all = c(all, d)
}
all
I got things like these (pasted numbers are actually getting accumulated in the each results)
[1]asdgkjkhdf1osrt
[2]asdgkjkhdf12osrt
[3]asdgkjkhdf123osrt
[4]asdgkjkhdf1234osrt
...
But I want results like these regardless of how many numbers i put in 'for()'function.
[1]asdgkjkhdf1osrt
[2]asdgkjkhdf2osrt
...
[400]asdgkjkhdf400osrt
like these above
What should I change in order to have what I want to result in?
Should I use paste(substr(), substr(), sep='')?

If you really want to use a for-statement you can use the following
r = "asdgkjkhdf"
t = "osrt"
all = c()
for (idx in 1:400)
all = c(all, paste0(r, idx, t))
However, in R you should prefer code without for-statements since, in general, this is less readable and hurts performance. The solution without the for-statement (given by Roland in the comments) equals
all <- paste0(r, 1:400, t)
Note that paste0("string")is just a short notation for paste("string", sep='').

Related

Dynamically create subsets in R with a for loop

I am trying to create different subsets out of a table and with each iteration I want to shift one column upwards. So far I realized this with this code but undynamically:
subset_cor_lag00 <- subset(data_24h, select = c(price_return, sentiment_return, tweet_return))
korr_tab_lag00 <- cor(subset_cor_lag00)
subset_cor_lag01 <- transform(subset_cor_lag00, price_return = lead(price_return))
subset_cor_lag01 <- na.omit(subset_cor_lag01)
korr_tab_lag01 <- cor(subset_cor_lag01)
But now I tried to do this dynamically but I got stuck with it. So maybe someone has a hint. I really would appreciate it. I tried this
for(i in 1:5) {
paste0("subset_cor_lag0", i) <- transform(paste0("subset_cor_lag0", i-1), price_return = lead(price_return))
paste0("subset_cor_lag0", i) <- na.omit(paste0("subset_cor_lag0", i))
paste0("korr_tab_lag0", i) <- cor(paste0("subset_cor_lag0", i))
}
You can use assign for this, but usually having sequentially named variables isn't nice to work with. The better way is to use a list:
subset_cor_lag = list(subset(data_24h, select = c(price_return, sentiment_return, tweet_return)))
for(i in 2:6) {
temp = transform(subset_cor_lag[[i - 1]], price_return = lead(price_return))
subset_cor_lag[[i]] = na.omit(temp)
}
korr_tab = lapply(subset_cor_lag, cor)
## add names, if desired:
name_vec = paste0("lag", 0:5)
names(subset_cor_lag) = name_vec
names(korr_tab) = name_vec
You can then access, e.g., subset_cor_lag[["lag2"]] or subset_cor_lag[[3]], which is easy to do programmatically in a loop or with lapply.
See my answer at How to make a list of data frames? for more discussion and examples.

How to convert for loop to apply in r [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 3 years ago.
Improve this question
I have the following loop. I am trying to convert it into using the apply function instead of a loop, but I don't know how to rewrite the code.
for (i in 1:dim(Y)[2]) {
K = K_origin
print(i)
e = lmmlite::eigen_rotation(K, t(Y)[i,], use_cpp = T)
VC = lmmlite::fitLMM(
e$Kva,
e$y,
e$X,
reml = T,
use_cpp = T,
tol = 1e-6,
check_boundary = T
)
write.table(
VC$sigmasq_g,
"Vg_temp.txt",
row.names = F,
col.names = F,
append = T,
quote = F,
sep = "\n"
)
write.table(
VC$sigmasq_e,
"Ve_temp.txt",
row.names = F,
col.names = F,
append = T,
quote = F,
sep = "\n"
)
}
I want results like these
Vg Ve
1.15521325512487 0.755118863386436
0.579039221720728 1.21733212837417
0.372439354137817 0.296327744338075
0.0668396114713355 0.300417453013007
0.00771158861391208 0.100176380868691
0.210174870097273 0.141907482831872
R's apply functions has to be formulated as 1) something to iterate over, and 2) a function to apply to each element in (1).
But! Whether you will gain anything from converting your particular loop into an apply, is doubtful. If your loop is slow, I am guessing it is due to the operations performed, not that "R is slow on loops". If you only have 6 rows in Y, you will gain nothing from re-formulating the loop into an apply!
For your loop, each i is independent (as opposed to looping over i, when a result depends on the calculation on i-1). So that makes it very easy to re-formulate. Generally,
for (i in some_sequence) {
do something with i
}
can be reformulated to
my_do_something <- function(i) {
do something
}
for (i in some_sequence) {
my_do_something(i)
}
which can be again be directly reformulated to
sapply(some_sequence, my_do_something)
In your case, this would be
my_rotate <- function(i) {
e = lmmlite::eigen_rotation(K, t(Y)[i,], use_cpp = T)
VC = lmmlite::fitLMM( ... )
write.table(...)
write.table(...)
NULL
}
sapply(seq_len(dim(Y)[2]), my_rotate)
Notice how I added an NULL at the bottom of the function? That would be because apply will gather the returned values from the iterated function; write.table returns the written data invisible. Try the function without the last NULL and see what apply returns.
But wait, there's more!
Since you are iterating over particular rows (and asking about apply in particular), let's just drop the i stuff and feed the function the row:
my_rotate_row <- function(x) {
# you might or might not need to either use x as is, transpose it as t(x) or double transpose it, t(t(x)), to get the correct orientation.
# x is most likely an atomic vector, whereas `eigen_rotation` might be requiring either a row-vector or a column vector.
e = lmmlite::eigen_rotation(K, x, use_cpp = T)
VC = lmmlite::fitLMM( ... )
# lets collect data directly into a data.frame or matrix, instead of using files:
c(VC$sigmasq_g, VC$sigmasq_e)
}
Now you can use apply:
apply(Y, 2, my_rotate_row)

Using for loop and rbind to iterate over multiple files

I have a small R script of 14 functions and want to run it for 81 files. While I have read several posts on Stack Overflow that address similar issues, I am still having trouble getting this working. I am using a for loop and rbind.
All functions within the { } of the loop work. I have tested them without the for loop and I get the vector of data that I need. But when I run the for loop I only get an output for the last file in the folder. I am not sure what is going on.
Is the for loop working right (is it iterating through the files) and simply overwriting the previous runs? If the for loop is working then I assume I have a problem with my rbind. Or, is the for loop only running the last file in list.files()?
In the end, I want a matrix (or table) with the results of the 14 functions for each of the 81 files.
Here is the code:
res=(1:14)
for(i in list.files())
{
nd = read.csv(i, header= TRUE, row.names =1, check.names = FALSE)
mx = as.matrix(nd)
res[1]=basename(i)
res[2]=-99 #this is just a place holder
res[3]=gden(mx)
res[4]=centralization(mx,degree)
deg = degree(mx, gmode="graph", diag=FALSE, rescale=FALSE)
res[5]=mean(deg)
res[6]=sd(deg)
res[7]=max(deg)
res[8]=min(deg)
Ndeg = degree(mx, gmode="graph", diag=FALSE, rescale=TRUE)*1000
res[9]=mean(Ndeg)
res[10]=sd(Ndeg)
res[11]=max(Ndeg)
res[12]=min(Ndeg)
iso = isolates(mx, diag=FALSE)
res[13]=length(iso)
res[14]=nrow(mx)
}
results=rbind(res)
results
Make your set of functions together a new function and sapply it to every element of list.files():
out <- sapply(list.files(), function(i){
nd = read.csv(i, header= TRUE, row.names =1, check.names = FALSE)
mx = as.matrix(nd)
res = numeric(14)
res[1]=basename(i)
res[2]=-99 #this is just a place holder
res[3]=gden(mx)
res[4]=centralization(mx,degree)
deg = degree(mx, gmode="graph", diag=FALSE, rescale=FALSE)
res[5]=mean(deg)
res[6]=sd(deg)
res[7]=max(deg)
res[8]=min(deg)
Ndeg = degree(mx, gmode="graph", diag=FALSE, rescale=TRUE)*1000
res[9]=mean(Ndeg)
res[10]=sd(Ndeg)
res[11]=max(Ndeg)
res[12]=min(Ndeg)
iso = isolates(mx, diag=FALSE)
res[13]=length(iso)
res[14]=nrow(mx)
return(res)
}
out
you have to have rbind(res) inside the loop,something like this
results = rbind(res), but that is not enough. something like results = rbind(results,res)
It depends how you want to store them as an array of array etc..
You'd better also lapply sapply etc.. instead of loop
I also posted this question on my university listserv and a fellow student provided the following fixes. And now it works :)
res=(1:14)
summary=(1:14)
for(i in list.files())
{
....code as above.....
summary=rbind(summary, res)
}
summary
# then to put into a .csv
write.csv(summary, "nameoffile.csv")

Substituting variables in a loop?

I am trying to write a loop in R but I think the nomenclature is not correct as it does not create the new objects, here is a simplified example of what I am trying to do:
for i in (1:8) {
List_i <-List
colsToGrab_i <-grep(predefinedRegex_i, colnames(List_i$table))
List_i$table <- List_i$table[,predefinedRegex_i]
}
I have created 'predefinedRegex'es 1:8 which the grep should use to search
The loop creates an object called "List_i" and then fails to find "predefinedRegex_i".
I have tried putting quotes around the "i" and $ in front of the i , also [i] but these do not work.
Any help much appreciated. Thank you.
#
Using #RyanGrammel's answer below::
#CREATING regular expressions for grabbing sets groups 1 -7 ::::
g_1 <- "DC*"
g_2 <- "BN_._X.*"
g_3 <- "BN_a*"
g_4 <- "BN_b*"
g_5 <- "BN_a_X.*"
g_6 <- "BN_b_X.*"
g_7 <- "BN_._Y.*"
for i in (1:8)
{
assign(x = paste("tableA_", i, sep=""), value = BigList$tableA)
assign(x = paste("Forgrep_", i, sep=""), value = colnames(get(x = paste("tableA_", i, sep=""))))
assign(x = paste("grab_", i, sep=""), value = grep((get(x = paste("g_",i, sep=""))), (get(x = paste("Forgrep_",i, sep="")))))
assign(x = paste("tableA_", i, sep=""), value = BigList$tableA[,get(x = paste("grab_",i, sep=""))])
}
This loop is repeated for each table inside "BigList".
I found I could not extract columnnames from
(get(x = paste("BigList_", i, "$tableA" sep=""))))
or from
(get(x = paste("BigList_", i, "[[2]]" sep=""))))
so it was easier to extract the tables first. I will now write a loop to repack the lists up.
Problem
Your syntax is off: you don't seem to understand how exactly R deals with variable names.
for(i in 1:10) name_i <- 1
The above code doesn't assign name_1, name_2,....,name_10. It assigns "name_i" over and over again
To create a list, you call 'list()', not List
creating a variable List_i in a loop doesn't assign List_1, List_2,...,List_8.
It repeatedly assigns an empty list to the name 'List_i'. Think about it; if R names variables in the way you tried to, it'd be equally likely to name your variables L1st_1, L2st_2...See 'Solution' for some valid R code do something similar
'predefinedRegex_i' isn't interpreted as an attempt to get the variable 'predefinedRegex_1', 'predefinedRegex_2', and so one.
However, get(paste0("predefinedRegex_", i)) is interpreted in this way. Just make sure i actually has a value when using this. See below.
Solution:
In general, use this to dynamically assign variables (List_1, List_2,..)
assign(x = paste0("prefix_", i), value = i)
if i is equal to 199, then this code assigns the variable prefix_199 the value 199.
In general, use this to dynamically get the variables you assigned using the above snippet of code.
get(x = paste0("prefix_", i))
if i is equal to 199, then this code gets the variable prefix_199.
That should solve the crux of your problem; if you need any further help feel free to ask for clarification here, or contact me via my Twitter Feed.

a reliable way to tell if = is for assignment in R code?

I'm a stubborn useR who uses = instead of <- all the time, and apparently many R programmers will frown on this. I wrote the formatR package which can replace = with <- based on the parser package. As some of you might know, parser was orphaned on CRAN a few days ago. Although it is back now, this made me hesitant to depend on it. I'm wondering if there is another way to safely replace = with <-, because not all ='s mean assignment, e.g. fun(a = 1). Regular expressions are unlikely to be reliable (see line 18 of the mask.inline() function in formatR), but I will certainly appreciate it if you can improve mine. Perhaps the codetools package can help?
A few test cases:
# should replace
a = matrix(1, 1)
a = matrix(
1, 1)
(a = 1)
a =
1
function() {
a = 1
}
# should not replace
c(
a = 1
)
c(
a = c(
1, 2))
This answer uses regular expressions. There are a few edge cases where it will fail but it should be okay for most code. If you need perfect matching then you'll need to use a parser, but the regexes can always be tweaked if you run into problems.
Watch out for
#quoted function names
`my cr*azily*named^function!`(x = 1:10)
#Nested brackets inside functions
mean(x = (3 + 1:10))
#assignments inside if or for blocks
if((x = 10) > 3) cat("foo")
#functions running over multiple lines will currently fail
#maybe fixable with paste(original_code, collapse = "\n")
mean(
x = 1:10
)
The code is based upon an example on the ?regmatches page. The basic idea is: swap function contents for a placeholder, do the replacement, then put your function contents back.
#Sample code. For real case, use
#readLines("source_file.R")
original_code <- c("a = 1", "b = mean(x = 1)")
#Function contents are considered to be a function name,
#an open bracket, some stuff, then a close bracket.
#Here function names are considered to be a letter or
#dot or underscore followed by optional letters, numbers, dots or
#underscores. This matches a few non-valid names (see ?match.names
#and warning above).
function_content <- gregexpr(
"[[:alpha:]._][[:alnum:._]*\\([^)]*\\)",
original_code
)
#Take a copy of the code to modify
copy <- original_code
#Replace all instances of function contents with the word PLACEHOLDER.
#If you have that word inside your code already, things will break.
copy <- mapply(
function(pattern, replacement, x)
{
if(length(pattern) > 0)
{
gsub(pattern, replacement, x, fixed = TRUE)
} else x
},
pattern = regmatches(copy, function_content),
replacement = "PLACEHOLDER",
x = copy,
USE.NAMES = FALSE
)
#Replace = with <-
copy <- gsub("=", "<-", copy)
#Now substitute back your function contents
(fixed_code <- mapply(
function(pattern, replacement, x)
{
if(length(replacement) > 0)
{
gsub(pattern, replacement, x, fixed = TRUE)
} else x
},
pattern = "PLACEHOLDER",
replacement = regmatches(original_code, function_content),
x = copy,
USE.NAMES = FALSE
))
#Write back to your source file
#writeLines(fixed_code, "source_file_fixed.R")
Kohske sent a pull request to the formatR package which solved the problem using the codetools package. The basic idea is to set a code walker to walk through the code; when it detects = as a symbol of a functional call, it is replaced by <-. This is due to the "Lisp nature" of R: x = 1 is actually `=`(x, 1) (we replace it by `<-`(x, 1)); of course, = is treated differently in the parse tree of fun(x = 1).
The formatR package (>= 0.5.2) has since got rid of dependency on the parser package, and replace.assign should be robust now.
The safest (and probably fastest) way to replace = by <- is directly typing <- instead of trying to replace it.

Resources