I'm very new at R and I would like to do a loop in order to return search volume (through an API call) for a list of keywords.
Here the code that I used :
install.packages("SEMrushR")
library(SEMrushR)
mes_keywords_to_check <- readLines("voyage.txt") # List of keywords to check
mes_keywords_to_check <- as.character(mes_keywords_to_check)
Loop
for (i in 1:length(mes_keywords_to_check)) {
test_keyword <- as.character(mes_keywords_to_check[i])
df_test_2 <- keyword_overview_all(test_keyword, "fr","API KEY NUMBER") ##keyword_overview_all is the function from the Semrush package
}
By doing this, I only get the Search Volume for the first keyword in the list. My purpose if of course to get the date required for the full list of keywords.
Here is the table that I get:
enter image description here
Do you have any idea how I could solve this issue?
Well, you need to add your results to some kind of container. for example to a list. As of now, you have just one object that gets filled with data from the most recent iteration of your loop.
results = list()
for (i in 1:length(mes_keywords_to_check)) {
test_keyword <- as.character(mes_keywords_to_check[i])
df_test_2 <- keyword_overview_all(test_keyword, "fr","API KEY NUMBER") ##keyword_overview_all is the function from the Semrush package
results[[i]] <- df_test_2
}
But, most R experts would suggest to refrain from using a loop
library("plyr")
result <- plyr::ldply(mes_keywords_to_check, function(x) keyword_overview_all(as.character(x), "fr","API KEY NUMBER"))
I did not test this, and it probably needs some tweaking, but it should point you in the right direction.
It looks like you're reading in the text file with readLines("voyage.txt") which will return a list of each line. These lines are then being passed to the for loop. The below will convert the lines to words. There are various approaches, but below uses a loop within a loop to keep using for() and in case you prefer to search line-by-line-word-by-word. It also uses a regex to split on non-alpha-numeric so that you omit words bounded by punctuation.
mes_lines <- readLines("voyage.txt") # List of keywords to check
mes_lines <- as.character(mes_lines)
search_results <- list()
for (i in 1:length(mes_lines)) {
mes_keywords_to_check <- unlist(strsplit(mes_lines,"[^[:alnum:]]"))
mes_keywords_to_check <- mes_keywords_to_check[nchar(mes_keywords_to_check)>0]
if (length(mes_keywords_to_check)==0) next
for (w in 1:length(mes_keywords_to_check))
{
test_keyword <- as.character(mes_keywords_to_check[w])
print(paste0("Checking word=",test_keyword))
df_test_2 <- keyword_overview_all(test_keyword, "fr","API KEY NUMBER") ##keyword_overview_all is the function from the Semrush package
search_results <- append(search_results,df_test_2)
}
}
search_results
Thanks for pointing to the right direction.
Here is what I did, and this is working:
final_result <- data.frame()
mes_keywords_to_check <- readLines("voyage.txt")
mes_keywords_to_check <- as.character(mes_keywords_to_check)
for (i in 1:length(mes_keywords_to_check)) {
test_keyword <- as.character(mes_keywords_to_check[i])
df_test_2 <- keyword_overview_all(test_keyword, "fr","API KEY")
final_result <- rbind(final_result,df_test_2)
}
Related
I regularly come up against the issue of how to categorise dataframes from a list of dataframes according to certain values within them (E.g. numeric, factor strings, etc). I am using a simplified version using vectors here.
After writing messy for loops for this task a bunch of times, I am trying to write a function to repeatedly solve the problem. The code below returns a subscripting error (given at the bottom), however I don't think this is a subscripting problem, but to do with my use of return.
As well as fixing this, I would be very grateful for any pointers on whether there are any cleaner / better ways to code this function.
library(plyr)
library(dplyr)
#dummy data
segmentvalues <- c('1_P', '2_B', '3_R', '4_M', '5_D', '6_L')
trialvec <- vector()
for (i in 1:length(segmentvalues)){
for (j in 1:20) {
trialvec[i*j] <- segmentvalues[i]
}
}
#vector categorisation
vcategorise <- function(categories, data) {
#categorises a vector into a list of vectors
#requires plyr and dyplyr
assignment <- list()
catlength <- length(categories)
for (i in 1:length(catlength)){
for (j in 1:length(data)) {
if (any(contains(categories[i], ignore.case = TRUE,
as.vector(data[j])))) {
assignment[[i]][j] <- data[j]
}
}
}
return (assignment)
}
result <- vcategorise(categories = segmentvalues, data = trialvec)
Error in *tmp*[[i]] : subscript out of bounds
You are indexing assignments -- which is ok, even if at an index that doesn't have a value, that just gives you NULL -- and then indexing into what you get there -- which won't work if you get NULL. And NULL you will get, because you haven't allocated the list to be the right size.
In any case, I don't think it is necessary for you to allocate a table. You are already using a flat indexing structure in your test data generation, so why not do the same with assignment and then set its dimensions afterwards?
Something like this, perhaps?
vcategorise <- function(categories, data) {
assignment <- vector("list", length = length(data) * length(categories))
n <- length(data)
for (i in 1:length(categories)){
for (j in 1:length(data)) {
assignment[(i-1)*n + j] <-
if (any(contains(categories[i],
ignore.case = TRUE,
as.vector(data[j])))) {
data[j]
} else {
NA
}
}
}
dim(assignment) <- c(length(data), length(categories))
assignment
}
It is not the prettiest code, but without fully understanding what you want to achieve, I don't know how to go further.
I'm trying to save each iteration of this for loop in a vector.
for (i in 1:177) {
a <- geomean(er1$CW[1:i])
}
Basically, I have a list of 177 values and I'd like the script to find the cumulative geometric mean of the list going one by one. Right now it will only give me the final value, it won't save each loop iteration as a separate value in a list or vector.
The reason your code does not work is that the object ais overwritten in each iteration. The following code for instance does what precisely what you desire:
a <- c()
for(i in 1:177){
a[i] <- geomean(er1$CW[1:i])
}
Alternatively, this would work as well:
for(i in 1:177){
if(i != 1){
a <- rbind(a, geomean(er1$CW[1:i]))
}
if(i == 1){
a <- geomean(er1$CW[1:i])
}
}
I started down a similar path with rbind as #nate_edwinton did, but couldn't figure it out. I did however come up with something effective. Hmmmm, geo_mean. Cool. Coerce back to a list.
MyNums <- data.frame(x=(1:177))
a <- data.frame(x=integer())
for(i in 1:177){
a[i,1] <- geomean(MyNums$x[1:i])
}
a<-as.list(a)
you can try to define the variable that can save the result first
b <- c()
for (i in 1:177) {
a <- geomean(er1$CW[1:i])
b <- c(b,a)
}
I have problems storing user defined functions in R list when they are put on it in a for loop.
I have to define some segment-specific functions based on some parameters, so I create functions and put them on a list looping through segments with for-loop. The problem is I get same function everywhere on a result list.
The code looks like this:
n <- 100
segmenty <- 1:n
segment_functions <- list()
for (i in segmenty){
segment_functions[[i]] <- function(){return(i)}
}
When i run the code what I get is the same function (last created in the loop) for all indexes:
## for all k
segment_functions[[k]]()
[1] 100
There is no problem when I put the functions on list manually e.g.
segment_functions[[1]] <- function(){return(1)}
segment_functions[[2]] <- function(){return(2)}
segment_functions[[3]] <- function(){return(3)}
works just fine.
I honsetly have no idea what's wrong. Could you help?
You need to use the force function to ensure that the evaluation of i is done during the assignment into the list:
n <- 100
segmenty <- 1:n
segment_functions <- list()
f <- function(i) { force(i); function() return(i) }
for (i in segmenty){
segment_functions[[i]] <- f(i)
}
I'd use lapply and capture i in a clousre of the wrapper:
segment_functions <- lapply(1:100, function(i) function() i)
I've created a simple loop to calculate the efficiency of some simulated data. It performs perfectly well whilst as a loop:
NSE_cal <- NULL
for(i in 1:6) {
Qobs <- flowSummary_NSE1[[i]][[3]]
Qsim <- flowSummary_NSE1[[i]][[1]]
object_cal <- NSEsums("NSE")
NSE_cal <- c(NSE_cal, object_cal)
}
#NSE_cal
#[1] 0.8466699 0.7577019 0.8128499 0.9163561 0.7868013 0.8462228
However, I want to apply this loop quite a few times - I need to vary the object flowSummary_NSE# and I have four different transformation types to apply. As a start, I put the loop inside a function, with only transformation needing to be specified, like so:
badFunction <- function(transformation){
NSE_cal <- NULL
for(i in 1:6) {
Qobs <- flowSummary_NSE1[[i]][[3]]
Qsim <- flowSummary_NSE1[[i]][[1]]
object_cal <- NSEsums(transformation)
NSE_cal <- c(NSE_cal, object_cal)
}
print(NSE_cal)
}
badFunction("NSE")
# [1] 0.8462228 0.8462228 0.8462228 0.8462228 0.8462228 0.8462228
The function has exactly the same information input as in the for loop on its own, except, for some reason, it outputs the same value for each case of i.
It is clear that I have done something wrong. But as far as I can see, it must be something simple contained to the function itself. However, incase it is an error elsewhere, I have attached the code that generates the necessary data and dependent functions (here)
Any help would be much appreciated
You need to pass objects into the nested function as arguments.
In your function_NSEsums.r script change the first line to NSEsums <- function(i, Qobs, Qsim) {
In your example_script.r change your code to the following:
badFunction <- function(transformation){
NSE_cal <- NULL
for(i in 1:6) {
Qobs <- flowSummary_NSE1[[i]][[3]]
Qsim <- flowSummary_NSE1[[i]][[1]]
object_cal <- NSEsums(transformation, Qobs = Qobs, Qsim = Qsim)
NSE_cal <- c(NSE_cal, object_cal)
}
print(NSE_cal)
}
badFunction("NSE")
[1] 0.8466699 0.7577019 0.8128499 0.9163561 0.7868013 0.8462228
I am trying to keep an assigned object from a function (building a ts function to begin to model a univariate process, simple I know!). I am having trouble finding a method to keep objects in my workspace. It works fine just using a for loop but I would like to parameterize the following:
ts.builder<-function(x,y,z){
for(i in 9:13){
assign(paste(x,i,sep="_"),ts(yardstick[1:528,i], freq=24))
assign(paste(y,i,sep="_"),ts(yardstick[529:552,i], freq=24))
assign(paste(z,i,sep="_"),ts(yardstick[1:552,i], freq=24))
}
}
ts.builder("yard.book.training","yard.book.small.valid", "yard.book.valid")
Any pointers?
I am thinking it may need a return statement, yet I have not found this to be of use yet.
Untested (a reproducible example helps a lot):
ts.builder <- function() {
xd <- list()
yd <- list()
zd <- list()
for (i in 9:13) {
xd[[i]] <- ts(yardstick[1:528,i], freq=24)
yd[[i]] <- ts(yardstick[529:552,i], freq=24)
zd[[i]] <- ts(yardstick[1:552,i], freq=24)
}
list(yard.book.training=xd, yard.book.small.valid=yd, yard.book.valid=zd)
}
l <- ts.builder()
Then here are the returned values:
l$yard.book.training[[9]]
etc.