My dataset test[[1]] can be found here.
I'm defining a function and using it in a for loop in the following code. The function is supposed to concatenate strings such as (test[[1]], '$', names(test[[1]])[1])) before converting them into an R variable. So in this example, these strings go in and out comes test[[1]]$V1.
I then iterate the function over the variables in test[[1]].
Unfortunately, I keep getting this error: Error in stvar(test[[1]], j) <- NULL : could not find function "stvar<-".
stvar <- function(df,num) {
eval(parse(text=paste(deparse(substitute(df)),'$',names(df)[num],sep='')))
}
for (j in 1:length(names(test[[1]]))){
if (trimws(as.character(stvar(test[[1]],j)[1]))=="Div" &
grepl("^M",stvar(test[[1]],j)[3])==0) {
stvar(test[[1]],j) <- NULL
}
}
Also, not sure if this is important, but the for-loop finds columns containing certain characteristics (first observation == "Div", third observation doesn't start with 'M') and removes matching columns.
Is there a way I can make the loop recognize my function?
Related
I wish to merge tables in R only if that variable name exists. For the same, I have made a variable with the various table names that may or may not exist. And then added a "for" and "if" loop to combine the tables. All the tables if they exist, have a common "names" column. The code entered by me is as follows:
Designation.Attrition1<- data.frame(names)
x<- c("despivot2020new", "despivot2019new", "despivot2018new", "despivot2017new")
for( i in 1: length(x)){if (exists(x[i])){Designation.Attrition1<- merge(Designation.Attrition1, x[i] , by = "names")}}
However, I'm getting the error as "Error in fix.by(by.y, y) : 'by' must specify a uniquely valid column".
One of the reasons for the error, maybe that the merge function fails to consider the element of x as variable name.
x[i] is still a string and not a dataframe. Try to get the data first before merging.
for( i in seq_along(x)) {
if (exists(x[i])) {
Designation.Attrition1 <- merge(Designation.Attrition1,get(x[i]),by = 'names')
}
})
I have a few problems concerning the same topic.
(1) I am trying to loop over:
premium1999 <- as.data.frame(coef(summary(data1999_mod))[c(19:44), 1])
for 10 years, in which I wrote:
for (year in seq(1999,2008)) {
paste0('premium',year) <- as.data.frame(coef(summary(paste0('data',year,'_mod')))[c(19:44), 1])
}
Note:
for data1999_mod is regression results that I want extract some of its estimators as a dataframe vector.
The coef(summary(data1999_mod)) looks like this:
#A matrix: ... of type dbl
Estimate Std. Error t value Pr(>|t|)
age 0.0388573570 2.196772e-03 17.6883885 3.362887e-6
age_sqr -0.0003065876 2.790296e-05 -10.9876373 5.826926e-28
relation 0.0724525759 9.168118e-03 7.9026659 2.950318e-15
sex -0.1348453659 8.970138e-03 -15.0326966 1.201003e-50
marital 0.0782049161 8.928773e-03 8.7587533 2.217825e-18
reg 0.1691004469 1.132230e-02 14.9351735 5.082589e-50
...
However, it returns Error: $ operator is invalid for atomic vectors, even if I did not use $ operator here.
(2) Also,
I want to create a column 'year' containing repeated values of the associated year and am trying to loop over this:
premium1999$year <- 1999
In which I wrote:
for (i in seq(1999,2008)) {
assign(paste0('premium',i)[['year']], i)
}
In this case, it returns Error in paste0("premium", i)[["year"]]: subscript out of bounds
(3) Moreover, I'd like to repeat some rows and loop over:
premium1999 <- rbind(premium1999, premium1999[rep(1, 2),])
for 10 years again and I wrote:
for (year in seq(1999,2008)) {
paste0('premium',year) <- rbind(paste0('premium',year), paste0('premium',year)[rep(1, 2),])
}
This time it returns Error in paste0("premium", year)[rep(1, 2), ]: incorrect number of dimensions
I also tried to loop over a few other similar things but I always get Error.
Each code works fine individually.
I could not find what I did wrong. Any help or suggestions would be very highly appreciated.
The problem with the code is that the paste0() function returns the character and not calling the object that is having the name as this character. For example, paste0('data',year,'_mod') returns a character vector of length 1, i.e., "data1999_mod" and not calling the object data1999_mod.
For easy understanding, there is huge a difference between, "data1999_mod"["Estimate"] and data1999_mod["Estimate"]. Subsetting as data frame merely by paste0() function returns the former, however, the expected output will be given by the latter only. That is why you are getting, Error: $ operator is invalid for atomic vectors.
The same error is found in all of your codes. On order to call the object by the output of a paste0() function, we need to enclose is by get().
As, you have not supplied the reproducible sample, I couldn't test it. However, you can try running these.
#(1)
for (year in seq(1999,2008)) {
paste0('premium',year) <- as.data.frame(coef(summary(get(paste0('data',year,'_mod'))))[c(19:44), 1])
}
#(2)
for (i in seq(1999,2008)) {
assign(get(paste0('premium',i))[['year']], i)
}
#(3)
for (year in seq(1999,2008)) {
paste0('premium',year) <- rbind(get(paste0('premium',year)), get(paste0('premium',year))[rep(1, 2),])
}
I'm writing a code to solve a sudoku puzzle using a video found from YouTube that has coded the same algorithm through Python. This code requires three functions to
Find an empty square.
insert a number into the empty square.
Test whether this number is valid to solve the puzzle.
This is using a backtracking algorithm for the solver.
I am having an issue when calling the functions together where i get the error:
Error in free_squ(x) : argument "x" is missing, with no default
In addition: Warning message:
In if (empty_sq == FALSE) { :
the condition has length > 1 and only the first element will be used
Called from: free_squ(x)
This is confusing as I only get it when running thIS code. So I can write other functions to call the individual functions to analyse the argument inserted into the overlying function:
function1(argument){
function2(argument){
function3(argument){
***DO STUFF***}}}
Why for the following code does function within the main function not recognise the argument?
sudoku_solve <- function(x){
empty_sq <- free_squ(x) # Define a new object to give coordinates of empty square
if(empty_sq == FALSE){ # If no empty square can be found
return(x) # Return the matrix
} else{
empty_sq <- empty_sq # Pointless line kept for clarity
}
for(i in c(1:9)){ # Integers to insert into the found empty square
if(valid(x, i, empty_sq) == TRUE){ # can the intiger be placed in this square?
x[empty_sq[1], empty_sq[2]] = i # if i valid, insert into empty square
}
if(sudoku_solve()){ # are all i's valid?
return(TRUE) # All i's valid
} else{
x[empty_sq[1], empty_sq[2]] = 0 # reset the initial try and try again with another
}
}
return(FALSE)
}
I have named the sudoku puzzle 'puzzle', and call the function by the following:
sudoku_solve(puzzle)
I think in the following statement, you are not passing any value to the function and x does not have a default value either.
if(sudoku_solve()){ # are all i's valid?
return(TRUE) # All i's valid
}
Hence, although the argument is initially passed, when the function is called again after the loop, it is called without an argument. So you pass to free_sq(x) inside sudoku_solve(), and it gives an error.
empty_sq <- free_squ(x)
Make sure you are passing a value to sudoku_solve or else set the default value for x wither in sudoku_solve or in the free_squ class/function.
I'm trying to use as.name(x) to refer to a list to input into a function. Here's an example of my simplified version of my stats function followed by the for loop I'm using to output all the data at once.
get<-function(data,x) {
for (i in x) {
lm(as.formula(paste(i,'~',variable)),data)
}
}
lists<-c("a","b","c")
# where each of a, b, and c are lists that refer to column names of my data
for (j in lists) {
get(data,as.name(j))
}
I keep getting the following error:
Error in for (i in x) { : invalid for() loop sequence
If I just do get(data,a) each time it works but not when I try and do a loop.
Are each of a, b and c a list that contains only one value? I ask because your lm() formula has i on the left hand side, and can only be a vector.
If that's the case, then replacing as.name(j) with j should make your code work.
I'm currently writing a utility to run a series of test on a set of data. I have the data in a data.frame and would like to run N tests on each row of data. (Apologies if my terminology isn't all there: I've been using R for all of five hours).
In my utility, I would like to split the tests into different files and in the main program, load all those tests and run them once for each data.frame row. Here's what I'm doing to source the relevant files:
file.sources = list.files(pattern="validator-.*.R$")
sapply(file.sources,source,verbose = TRUE)
This works well, and if I do this in each matched file:
b <- function(a) {
if(grep("^[[:blank:]]*$", a)) {
return(FALSE)
} else {
return(TRUE)
}
test.functions <- append(test.functions, b)
Then I end up with a test.function list which accurately contain all the test functions to run, but this is now where I get stuck. I've tried variations of sapply() and I think do.call() is also relevant in this. This is my current attempt:
process.entry <- function(a) {
lapply(test.functions,do.call,a)
}
sapply(all.data,process.entry)
My attempt here was to create a function which takes one row of data as its argument, iterates over test.functions and calls do.call() with the function and row of data as arguments. This doesn't seem to work quite, and the error thrown is:
Error in FUN(X[[i]], ...) : second argument must be a list
However, I'm not entirely sure where this error occurs, and quite possibly: there are other, cleaner, ways of doing what I intend!
# I would
process.entry <- function(a) {
# call each function to a
# I think a anonymous function is easier here;
lapply(test.functions, function(f) f(a))
}
# sapply iterate over column of data.frame by default,
# if you want to iterate over rows, use for or apply;
apply(all.data, 1, process.entry)