I'm quite new to R, and I am currently trying to write a code with some foolproofing.
For example, I have a function where let's say it takes (n,c,l)
where the variables are supposed to be numerical, character, and logical types.
Is there a way so that I can check these things?
for example, I've tried is.integer(3) ... this returns FALSE.
Ideally, what I'm looking for is for example, suppose abc() is some function that will test if l==T or l==F (checking if the proper logical is input.
then: abc(T) gives TRUE, and abc(2) gives FALSE.
Also is there a way to check if n is specifically an integer? I mean I could check if (n%%1==0), but is there a specific function for this?
Thank you kindly in advance for what may seem to be a very basic question.
As suggested in the comments, you can use
is.numeric(3)#check whether numeric
is.integer(3L)#check whether integer
is.logical(TRUE)#Check whether logical
is.logical(2)#will return false
is.character("abc")#check whether character
is.character(4)#will return false
Similarly, you can check for other data types in R. Hope this is of help.
Related
I'm trying to declare an array in R, something logically equivalent to the following Java code:
Object[][] array = new Object[6][32]
After I declare this array, I plan to loop over the indices and assign values to them.
I am not familiar with what you are planning on doing in R, but loops are generally not recommended. I would say this is especially true when you don't know the length of the output.
You might want to find a "vectorized" solution first and if not, then using something in the apply family might also be helpful
Disclaimer: I am certain there is more nuance to this discussion based on what I have read, so I don't want to claim to be an expert on this subject.
Hi, I was looking to see if anyone could help me.
So I am new to using R and I'm following through a workbook provided by my univesity and have been trying the functions out in R and changing various numbers to see how this changed the information.
Generally, I have been understanding most of it but I dont quite understand some of the following code.
Could anyone explain to me what
vec.mean <- numeric(N)
means as im not entirely sure what this is doing.
Thanks in advance for the help.
If you read the help file with help(numeric):
Description -
Creates or coerces objects of type "numeric".
Arguments -
length
A non-negative integer specifying the desired length. Double values will be coerced to integer: supplying an argument of length other than one is an error.
The first argument of numeric() is length =. Therefore, numeric(500) creates a numeric vector of length 500.
This is theoretically a best practice because running the for loop later on a vector that has already been created will have subtly improved performance.
I am trying to use update function on survey.design object. For instance, I want to create a variable that is the mean of 4 other variables, as follows
x1<-runif(3)
x2<-runif(3)
x3<-runif(3)
population=10000
testdf<-data.frame(x1,x2,x3,population)
testsvy<-svydesign(id=~1,weights=c(30,30,30),data=testdf)
testsvy<-update(testsvy,avg=mean(c(x1,x2,x3)))
However this returns a vector of the same number for every person. There must be something wrong. Alternatively I can modify on test$variables, but I don't feel that this is the easiest way...
OK I got the answer myself... Hope that it could be simpler since I type the object names three times...
testsvy<-update(testsvy,avg2=rowMeans(testsvy$variables[,c("x1","x2","x3")],na.rm=TRUE))
Working with some big data.frames in R, and wanted to know which one of the 2 options is more efficient timewise.
df[which(condition), ] = value
or
df[condition, ] = value
Assuming that most of the data doesn't fulfill the condition, and length(which(condition)) is much much smaller than the boolean vector.
Is it more efficient to ask for specific indices than going through the whole data.frame/vector and for each row/element and choose it if the boolean vector is true at the position.
Or maybe if I call another function, it only delays performance.
I assumed someone else already asked this, but could not find an answer, this seems relevent, but the discussions I saw there are only if you need the boolean vector/indices again.
I am putting together an R function that takes some undefined input through the ... argument described in the docs as:
"..." the special variable length argument ***
The idea is that the user will enter a number of column names here, each belonging to a dataset also specified by the user. These columns will then be cross-tabulated in comparison to the dependent variable by tapply. The function is to return a table (independent variable x indedependent variable).
Thus, I tried:
plotter=function(dataset, dependent_variable, ...)
{
indi_variables=list(...); # making a list of the ... input as described in the docs
result=with (dataset, tapply(dependent_variable, indi_variables, mean); # this fails
}
I figured this should work as tapply can take a list as input.
But it does not in this case ('Error in tapply...arguments must have same length') and I think it is because indi_variables is a list of strings.
If I input the contents of the list by hand and leave out the quotation marks, everything works just fine.
However, if the user feeds the function the column names as non-strings, R will interpret them as variable names; and I cannot figure out how to transform the list indi_variables in the right way, unsuccessfully trying things like this:
indi_variables=lapply(indi_variables, as.factor)
So I am wondering
What causes the error described above? Is my interpretation correct?
How would one go about transforming the list created through ... in the right way?
Is there an overall better way of doing this, in the input or the implementation of tapply?
Any help is much appreciated!
Thanks to Joran's helpful reading, I have come up with these improvements than make things work out...
indi_variables=substitute(list(...));
result=with (dataset, tapply(dependent_variable, eval(indi_variables, dataset), FUN=mean));