R in BERT won't use na.rm=TRUE in sum function - r

I installed BERT (R-language to Excel interface). In the functions.R file that is included, i modified the included Add function to use the na.rm argument, as follows.
Add <- function( ... ) {
sum(..., na.rm=TRUE);
}
However, it appears that the na.rm arugment is ignored. That is, the Add() function works fine in Excel if all values in the range are present.
[that is =R.Add(A1:A5) in Excel works fine if all of cells A1:A5 contain values]
But if I delete any value in the range (so the Excel cell is blank), I get #NULL! returned.
Is it possible to utilize the na.rm argument, using BERT so that for R-language functions that have the na.rm argument, it is taken into account and blank cells within the Excel range still compute on the remaining values and do not return #NULL!?

This is a little complicated because the behavior is different if you are passing in one argument (a range) or multiple arguments (individual cells). But in the case of a single argument, if you pass in a range that has empty cells, this will be passed as a list. In that case, you will need to call unlist, e.g.
Add <- function( ... ) {
sum(unlist(...), na.rm=TRUE);
}
Excel can have ranges that include different types (e.g. strings and numbers), but R can't. So when BERT passes data from Excel to R that has mixed types, it uses a list of lists.
There is a hint about this in the console when it runs, which is a good place to start -- it says that the argument is an invalid type (list).
As I said it's complicated because the three-dots argument could refer to multiple arguments, each of which could be a list or a single value (a scalar), each of which could be different types. In that case you'd need to use one of the apply functions to unlist different arguments. But try the above first.

Related

How to write formulas for counting different values in data frame

The assignment is as follows.
Q2. Write a function with one argument, say, data.
The function does following,
If the argument data is a character vector, count the total number of characters;
If the argument data is numeric vector, calculate the mean, sd, min, and max values;
The function should return the value using list, containing also data.
I am very new to this and would like to use basic R code to solve it. I don't really understand the syntax I ought to use.
You may want to upload some data examples. I am assuming your 'data' is pure character or numbers. See code below. But detailed information will be needed if it does not work for you.
myfunc=function(data){
if(is.character(data)){
res=list(data=data, Nchar=sum(nchar(data)))
}
else if(is.numeric(data)){
res=list(data=data, mean=mean(data), sd=sd(data),max=max(data), min=min(data))
}
return(res)
}
#usage
data1=c("a","bbb")
myfunc(data1)
data2=c(1,2,3)
myfunc(data2)

not error, but not results either in R

I am trying to make a function in R that calculates the mean of nitrate, sulfate and ID. My original dataframe have 4 columns (date,nitrate, sulfulfate,ID). So I designed the next code
prueba<-read.csv("C:/Users/User/Desktop/coursera/001.csv",header=T)
columnmean<-function(y, removeNA=TRUE){ #y will be a matrix
whichnumeric<-sapply(y, is.numeric)#which columns are numeric
onlynumeric<-y[ , whichnumeric] #selecting just the numeric columns
nc<-ncol(onlynumeric) #lenght of onlynumeric
means<-numeric(nc)#empty vector for the means
for(i in 1:nc){
means[i]<-mean(onlynumeric[,i], na.rm = TRUE)
}
}
columnmean(prueba)
When I run my data without using the function(), but I use row by row with my data it will give me the mean values. Nevertheless if I try to use the function so it will make all the steps by itself, it wont mark me error but it also won't compute any value, as in my environment the dataframe 'prueba' and the columnmean function
what am I doing wrong?
A reproducible example would be nice (although not absolutely necessary in this case).
You need a final line return(means) at the end of your function. (Some old-school R users maintain that means alone is OK - R automatically returns the value of the last expression evaluated within the function whether return() is specified or not - but I feel that using return() explicitly is better practice.)
colMeans(y[sapply(y, is.numeric)], na.rm=TRUE)
is a slightly more compact way to achieve your goal (although there's nothing wrong with being a little more verbose if it makes your code easier for you to read and understand).
The result of an R function is the value of the last expression. Your last expression is:
for(i in 1:nc){
means[i]<-mean(onlynumeric[,i], na.rm = TRUE)
}
It may seem strange that the value of that expression is NULL, but that's the way it is with for-loops in R. The means vector does get changed sequentially, which means that BenBolker's advice to use return(.) is correct (as his advice almost always is.) . For-loops in R are a notable exception to the functional programming paradigm. They provide a mechanism for looping (as do the various *apply functions) but the commands inside the loop exert their effects in the calling environment via side effects (unlike the apply functions).

R - Please explain this code and how to make a function that outputs like it?

I am new to R and mostly working with old code written by someone else. And I am trying to create my own R functions.
I found some of the following code used for eigenvalue decomposition.
eigenMatrix = eigen(myMatrix)[[2]]
eigenVals = eigen(myMatrix)[[1]]
Here there is single function that can output 2 different data structures, being, a vector and a matrix depending of the value in the brackets.
When I search of functions with multiple outputs, they usually use lists to output multiple variables at once which does not work, possibly because of different types.
I don't understand why there are two setts of brackets and how the underlying function would work.
The posted code takes the eigen function, which returns a list with 2 values.
Then the [[]] are use to extract the first and second items from the list.
The [[]] is needed to return the underlying structure, and is better explained here: How to Correctly Use Lists in R?
Also, since the eigen function is run twice the code in the question is inefficient.
resultList = eigen(myMatrix)
eigenMatrix = resultList[[2]]
eigenVals = resultList[[1]]
This code is better since eigen is run only once and saves the result of the function as a list and then reads the values from the list.
For the function itself can be coaded as any function with multiple outputs such as here: https://stat.ethz.ch/pipermail/r-help/2007-March/126851.html or here: How to assign from a function with multiple outputs?
The list values can hold any structure and [[]] can be used to return the underlying structure of each value.

Naming columns of coefficient matrix in a VAR

I am searching for a fast and simple way to give comprehensible names to the columns of a VAR-coefficient matrix.
What I would like to use is the function VAR.names, which is used in the function VAR.est() in the VAR.etp-package. When I use the function VAR.est(), this works perfectly, but as soon as I modify VAR.est (by adding another element to the list of values which are returned), I receive an error message stating "could not find function VAR.names".
I could not find any information on the function VAR.names.
Example:
library(VAR.etp)
data(dat)
M=VAR.est(dat,p=2,type="const")
M$coef
Another possibility would be to use a loop as in the function VAR() from the vars package, but if VAR.names would actually work, this would be a lot more elegant!

My data is stored as a matrix and as a list at the same time?

I am using the tabular() function to produce tables in r (tables library).
I want to compute CI's from the data in the output (let mytable be the output from tabular()). Simple enough I thought, except when I go to call a value from the matrix, I get the error Error in mytable[1, i] - 1 : non-numeric argument to binary operator. I thought this was odd, as when I call up a particular cell of the matrix (where as.matrix returned true for mytable), for example mytable[1, i] for some i, I get an interger. I then do the as.list for mytable and get true also, so I am not sure what this means. I guess the tabular() function stores the results as a special kind of matrix.
I am only trying to pull out the mean,sdev, and n, which I am able to just by typing the cell location, for example mytable[1, i] would return an 86. However, when I try to call up the value in qt(.975,df=(mytable[1,i]-1)) for example, I get the error above. Not sure really how to approach this except to manually enter the values into another matrix (which I would like to avoid). Or, if I can compute CI's directly in the tabular() function that would work also. Cheers.
I shall quote for you the Value section of the documentation on the function ?tabular:
An object of S3 class "tabular". This is a matrix of mode list, whose
entries are computed summary values, with the following attributes:
rowLabels - A matrix of labels for the rows. This will have the same
number of rows as the main matrix, but may have multiple columns for
different nested levels of labels. If a label covers multiple rows, it
is entered in the first row, and NA is used to fill following rows.
colLabels - Like rowLabels, but labelling the columns.
table - The original table expression being displayed. A list of the
original format specifications are attached as a "fmtlist" attribute.
formats - A matrix of the same shape as the main result, containing NA
for default formatting, or an index into the format list.
As the documentation says, each element of the matrix is a list. If your tabular object is called tab type tab[1,1] and you should see a list containing one of your table values. If I wanted to modify that value, I would probably do something like:
tab[1,1]$term <- value
just like you would modify values in any other list.
Type attributes(tab) and you'll see the items listed above, containing a lot of the formatting information and row/col headers.

Resources