Extend exists to nested dictionary/list - r

How can I extend the exists function to work with the following:
Any ideas how I would extend to this to looking at seeing whether a nested dictionary would also exist. I.e. for example: if(exists("mylists[[index]]['TSI']")), where the mylists object is a dictionary look up that also wants to contain a nested dictionary.
Now mylists will look like:
[[index]]["TSI"]=c(0="a",1="b")
How should I check this exists so that I may append it so I have:
[[index]]["TSI"]=c(0="a",1="b",2="c")
Here is more code that illustrates things better:
index is an ID
if(!is.null(listsar[[index]]["TSI"])) {
print("extending existing")
listsar[[index]][["TSI"]] <- c(listsar[[index]][["TSI"]], risktype=myTSI)
}else
{
print("creating new")
listsar[[index]][["TSI"]] <- c(risktype=myTSI)
}
However this does not seem to work. I get the "extending existing" and I never seem to get the "creating new". If I change the evaluation line to:
if(!is.null(listsar[[index]][["TSI"]]))
I get different statement:
"creating new"

You can test for NULL in most cases. Sample data (which is something you should have given us along with working code - wtf is c(0="a",1="b",2="c") supposed to be?)
> mylists=list()
> mylists[["foo"]]=list()
> mylists[["foo"]][["TSI"]]=c(a=0,b=1)
Does a "foo" exist at the top level?
> !is.null(mylists[["foo"]])
[1] TRUE
Yes.
Does a "fnord" exist at the top level?
> !is.null(mylists[["fnord"]])
[1] FALSE
No.
Does a "TSI" exist within "foo"?
> !is.null(mylists[["foo"]][["TSI"]])
[1] TRUE
Yes.
Does a "FNORD" exist within "foo"?
> !is.null(mylists[["foo"]][["FNORD"]])
[1] FALSE
No.
Does a "FNORD" exist within a top-level (and nonexistent) "fnord":
> !is.null(mylists[["fnord"]][["FNORD"]])
[1] FALSE
No.

Related

make file.exists() case insensitive

I have a line of code in my script that checks if a file exists (actually, many files, this one line gets looped for a bunch of different files):
file.exists(Sys.glob(file.path(getwd(), "files", "*name*")))
This looks for any file in the directory /files/ that has "name" in it, e.g. "filename.csv". However, some of my files are named "fileName.csv" or "thisfileNAME.csv". They do not get recognized. How can i make file.exists treat this check in a case insensitive way?
In my other code i usually make any imported names or lists immediately lowercase with the tolower function. But I don't see any option to include that in the file.exists function.
Suggested solution using list.files:
If we have many files we might want to do this only once, otherwise we can put in in the function (and pass path_to_root_directory instead of found_files to the function)
found_files <- list.files(path_to_root_directory, recursive=FALSE)
Behaviour as file.exists (return value is boolean):
fileExIsTs <- function(file_path, found_files) {
return(tolower(file_path) %in% tolower(found_files))
}
Return value is file with spelling as found in directory or character(0) if no match:
fileExIsTs <- function(file_path, found_files) {
return(found_files[tolower(found_files) %in% tolower(file_path)])
}
Edit:
New solution to fit new requirements:
keywordExists <- function(keyword, found_files) {
return(any(grepl(keyword, found_files, ignore.case=TRUE)))
}
keywordExists("NaMe", found_files=c("filename.csv", "morefilenames.csv"))
Returns:
[1] TRUE
Or
Return value are files with spelling as found in directory or character(0) if no match:
keywordExists2 <- function(file_path, found_files) {
return(found_files[grepl(keyword, found_files, ignore.case=TRUE)])
}
keywordExists2("NaMe", found_files=c("filename.csv", "morefilenames.csv"))
Returns:
[1] "filename.csv" "morefilenames.csv"
The following should return a 1 if the filename matches in any case and a 0 if it does not.
max(grepl("*name*",list.files()),ignore.case=T)

R: Searching for a certain, delimited string

I'm looking for a way in R to search for a certain, delimited string.
In my example I need to receive TRUE if a cell contains "HDT2" and not "HDT21" or "HDT24" and so on, because this string contains HDT2 as well.
So right now I am using
grepl("HDT2",data.label[d,2])
in a for-loop to check each row of the second column of data.label for "HDT2". The problem is that this also returns TRUE if there is more than just "HDT2". As for example it returns also true if there is "HDT21" or "HDT24", but this is not what i want.
Is there a way to only check for a certain, delimited string?
Thanks!
EDIT: The strings I have to check are longer than just "HDT2". The string is for example "HDT2 (Arm 1: reference)".
You can use the following regular expression in grepl(). This will return true for an exact match of "HDT2", with nothing coming before or after it.
grepl("^HDT2$",data.label[d,2])
Usage:
> grepl("^HDT2$", "HDT2")
[1] TRUE
> grepl("^HDT2$", "AHDT2")
[1] FALSE
> grepl("^HDT2$", "HDT2 (Arm 1: reference)")
[1] FALSE

(R) IF statement: stop & warn if FALSE else continue

I'm making a function and before it does any of the hard stuff I need it to check that all the column names listed in the 'samples' dataset are also present in the 'grids' dataset (the function maps one onto the other).
all(names(samples[expvar]) %in% names(grids))
This does that: the code within all() asks if all the names in the list ('expvar') of columns in 'samples' are also names in 'grids'. The output for a correct length=3, expvar would be TRUE TRUE TRUE. 'all' asks if all are TRUE, so the output here is TRUE. I want to make an IF statement along the lines of:
if(all(names(samples[expvar]) %in% names(grids)) = FALSE) {stop("Not all expvar column names found as column names in grids")}
No else needed, it'll just carry on. The problem is that the '= FALSE' is redundant because all() is a logically evaluable statement... is there a "carry on" function, e.g.
if(all(etc)) CARRYON else {stop("warning")}
Or, can anyone think of a way I can restructure this to make it work?
You're looking for the function stopifnot.
However you don't need to implement it as
if (okay) {
# do stuff
} else {
stop()
}
which is what you have. Instead you can do
if (!okay) {
stop()
}
# do stuff
since the lines will execute in sequential order. But, again, it might be more readable to use stopifnot, as in:
stopifnot(okay)
# do stuff
I would code it:
if(!all(...))
stop(...)
... rest of program ...

R numeric variable, non null, non na but empty

Hi eveyrone ##.
I got some problem with R that I can't fix: Currently i'm working with GEOquery package and I want to retrieve some informations in metadata of gse files.
More precisely I'm looking for the channel label (for exemple Cye3). Here's a sample of my code :
>library(GEOquery)
>gse<-getGEO("GSE2253",GSEMatrix=TRUE,destdir=".")
>gse<-gse[[1]]
>gse$label_ch1[1]
V2
Levels: According to Affymetrix protocol (biotin)`
And here's my problem
`> is.na(gse$label_ch1[1])
V2
FALSE
> is.null(gse$label_ch1[1])
[1] FALSE`
This GSE file is a text file and in the line corresponding to the label (!Sample_label_ch1) there is no value.So, here's what I'v done for my work:
`if(is.na(gse$label_ch1[1])){
color<-"Non specified"
} else {
label<-gse$label_ch1[1]
}`
So, if I got no informations for the channel I just say "non specified", else, I return the value. But I'v got error with this if/else statement in my script:
Error in if (file == "") file <- stdout() else if (is.character(file)) { :
the length of argument is null
Sorry if the error traduction is not exact, my R version is in French ^^.
I tried
if(as.character(gse$label_ch1[1])=="")
But it doesn't work either
If someone has an idea to help me ^^
Thanks in advance!
Script:
sample<-NULL
output<-NULL
gse<-NULL
color<-NULL
series_matrix<-dir(getwd(),pattern="*series_matrix.txt")
series_matrix<-unlist(strsplit(series_matrix,"_")[1])
for(i in 1:length(series_matrix)){
gse<-getGEO(series_matrix[i],GSEMatrix=TRUE,destdir=".")
gse<-gse[[1]]
if(length(gse$label_ch1[1])==0){
color<-"Non specified"
} else {
color<-gse$label_ch1[1]
}
print (color)
sample<-cbind(as.character(gse$title),as.character(gse$geo_accession))
outputsample<-paste(getwd(),"/sample.txt",sep="")
write.table(paste("txt",color,sep=""),output,
row.names=FALSE,col.names=FALSE,sep="\t",quote=FALSE)
write.table(sample,outputsample,
row.names=FALSE,col.names=FALSE,sep="\t",quote=FALSE,append=TRUE)
Feature_Num<-list(1:length(featureNames(gse)))
Gene_Symbol<-pData(featureData(gse)[,11])
Probe_Name<-pData(featureData(gse)[,1])
Control_Type<-pData(featureData(gse)[,3])
liste<-as.character(sampleNames(gse))
for(i in 1:lenght(liste)){
values<-cbind(Feature_Num,Gene_Symbol,Probe_name,Control_Type,exprs(gse)[,i])
colnames(values)<-c("Feature_Num","Gene_Symbol",
"Probe_Name","Control_Type","gMedianSignal")
write.table(values,paste(getwd(),"/Ech",liste[i],".txt",sep=""),
row.names=FALSE,quote=FALSE,sep="\t")
}
}
Don't hesitate if you want explication about lines in this script
Yes, in R you can create a zero-length object:
foo<-vector()
foo
logical(0)
Then change it:
foo<-NULL
foo
NULL
It's confusing at first, but if you ever took some abstract algebra, you may remember the difference between the "empty set" and a set whose only element is the "empty set."
Asking on other forum I finally get a solution for this non NULL/non NA problem:
the gse$label_ch1[1] is numeric of length 1
> length(gse$label_ch1[1])
[1] 1
but we can transform this variable in character:
> as.character(gse$label_ch1[1])
[1] ""
and with this line
> nchar(as.character(gse$label_ch1[1]))
[1] 0
we can see that I can see if the gse$label_ch1[1] value is really empty or not
Thank you all for your help!
Cheers

Documentation of squared bracket `[` function

I have a function in R that looks somewhat like this:
setMethod('[', signature(x="stack"),definition=function(x,i,j,drop){
new('class', as(x, "SpatialPointsDataFrame")[i,]) })
I use it to get a single element out of a stacked object. For the package I'm building I need a .Rd file to document the function. I stored it as [.Rd but somehow the R CMD check does not see this. It returns:
Undocumented S4 methods: generic '[' and siglist 'MoveStack,ANY,ANY'
The [.Rd file starts with these lines:
\name{[}
\alias{[}
\alias{[,stack,ANY,ANY-method}
\docType{methods}
\title{Returns an object from a stack}
\description{Returning a single object}
\usage{
\S4method{\[}{stack,ANY,ANY}(x,i,y,drop)
}
Any idea how I make R CMD check aware of this file?
If you look at the source code of the sp package, for example SpatialPolygons-class.Rd, the Methods section:
\section{Methods}{
Methods defined with class "SpatialPolygons" in the signature:
\describe{
\item{[}{\code{signature(obj = "SpatialPolygons")}: select subset of (sets of) polygons; NAs are not permitted in the row index}
\item{plot}{\code{signature(x = "SpatialPolygons", y = "missing")}:
plot polygons in SpatialPolygons object}
\item{summary}{\code{signature(object = "SpatialPolygons")}: summarize object}
\item{rbind}{\code{signature(object = "SpatialPolygons")}: rbind-like method}
}
}
method for [ is defined.
Name and class of the file are
\name{SpatialPolygons-class}
\alias{[,SpatialPolygons-method}
If you look at the help page for ?SpatialPolygons you should see
> Methods
>
> Methods defined with class "SpatialPolygons" in the signature:
>
> [ signature(obj = "SpatialPolygons"): select subset of (sets of)
> polygons; NAs are not permitted in the row index
>
So I would venture a guess that if you specify a proper (ASCII named) file name, give it an alias as in the above example, you should be fine.

Resources