Creating a vector of graph objects in r - r

I am currently working with the 'igraph' package on R.
I have created two functions that create a statistical table of graph object that work pretty well if used directly on a single graph object (here is an example of what they look like) :
Sfn <- function(x) # Give a table of statistics for nodes
{
Name <- deparse(substitute(x))
Nodes <- V(x)$name
Dtotal <- degree(x, mode="all")
Eigenvector <- eigen_centrality(x)
statistics_table <- data.frame(Nodes,
Dtotal,
Eigenvector)
colnames(statistics_table) <- c("Nodes","Total Degrees",
"Eigenvector centrality")
write.table(statistics_table,
file = paste0("Table_of_",Name,"_nodes.csv"),
sep=",",
row.names = F)
print("Success.")
}
As I am using several graph objects, I would like not to have to write one line per command, such as :
Sfn(g)
Sfn(g2)
Sfn(g3)
# etc...
Sfn(n)
I would thus like to create a vector of lists in which I could collect all my graph objects. I created something like that :
G <- c(
list(CC1),list(CC2),list(CC3),
list(CC4),list(CC5),list(CC6),
list(CC7),list(CC8),list(CC9),
list(CC10),list(CC11),list(CC12))
Yet, this solution is not optimal. First, it is too long to write if I have, for example, 100 graph objects. Secondly, if I write my script with an for() loop, the name of the variable sent to my function will be the name of the parameter of for(), thus, ruining the variable Name of my function Sfn. In other words, the script for(i in G) {Sfn(G)} does not work, because the variable Name will be equal to i :
# In my function Sfn, Name <- deparse(substitute(i)),
for(i in G) {print(deparse(substitute(i)))}
[1] "i"
[1] "i"
[1] "i"
[1] "i"
[1] "i"
[1] "i"
[1] "i"
[1] "i"
[1] "i"
[1] "i"
[1] "i"
[1] "i"
Also, the solution there : (Change variable name in for loop using R) does not work because I have, in my graph objects, very different randomly attributed graph names (such as "CC1","g2","CT3","CC1T3", etc).
Do you have any idea on how I could possibly:
1 - achieve a better way of creating a vector of graph objects ?
2 - make the name of the parameter sent to my variable the same as the actual name of the variable ?

Using deparse(substitute()) makes it hard to do what you want. If you really want to do this without changing your Sfn function, you'll need to construct a call to it as a string and parse that. For example:
names <- c("CC1", "g2")
Sfn <- function(x) deparse(substitute(x)) # just return the name
result <- list()
for (n in names) {
result[[n]] <- eval(parse(text = paste("Sfn(", n, ")")))
}
result
#> $CC1
#> [1] "CC1"
#>
#> $g2
#> [1] "g2"
Created on 2021-09-12 by the reprex package (v2.0.0)
This could be much simpler if you passed the name you want as a string to Sfn, instead of trying to get it using deparse(substitute()), e.g.
names <- c("CC1", "g2")
Sfn <- function(x, name) name
result <- list()
for (n in names)
result[[n]] <- Sfn(n, n)
result
#> $CC1
#> [1] "CC1"
#>
#> $g2
#> [1] "g2"
Created on 2021-09-12 by the reprex package (v2.0.0)
Edited to add: Not only is the second solution cleaner, it's safer too. If you don't have complete control of the names vector, there's a huge security risk: someone could set the "name" to some executable code (see https://xkcd.com/327/) and it would be executed.

Related

Extract strings based on multiple patterns

I have thousands of DNA sequences that look like this :).
ref <- c("CCTACGGTTATGTACGATTAAAGAAGATCGTCAGTC", "CCTACGCGTTGATATTTTGCATGCTTACTCCCAGTC",
"CCTCGCGTTGATATTTTGCATGCTTACTCCCAGTC")
I need to extract every sequence between the
CTACG and CAGTC. However, many cases in these sequences come with an error
(deletion, insertion, substitution). Is there any way to account for mismatches based on Levenshtein distance?
ref <- c("CCTACGGTTATGTACGATTAAAGAAGATCGTCAGTC", "CCTACGCGTTGATATTTTGCATGCTTACTCCCAGTC",
"CCTCGCGTTGATATTTTGCATGCTTACTCCCAGTC")
qdapRegex::ex_between(ref, "CTACG", "CAGTC")
#> [[1]]
#> [1] "GTTATGTACGATTAAAGAAGATCGT"
#>
#> [[2]]
#> [1] "CGTTGATATTTTGCATGCTTACTCC"
#>
#> [[3]]
#> [1] NA
reprex()
#> Error in reprex(): could not find function "reprex"
Created on 2021-12-18 by the reprex package (v2.0.1)
Like this I would be able to extract the sequence also in the last case.
UPDATE: can I create a dictionary with a certain Levenshtein distance and then match it to each sequence?
Using aregexec, build a regex pattern with sprintf, and finally removing the matches using gsub. Putting it into a Vectorized function to avoid overloading the script with lapplys or loops.
In the regex, the .* refers to everything before (resp. after) the respective letters. Note, that you probably need to adapt the max.distance= with your real data.
fu <- Vectorize(function(x) {
p1 <- regmatches(x, aregexec('.*CTACG', x, max.distance=0.1))
p2 <- regmatches(x, aregexec('CAGTC.*', x, max.distance=0.1))
gsub(sprintf('%s|%s', p1, p2), '', x, perl=TRUE)
})
fu(ref)
# CCTACGGTTATGTACGATTAAAGAAGATCGTCAGTC CCTACGCGTTGATATTTTGCATGCTTACTCCCAGTC
# "GTTATGTACGATTAAAGAAGATCGT" "CGTTGATATTTTGCATGCTTACTCC"
# CCTCGCGTTGATATTTTGCATGCTTACTCCCAGTC
# "CGTTGATATTTTGCATGCTTACTCC"
Data:
ref <- c("CCTACGGTTATGTACGATTAAAGAAGATCGTCAGTC", "CCTACGCGTTGATATTTTGCATGCTTACTCCCAGTC",
"CCTCGCGTTGATATTTTGCATGCTTACTCCCAGTC")

Extracting coefficients while looping over variable names

I'm working on some time-series stuff in R (version 3.4.1), and would like to extract coefficients from regressions I ran, in order to do further analysis.
All results are so far saved as uGARCHfit objects, which are basically complicated list objects, from which I want to extract the coefficients in the following manner.
What I want is in essence this:
for(i in list){
i_GARCH_mxreg <- i_GARCH#fit$robust.matcoef[5,1]
}
"list" is a list object, where every element is the name of one observation. For now, I want my loop to create a new numeric object named as I specified in the loop.
Now this obviously doesn't work because the index, 'i', isn't replaced as I would want it to be.
How do I rewrite my loop appropriately?
Minimal working example:
list <- as.list(c("one", "two", "three"))
one_a <- 1
two_a <- 2
three_a <- 3
for (i in list){
i_b <- i_a
}
what this should give me would be:
> one_b
[1] 1
> two_b
[1] 2
> three_b
[1] 3
Clarification:
I want to extract the coefficients form multiple list objects. These are named in the manner 'string'_obj. The problem is that I don't have a function that would extract these coefficients, the list "is not subsettable", so I have to call the individual objects via obj#fit$robust.matcoef[5,1] (or is there another way?). I wanted to use the loop to take my list of strings, and in every iteration, take one string, add 'string'_obj#fit$robust.matcoef[5,1], and save this value into an object, named again with " 'string'_name "
It might well be easier to have this into a list rather than individual objects, as someone suggest lapply, but this is not my primary concern right now.
There is likely an easy way to do this, but I am unable to find it. Sorry for any confusion and thanks for any help.
The following should match your desired output:
# your list
l <- as.list(c("one", "two", "three"))
one_a <- 1
two_a <- 2
three_a <- 3
# my workspace: note that there is no one_b, two_b, three_b
ls()
[1] "l" "one_a" "three_a" "two_a"
for (i in l){
# first, let's define the names as characters, using paste:
dest <- paste0(i, "_b")
orig <- paste0(i, "_a")
# then let's assign the values. Since we are working with
# characters, the functions assign and get come in handy:
assign(dest, get(orig) )
}
# now let's check my workspace again. Note one_b, two_b, three_b
ls()
[1] "dest" "i" "l" "one_a" "one_b" "orig" "three_a"
[8] "three_b" "two_a" "two_b"
# let's check that the values are correct:
one_b
[1] 1
two_b
[1] 2
three_b
[1] 3
To comment on the functions used: assign takes a character as first argument, which is supposed to be the name of the newly created object. The second argument is the value of that object. get takes a character and looks up the value of the object in the workspace with the same name as that character. For instance, get("one_a") will yield 1.
Also, just to follow up on my comment earlier: If we already had all the coefficients in a list, we could do the following:
# hypothetical coefficients stored in list:
lcoefs <- list(1,2,3)
# let's name the coefficients:
lcoefs <- setNames(lcoefs, paste0(c("one", "two", "three"), "_c"))
# push them into the global environment:
list2env(lcoefs, env = .GlobalEnv)
# look at environment:
ls()
[1] "dest" "i" "l" "lcoefs" "one_a" "one_b" "one_c"
[8] "orig" "three_a" "three_b" "three_c" "two_a" "two_b" "two_c"
one_c
[1] 1
two_c
[1] 2
three_c
[1] 3
And to address the comments, here a slightly more realistic example, taking the list-structure into account:
l <- as.list(c("one", "two", "three"))
# let's "hide" the values in a list:
one_a <- list(val = 1)
two_a <- list(val = 2)
three_a <- list(val = 3)
for (i in l){
dest <- paste0(i, "_b")
orig <- paste0(i, "_a")
# let's get the list-object:
tmp <- get(orig)
# extract value:
val <- tmp$val
assign(dest, val )
}
one_b
[1] 1
two_b
[1] 2
three_b
[1] 3

R: Argument as variablename and string in function? [duplicate]

I am looking for the reverse of get().
Given an object name, I wish to have the character string representing that object extracted directly from the object.
Trivial example with foo being the placeholder for the function I am looking for.
z <- data.frame(x=1:10, y=1:10)
test <- function(a){
mean.x <- mean(a$x)
print(foo(a))
return(mean.x)}
test(z)
Would print:
"z"
My work around, which is harder to implement in my current problem is:
test <- function(a="z"){
mean.x <- mean(get(a)$x)
print(a)
return(mean.x)}
test("z")
The old deparse-substitute trick:
a<-data.frame(x=1:10,y=1:10)
test<-function(z){
mean.x<-mean(z$x)
nm <-deparse(substitute(z))
print(nm)
return(mean.x)}
test(a)
#[1] "a" ... this is the side-effect of the print() call
# ... you could have done something useful with that character value
#[1] 5.5 ... this is the result of the function call
Edit: Ran it with the new test-object
Note: this will not succeed inside a local function when a set of list items are passed from the first argument to lapply (and it also fails when an object is passed from a list given to a for-loop.) You would be able to extract the ".Names"-attribute and the order of processing from the structure result, if it were a named vector that were being processed.
> lapply( list(a=4,b=5), function(x) {nm <- deparse(substitute(x)); strsplit(nm, '\\[')} )
$a # This "a" and the next one in the print output are put in after processing
$a[[1]]
[1] "X" "" "1L]]" # Notice that there was no "a"
$b
$b[[1]]
[1] "X" "" "2L]]"
> lapply( c(a=4,b=5), function(x) {nm <- deparse(substitute(x)); strsplit(nm, '\\[')} )
$a
$a[[1]] # but it's theoretically possible to extract when its an atomic vector
[1] "structure(c(4, 5), .Names = c(\"a\", \"b\"))" ""
[3] "1L]]"
$b
$b[[1]]
[1] "structure(c(4, 5), .Names = c(\"a\", \"b\"))" ""
[3] "2L]]"
deparse(quote(var))
My intuitive understanding
In which the quote freeze the var or expression from evaluation
and the deparse function which is the inverse of parse function makes that freezed symbol back to String
Note that for print methods the behavior can be different.
print.foo=function(x){ print(deparse(substitute(x))) }
test = list(a=1, b=2)
class(test)="foo"
#this shows "test" as expected
print(test)
#this (just typing 'test' on the R command line)
test
#shows
#"structure(list(a = 1, b = 2), .Names = c(\"a\", \"b\"), class = \"foo\")"
Other comments I've seen on forums suggests that the last behavior is unavoidable. This is unfortunate if you are writing print methods for packages.
To elaborate on Eli Holmes' answer:
myfunc works beautifully
I was tempted to call it within another function (as discussed in his Aug 15, '20 comment)
Fail
Within a function, coded directly (rather than called from an external function), the deparse(substitute() trick works well.
This is all implicit in his answer, but for the benefit of peeps with my degree of obliviousness, I wanted to spell it out.
an_object <- mtcars
myfunc <- function(x) deparse(substitute(x))
myfunc(an_object)
#> [1] "an_object"
# called within another function
wrapper <- function(x){
myfunc(x)
}
wrapper(an_object)
#> [1] "x"

How to write a function that will return the output as an S3 Class

I am new to R and I am having some difficulty understanding how to return the output of a S3 class for a function. I have some text and I need to write a summary method for it that will count the number of words in the text and the frequency of the top 3 words in the text. I have a function countwords that will count the words. The text is above the code:
text = 'The time of year was spring the sun shone for the birds who were not singing yet. The Local farmer was out in the fields preparing for the summer ahead. He had a spring in his step, for he was whistling.'
#counts the number of words in the text
countwords = function(x) {
# Read in the words from the text and separate into a vector
txt = unlist(strsplit(x,' '))
# Loop through each word
k = 0
for(i in 1:length(txt)) {
k = k + 1
}
return(k)
}
countwords(firstpar)
How do I return the output of this as an s3 class? How do I write a summary method/function? to count the words and also the top 3 words in the text? I am new to R and need some help explaining S3 classes and methods and functions. Is a function the same as a method?
Thank you
Here's one way to do both things, which illustrates the way to add a class and not have to write all the methods you might need for that class. I've also tweaked your function a bit to be more efficient and to work on a vector of strings as inputs. You also don't need the return() call; IIRC it is slightly more efficient to not call return explicitly but to use the fact that R returns automatically the result of the final statement in the function.
mystring <- "The time of year was spring the sun shone for the birds who were not singing yet. The Local farmer was out in the fields preparing for the summer ahead. He had a spring in his step, for he was whistling."
# counts the number of words in the text
countwords <- function(x) {
# Read in the words from the text and separate into a vector
txt <- strsplit(x, " ")
n <- sapply(txt, length)
top3 <- lapply(txt, function(x) names(tail(sort(table(x)), 3)))
out <- list(n = n, top3 = top3)
class(out) <- c("mysummary", "list")
out # implied that we return out here
}
countwords(mystring)
This gets us:
> countwords(mystring)
$n
[1] 41
$top3
$top3[[1]]
[1] "for" "was" "the"
attr(,"class")
[1] "mysummary" "list"
Which isn't pretty, but we can sort that later with a print method. Notice that this is just a list, hence I used class(out) <- c("mysummary", "list") as my S3 class(es) to indicate inheritance from class "list"
> str(countwords(mystring))
List of 2
$ n : int 41
$ top3:List of 1
..$ : chr [1:3] "for" "was" "the"
- attr(*, "class")= chr [1:2] "mysummary" "list"
That means we can subset it like any list without writing those methods:
> cw <- countwords(mystring)
> cw$n
[1] 41
> cw[[2]]
[[1]]
[1] "for" "was" "the"
That's all you really need for an S3 class. This doesn't change even if you stick this in a package. (What you need to do extra then relates to methods for ytou class and we don't have any of those as we inherit from class "list"
> inherits(cw, "list")
[1] TRUE
If you want to add a print method we can just do:
`print.mysummary` <- function(x, ...) {
writeLines(strwrap("Number of words:", prefix = "\n"))
print(x$n, ...)
writeLines(strwrap("Top 3 Words:", prefix = "\n"))
print(x$top3, ...)
invisible(x)
}
which then produces:
> cw
Number of words:
[1] 41
Top 3 Words:
[[1]]
[1] "for" "was" "the"
To make #Roland's comments a little more explicit:
First, you'll create an S3 class of your own, let's call it myTextClass, and then assign the class attribute of your object text.
class(text)<-c('myTextClass')
At this point, the class myTextClass isn't doing much for us, so we need to make a method for this class. In particular, we'll make a new method for the summary function so that whenever summary encounters and object of class myTextClass it will execute our desired method instead of any others.
summary.myTextClass<-function(t){
z1<- paste('Length of text is ', length(unlist(strsplit(t,' '))) ,sep=' ')
tb<-table(unlist(strsplit(t,' ')))
topWords<-paste(names(tb)[1:3],tb[1:3],sep=':')
z2<- paste(c('Top words are... ', topWords) ,collapse=' ')
return(c(z1,z2))
}
This method does some of the basic things you mention such as word counts and so on. Now when we call the generic function summary on an object of class myTextClass, this particular method will be called.
summary(text)
[1] "Length of text is 41" "Top words are... a:1 ahead.:1 birds:1"

R get objects' names from the list of objects

I try to get an object's name from the list containing this object. I searched through similar questions and find some suggestions about using the deparse(substitute(object)) formula:
> my.list <- list(model.product, model.i, model.add)
> lapply(my.list, function(model) deparse(substitute(model)))
and the result is:
[[1]]
[1] "X[[1L]]"
[[2]]
[1] "X[[2L]]"
[[3]]
[1] "X[[3L]]"
whereas I want to obtain:
[1] "model.product", "model.i", "model.add"
Thank you in advance for being of some help!
You can write your own list() function so it behaves like data.frame(), i.e., uses the un-evaluated arg names as entry names:
List <- function(...) {
names <- as.list(substitute(list(...)))[-1L]
setNames(list(...), names)
}
my.list <- List(model.product, model.i, model.add)
Then you can just access the names via:
names(my.list)
names(my.list) #..............
Oh wait, you didn't actually create names did you? There is actually no "memory" for the list function. It returns a list with the values of its arguments but not from whence they came, unless you add names to the pairlist given as the argument.
You won't be able to extract the information that way once you've created my.list.
The underlying way R works is that expressions are not evaluated until they're needed; using deparse(substitute()) will only work before the expression has been evaluated. So:
deparse(substitute(list(model.product, model.i, model.add)))
should work, while yours doesn't.
To save stuffing around, you could employ mget to collect your free-floating variables into a list with the names included:
one <- two <- three <- 1
result <- mget(c("one","two","three"))
result
#$one
#[1] 1
#
#$two
#[1] 1
#
#$three
#[1] 1
Then you can follow #DWin's suggestion:
names(result)
#[1] "one" "two" "three"

Resources