r sna equiv.clust more than one graph - r

I would like to provide more than one graph as input to the equiv.clust function in the sna package. For example
library(ergm)
library(sna)
data(florentine)
flobusiness # first relation
flomarriage # second relation
eq<-equiv.clust(flobusiness)
b<-blockmodel(flobusiness,eq,h=10)
plot(b)
So far so good. I get the output I expect. However, how do I include both relations in the equiv.clust and blockmodel commands?
According to the Usage in documentation
equiv.clust(dat, g=NULL, equiv.dist=NULL, equiv.fun="sedist",
method="hamming", mode="digraph", diag=FALSE,
cluster.method="complete", glabels=NULL, plabels=NULL, ...)
where
dat one or more graphs.
Specifically, I am requesting to know how to provide two or more graphs in as the dat part of the argument. Thanks a ton

try entering the graphs as a list, as in:
equiv.clust(list(flobusiness,flomarriage))
not sure if that will work but in general I think you need to use lists to analyze multiple graphs. though in this case, it depends on whether you want two separate blockmodels in which case you could just loop or use
lapply(equiv.clust, list(flobusiness,flomarriage))
and then a slightly more complicated statement for the block model, or whether you want a blockmodel of the combined network in which case you could just add them together

Related

What are the rules for ppp objects? Is selecting two variables possible for an sapply function?

Working with code that describes a poisson cluster process in spatstat. Breaking down each line of code one at a time to understand. Easy to begin.
library(spatstat)
lambda<-100
win<-owin(c(0,1),c(0,1))
n.seeds<-lambda*win$xrange[2]*win$yrange[2]
Once the window is defined I then generate my points using a random generation function
x=runif(min=win$xrange[1],max=win$xrange[2],n=pmax(1,n.seeds))
y=runif(min=win$yrange[1],max=win$yrange[2],n=pmax(1,n.seeds))
This can be plotted straight away I know using the ppp function
seeds<-ppp(x=x,
y=y,
window=win)
plot(seeds)
The next line I add marks to the ppp object, it is apparently describing the angle of rotation of the points, I don't understand how this works right now but that is okay, I will figure out later.
marks<-data.frame(angles=runif(n=pmax(1,n.seeds),min=0,max=2*pi))
seeds1<-ppp(x=x,
y=y,
window=win,
marks=marks)
The first problem I encounter is that an objects called pops, describing the populations of the window, is added to the ppp object. I understand how the values are derived, it is a poisson distribution given the input value mu, which can be any value and the total number of observations equal to points in the window.
seeds2<-ppp(x=x,
y=y,
window=win,
marks=marks,
pops=rpois(lambda=5,n=pmax(1,n.seeds)))
My first question is, how is it possible to add a variable that has no classification in the ppp object? I checked the ppp documentation and there is no mention of pops.
The second question I have is about using double variables, the next line requires an sapply function to define dimensions.
dim1<-pmax(1,sapply(seeds1$marks$pops, FUN=function(x)rpois(n=1,sqrt(x))))
I have never seen the $ function being used twice, and seeds2$marks$pop returns $ operator is invalid for atomic vectors. Could you explain what is going on here?
Many thanks.
That's several questions - please ask one question at a time.
From your post it is not clear whether you are trying to understand someone else's code, or developing code yourself. This makes a difference to the answer.
Just to clarify, this code does not come from inside the spatstat package; it is someone's code using the spatstat package to generate data. There is code in the spatstat package to generate simulated realisations of a Poisson cluster process (which is I think what you want to do), and you could look at the spatstat code for rPoissonCluster to see how it can be done correctly and efficiently.
The code you have shown here has numerous errors. But I will start by answering the two questions in your title.
The rules for creating ppp objects are set out in the help file for ppp. The help says that if the argument window is given, then unmatched arguments ... are ignored. This means that in the line seeds2<-ppp(x=x,y=y,window=win,marks=marks,pops=rpois(lambda=5,n=pmax(1,n.seeds)))
the argument pops will be ignored.
The idiom sapply(seeds1$marks$pops, FUN=f) is perfectly valid syntax in R. If the object seeds1 is a structure or list which has a component named marks, which in turn is a structure or list which has a component named pops, then the idiom seeds1$marks$pops would extract it. This has nothing particularly to do with sapply.
Now turning to errors in the code,
The line n.seeds<-lambda*win$xrange[2]*win$yrange[2] is presumably meant to calculate the expected number of cluster parents (cluster seeds) in the window. This would only work if the window is a rectangle with bottom left corner at the origin (0,0). It would be safer to write n.seeds <- lambda * area(win).
However, the variable n.seeds is used later as it it were the number of cluster parents (cluster seeds). The author has forgotten that the number of seeds is random with a Poisson distribution. So, the more correct calculation would be n.seeds <- rpois(1, lambda * area(win))
However this is still not correct because cluster parents (seed points) outside the window can also generate offspring points inside the window. So, seed points must actually be generated in a larger window obtained by expanding win. The appropriate command used inside spatstat to generate the cluster parents is bigwin <- grow.rectangle(Frame(win), cluster_diameter) ; Parents <- rpoispp(lambda, bigwin)
The author apparently wants to assign two mark values to each parent point: a random angle and a random number pops. The correct way to do this is to make the marks a data frame with two columns, for example marks(seeds1) <- data.frame(angles=runif(n.seeds, max=2*pi), pops=rpois(n.seeds, 5))

How to use Cheminformatics Toolkit for R to compare a set of SMILES structures

I have a set of SMILES codes of different molecules and I would like to know how to determine similarity among them. I have decided to use the ChemmineR package based on this tutorial. The issue is that I cannot understand how to connect my dataframe and use it like a ChemmineR object in order to run the analysis on SMILES.
DrugName<-c("alclofenac","alosetron")
DrugID_CID<-c("30951","2099")
DrugID<-c("CHEMBL94081","DB00969")
DrugBank<-c("DB13167","DB00969")
SMILES<-c("OC(=O)Cc1ccc(OCC=C)c(Cl)c1","Cc1[nH]cnc1CN1CCc2c(C1=O)c1ccccc1n2C")
Target<-c("PTGS1","HTR3A")
test<-data.frame(DrugName,DrugID_CID,DrugID,DrugBank,SMILES,Target)
I have used the read.SMIset function which imports one or many molecules from a SMILES file and stores them in a SMIset container but I cannot understand how to further proceed with this.
library("ChemmineR")
test; smiset <- smisample
write.SMI(smiset, file="sub.smi")
smiset <- read.SMIset("sub.smi")
data(smisample) # Loads the same SMIset provided by the library
smiset <- smisample
smiset
view(smiset)
cid(smiset)
smi <- as.character(smiset)
as(smi, "SMIset")
It's not entirely clear what you want to compare with what. However, here is one way to proceed with the SMILES in your example data frame.
First you need to convert the SMILES to a SDFset. This is the first step in most ChemmineR operations.
test_sdf <- smiles2sdf(test$SMILES)
For pairwise comparison using atom pairs, you need to convert again to an APset:
test_ap <- sdf2ap(test_sdf)
You could now compare, for example, the first compound in the APset with the second:
cmp.similarity(test_ap[1], test_ap[2])
[1] 0.1313131
I would spend some time reading and working through the Chemminer vignette linked in your question. It's a lot of information but it is well-presented, very clear and covers most things that you'll want to do.

In R, looking for a more detailed str() showing full names or a tree

I want to change parts of a ggplot2 object made by a function and returned as a result, to remove the Y-axis label. No, the function does not allow that to be specified in the first place so I want to change it after the fact.
str(theObject) ## shows the nested structure with parts shortened to ".." and I want to be able to type something like:
theObject$A$B$C$myLabel <- ""
So how can I either make an str -like listing with full paths like that or perhaps draw a tree structure showing the inner working of the object?
Yes, I can figure things out using names(theObject) and finding which branch leads to what I am looking for, then switching to that branch and repeating but it looks like there could be a better automated way to find a leaf node such as:
leaf_str(obj=theObject, leaf="myLabel")
might return zero or more lines like:
theObject$A$B$C$myLabel
theObject$A$X$Y$Z$myLabel
Or, the entire structure could be put out as a series of such lines.
I have searched and found nothing quite like this. I can see lots of uses especially in teaching what an object is. Yes, S4 objects might also use # as well as $.
The
tree
function in the xfun package may be useful.
See here for more details
https://yihui.org/xfun/

(wx)Maxima plot point by point, numbered

I have a list of roots and I want to plot the real/imaginary parts. If s=allroots(), r=realpart() and i=imagpart(), all with makelist(). Since length(s) can get ...lengthy, is there a way to plot point by point and have them numbered? Actually, the numbering part is what concerns me most. I can simply use points(r,i) and get the job done, but I'd like to know their occurence before and after some sorting algorithms. It's not always necessary to plot all the points, I can plot up until some number, but I do have to be able to see their order of having been sorted out.
I have tried multiplot_mode but it doesn't work:
multiplot_mode(wxt)$
for i:1 thru length(s) do draw2d(points([r[i]],[i[i]]))$
multiplot_mode(none)$
All I get is a single point. Now, if this should work, using draw2d's label(["label",posx,posy]) is very handy, but can I somehow evaluate i in the for loop inside the ""?
Or, is there any other way to do it? With Octave? or Scilab? I'm on Linux, btw.
Just to be clear, here's what I currently do: (I can't post images, here's the link: i.stack.imgur.com/hNYZF.png )
...and here is the wxMaxima code:
ptest:sortd(pp2); length(ptest);
draw2d(proportional_axes=xy,xrange=[sort(realpart(s))[1]-0.1,sort(realpart(s))[length(s)]+0.1],
yrange=[sort(imagpart(s))[1]-0.1,sort(imagpart(s))[length(s)]+0.1],point_type=0,
label(["1",realpart(ptest[1]),imagpart(ptest[1])]),points([realpart(ptest[1])],[imagpart(ptest[1])]),
label(["2",realpart(ptest[2]),imagpart(ptest[2])]),points([realpart(ptest[2])],[imagpart(ptest[2])]),
label(["3",realpart(ptest[3]),imagpart(ptest[3])]),points([realpart(ptest[3])],[imagpart(ptest[3])]),
label(["4",realpart(ptest[4]),imagpart(ptest[4])]),points([realpart(ptest[4])],[imagpart(ptest[4])]),
label(["5",realpart(ptest[5]),imagpart(ptest[5])]),points([realpart(ptest[5])],[imagpart(ptest[5])]),
label(["6",realpart(ptest[6]),imagpart(ptest[6])]),points([realpart(ptest[6])],[imagpart(ptest[6])]),
label(["7",realpart(ptest[7]),imagpart(ptest[7])]),points([realpart(ptest[7])],[imagpart(ptest[7])]),
label(["8",realpart(ptest[8]),imagpart(ptest[8])]),points([realpart(ptest[8])],[imagpart(ptest[8])]),
label(["9",realpart(ptest[9]),imagpart(ptest[9])]),points([realpart(ptest[9])],[imagpart(ptest[9])]),
label(["10",realpart(ptest[10]),imagpart(ptest[10])]),points([realpart(ptest[10])],[imagpart(ptest[10])]),
label(["11",realpart(ptest[11]),imagpart(ptest[11])]),points([realpart(ptest[11])],[imagpart(ptest[11])]),
label(["12",realpart(ptest[12]),imagpart(ptest[12])]),points([realpart(ptest[12])],[imagpart(ptest[12])]),/*
label(["13",realpart(ptest[13]),imagpart(ptest[13])]),points([realpart(ptest[13])],[imagpart(ptest[13])]),
label(["14",realpart(ptest[14]),imagpart(ptest[14])]),points([realpart(ptest[14])],[imagpart(ptest[14])]),*/
color=red,point_type=circle,point_size=3,points_joined=false,points(realpart(pp2),imagpart(pp2)),points_joined=false,
color=black,key="",line_type=dots,nticks=50,polar(1,t,0,2*%pi) )$
This is for 14 zeroes, only. For higher orders it would be very painful.
I gather that the problem is that you want to automatically construct all the points([realpart(...), imagpart(...)]). My advice is to construct the list of points expressions via makelist, then append that list to any other plotting arguments, then apply the plotting function to the appended list. Something like:
my_labels_and_points :
apply (append,
makelist ([label ([sconcat (i), realpart (ptest[i]), imagpart (ptest[i])]),
points ([realpart (ptest[i])], [imagpart (ptest[i])])],
i, 1, length (ptest)));
all_plot_args : append ([proptional_axes=..., ...], my_labels_and_points, [color=..., key=..., ...]);
apply (draw2d, all_plot_args);
The general idea is to build up the list of plotting arguments and then apply the plotting function to that.

Finding What You Need in R: focused searching within R and all (3,500+) CRAN Packages

Often in R, there are a dozen functions scattered across as many packages--all of which have the same purpose but of course differ in accuracy, performance, documentation, theoretical rigor, and so on.
How do you locate these--from within R and even from among the CRAN Packages which you have not installed?
So for instance: the generic plot function. Setting secondary ticks is much easier using a function outside of the base package:
minor.tick(nx=n, ny=n, tick.ratio=n)
Of course plot is in R core, but minor.tick is not, it's actually in Hmisc.
Of course, that doesn't show up in the documentation for plot, nor should you expect it to.
Another example: data-input arguments to plot can be supplied by an object returned from the function hexbin, again, this function is from a library outside of R core.
What would be great obviously is a programmatic way to gather these function arguments from the various libraries and put them in a single namespace?
*edit: (trying to re-state my example just above more clearly:) the arguments to plot supplied in R core, e.g., setting the axis tick frequency are xaxp/yaxp; however, one can also set a/t/f via a function outside of the base package, again, as in the minor.tick function from the Hmisc package--but you wouldn't know that just from looking at the plot method signature. Is there a meta function in R for this?*
So far, as i come across them, i've been manually gathering them, each set gathered in a single TextMate snippet (along with the attendant library imports). This isn't that difficult or time consuming, but i can only update my snippet as i find out about these additional arguments/parameters. Is there a canonical R way to do this, or at least an easier way?
Just in case that wasn't clear, i am not talking about the case where multiple packages provide functions directed to the same statistic or view (e.g., 'boxplot' in the base package; 'boxplot.matrix' in gplots; and 'bplots' in Rlab). What i am talking is the case in which the function name is the same across two or more packages.
The "sos" package is an excellent resource. It's primary interface is the "findFn" command, which accepts a string (your search term) and scans the "function" entries in Johnathan Baron's site search database, and returns the entries that contain the search term in a data frame (of class "findFn").
The columns of this data frame are: Count, MaxScore, TotalScore, Package, Function, Date, Score, Description, and Link. Clicking on "Link" in any entry's row will immediately pull up the help page.
An example: suppose you wanted to find all convolution filters across all 1800+ R packages.
library(sos)
cf = findFn("convolve")
This query will look the term "convolve", in other words, that doesn't have to be the function name.
Keying in "cf" returns an HTML table of all matches found (23 in this case). This table is an HTML rendering of the data frame i mentioned just above. What is particularly convenient is that each column ("Count", "MaxScore", etc.) is sortable by clicking on the column header, so you can view the results by "Score", by "Package Name", etc.
(As an aside: when running that exact query, one of the results was the function "panel.tskernel" in a package called "latticeExtra". I was not aware this package had any time series filters in it and i doubt i would have discovered it otherwise.
Your question is not easy to answer. There is not one definitive function.
formals is the function that gives the named arguments to a function and their defaults in a named list, but you can always have variable arguments through the ... parameter and hidden named arguments with embedded hadArg function. To get a list of those you would have to use a getAnywhere and then scan the expression for the hasArg. I can't think of a automatic way of doing it yourself. That is if the functions hidden arguments are not documented.

Resources