Which has more entropy directed or undirected graph - graph

Suppose we have a directed and undirected graph is there a way to compare their entropy? If so, then which graphs usually have higher entropy

Related

How can I calculate the cluster coefficients of a directed graph in R?

In R, the following code can be used to calculate cluster coefficients.
transitivity(g,type = "global") #"g" is my graph
However, this is for undirected graphs and differs from the results obtained by Cytoscape and Gephi.
How can I get the same cluster coefficient values in R for directed graphs as in other software?

Neural networks: Getting the equivalent of an SAS “iteration plot” with R?

With SAS, you get an iteration plot when you generate a neural network. The plot shows how the error changes between the training and validation data for each iteration, and the point where the algorithm has chosen the optimal stopping point (to maximize generalization) is shown. How do you get a plot like this with R?
With the R neuralnet package, you can define stepmax to limit the number of steps per repetition. Do you have use stepmax in a loop to produce a meaningful graph? For example, if you predetermine that the network trains in 750 steps in 1 repetition, you could start with 650 steps (stepmax), then 660, ..., 850. This is using a for loop than adds 10 to stepmax each time. You then plot the errors. I assume this would work. Is there a better way? Is there a different neural net function that will give you this information directly?
I would like a loss vs. epoch plot, and an accuracy vs. epoch plot (with training and validation accuracy). If I lower stepmax the algorithm doesn't converge, and it doesn't assign weights, so how can I get these plots? With my model the algorithm ran for 52,000 steps. If I increase rep above 1, it has no effect on the model.
With no plots like these, how do we know the model will generalize well?

Inhomogeneous K-function to indicate need for spatial dependence/interaction term in Poisson point process model

I am mapping and modelling a disease of sheep. I have approx 4200 point locations in my dataset, each of which represents the centroid of a given sheep farm.
I have created a K-function difference plot (below) to assess whether my disease-positive farm density layer shows evidence of spatial dependence above and beyond that shown by my disease-negative farm density layer. From this plot I identified spatial dependence in my dataset out to a distance of 500m from a given disease-positive farm.
I have built a Poisson point process model and been through a model selection process. My model residuals appear to be relatively well behaved. See lurking variable plots below, raw and pearson residuals.
To assess the need (or not) for a spatial dependence/interaction term in my Poisson point process model, I created an inhomogeneous K-function plot from a density surface estimated from my final model. See inhomogeneous K-function plot below.
My questions:
1) Based on these plots, should I be including a spatial dependence/interaction term in my model? If so why?
2) Should the repulsion between points shown by the inhomogeneous K-function be accounted for in my Poisson point process model if it is not due to the disease itself? The inhomogeneous K-function plot shows no evidence that disease-positive farms cluster, but does show evidence consistent with repulsion. I believe this repulsion is an artifact of my data and not associated with the disease itself -I am using points to represent the area of a farm, so points can never be closer to each other than their farm borders.
Thanks in advance for any answers, I am very very appreciative!
K-function difference plot
Raw residuals from ppm
Pearson residuals from ppm
Inhomogeneous K-function
It's not appropriate to use point process models for this problem. The farm locations are fixed, while the farm status (diseased or non-diseased) is the response variable.
A Poisson point process model would state that the farm locations are independent, and that's clearly not realistic. The results are consistent with farms being spaced apart which is realistic, but not informative for your real question.
In the spatstat package you could use the function relrisk to estimate the spatially-varying disease risk. But to evaluate evidence for contagion, conditional on the farm locations, you'd best use a package like spdep.

How to do Hierarchical Clustering for Ordinal data-set in R?

I am trying to do Hierarchical clustering on a dataset where the columns are ordinal on the scale of 1 to 5.
Based on Hierarchical clustering can be done using hclust() function.
For doing analysis with ordinal data, we should use "Max" distance or Chebyshev distance method.
But which Linkage should I use with Chebyshev distance as most of the Linkage using squared Euclidean distance. like following linkage methods - Ward, Centroid and Median use squared Euclidean distance.
Linkage - ward.D, ward.D2, Single, Complete, Average, Centroid, median.
So what Linkage should I use with Chebyshev distance to do hierarchical clustering for Ordinal Data?

How to draw the Gaussian graphical model in R

I have got the Correlation coefficient matrix R, and the Partial correlation coefficient matrix Rp, then how could I draw the Gaussian graphical model in R?
It would be better if recommending some books introduction about the Gaussian graphical model, indeed, I don't know what is it, but the first thing I need to do is to draw it out. Many thanks!
#the Correlation coefficient matrix
R=c(1,0.55,0.55,0.41,0.39,0.55,1,0.61,0.49,0.44,0.55,0.61,1,0.71,
0.66,0.41,0.49,0.71,1,0.61,0.39,0.44,0.66,0.61,1)
dim(R)=c(5,5)
#the Partial correlation coefficient matrix
library("corpcor")
Rp=cor2pcor(R)
Then how could I draw the Gaussian graphical model?
If you want to plot the corresponding graph, you can use the igraph package.
library(igraph)
g <- graph.adjacency( abs(Rp)>.1, mode="undirected", diag=FALSE )
plot(g, layout=layout.fruchterman.reingold)
I am not familiar with the term "Gaussian graphical model" although I feel I should be (I'll read up on it, thanks).
But to visualize (partial) correlation matrices you could use the qgraph package, which is designed to do just that. For example:
library("qgraph")
qgraph(round(Rp,5),edge.labels=TRUE)
Computing partial correlations are built in with the graph argument:
qgraph(round(R,5),edge.labels=TRUE,graph="concentration")
Gives the same result.

Resources