I need to read in a list of polygons from a Object File Format (.off) file (in c++). The format of .off files is basically like this:
Header infomation
x y z //co-ords for each vertex
...
NVertices v1 v2 v3 ... vN //Number of vertices for each polygon,
//followed by each vertex's index
...
.off files allow any number of vertices per polygon, which brings me to my question. How do you know which vertices are connected to which? For eg, if the .off file read:
Header stuff
-0.500000 -0.500000 0.500000
0.500000 -0.500000 0.500000
-0.500000 0.500000 0.500000
0.500000 0.500000 0.500000
-0.500000 0.500000 -0.500000
0.500000 0.500000 -0.500000
-0.500000 -0.500000 -0.500000
0.500000 -0.500000 -0.500000
4 0 1 3 2
4 2 3 5 4
4 4 5 7 6
4 6 7 1 0
4 1 7 5 3
4 6 0 2 4
The polygons are four sided, but not all vertices are connected. If you simply connect each vertex to each other vertex, you end up with four three sided polygons instead of one four sided polygon. I was hoping vertices were listed in a way similar to cycle notation, but I can't seem to find any information on this, so I'm guessing not.
So my question is:
Is there any format that .off files use to show this connection? If not, is there any other way to determine which vertices are connected in an .off file?
In an .off file, each polygon's vertices are connected sequentially in their listed order, with the last one connecting back to the first. In your example, the first polygon has 4 vertices, listed as "0 1 3 2", which means there are connections (i.e. edges) from 0 to 1, from 1 to 3, from 3 to 2, and from 2 back to 0.
Related
I have a rather large tree like structure / dendrogram like / web (think pedigree) and I want to create a list of singularly connected leafs / nodes.
In genealogy something similar I believe is called a "Spitzenahnen" (German) / but it I believe is specific to 'no known parents', not necessarily no descendants. So basically dead ends in the structure, not just top or bottom is what I am looking to find.
I saw the post on creating a edge list from a Matrix as well as how to access the attributes of a dendrogram in R but not sure how to apply it to get the specific results I am looking to obtain.
I have thousands of nodes with multiple starting and end points. I want to create a list of nodes/leafs where there is only one connected node that attaches it to the tree. So if there are two or more connections to the node (some have up to two dozen at most), I do not want to see it in my list.
Using a marked up graphic from "Drawing pedigree diagrams with R and
graphviz" by Jing Hua Zhao I only want to see the highlighted nodes, but some of the applicable nodes may be buried deep within the web and not necessarily on the 'edge'.
It looks like you're using this data:
pre <- read.table(text="pid id father mother sex affected
10081 1 2 3 2 1
10081 2 0 0 1 2
10081 3 0 0 2 1
10081 4 2 3 2 1
10081 5 2 3 2 2
10081 6 2 3 1 2
10081 7 2 3 2 2
10081 8 0 0 1 2
10081 9 8 4 1 2
10081 10 0 0 2 2
10081 11 2 10 2 2
10081 12 2 10 2 1
10081 13 0 0 1 2
10081 14 13 11 1 2
10081 15 0 0 1 2
10081 16 15 12 2 2",header=T)
If you're looking at graph-like data, you might consider using the igraph library. Here's one way to create a similar plot.
unit<-as.character(interaction(pre$father, pre$mother))
el<-rbind(
data.frame(person=as.character(c(pre$father, pre$mother)), unit=unit, stringsAsFactors=F),
data.frame(person=unit, unit=pre$id, stringsAsFactors=T)
)
el<-subset(el, person!="0" & person !="0.0" & unit!="0" & unit!="0.0")
gg<-simplify(graph.data.frame(el, vertices=rbind(
data.frame(id=pre$id, type="person", affected=pre$affected==1, sex=pre$sex),
data.frame(id=unique(unit), type="family", affected=FALSE, sex=0))))
V(gg)$color <- "grey"
V(gg)[type=="person" & !affected]$color <- "deepskyblue"
V(gg)$label <- ""
V(gg)[type=="person"]$label <- V(gg)$name
V(gg)$size <-2
V(gg)[type=="person"]$size <- 15
V(gg)$shape<-"circle"
V(gg)[sex==1]$sex<-"square"
Which produces
(or something similar, the default layout algorithm is stochastic).
It's a bit messy to reshape the data, but the idea is that I create pseudo-nodes for each union resulting in a child. Then I connect parents as incoming nodes and children as outgoing nodes.
Basically the nodes you describe all have one connection so that means in the graph setting, they all have degree 1. We can change these labels to red to get
V(gg)$label.color<-"black"
V(gg)[degree(gg)==1 & type=="person"]$label.color<-"red"
plot(gg)
or you can just get the names with
V(gg)[degree(gg)==1 & type=="person"]$name
# [1] "1" "3" "5" "6" "7" "8" "9" "10" "13" "14" "15" "16"
I have a data frame in R that contains 2 columns named x and y (co-ordinates). The data frame represents a journey with each line representing the position at the next point in time.
x y seconds
1 0.0 0.0 0
2 -5.8 -8.5 1
3 -11.6 -18.2 2
4 -16.9 -30.1 3
5 -22.8 -40.8 4
6 -29.0 -51.6 5
I need to break the journey up into segments where each segment starts once the distance from the start of the previous segment crosses a certain threshold (e.g. 200).
I have recently switched from using SAS to R, and this is the first time I've come across anything I can do easily in SAS but can't even think of the way to approach the problem in R.
I've posted the SAS code I would use below to do the same job. It creates a new column called segment.
%let cutoff=200;
data segments;
set journey;
retain segment distance x_start y_start;
if _n_=1 then do;
x_start=x;
y_start=y;
segment=1;
distance=0;
end;
distance + sqrt((x-x_start)**2+(y-y_start)**2);
if distance>&cutoff then do;
x_start=x;
y_start=y;
segment+1;
distance=0;
end;
keep x y seconds segment;
run;
Edit: Example output
If the cutoff were 200 then an example of required output would look something like...
x y seconds segment
1 0.0 0.0 0 1
2 40.0 30.0 1 1
3 80.0 60.0 2 1
4 120.0 90.0 3 1
5 160.0 120.0 4 2
6 120.0 150.0 5 2
7 80.0 180.0 6 2
8 40.0 210.0 7 2
9 0.0 240.0 8 3
If your data set is dd, something like
cutoff <- 200
origin <- dd[1,c("x","y")]
cur.seg <- 1
dd$segment <- NA
for (i in 1:nrow(dd)) {
dist <- sqrt(sum((dd[i,c("x","y")]-origin)^2))
if (dist>cutoff) {
cur.seg <- cur.seg+1
origin <- dd[i,c("x","y")]
}
dd$segment[i] <- cur.seg
}
should work. There are some refinements (it might be more efficient to compute distances of the current origin to all rows, then use which(dist>cutoff)[1] to jump to the first row that goes beyond the cutoff), and it would be interesting to try to come up with a completely vectorized solution, but this should be OK. How big is your data set?
I've imported a csv file using read.csv.
It gives me a data frame with 18k observations of 1 variable, which looks like this:
V1
1 Energies (kJ/mol)
2 Bond Angle Proper Dih. Improper Dih. LJ-14
3 3.12912e+04 4.12307e+03 1.63677e+04 1.25619e+02 1.04394e+04
4 Coulomb-14 LJ (SR) Coulomb (SR) Potential Pressure (bar)
5 9.21339e+04 2.82339e+05 -1.15807e+06 -7.21252e+05 -7.25781e+03
6 Step Time Lambda
7 1 1.00000 0.00000
8 Energies (kJ/mol)
9 Bond Angle Proper Dih. Improper Dih. LJ-14
10 2.71553e+04 4.11858e+03 1.63855e+04 1.22226e+02 1.03903e+04
11 Coulomb-14 LJ (SR) Coulomb (SR) Potential Pressure (bar)
12 9.20926e+04 2.65253e+05 -1.15928e+06 -7.43766e+05 -7.27887e+03
13 Step Time Lambda
14 2 2.00000 0.00000
...
I want to extract the Potential energy in a vector. I've tried grep and readLines in multiple varieties and functions, but nothing works. Does anybody have an idea how to solve this problem?
Thanks! :)
So is this the right answer (from a former fizzsics major):
Lines <- readLines(textConnection("1 Energies (kJ/mol)
2 Bond Angle Proper Dih. Improper Dih. LJ-14
3 3.12912e+04 4.12307e+03 1.63677e+04 1.25619e+02 1.04394e+04
4 Coulomb-14 LJ (SR) Coulomb (SR) Potential Pressure (bar)
5 9.21339e+04 2.82339e+05 -1.15807e+06 -7.21252e+05 -7.25781e+03
6 Step Time Lambda
7 1 1.00000 0.00000
8 Energies (kJ/mol)
9 Bond Angle Proper Dih. Improper Dih. LJ-14
10 2.71553e+04 4.11858e+03 1.63855e+04 1.22226e+02 1.03903e+04
11 Coulomb-14 LJ (SR) Coulomb (SR) Potential Pressure (bar)
12 9.20926e+04 2.65253e+05 -1.15928e+06 -7.43766e+05 -7.27887e+03
13 Step Time Lambda
14 2 2.00000 0.00000"))
> grep("Potential", Lines) # identify the lines with "Potential"
[1] 4 11
Need to move to the next line and get the 5th item:
> read.table(text=Lines[ grep("Potential", Lines)+1])[ , 5]
[1] -721252 -743766
I have a large number of adjacency matrices, in csv format exported from excel. I also have a large number of csv. files with vertex attribute data.
I have linked them in SNA but igraph goes further functionally, so I am looking to move to it, but I am failing to be able to build the graph+attribute files.
I am looking to set up some code that will be a workhorse for doing a range of plots.
Although there seem many ways to link these two data sets it seemed this was the simplest:
To make the adjacency matrix in the csv a data frame (cut down for missing vertex data) I use:
m <- read.table(header=TRUE, check.names=FALSE, textConnection("
2 3 4 5 6 7
2 0 1 1 0 1 0
3 1 0 0 0 1 0
4 0 0 0 0 0 0
5 1 0 1 0 0 1
6 0 0 0 0 0 0
7 1 1 0 1 0 0
"))
In the case of having both vertex and row names in the original file, the imported attributes file has both vertex names and 'row.names' which correspond to the node names. Hex.ed[1,1] gives the value of the attribute for the first node in the m network, i.e. node 2:
Hex.ed <- read.table(header=TRUE, textConnection("
HH Emo Extra Aggr Consci OTE
2 3.3750 3.0000 3.0000 3.0000 3.0625 3.4375
3 3.5625 2.9375 3.0625 3.0000 3.3125 3.6250
4 3.2500 2.8750 3.7500 3.2500 3.8750 3.5000
5 3.6875 3.1250 3.3750 3.5625 3.6250 3.3125
6 3.3125 3.0000 3.3125 3.8750 3.2500 3.6875
7 3.8125 3.2500 3.5625 2.8750 3.6875 3.4375
"))
g <- graph.data.frame(m, directed=TRUE, vertices=Hex.ed)
However, I get the error: Error in graph.data.frame(m, directed = TRUE, vertices = Hex.ed) : Duplicate vertex names
I get a different error message:
Error in graph.data.frame(m, directed = TRUE, vertices = Hex.ed) :
Some vertex names in edge list are not listed in vertex data frame
but this is because you were not running the example in the question, but used your complete data set, possibly.
Anyway, graph.data.frame does not use adjacency matrices. From the docs at http://igraph.sourceforge.net/doc/R/graph.data.frame.html:
... the first two columns of d are used as a symbolic edge list and
additional columns as edge attributes. The names of the attributes are
taken from the names of the columns.
If you cared about reading the manual you would have seen an example at the bottom.
If you have an adjacency matrix, then you can use graph.adjacency to create the graph, and then add the vertex attribute one by one:
g <- graph.adjacency(as.matrix(m))
for (i in seq_len(ncol(Hex.ed))) {
g <- set.vertex.attribute(g, colnames(Hex.ed)[i], value=Hex.ed[,i])
}
g
# IGRAPH DN-- 6 11 --
# + attr: name (v/c), HH (v/n), Emo (v/n), Extra (v/n), Aggr (v/n),
# Consci (v/n), OTE (v/n)
I am working with triangular meshes in R. For those not familiar, the PLY format has two main components, a 3 by n matrix of vertex x,y,z coordinates, where n is the number of vertices, and a 3 by m matrix of faces where each number references one line from the vertex matrix, and so defining three corners of a triangular face. I am trying to find the mesh boundary edges, which are the "sides" of the triangles that are only referenced once in the faces matrix.
Therefore my question is, how do I find unique pairs of numbers across rows where there are three columns?
face 1 4 6 7
face 2 7 6 8
face 3 9 11 12
face 4 10 9 12
Here line (face) 1 has the edge 4-7 that only appears once, while 6-7 appears twice, as does 9-12.
unique() works across rows, but looks for unique rows, and expects the numbers to be in the same order. Any suggestions?
What you want to do is hash each pair, then make a table of the hashes. You also want (x,y)
to hash the same as (y,x).
R>data
V1 V2 V3 V4 V5
1 face 1 4 6 7
2 face 2 7 6 8
3 face 3 9 11 12
4 face 4 10 9 12
R>e1 <- pmin(data[3], data[4]) + pmax(data[3], data[4])/100
R>e2 <- pmin(data[3], data[5]) + pmax(data[3], data[5])/100
R>e3 <- pmin(data[4], data[5]) + pmax(data[4], data[5])/100
R>table(c(e1,e2,e3, recursive=TRUE))
4.06 4.07 6.07 6.08 7.08 9.1 9.11 9.12 10.12 11.12
1 1 2 1 1 1 1 2 1 1