Units vs real world - scale

I've got a DXF (rev 10) CAD file with some 2D drawings and I'm implementing a reader. Until now I've successfully loaded everything and rasterized with ImageMagick.
But the point is, I have manually set the zoom on the coordinates to a number that made sense for me. How do I know what was the original size of the components and what unit was used to draw? Is there any specific group I have to look at?
My header is like this:
0
SECTION
2
HEADER
9
$ACADVER
1
AC1006
9
$EXTMIN
10
-14.610075
20
-14.723197
9
$EXTMAX
10
14.556421
20
15.530217
9
$LTSCALE
40
0.000394
9
$PDMODE
70
35
9
$PDSIZE
40
0.000315
0
ENDSEC
I've read what each part is about and I don't seem to find anything that helps me.
I want to know the units, because I want to be able to change the drawing accurately as it will be plotted, e.g. move a point by 2 inches.

When implementing a viewer for a dxf file, you don't actually need to know anything about the units. Unless of course, you are going to implement a Measure function in your viewer, then it gets more complicated.
Your initial 'zoom' size in your viewer can be determined from the header information that you have shown: EXTMIN and EXTMAX are the 2 key pieces of info you need. In your example the minimum coordinate use3d in the dxf file is -14.610075,-14.723197 and the maximum coordinate used is 14.556421,15.530217. This gives you a total drawing size of 29.166496(width) x 30.253414.
For a simple viewer, you can just assume that the units in the DXF file be equal to the units in your viewer (pixels or points or whatever you are using).
Then the base drawing size in your viewer will be 29.166496x30.253414, and you can scale that up (zoom) to make it fill whatever display area you have available.
EDIT
DXF files are by no means 'unitless', so in the case where you absolutely need to know the units, you will need to read the $INSUNITS group code value, and to double-check it, you can also read the $MEASUREMENT group code value.
The R2000 dxf spec, or any of the other versions, contain all the info you need on what those values mean. If you go to the 'HEADER Section Group Codes' page, and search for 'units', you will be able to find the listing of all the unit types. For example:
$INSUNITS
70
4
indicates that the dxf file is using metric units, specifically millimeters, as the base unit. So any dimensional or coordinate value stored by the dxf file will be in millimeters.
Default drawing units for AutoCAD DesignCenter blocks: 0 = Unitless; 1
= Inches; 2 = Feet; 3 = Miles; 4 = Millimeters; 5 = Centimeters; 6 = Meters; 7 = Kilometers; 8 = Microinches; 9 = Mils; 10 = Yards; 11 =
Angstroms; 12 = Nanometers; 13 = Microns; 14 = Decimeters; 15 =
Decameters; 16 = Hectometers; 17 = Gigameters; 18 = Astronomical
units; 19 = Light years; 20 = Parsecs
EDIT
I just noticed you are using a very old dxf format (R10). If I remember right, the units were not introduced into the DXF spec until about R12. Before that time, the actual size of the drawing entities didn't change based on which units were being assumed. Only the labels on the dimensions were different from imperial to metric units.
If you are set on using the old R10 format, you will just have to make an arbitrary decision on what the units are; assuming you don't have any dimension labels on your drawings that would indicate what units are being implied.

Related

How to calculate steps needed for arrow to fill all given sectors of the clock?

the problem's data are:
Analog clock is dived into 512 even sections, arrow/handle starts its movement at 0° and each tick/step moves it by 4.01°. Arrow/Handle can move only clockwise. What minimum ticks/steps count is needed for arrow/handle to visit all sections of the clock.
I'm trying to write a formula to calculate the count but can't quite wrap my head around it.
Is it possible to do it? If yes, how can I do it?
This site is for programmers, isn't it?
So we can hire our silicon friend to work for us ;)
Full circle is 360*60*60*4=5184000 units (unit is a quarter of angular second)
One step is 4*(4*3600+36) = 57744 units
One section is 4*360*3600/512 = 10125 units (we use quarters to make this value integer)
cntr = set()
an = 0
step = 57744
div = 10125
mod = 5184000
c = 0
while len(cntr) < 512:
sec = (an % mod) // div
cntr.add(sec)
an += step
c += 1
print(c)
>>804
unfortunately I can`t fully answer your question but the following may help:
Dividing the 512 Sections into degree gives you 1,4222° each.
Each round you cover 90 different section when starting between 0°-3.11° and 89° when starting between 3.12°-4.00°
For starting the rounds this gives you a change in starting degree of 0.9° every round except after the fourth, where it is only 0.89°(within the possible range of 0°-4° so all calculated mod 4).
So you have 0.9°->1.8°->2.7°->3.6°->0.49->1.39°...0.08°...
I hope this helps you devloping an algorithm

How to read GDB 13 database? And is there any easy way to clean this data?

5
0001 -417.031
C 1.04168, -0.05620, -0.07148 1.041682, -0.056200, -0.071481
H 2.15109, -0.05620, -0.07150 2.130894, -0.056202, -0.071496
H 0.67187, 0.17923, -1.09059 0.678598, 0.174941, -1.072044
H 0.67188, 0.70866, 0.64196 0.678613, 0.694746, 0.628980
H 0.67188, -1.05649, 0.23421 0.678614, -1.038285, 0.228641
8
0002 -711.117
C 0.99571, 0.01149, -0.09922 0.995914, 0.011511, -0.099221
C 2.51489, 0.01148, -0.09922 2.514686, 0.011466, -0.099226
H 0.61911, 0.74910, -0.83887 0.597259, 0.729877, -0.819596
H 0.61911, 0.28325, 0.90938 0.597259, 0.276170, 0.883106
H 0.61909, -0.99785, -0.36818 0.597278, -0.971531, -0.361167
H 2.89151, 1.02083, 0.16973 2.913322, 0.994509, 0.162719
H 2.89149, -0.26027, -1.10783 2.913341, -0.253192, -1.081553
H 2.89149, -0.72612, 0.64042 2.913341, -0.706900, 0.621148
These two data points are from chemical database GDB 13. I try to understand what these numbers are representing. I know 5 and 8 are atomic number; 0001 and 0002 are atom id; and -417.031 and 711.117 are atomization energies. However, I don't quite understand what those number below means. However, I am pretty sure they are the geometry representation in 3 dimension space. If that is the geometry representation in 3 dimension space, then why there are 6 numbers in there. How to read those 6 numbers?
I am also trying to use BOB representation to reform the data, is there any ways to do that instead of hard coding? If not, I am using R, is R able to do that ?
Have a look at the original paper in ‎Int. J. Quantum Chem., 2015, 115, 1058-1073 (DOI).
The Extended XYZ format is explained in Fig. 7 of the article.
You are right that the first line denotes the number of atoms k, while the second line consists of an identifier and the energy of atomization for the particular molecule.
The next k lines contain two sets of cartesian coordinates (in Angström). The left block contains the x,y,z coordinates from a force-field calculation (UFF), while the coordinates on the right stem from a DFT calculation.
A common tool to read and convert coordinate files in various formats is Open Babel. Have a look at th accompanying paper in J. Cheminformatics, 2013,3:33 (DOI)
There exist various bindings for Open Babel, and apparently, there is is one for r too. Have a look.
I just ran a quick test on the first entry in the data from the supplement of the paper by Mathias Rupp using Open Babel 2.3.2:
obabel -ixyz c1.xyz -oxyz -O c1a.xyz
Apparently, only the left coordinate block is read in! If you suspect that the coordinates from UFF and DFT calculations differ significantly, you're probably on your own. However, given that the file format is documented, this should not be a major problem.
If you don't mind a remark, the title of your question is somewhat misleading. The data in question is only remotely related to GDB-13. To my knowledge, the GDB files from Jean-Louis Reymond do not contain any coordinates. They are large collections SMILES strings, from which coordinates would have to be generated for each entry.

2D grid based games : represent passability

Considering a tiled based game, where each agents can move straight/diagonally (in 8-directions).
Basically, a map like this can be represented as a regular 2D grid, where 0 would represent an walkable location and 1 unwalkable locations (I'm using Lua):
-- Example : 3x3 sized map
local map = {
{0,0,0},
{0,1,1},
{0,0,0},
}
At this point, how can we represent tile walkability depending from the direction an agent comes from ?
I.e. cell [2][2] above, which is statically unwalkable, would now be walkable if coming from [1][2] (above) or [2][1] (left), but not, for instance, from [3][2] (down).
I have given to this some thoughts, but I couldn't come up with something clean enough, to me.
Thanks in advance.
I'd overlay another 2D grid with of single bytes. Each bit of the byte corresponds to a possible entrance direction with a 1 meaning it can be walked on from that direction and a 0 meaning not. You can then check for enterability using binary masking.
If most of your cells can be entered from any direction, then you may consider using a map with the tile's absolute ID (X*MaxY+Y, for instance) as a key and the byte scheme described above indicating enterability. This is slower to access, but takes less space.
For instance, let the directions be laid out as so:
Bit # X offset Y offset
123 -1 0 1 -1 -1 -1
4 5 -1 0 1 0 0 0
678 -1 0 1 1 1 1
If I go in the northeast direction, this corresponds to bit #3. I can perform masking by translating the above values into bit masks:
1 2 4
8 16
32 64 128
I can enter from a direction if the following returns true
Enterability(CurrentX+Xoffset(Dir), CurrentY+Yoffset(Dir)) & BitMask(Dir)
(Sorry, I'm afraid I don't know Lua well enough to write this up in that language)
Edit
So, say my directions and such are as above and I want a square that can be entered only from the North. To do this, I set bit #2:
Enterability(X)=2
If I want a square that is enterable from both the north and the southwest, I would use:
Enterability(X)=2 | 64
where | is the bitwise OR operation.
If I want a square to be enterable from any direction but west I use:
Enterability(X)=(~8)
where ~ is the not operation.
If I need to close a door, say to the east, I could unset that bit:
Enterability(X)=Enterability(X) & (~16)
To open the door again, I use:
Enterability(X)=Enterability(X) | 16
or, more simply,
Enterability(X)|=16
The ~16 produces a bitfield which is all ones except for the bit referring to 16. Using this with the AND operator (&) leaves all the bits on, except the one referring to 16.
Also note that hexadecimal addressing can be more convenient:
Decimal Hexadecimal
1 2 4 0x1 0x2 0x4
8 16 = 0x8 0x10
32 64 128 0x20 0x40 0x80

Point handling for closed loop searching

I have set of line segments. Each contains only 2 nodes. I want to find the available closed cycles which produces by joining line segments. Actually, I am looking for the smallest loop if there exist more than one occurrence. If can, please give me a good solution for this.
So, for example I have added below line list together with their point indices to get idea about m case. (Where First value = line number, second 2 values are the point indices)
0 - 9 11
1 - 9 18
2 - 9 16
3 - 11 26
4 - 11 45
5 - 16 25
6 - 16 49
7 - 18 26
8 - 18 25
9 - 18 21
10 - 25 49
11 - 26 45
So, assume I have started from the line 1. That is I have started to find connected loops from point 9, 18. Then, could you please explain (step by step) how I can get the "closed loops" from that line.
Well, I don't see any C++ code, but I'll try to suggest a C++ solution (although I'm not going to write it for you).
If your graph is undirected (if it's directed, s/adjacent/in-edges' vertices/), and you want to find all the shortest cycles passing through some vertex N, then I think you could follow this procedure:
G <= a graph
N <= some vertex in G
P <= a path (set of vertexes/edges connecting them)
P_heap <= a priority queue, ascending by distance(P) where P is a path
for each vertex in adjacent(N):
G' = G - edge(vertex, N)
P = dijkstraShortestPath(vertex, N, G')
push(P, P_heap)
You could also just throw out all but the shortest loop, but that's less succinct. As long as you don't allow negative edge weights (which, since you'll be using line segment length for weights, you don't), I think this should work. Also, fortunately Boost.Graph provides all of the necessary functionality to do this in C++ (you don't even have to implement Dijkstra's algorithm)! You can find documentation about it here:
http://www.boost.org/doc/libs/1_47_0/libs/graph/doc/table_of_contents.html
EDIT: you will have to create the graph from that data you listed first before you can do this, so you'll just define your graph's property_map accordingly and make sure the distance between a vertex you're about to insert and all vertexes currently in the graph is greater than zero, because otherwise the vertex is already in the graph and you don't want to insert it again.
Happy graphing!

How to determine endpoints of Arcs in GraphicsPath PathPoints and PathTypes arrays?

I have the following PathPoints and PathTypes arrays (format: X, Y, Type):
-177.477900, 11021.670000, 1
-614.447200, 11091.820000, 3
-1039.798000, 10842.280000, 3
-1191.761000, 10426.620000, 3
-1591.569000, 10493.590000, 3
-1969.963000, 10223.770000, 3
-2036.929000, 9823.960000, 3
-2055.820000, 9711.180000, 3
-2048.098000, 9595.546000, 3
-2014.380000, 9486.278000, 3
Here is what this GraphicsPath physically looks like. The 2 Arcs are very distinguishable:
I know that this GraphicsPath.PathData array was created by 2 AddArc commands. Stepping through the code in the debugger, I saw that the first 4 PathData values were added by the first AddArc command, and the remaining 6 points were added by the 2nd AddArc command.
By examining the raw pathpoints/pathtype arrays (without previously knowing that it was 2 AddArc commands so I would know that I have 2 start and end points), I would like to determine to start and end point of each arc.
I have tried several Bezier calculations to 'recreate' the points in the array, but am at a loss to determine how to determine the separate start and end points. It appears that GDI+ is combining the start/end point between the arcs (they are the same point and the arcs are connected), and I am losing the fact that one arc is ending and other one is starting.
Any ideas?
Use the GraphicsPathIterator class in combination with the GraphicsPath.SetMarkers method.
For example:
dim gp as new GraphicsPath
gp.AddArc(-50, 0, 100, 50, 270, 90) 'Arc1
gp.SetMarkers()
gp.AddArc(0, 25, 100, 50, 270, 90) 'Arc2
Dim iterator as New GraphicsPathIterator(gp)
Dim i as Integer = 0
Dim MyPts(3) As PointF
Dim temp as New GraphicsPath
Do until i > 2
iterator.NextMarker(temp)
MyPts(i) = temp.PathPoints(0)
MyPts(i + 1) = temp.GetLastPoint()
i += 2
Loop
'Free system resources...
iterator.Dispose()
temp.Dispose()
Arc1 -> start: MyPts(0); end: MyPts(1)
Arc2 -> start: MyPts(2); end: MyPts(3)
Hope this helps!
Take a look at the PathPointType Enum (System.Drawing.Drawing2D).
Value Meaning
0 Start (path)
1 Line
3 Bezier/Bezier3
7 PathType Mask
16 Dash Mode
32 Path Marker
128 Close Subpath
This one was bugging me a lot too! I had path created beyond my control without markers and couldn't figure out curve endpoints.
In this case you'd expect that the curve starts at [i + 1] but it is not! It turns out that GDI combines path points probably to make the points array shorter. In this case the curve points are: [0], [1], [2], [3].
It seems that if PathPointType.Start or PathPointType.Line is followed by PathPointType.Bezier, then you have to treat the PathPontType.Start or Path.PointType.Line as a first point of your Bezier curve, so in your example it should be like this:
-177.47, 11021.67, 1 // Draw line to this point AND use it as a Bezier start!
-614.44, 11091.82, 3 // Second Bezier point
-1039.79, 10842.28, 3 // Third Bezier point
-1191.76, 10426.62, 3 // Fourth Bezier point AND first point of the next Bezier!
-1591.56, 10493.59, 3 // Second Bezier point
-1969.96, 10223.77, 3 // Third Bezier point
-2036.92, 9823.96, 3 // Fourth Bezier point AND first point of the next Bezier!
-2055.82, 9711.18, 3 // Second Bezier point
-2048.09, 9595.54, 3 // Third Bezier point
-2014.38, 9486.27, 3 // Fourth Bezier point
So, when analysing PathPoints array point by point, you have to also check current and following indices. The docs on PatPointType might come in handy. In most cases you can probably ignore additional data stored on bits other than the first three (these three define Start, Line and Bezier). The only exception is CloseSubpath but it's irrelevant if you consider the next advice.
It's also worth noting that if you have a complex path that consists of huge number of PathPoints then it might be handy to split the path into chunks using GraphicsPathIterator. This simplifies the whole procedure as PathPointType.CloseSubpath can be ignored - it will be always the last point of your GraphicsPath chunk.
A quick look into Reflector or here might be helpful if you want to better understand PointTypes array.

Resources