I'm writing some software to read DICOM files and I am not sure how to deal with tags that have an undefined length. The standard says that
"If the Value Field has an Explicit Length, then the Value Length Field shall contain a value equal to the length (in bytes) of
the Value Field. Otherwise, the Value Field
has an Undefined Length and a Sequence Delimitation
Item marks the end of the Value Field. "
So to read a value with undefined length FFFFFFFF I would continue reading bytes until I hit a sequence delimitation item FFFEE0DD. What happens if the value contains a series of bytes that happen to be equal to a sequence delimitation item? How do I correctly locate the sequence delimitation item?
Undefined attributes are used in DICOM for both SQ tags, and for Pixel Data (7fe0,0010). In both cases, The chunks of data can be encoded using one more more Item Tags (FFFE,E000), and the end of the attribute is signaled by the Sequence Delimitation Item (FFFE,E0DD).
In the case of Pixel Data, each fragment of pixel data is encoded with an Item Tag (FFFE,E000). Each fragment of pixel data is encoded as a fixed length. Each frame of the pixel data can be composed of one or more fragments of pixel data. The first Item Tag encoded in the pixel data is a basic offset table. If a frame is encoded with more than one fragment, this offset table will tell the offset within the pixel data of each frame. If the offset table is zero length, then each frame is encoded in single fragment. You can see an example of the encoding in DICOM Part 5, Table A.4-1. In any case, you should be able to parse the contents of the pixel data by repeatedly reading 8 bytes of pixel data to get the Item Tag/Sequence Delimiter Item + the length of each fragment, then reading the actual number of bytes specified by the length, and repeating.
You can see an example for SQ attribute encoding in Part 5 of DICOM, Table 7.5-2 and Table 7.5-3. This works in a similar fashion, with the exception that the length associated with an item tag could be "undefined length". In the case of Sequences, however, you can keep parsing the contents of the sequences, since they're just tag data anyways.
Related
I am a somewhat newbie to SQLite (and KMyMoney). KMyMoney (an open source personal finance manager) allows one-click exporting data into an SQLite database.
On browsing the SQLite database output, the dollar amount data is stored in a table called kmmSplits as several text fields in a strange format based on “value” and “valueFormatted” (see screen shot below). The “value” field is apparently written as a division equation (in a text format) which apparently yields the “valueFormatted” field (again in text format). The “valueFormatted is the correct number amount but the problem is that parenthesis are used to indicate a negative number instead of a simple minus in front of the value. This is apparently an accounting number format, but I don’t know how to parse this into a float value for running calculated SQL queries, etc. The positive values (without parenthesis) are no problem to convert to FLOATS.
I’ve tried using the CAST to FLOAT function but this does not do the division math, nor does it convert parenthesis into negative values (see screen shot).
The basic question is: how to parse a text value containing parenthesis in the “valueFormatted field (accounting money format) into a common number format OR, alternatively, how to convert a division equation in the “value” field to an actual calculation.
Use a CASE expression to check if valueFormatted is a numeric value inside parentheses and if it is multiply -1 with the substring starting from the 2nd char (the closing parenthesis will be discarded by SQLite during this implicit type casting):
SELECT *,
CASE
WHEN valueFormatted LIKE '(%)' THEN (-1) * SUBSTR(valueFormatted, 2)
ELSE valueFormatted
END AS value
FROM kmmSQLite;
Or, replace '(' with ''-'' and add 0 to covert the result to a number:
SELECT *,
REPLACE(valueFormatted, '(', '-') + 0 AS value
FROM kmmSQLite;
I need to count number of lines in each block and count number of blocks in order to read it properly afterwards. Can anybody suggest a sample piece of code in Fortran?
My input file goes like this:
# Section 1 at 50% (Name of the block and its number and value)
1 2 3 (Three numbers in line with random number of lines)
...
1 2 3
# Section 2 at 100% (And then again Name of the block)
1 2 3...
and so on.
The code goes below. It works fine with 1 set of data, but when it meets " # " again it just stops providing data only about one section. Can not jump to another section:
integer IS, NOSEC, count
double precision SPAN
character(LEN=100):: NAME, NAME2, AT
real whatever
101 read (10,*,iostat=ios) NAME, NAME2, IS, AT, SPAN
if (ios/=0) go to 200
write(6,*) IS, SPAN
count = 0
102 read(10,*,iostat=ios) whatever
if (ios/=0) go to 101
count = count + 1
write(6,*) whatever
go to 102
200 write(6,*) 'Section span =', SPAN
So the first loop (101) suppose to read parameters of the Block and second (102) counts the number of lines in block with 'ncount' as the only parameter which is needed. However, when after 102 it suppose to jump back to 101 to start a new block, it just goes to 200 instead (printing results of the operation), which means it couldn't read the data about second block.
Let's say your file contains two valid types of lines:
Block headers which begin with '#, and
Data lines which begin with a digit 0 through 9
Let's add further conditions:
Leading whitespace is ignored,
Lines which don't match the first two patterns are considered comments and are ignored
Comment lines do not terminate a block; blocks are only terminated when a new block is found or the end of the file is reached,
Data lines must follow a block header (the first non-comment line in a file must be a block header),
Blocks may be empty, and
Files may contain no blocks
You want to know the number of blocks and how many data lines are in each block but you don't know how many blocks there might be. A simple dynamic data structure will help with record-keeping. The number of blocks may be counted with just an integer, but a singly-linked list with nodes containing a block ID, a data line count, and a pointer to the next node will gracefully handle an arbitrarily large blob of data. Create a head node with ID = 0, a data line count of 0, and the pointer nullify()'d.
The Fortran Wiki has a pile of references on singly-linked lists: http://fortranwiki.org/fortran/show/Linked+list
Since the parsing is simple (e.g. no backtracking), you can process each line as it is read. Iterate over the lines in the file, use adjustl() to dispose of leading whitespace, then check the first two characters: if they are '#, increment your block counter by one and add a new node to the list and set its ID to the value of the block counter (1), and process the next line.
Aside: I have a simple character function called munch() which is just trim(adjustl()). Great for stripping whitespace off both ends of a string. It doesn't quite act like Perl's chop() or chomp() and Fortran's trim() is more of an rtrim() so munch() was the next best name.
If the line doesn't match a block header, check if the first character is a digit; index('0123456789', line(1:1)) is greater than zero if the the first character of line is a digit, otherwise it returns 0. Increment the data line count in the head node of the linked list and go on to process the next line.
Note that if the block count is zero, this is an error condition; write out a friendly "Data line seen before block header" error message with the last line read and (ideally) the line number in the file. It takes a little more effort but it's worth it from the user's standpoint, especially if you're the main user.
Otherwise if the line isn't a block header or a data line, process the next line.
Eventually you'll hit the end of the file and you'll be left with the block counter and a linked list that has at least one node. Depending on how you want to use this data later, you can dynamically allocate an array of integers the length of the block counter, then transfer the data line count from the linked list to the array. Then you can deallocate the linked list and get direct access to the data line count for any block because the block index matches the array index.
I use a similar technique for reading arbitrarily long lists of data. The singly-linked list is extremely simple to code and it avoids the irritation of having to reallocate and expand a dynamic array. But once the amount of data is known, I carve out a dynamic array the exact size I need and copy the data from the linked list so I can have fast access to the data instead of needing to walk the list all the time.
Since Fortran doesn't have a standard library worth mentioning, I also use a variant of this technique with an insertion sort to simultaneously read and sort data.
So sorry, no code but enough to get you started. Defining your file format is key; once you do that, the parser almost writes itself. It also makes you think about exceptional conditions: data before block header, how you want to treat whitespace and unrecognized input, etc. Having this clearly written down is incredibly helpful if you're planning on sharing data; the Fortran world is littered with poorly-documented custom data file formats. Please don't add to the wreckage...
Finally, if you're really ambitious/insane, you could write this as a recursive routine and make your functional programming friends' heads explode. :)
As the title says :
What is the maximum length of a string that can be stored inside a custom dimension in google analytics?
So the maximum length of Custom Dimension value must not exceed 150 bytes.
If you are using plain test, it can be upto: 150 characters (best case)
Worst case: - 37 characters
There are various tools to calculate the bytes out of Characters. One of it I found: https://mothereff.in/byte-counter
Reference: https://developers.google.com/analytics/devguides/collection/analyticsjs/field-reference#dimension
Update - GA4 limitations
Events can have a maximum of 25 parameters.
Events can have a maximum of 25 user properties.
User property names must be 24 characters or fewer.
User property values must be 36 characters or fewer.
Parameter names (including item parameters) must be 40 characters or fewer, may only contain alpha-numeric characters and underscores, and must start with an alphabetic character.
Parameter values (including item parameter values) must be 100 characters or fewer.
The post body must be smaller than 130kB.
Reference: https://developers.google.com/analytics/devguides/collection/protocol/ga4/sending-events?client_type=gtag#required_parameters
It is possible to use getValue(), getCalculatedValue() and getOldCalculatedValue() to retrieve the value of a cell in phpexcel.
Is there a way to determine programatically the content type of the cell and apply the corresponding correct method. I need to use this in a general way. i.e. to display the same value as when opening excel.
I know there is something called getDataType() but not sure how to apply it in this case (not in documentation). In my experience sometimes only one of these three retrieves the correct value.
(i.e. sometimes getOldCalculatedValue works but not getCalculatedValue for a formula for example. other times only getvalue works, etc.)
getOldCalculatedValue() is used to retrieve the result of a previous calculation in MS Excel itself; and should not be relied on, because it is possible to disable autocalculate in MS Excel, which can leave this field empty, or even with an incorrect value. It is used within PHPExcel as a "fallback" for cell formulae that are reliant on external spreadsheet data, but it still shouldn't be trusted as an absolute.
getValue() returns the "raw" value of the cell. The returned value may require "interpretation". A cell containing a date and/or time is simply a float value in MS Excel, so it will return that float (e.g. 42017.7916666667 instead of a human-readable date/time like 13-Jan-2015 19:00;
and it will return the actual formula if a cell contains a formula (e.g. =TODAY()); or 0.8 for a value that might be formatted as a percentage and that appears as 80% in MS Excel itself.
getCalculatedValue() will attempt to execute a formula calculation if a cell contains a formula, and return the result of that calculation. If the cell doesn't contain a formula, then it will return the "raw" value, in the same way as getValue(). While PHPExcel has a fairly good calculation engine, it isn't perfect (it can't handle 3d cell ranges or array formulae for example), so it is possible for some formulae to fail. Likewise, formulae containing references to external resources may also fail, and while PHPExcel will attempt to use the getOldCalculatedValue() in that circumstance, it isn't (as mentioned above) guaranteed to maintain the correct result.
getFormattedValue() will execute getCalculatedValue(), and then apply any number formatting mask that applies to that cell against the result, so that (for example) a float with a date mask will be displayed as a date.
However, if you've loaded a spreadsheet file with readDataOnly(true), then that tells PHPExcel not to load any formatting, including number format masks, so it will not be able to format the result.
When you access MS Excel itself, then the closest result to the values displayed in MS Excel itself will be getFormattedValue()
I have some products which have 2d GS1 bar codes on them. Most have the format 01.17.10 which is GTIN.Expiry Date.Lot Number.
This makes sense as 01 and 17 are fixed length, so can be parsed easily, just by splitting the string in the appropriate place.
However, I also have some in the format 01.10.17.21 (GTIN.Lot.Expiry.Serial Number) which doesn't make sense because Lot and Serial number are variable length, meaning I cannot use position to decode the various elements. Also, I cannot search for the AIs as they could legitimately appear in the data.
It seems that I've no way of reliably decoding this format. Am I missing something?
Thanks!
According to the GS 1 website, "More than one AI can be carried in one bar code. When this happens, AIs with a fixed length data content (e.g., SSCC has a fixed length of 18 digits) are placed at the beginning and AI with variable lengths are placed at the end. If more than one variable length AI is placed in one bar code, then a special "function" character is used to tell the scanner system when one ends and the other one starts."
So it looks like they intend for you to order your AIs with the fixed width identifiers first. Then separate the variable-width fields with a function character, which it, appears is FNC1, but implementing that that will depend on the barcode symbology you are using, It may be different between DataMatrix, Code 128 and QR Code for example.