I am trying to convert data from Act 2000 to a MySQL database. I have successfully imported the DBF files into individual MySQL tables. However I am having issues with the *.BLB file, which seems to be a non-standard memo file.
The DBF files, identifies themselves as dbase III Plus, No memo format. There is a single *.BLB which is a memo file for multiple DBFs to share BLOB data.
If you read this document: http://cicorp.com/act/sdk/ACT6-SDK-ChapterA.htm#_Toc483994053)
You can see that the REGARDING column is a 6 character one. The description is: This 6-byte field is supplied by the system and contains a reference to a field in the Binary Large Object (BLOB) Database.
Now upon opening the *.BLB I can see that the block size is 64 bytes. All the blocks of text are NULL padded out to that size.
Where I am stumbling is trying to convert the values stored in the REGARDING column to blocks location in the BLB file. My assumption is that 6 character field is an offset.
For example, one value for REGARDING is, (ignoring the square brackets): [ ",J$]
In my Googling, I found this: http://ulisse.elettra.trieste.it/services/doc/dbase/DBFstruct.htm#C1.5
It explains that in memo fields (in normal DBF files at least) the space value is ignore (i.e. it's padding out the column).
Therefore if I'm correct (again, square brackets) [",J$] should be the offset in my BLB file. Luckily I've still got access to the original ACT2000 software, so I can compare the full text in the program / MySQL and BLB file.
Using my example value, I know that the DB row with REGARDING value of [ ",J$] corresponds to a 1024 byte offset (or 16 blocks, assuming my guess of a 64 byte sized block).
I've tried reading some Python code for open source projects that read DBF files - but I'm in over my head.
I think what I need to do is unpack the characters to binary, but am not sure.
How can I find the 64-block based spot to read from based on what's found in the DBF files?
EDIT by Jerry Dodge
I've attempted to reverse-engineer the strings in this field to hexadecimal values, and then to an integer value using StrToInt64, but the result still does not match up with the blob file. I've also tried multiplying this integer value by 64 and not multiplying, but the result keeps winding up outside of the size of the blob file, not actually finding any data.
For example, a value of ___/BD (_ = space) translates to $2f4244 hexidecimal, which in turn translates to the integer value of 3097156, but does not correspond with any relevant portion of data in the blob file, even when multiplied or divided by 64.
According to the SDK you linked, the following happens as I understand:
There is a TYPE field (right behing REGARDING) that encodes what REGARDING is used for (see the second table of the linked chapter). So I'd assume that if type=6 (meeting not held) the REGARDING is either irrelevant or only contains a meeting ID reference from some other table. On that line of thought I would only expect REGARDING to be a BLB offset if type=101 (or possibly 100). I'd also not abandon the thought that in these relevant cases TYPE might be a concatenation of BLB file index and offset (because there is a mention that each file must not be longer than 30K chars and I really expect to be able to store much more data even in one table).
Related
I would like to write a line in a text file at a given position (i) by avoiding the sequential reading.
There is WriteLines base function but I don't know how to insert the text at position (i) given as parameter.
Thanks
Dave
This is — unrelated to R — fundamentally impossible. Most (all common) filesystems do not support inserting or removing content in the middle of a file. The only supported operations are appending (or truncation) at the end, and R only supports appending, not truncation.
The way virtually all software solves your problem is by reading the file, modifying it, and writing it back to disk. If you want to get fancy because the file is very large (at least in the order of hundreds of MiB), you can stream edit the file: Read a part, edit that part, write it back to a new file. Rinse and repeat.
Technical aside: There is one exception to the above with low-level file operations, since files are stored as as non-contiguous “blocks”. But even if R supported this it wouldn’t help you since it doesn’t permit byte-level or line-level granularity: Blocks are typically at least 4 kiB in size.
I have an SQLite database in which I altered a table to add a column that will contain a kind of permanently unique ID for each row (in addition to the existing INTEGER PRIMARY KEY which might be reassigned and thus not permanent). I also want to avoid accidentally mixing up the normal ID's and the new "permanent ID's", therefor I decided to use a TEXT column and give each value a prefix, for example pid-.
So I simply added a column named perma_id with the type TEXT and ran UPDATE mytable SET perma_id = 'pid-' || _rowid_ to assign values for the existing rows. I then saved and compacted/vacuumed the database and compressed it into a zip-file because I will include it in an Android APK.
I noticed that the filesize had gone up from 379kB to 417kB after adding the new column. This is of course expected. But as an experiment, I thought maybe I could reduce the filesize by just using p... instead of pid-... for the perma_id column values, so I reassigned all the values. But to my surprise, the filesize had instead increased to 420kB! I experimented a bit further, and I can consitently get the (compressed) filesize o become 417kB with pid-... and 420kB with p.... As expected, using an INTEGER column reduces the filesize further, but only to 414kB.
This makes me wonder - what is the black magic behind the smaller file size when using a longer string as a prefix in the perma_id column? And is there a way to determine which string would produce the smallest filesize?
Edit
Just tried using the prefix perma-id-..., which results in a compressed file size of 414kB - i.e. same as using an INTEGER column with just the number after the prefix. So I tried very-long-permanent-id-with-the-value-... as prefix - 413kB. Mind = blown.
Did you try running the VACUUM command on the database before zipping each time?
When you shortened the Primary Key values it may have reduced the size of the data but kept the .DB file the same size, as SQLite doesnt automatically reduce the file size it just marks chunks of the file as 'overwriteable'. Until, that is, you run the VACUUM to throw away all this spare space.
I'm guessing the 'overwriteable' proportion of your file was hard to zip. Then when it got filled up with lots of repeating text saying "permanent-id-with-the-value-" - it got easier to zip!
Hi guys I encrypted school project but my AES saved txt has been deleted, I pictured it before and I filled a new file. But new AES key file is not equal to the typed in jpeg file. Which character is wrong I couldn't find it. Could you please help me.
Pic : https://i.stack.imgur.com/pAXzl.jpg
Text file : http://textuploader.com/dfop6
If you directly convert bytes with any value to Unicode you may lose information because some bytes will not correspond to a Unicode character, a whitespace character or other information that cannot be easily distinguished in printed out form.
Of course there may be ways to brute force your way out of this, but this could easily result in very complex code and possibly near infinite running time. Better start over, and if you want to use screen shots or similar printed text: base 64 or hex encode your results; those can be easily converted back.
Setting:
I have (simple) .csv and .dat files created from laboratory devices and other programs storing information on measurements or calculations. I have found this for other languages but nor for R
Problem:
Using R, I am trying to extract values to quickly display results w/o opening the created files. Hereby I have two typical settings:
a) I need to read a priori unknown values after known key words
b) I need to read lines after known key words or lines
I can't make functions such as scan() and grep() work.
c) Finally I would like to loop over dozens of files in a folder and give me a summary (to make the picture complete: I will manage this part)
I woul appreciate any form of help.
ok, it works for the key value (although perhaps not very nice)
variable<-scan("file.csv", what=character(),sep="")
returns a charactor vector of everything
variable[grep("keyword", ks)+2] # + 2 as the actual value is stored two places ahead
returns characters of seaked values.
as.numeric(lapply(variable, gsub, patt=",", replace="."))
for completion: data had to be altered to number and "," and "." problem needed to be solved.
in a line:
data=as.numeric(lapply(ks[grep("Ks_Boden", ks)+2], gsub, patt=",", replace="."))
Perseverence is not to bad of an asset ;-)
The rest isn't finished, yet, I will post once finished.
In PhpExcel library when i am assigning values to IW4 the assigned value not generatted there
Steps:
We are using The code to generate the Value to cell in PHPExcel
**$objPHPExcel->getActiveSheet()->setCellValue('A1', 'cell value here');**
When i am using it to generate value to IW4 cell the value not getting generatted
**$objPHPExcel->getActiveSheet()->setCellValue('IW4', 'cell value here');**
Please Help me to find the solution
BIFF format Excel files only allow 256 columns (up to IV), OfficeOpenXML allows more.
If you set a value in a column beyond the limit, PHPExcel only knows it's invalid at the point where you save (when it knows whether you're saving as an Excel5 or Excel2007 file), Rather than trigger an exception at that point (which would be much more frustrating if it was a long running script), it silently discards the invalid columns or rows.
This is similar behaviour to Excel itself, if you open an xlsx file in an earlier version of Excel that doesn't support as many rows and columns.