In describing how find (& find and replace) work, the Atom Flight Manual refers to the buffer as one scope of search, with the entire project being another scope. What is the buffer? It seems like it would be the current file, but I expect it is more than that.
From the Atom Flight Manual:
A buffer is the text content of a file in Atom. It's basically the same as a file for most descriptions, but it's the version Atom has in memory. For instance, you can change the text of a buffer and it isn't written to its associated file until you save it.
Also came across this from The Craft of Text Editing by Craig Finseth, Chapter 6:
A buffer is the basic unit of text being edited. It can be any size, from zero characters to the largest item that can be manipulated on the computer system. This limit on size is usually set by such factors as address space, amount of real and/or virtual memory, and mass storage capacity. A buffer can exist by itself, or it can be associated with at most one file. When associated with a file, the buffer is a copy of the contents of the file at a specific time. A file, on the other hand, can be associated with any number of buffers, each one being a copy of that file's contents at the same or at different times.
Related
In a Mach-O executable, I am trying to increase the size of the __LLVM segment that precedes the __LINKEDIT segment (with a home-grown tool). I am considering two strategies: (a) move the __LLVM segment to after the __LINKEDIT segment, producing a file that is not what ld would create (now with a gap and section addresses out of order), and (b) move the __LINKEDIT segment to allow resizing of the __LLVM segment that precedes it. I need the result to be accepted for downstream processing, e.g. generating an .ipa file or sending to the App Store.
This question is about my assumptions and the viability of these approaches. Specifically, what are the potential pitfalls of each that might lead them to fail?
I implemented the first approach (a) is understood by segedit's -extract option, but its -replace option complains that the segments are out of order. I append a new segment to the file and update the address and length values in the corresponding load command to refer to this new segment data (both in the file and the destination memory). This might be fine, as long as the other downstream processing will accept the result (still to check; e.g. any local signature is likely invalidated).
The second approach (b) would seem cleaner, as long as there are no references into the __LINKEDIT segment, which I guess contains linking information (symbol tables etc., rather than code). I have not tried this yet, though it seems to be a foregone conclusion that segedit will be happy with the result, which may suggest other processing might also be happier. Are there likely to be any references that are invalidated due to simply moving this segment? I am guessing that I will have to update further load commands (they seem to reference into the __LINKEDIT segment), which I have not examined, but this should be fairly straightforward.
EDIT: Replaced my confused use of "section" with "segment" (mentioned in answer).
ADDED: Context is where I have no control of generating the original executable. I need to post-process it, essentially performing a 'segedit -replace' process, wherein the a section in the segment is to be replaced with a section that is larger than space previously allocated for the segment.
RUN-ON clarifying question: It seems from the answer that moving the __LINKEDIT segment will break it. Can this be fixed by adjusting load commands only (e.g. LC_DYLD_INFO_ONLY, LC_LOAD_DYLINKER, LC_LOAD_DYLIB), not data in any segments? I am not yet familiar with these load commands, and would like to know whether to pursue this.
So basically the segments and sections describe how the physical file maps onto virtual memory.
As I mentioned in my previous iteration of the answer there are limitations on the segments order:
__TEXT section must start at executable physical file offset 0
__LINKEDIT section must not start at physical file offset 0
__LINKEDIT's File Offset + File Size should be equal to physical executable size (this implies __LINKEDIT being the last segment). Otherwise code signing won't work.
__DYLD_INFO_ONLY contains file offsets to dyld loading bind opcodes for:
rebase
bind at load
weak bind
lazy bind
export
For each kind there is file offset and size entry in __DYLD_INFO_ONLY describing the data in file that matches __LINKEDIT (in a "regular" ld linked executable). __DYLD_INFO_ONLY does not use any segment & section information from __LINKEDIT directly, the file offsets and sizes are enough.
EDIT also as mentioned in #kirelagin answer here
"Apparently, the new version of dyld from 10.12 Sierra performs a check that previous versions did not perform: it makes sure that the LC_SYMTAB symbols table is entirely within the __LINKEDIT segment."
I assume since you want to inflate the size of the preceding __LLVM segment you would also want some extra data in the file itself. Typically data described by __LINKEDIT (i.e. not the segment & sections themselves, but the actual data) won't use 100% of it's space so it could be modified to start "later" and occupy less space.
A tool called jtool by Jonathan Levin could probably do it for you.
I know this is an old question, but I solved this problem while solving another problem.
define the slide amount, this must be page-aligned, so I choose 0x4000.
add the slide amount to the relevant load commands, this includes but is not limited to:
__LINKEDIT segment (duh)
dyld_info_command
symtab_command
dysymtab_command
linkedit_data_commands
physically move the __LINKEDIT in the file.
Most file systems use locking to handle concurrent read/write. But what if after a read call, a write call is executed which deletes the data preceding the previous read call.
Is the pointer for a file open for reading updated to reflect the new start of the now smaller file?
The question isn't really valid, because you can't delete data using the write system call. You can overwrite data using the write(2) system call, but you can't delete data. Now, you can truncate the file using the truncate(2) system call. This changes the size of the file (reported via the st_size field by the stat(2) system call), and any bytes after the end of the file as reported by changed st_size will be zero. You can increase the size of the file using the truncate system call by requesting a new size which is larger than the current size. It is undefined (per the POSIX specification) whether this is allowed, or what the system will do when it recieves a truncate larger than the current size of the file. On many file systems it will simply set the size of the file to requested size.
OK, a few more concepts. Associated with each open file structure is a file offset pointer. Attempts to read or write a file using the read(2) or write(2) system call will advance the offset pointer by the number of bytes read or written. If you open a file twice using the open(2) system call, you will get two file descriptors, which each refer to a different open file structure, and in that case, a read(2) or write(2) using one file descriptor will not change the file offset for the other file descriptor. (If you clone a file descriptor using the dup(2) system call, then then you will get a second file descriptor which points to the same file structure, and then changes made to the file structure via one file descriptor, using the read(2), write(2), or lseek(2) system calls will be reflected via the cloned file descriptor. But that's a side issue, so that's all I will say on this topic for now.)
Now, if you truncate the file, this doesn't change the file offset in the file descriptor. However, any bytes after the truncated size will be zero if read. So the answer is that file offset pointer won't be updated after the truncate, but an attempt to read beyond the truncated size of the file will return all zeros.
I am pretty much new in Mongodb now what I want to do is to insert a pdf file of 3MB using JAVA driver and want change the chunk size from 256 to 1mb and then want to retrieve the second chunk say 2nd page of the pdf document.
How can I do so.
Thankyou.
Generally, once a document has been written into GridFS you will need to re-write it (delete and save again) to modify the chunk size.
Since GridFS does not know anything about the format of the data in the file it can not help you get to the "2nd page". The InputStream implementation that is returned from GridFSDBFile does avoid reading blocks when you use the skip(long) method. If you know that the "2nd page" is N bytes into the file then you can skip that many bytes in the stream and start reading.
HTH, Rob
P.S. Remember that skip(long) returns the number of bytes actually skipped. You should not assume that skip(12) always skips 12 bytes.
P.P.S Starting to read from the middle of a PDF and making sense of what is there is going to be hard unless you have preserved state from the previous page(s).
The question may looks duplicate. But i am not getting the answer which i am looking.
The problem is, in unix, one of the 4GL binary is fetching data from the table using cursor and writing the data in .txt file.
The table contains around 50 Million records.
The binary took lot of time and not completing. the .txt file is also 0 byte.
I want to know the possibilities why the records are not written in the .txt file.
Note: There is enough disk space available.
Also, for 30 Million records, i can get the data in the .txt file as i expected.
The information you provide is insufficient to tell for sure why the file is not written.
In UNIX, a text file is just like any another file - a collection of bytes. No specific limit (or structure) is enforced on "row size" or "row count," although obviously, some programs might have certain limits on maximum supported line sizes and such (depending on their implementation).
When a program starts writing data to a file (i.e. once the internal buffer is flushed for the first time) the file will no longer be zero size, so clearly your binqary is doing something else all that time (unless it wipes out the file as part of the cleanup).
Try running your executable via strace to see the file I/O activity - that would give some clues as to what is going on.
Try closing the writer if you are using one to write to the file. It achieves the dual purpose of closing the resource along with flushing the remaining contents of the buffer.
CPU calculated output needs to be flushed if you are using any mechanism of buffered writer. I have encountered such situations a few times and in almost all cases, the issue was that of flushing the output.
In java specifically, usually the best practice of writing data involves buffers. So when the buffer limit is reached, it gets written to the file but doesn't get written to the file when the end of buffer has not been reached yet. This happens when program closes without flushing the buffered writer.
So, in your case, if the processing time that it takes is reasonable and still the output is not on the file, it may mean that the output has been calculated and put on the RAM but could not be written to the file (which represents the disk) due to the output not being flushed.
You can also consider the answers to this question.
I am using connect direct with scp and trying to send some pdf files from unix to mainframes.
On unix end, I have archive containing pdfs which I am simply renaming to ABC.XYZ.LMN.PQR (mainframe file name) and then sending to mainframe.
The archive contains variable length pdf files.
However, the requirement is:
For any variable length file mainframe needs to know the maximum possible length, of any record in the file. For e.g. say the LRECL is 1950.
How to include LRECL as well when the pdf files inside the archive file to be sent is of variable length ?
An alternative would be to transfer the files to a Unix System Services file (z/OS Unix) instead of a "traditional" z/OS dataset. Then the folks on the mainframe side could use their utilities to copy the file to a "traditional" mainframe dataset if that's what they need.
For Variable Blocked datasets Only! If your maximum record size is 1950 you will want to specify RECFM=VB,LRECL=1954 adding 4 bytes more than your maximum record. This 4 byte allowance is for the Record Descriptor Word (RDW). If you need to specify BLKSIZE then the minimum is the size of LRECL plus another 4 bytes.
So in your example, your JCL will have DCB parameters that looks like: RECFM=VB,LRECL=1954,BLKSIZE=1958
Ideally, for optimal storage, BLKSIZE should be set to the largest size that does not exceed the device specific recommendation. i.e. TAPE devices typically try to use BLKSIZE=32670 (32 * 1024K - 8 for RDW & BDW). Disk drives may vary but in our shop BLKSIZE=23476 is considered optimal.
Incorrect blocking factors can waste tremendous amounts of space. When in doubt, let your system defaults apply or consult with your local system guru's for their device specific recommendations.