I am attempting to read data files stored as a .txt, some of which are very large (>1 GB). It seems that every time QFile attempts to use the .open() method on files larger than 600MB, it freezes and crashes. Is there a better way to open large files than QFile? None of the code below the if (_file.open(QIODevice::ReadOnly)) line shown below executes, so I believe the crash occurs where the open method is called.
I understand from answers to similar questions that reading in large text files is not a great way to handle huge amounts of data, but unfortunately these are log files that I have no control over. I need to be able to read these files OR elegantly handle/ignore an oversized file, but I can't find information on how to detect the maximum read size. I would rather not have to manually open and split these files in a text editor, as I have about a terabyte of these to process and manually splitting could lead to loss of important information. I am not overly concerned with the responsiveness of this program, and any method used to open files can sit and think for quite awhile, as this program will be used for data processing not any kind of user interaction.
Thanks for your help
Code:
void FileRead::openNewFile()
{
if(_listOfFiles.size()>0)
{
_file.setFileName(_listOfFiles.at(0));
if (_file.open(QIODevice::ReadOnly)) //file opened successfully
{
_file.reset();
emit fileOpened();
emit fileOpened(_file.fileName());
qDebug()<<"File Opened";
qDebug()<<_file.fileName();
}
else
{
qDebug()<<"Unable to open file";
qDebug()<<_listOfFiles;
_listOfFiles.removeAt(0);
emit fileSent();
}
}
else
{
qDebug()<<"All files processed";
}
}
I think you're re-using a QFile that's already open, and this might be problematic.
The call to reset() is pointless - you've just opened the file, it is reset by definition.
You have not provided a backtrace of where exactly does your code crash. I cannot reproduce your problem - I have a 16GB sparse file that I can open, read from, and close, successfully, on both Qt 4.8 and Qt 5.2, on both Windows 7 and OS X.
If you write a minimal test case for this (a stand-alone application that does nothing but opens the file, reads a few bytes from it, and closes it), you'll likely find that it doesn't crash - the problem is elsewhere in your code.
Related
So I have ffmpeg writing its progress to a text file, and I need to read the new values (lines) from the said file. How should I approach this using Qt classes in order to minimize the amount of code I have to write?
I don't even have an idea where to start, other than doing ugly things like seeking to the end, storing this pos, then seeking to the end again a bit later and comparing the new pos to the previous one. It's unclear to me if QTextStream can be used here or not, for instance.
I used Win32 API own interface for the file system notification some time ago and that worked 100% reliably. Modern OSes provide us with notifications for the file change. And Qt incorporates such functionality as well. Specifically for the purpose of tracking the file changes I would use QFileSystem::fileChanged signal to start the slot myFileReadNextBuffer() method only in case if the file was changed. But then you would still want to evaluate how many bytes were added by subtracting the previous from the new file length. And there is also relative question here: How to know when and which files are changed in windows filesystem with winapi.
If the file is only growing:
Whether the file is text-based or not I would open it in shared mode and read to the end and read more till the end when the notification received.
I need very simple text file logging. I'll only append lines to it. never change existing ones nor delete them. If it would be XML file it would be easier to bind to grids to view them. but question remains for both text files and xml files as they are in file system.
in web server there will be file locking while appending log entries. and maybe also while reading them. So this method has to be thread safe. At the same moment multiple instances can write date to file.
I know there are some third party tools like serilog etc but I want to know:
how can I append (not change) lines to text file (or xml file) without concerning about file locks ?
if I read xml file to dataset, add a new row to it and save it as xml I would use other entries made by other instances.
if I open a text file with streamwriter and append a line to it, other instances would get lock error.
I get the list of logs from admin panel again, file will be locked and instances wouldn't append logs.
any ideas ?
After long reserch hours and experiments I found out that using Nlog is the best option for me. most important thing is people who use it are very happy. I created small example page that writes a log everytime it called and tested it. I have a multithreaded application that calls this sample page again and again. If was fast enough so I could not see the counting numbers of threads. no problem raised so far.
So, I'll stick to Nlog.
best.
I am new to Qt, and I was learning on its Getting Started Page. I want to know what does the following statements mean and why are they required?
In Open function:
if (!file.open(QIODevice::ReadOnly)) {
QMessageBox::critical(this, tr("Error"), tr("Could not open file"));
return;
}
Also in Save function:
if (!file.open(QIODevice::WriteOnly)) {
// error message
}
I was unable to run these functions without these lines. I tried reading about Error Handling in the documentation but couldn't exactly find what do these statements mean.
You can open files for reading and for writing. Using QIODevice::WriteOnly or QIODevice::ReadOnly flags you are specifying mode in which you will open particular file.
But, why does it matter?
Suppose you have one file opened in several instances of different programs, and that there is no such thing as specifying file mode. Now, if every files are reading file - since they all have different pointer to the current position in the file - this is not a problem - since all programs will get the latest and correct information from file. But, if only one programs write something into file - your data will be inconsistent, so other programs will potentially read wrong data.
Intuitive approach would be to send a message to all programs that are attached on this file, so they could update themselves. But - what to do if the file is deleted? Or if there is no possibility to set the proper position in the new data? Also, every program now needs to have interface in order to be notified, and the whole message passing idea can be very slow (aside that it doesn't work).
So - there is simply consensus made - multiple programs can open file for reading - as they will all have the same and consistent data. But, if the only one program is signaling the operating system that it wants to gain write permissions - the file must not be opened in any program - nor for reading - nor for writing! Depending on the implementation, operating system may block the caller until all files are closed, or it can simply ignore the call and send the error information to caller - which is often a better idea, as the program (or the user) can block itself and try again later, or it can simply ask user to save into another destination, or it can send us creepy error message - but it will not be able to write into file.
Last paragraph is describing what is known as multiple readers-single writer technique, so you may want to look it up on the internet or concurrency classes textbooks.
How to read bigger than 600mb file in qt?
I am trying to read file using file.readAll(). It works small files. but it gives bad_alloc error in large files? what should I do?
Try adding
QMAKE_LFLAGS += -Wl,--large-address-aware
to your qt pro file, from what I understand it will allow a process to accumulate more memory than 2GB.
Don't do it.
It's rarely necessary to load a huge file into memory in one operation.
You can't be loading this much information for user navigation or manipulation, so if, as I suspect, you are simply acting as an intermediary between having the file on disc and sending the file somewhere else then use a mechanism which treats the Qfile as a QIODevice instead of loading it all completely as a QString or QByteArray.
If you (or your customers) are on Windows using a 32 bit system but are likely to have more than 2Gb of RAM at your disposal, you might want to be aware of the /LARGEADDRESSAWARE linker option which will allow you to support addresses larger than 2Gb and which may also improve your situation if you are truly unable to do without loading the file into memory.
Consider reading the file in chunks, instead of not all at once. Of course, your goal might be to display the entire file in a text editor, in which case loading it partially is more complicated. You're being very vague, so it's hard to be more specific.
I have a Qt desktop application that works on Linux and Windows. At some point I'm hoping to port it to MacOS X and other *nix systems too.
My problem is that, a part of the application has a functionality that allows users to drag and drop files into, and out from an archive. The UI is kinda like that of WinZip or similar GUI based archivers. But Qt's drag and drop system wants the data to be prepared when the user starts dragging files from the archive.
What I currently do is, extract the dragged files to a temporary location, and supply those file names as data. But it's undesirable because extracting deep directory trees take a good amount of time, and it causes the GUI to freeze during that time. It would be nice if I could do that operation when the user decides to drop the files, not when he/she starts to drag. Unfortunately Qt docs don't say anything about this.
I know how to achieve this using Windows API, and I'm pretty sure that most systems have a way to do that too. But I want to avoid writing platform specific code as much as possible.
Is there a Qt way to achieve that? Am I missing something?
I may be misunderstanding, but I think what you want to do is supply enough information in a QMimeData for the QDrag you create that you can find the files in the archive after the user drops it without having to extract them first. So, if your code on the drop end doesn't know which archive the files have come from, you need to supply the path to the archive in your mime data, too.
The drag starts as a message:
"I'm dragging Archive1:FileA, Archive1:FileB" but no file extraction.
It ends on the other end by interpreting the message and then extracting the files. I'd probably set up some kind of simple ICD for both sides of the message transfer. If you can only drag from one archive at a time, maybe a string list with the first element being the archive and following ones being the files:
QStringList list;
list << archivePath;
list << fileName1;
list << fileName2;
QByteArray ba;
QDataStream stream(&ba);
stream << list;
QMimeData* mime = new QMimeData;
mime->setData("yourType", ba);
I hope this helps!