Peek on QTextStream - qt

I would like to peek the next characters of a QTextStream reading a QFile, in order to create an efficient tokenizer.
However, I don't find any satisfying solution to do so.
QFile f("test.txt");
f.open(QIODevice::WriteOnly);
f.write("Hello world\nHello universe\n");
f.close();
f.open(QIODevice::ReadOnly);
QTextStream s(&f);
int i = 0;
while (!s.atEnd()) {
++i;
qDebug() << "Peek" << i << s.device()->peek(3);
QString v;
s >> v;
qDebug() << "Word" << i << v;
}
Gives the following output:
Peek 1 "Hel" # it works only the first time
Word 1 "Hello"
Peek 2 ""
Word 2 "world"
Peek 3 ""
Word 3 "Hello"
Peek 4 ""
Word 4 "universe"
Peek 5 ""
Word 5 ""
I tried several implementations, also with QTextStream::pos() and QTextStream::seek(). It works better, but pos() is buggy (returns -1 when the file is too big).
Does anyone have a solution to this recurrent problem? Thank you in advance.

You peek from QIODevice, but then you read from QTextStream, that's why peek works only once. Try this:
while (!s.atEnd()) {
++i;
qDebug() << "Peek" << i << s.device()->peek(3);
QByteArray v = s.device()->readLine ();
qDebug() << "Word" << i << v;
}
Unfortunately, QIODevice does not support reading single words, so you would have to do it yourself with a combination of peak and read.

Try disable QTextStream::autoDetectUnicode. This may read device ahead to perform detection and cause your problem.
Set also a codec just in case.
Add to the logs s.device()->pos() and s.device()->bytesAvailable() to verify that.
I've check QTextStream code. It looks like it always caches as much data as possible and there is no way to disable this behavior. I was expecting that it will use peek on device, but it only reads in greedy way. Bottom line is that you can't use QTextStream and peak device at the same time.

Related

QRegularExpression: how to get the failing position?

I guess that this has had to be asked before, but cannot find anything about it.
I also think that maybe the answer is just right there but I can't see it either.
So, if QRegularExpression::match() has not a match, how do I know the position of the character that made the validation fail?
I'm pretty sure that internally, there should be some variable storing the "current position" as the string is being evaluated.
Yes, maybe there is backtracking in that evaluation so if the exact failing char is hard to get, at least the last good one could be easier.
Any hints? Thank you.
Edit (2022-08-08):
I'm starting to feel like it's possible that no one asked this before, in fact, considering how people think I am asking something like "why my regex does not work". Not my case.
This is not about a particular regular expression. It's about Qt's class QRegularExpression.
I apologize if I've not been clear. I've tried to explain the best I could since the very beginning.
Anyway, let's say you have one string, to be evaluated against some (ANY) regex. No match is found. Then I want to know, if possible, the point where the evaluation failed.
This regex: "abc"
This string: "abd", failing position: 2
This regex: "abc"
This string: "acb", failing position: 1
This regex: "abc"
This string: "xyz", failing position: 0
I feel very stupid asking this, mostly because I think it's a very basic question.
But it's not what you immediately think at first glance. I swear I searched for answers the most I could, but everything I got was about errors in the regexes themselves.
I hate this, but it works.
int getFailingPosition(QString sRegEx,QString sText) {
int iResult;
QRegularExpression rxRegEx;
QRegularExpressionMatch rxmMatch;
rxRegEx.setPattern(QRegularExpression::anchoredPattern(sRegEx));
for(iResult=sText.length();iResult>0;iResult--) {
rxmMatch=rxRegEx.match(sText);
if(rxmMatch.hasMatch())
break;
else {
rxmMatch=rxRegEx.match(
sText,
0,
QRegularExpression::MatchType::PartialPreferCompleteMatch
);
if(rxmMatch.hasPartialMatch())
break;
}
sText.chop(1);
}
return iResult;
}
Tests:
#define REGEX_USA_ZIPCODE "\\d{4}?\\d$|^\\d{4}?\\d-\\d{4}"
#define REGEX_SIGNED_NUMBER "[-+]?[0-9]*\\.?[0-9]+([eE][-+]?[0-9]+)?"
#define REGEX_ISO8601_DATE "\\d{4}-(0[1-9]|1[012])-(0[1-9]|[12]\\d|3[0-1])"
#define REGEX_USA_PHONE "\\(?\\d{1,3}?\\)?[-.\\s]?\\d{1,4}[-.\\s]?\\d{1,4}[-.\\s]?\\d{1,9}"
qDebug() << getFailingPosition("abc","abcd"); // 3
qDebug() << getFailingPosition("abc","abd"); // 2
qDebug() << getFailingPosition("abc","acb"); // 1
qDebug() << getFailingPosition("abc","xyz"); // 0
qDebug() << getFailingPosition("abc","x"); // 0
qDebug() << getFailingPosition("abc",""); // 0
qDebug() << getFailingPosition("abc","a"); // 1
qDebug() << getFailingPosition("abc","ab"); // 2
qDebug() << getFailingPosition(REGEX_USA_ZIPCODE,"12345-1"); // 7 (missing chars)
qDebug() << getFailingPosition(REGEX_SIGNED_NUMBER,"-0.123e"); // 7 (missing chars)
qDebug() << getFailingPosition(REGEX_ISO8601_DATE,"2021-23-31"); // 5 (unexpected char)
qDebug() << getFailingPosition(REGEX_USA_PHONE,"202-3(24)-3000"); // 5 (unexpected char)
getFailingPosition() should be called only after we're sure there is not a match, or it would return the string length, giving the wrong idea that something's missing.
This should have a built-in function...

QString remove last characters

How to remove /Job from /home/admin/job0/Job
QString name = "/home/admin/job0/Job"
I want to remove last string after"/"
You have QString::chop() for the case when you already know how many characters to remove.
It is same as QString::remove(), just works from the back of string.
Find last slash with QString::lastIndexOf.
After that get substring with QString::left till the position of the last slash occurrence
QString name = "/home/admin/job0/Job";
int pos = name.lastIndexOf(QChar('/'));
qDebug() << name.left(pos);
This will print:
"/home/admin/job0"
You should check int pos for -1 to be sure the slash was found at all.
To include last slash in output add +1 to the founded position
qDebug() << name.left(pos+1);
Will output:
"/home/admin/job0/"
Maybe easiest to understand for later readers would probably be:
QString s("/home/admin/job0/Job");
s.truncate(s.lastIndexOf(QChar('/'));
qDebug() << s;
as the code literaly says what you intended.
You can do something like this:
QString s("/home/admin/job0/Job");
s.remove(QRegularExpression("\\/(?:.(?!\\/))+$"));
// s is "/home/admin/job0" now
If you are using Qt upper than 6 and sure that "/" constains in your word you should use QString::first(qsizetype n) const function instead QString::left(qsizetype n) const
Example:
QString url= "/home/admin/job0/Job"
QString result=url.first(lastIndexOf(QChar('/')));
If you run these code:
QElapsedTimer timer;
timer.start();
for (int j=0; j<10000000; j++)
{
QString name = "/home/admin/job0/Job";
int pos = name.lastIndexOf("/");
name.left(pos);
}
qDebug() << "left method" << timer.elapsed() << "milliseconds";
timer.start();
for (int j=0; j<10000000; j++)
{
QString name = "/home/admin/job0/Job";
int pos = name.lastIndexOf(QChar('/'));
name.first(pos);
}
qDebug() << "frist method" << timer.elapsed() << "milliseconds";
Results:
left method 10034 milliseconds
frist method 8098 milliseconds
sorry for replying to this post after 4 years, but I have (I think) the most efficient answer.
You can use
qstr.remove(0, 1); //removes the first character
qstr.remove(1, 1); //removes the last character
Thats everything you have to do, to delete characters ONE BY ONE (first or last) from a QString, until 1 character remains.

QRegExp does not match even though regex101.com does

I need to extract some data from string with simple syntax. The syntax is this:
_IMPORT:[any text] - [HEX number] #[decimal number]
Therefore I created regex you can see below in the code:
//SYNTAX: _IMPORT:%1 - %2 #%3
static const QRegExp matchImportLink("^_IMPORT:(.*?) - ([A-Fa-f0-9]+) #([0-9]+)$");
QRegExp importLink(matchImportLink);
QString qtWtf(importLink.pattern());
const int index = importLink.indexIn(mappingName);
qDebug()<< "Input string: "<<mappingName;
qDebug()<< "Regular expression:"<<qtWtf;
qDebug()<< "Result: "<< index;
For some reason, that does not work, I get this output:
Input string: "_IMPORT:ddd - 92806f0f96a6dea91c37244128f7d00f #0"
Regular expression: "^_IMPORT:(.*?) - ([A-Fa-f0-9]+) #([0-9]+)$"
Result: -1
I even tried to remove the anchors ^ and $ but that didn't help and also is undesired. The annoying thing is that this regexp works perfectly if I copy the output in regex101.com, as you can see here: https://regex101.com/r/oT6cY3/1
Can anyone explain what is wrong here? Did I stumble upon Qt bug? I use Qt 5.6. Is there any workaround for this?
It seems like Qt does not recognize the quatifier *? as valid. Check the method QRegExp::isValid() againts your pattern. In my case it did not work because of this. And the documentation tells that any invalid pattern will never match.
So first thing I tried was skipping the ? which perfectly fits your provided string with all capturing groups. Here is my code.
QString str("_IMPORT:ddd - 92806f0f96a6dea91c37244128f7d00f #0");
QRegExp exp("^_IMPORT:(.*) - ([A-Fa-f0-9]+) #([0-9]+)$");
qDebug() << "pattern:" << exp.pattern();
qDebug() << "valid:" << exp.isValid();
int pos = 0;
while ((pos = exp.indexIn(str, pos)) != -1) {
for (int i = 1; i <= exp.captureCount(); ++i)
qDebug() << "pos:" << pos << "len:" << exp.matchedLength() << "val:" << exp.cap(i);
pos += exp.matchedLength();
}
And here is the resulting output.
pattern: "^_IMPORT:(.*) - ([A-Fa-f0-9]+) #([0-9]+)$"
valid: true
pos: 0 len: 49 val: "ddd"
pos: 0 len: 49 val: "92806f0f96a6dea91c37244128f7d00f"
pos: 0 len: 49 val: "0"
Tested using Qt 5.6.1.
Also note that you may set greedy evaluation using QRegExp::setMinimal(bool).

Qt QVariant toList not working

I have a Qt (4.7) program that takes a QByteArray and should break it into a list of QVariants, after using a parser to transform it into a QVariant. The problems seem to arise when I try to use the toList() function. I have something similar to this:
QVariant var = //whatever the value passed in is...
std::cout << "Data = " << var.toString().toStdString() << std::endl;
QList<QVariant> varlist = var.toList();
std::cout << "List Size = " << varlist.size() << std::endl;
which would return this:
Data = variant1 variant2 variant3
Size = 0
where the size should clearly be 3. Does anyone have an idea what I may be doing wrong? thanks!
The documentation of toList() says:
Returns the variant as a QVariantList if the variant has userType() QMetaType::QVariantList or QMetaType::QStringList; otherwise returns an empty list.
My guess is, your variant's userType() is neither of those two.
You probably need to construct your variant differently, e.g.
QVariantList list;
list << variant1 << variant2 << variant3;
QVariant var = list;
So, I have no idea why, but when I put the command I specified above into a separate function, ie QList<QVariant> myClass::ToList(QVariant v){return v.toList();}, and then call varlist = myClass::ToList(v), it works. Still doesn't the original way, but this way it's fine. Guess I'll just chalk it up to one of the quirks of Qt...

quint16 on qbytearray

i need add on firt position of qbytearray a quint16 and after read it: How can i to do it?
I have try this:
quint16 pos = 0;
QFile file(m_pathFile);
if (file.open(QFile::ReadOnly))
{
qDebug() << "el fichero existe";
m_udpSocket->bind(m_port);
QByteArray datagram;
while (!file.atEnd())
{
datagram.begin();
datagram.append(pos++);
datagram = file.read(m_blockSize);
qDebug() << "Sec" << datagram.at(0);
}
}
Thanks you very much
I got add with:
datagram.begin();
datagram.setNum(pos, 10);
datagram.append(file.read(m_blockSize));
but i don't know as read it
Thanks
Ok, first of all, that datagram.begin() is useless since it returns an iterator that you don't assign at all. If you want to insert a number in the first position of a QByteArray you can do something like:
datagram.insert(0, QString::number(pos++));
To read it, the simplest way is to use a QTextStream like this:
QTextStream str(datagram);
quint16 num;
str >> num;
Also, take a look at the docs before posting, because the Qt ones are really simple and helpful if you know how to search (and it's not that difficult, trust me).

Resources