The processing of writing data in the ByteArrayOutputStream from InputStream obtained by urlConnection.getInputStream() is taking more than 1 minute.
CODE SNIPPET
URL requestUrl= new URL(_sampleurl_);
HttpURLConnection urlConnection=(HttpURLConnection)requestUrl.openConnection();
urlConnection.setConnectTimeout(10000);
urlConnection.setReadTimeout(10000);
urlConnection.setRequestMethod("GET");
urlConnection.connect();
int statusCode=urlConnection.getResponseCode(); //200
long contentLengthByTen = urlConnection.getHeaderFieldLong("Content-Length", _defaultSize_); //7631029
InputStream inputStream = urlConnection.getInputStream();
final byte[] byteArray = new byte[16384];
int length;
ByteArrayOutputStream byteArrOutStrm = new ByteArrayOutputStream();
int k = 0;
while ((length = inputStream.read(byteArray)) != -1)
{
byteArrOutStrm.write(byteArray, 0, length);
k++;
}
Some of the observations are:
The while loop alone executing for more than one minute and it is iterated for around 2650 times.
The HttpURLConnection response code is 200, so the entire content is available in the InputStream.
The Content-Length of the file is 7631029 (bytes).
I have two questions:
Though the byte array size is 16384 and the status code is 200, the inputStream.read method reads only 2800 bytes averagely. Why and which factors decide these bytes?
Proper solution to reduce the processing time?
Though the byte array size is 16384 and the status code is 200, the inputStream.read method reads only 2800 bytes averagely. Why and which factors decide these bytes?
The number of bytes available in the socket receive buffer, which in turn is a function of the speed of the sender, the frequency of reads at the receiver, the network bandwidth, and the path MTU. What you're seen isn't surprising: it indicates that you're keeping up with the sender pretty well.
Proper solution to reduce the processing time?
Have the sender send faster. There is nothing wrong with your receiving code, although I wonder why you're collecting the entire response in a ByteArrayOutputStream before doing anything with it. This just wastes memory and adds latency.
Related
The following scenarioļ¼
Sender have sent continuous 10000 bytes
Receiver ACK 0 - 8000(sender have recived those acks)
Now, Sender will send a reset frame, the question is how to set the " final size"
a. final size is 10000
b. final size is 8000
Looking at RFC 9000(https://datatracker.ietf.org/doc/html/rfc9000#section-4.5) witch
describing:
4.5. Stream Final Size
The final size is the amount of flow control credit that is
consumed by a stream. Assuming that every contiguous byte on the
stream was sent once, the final size is the number of bytes sent.
More generally, this is one higher than the offset of the byte with
the largest offset sent on the stream, or zero if no bytes were
sent.
I think the final size should be 10000, and sender must not send any more(transmission and
retransmission) on the identified stream, am I right?
https://datatracker.ietf.org/doc/html/rfc9000#section-19.4
After sending a RESET_STREAM, an endpoint ceases transmission and
retransmission of STREAM frames on the identified stream. A receiver
of RESET_STREAM can discard any data that it already received on that
stream.
And nginx'iquic may have some problems, nginx will retranmit frames when nginx have sent a reset frame(for example: nginx http3 have sent all data to ctx->frames, then qs will be deleted)
void
ngx_quic_resend_frames(ngx_connection_t *c, ngx_quic_send_ctx_t *ctx)
{
case NGX_QUIC_FT_STREAM:
qs = ngx_quic_find_stream(&qc->streams.tree, f->u.stream.stream_id);
if (qs) {
if (qs->send_state == NGX_QUIC_STREAM_SEND_RESET_SENT
|| qs->send_state == NGX_QUIC_STREAM_SEND_RESET_RECVD)
{
ngx_quic_free_frame(c, f);
break;
}
}
}
The final size is definitely 10000, as flow control credit is consumed by sending data on a stream; whether it is acknowlegded or not does not matter.
Indeed, after sending the reset-stream frame:
no new data should be send (this would be a violation of the final size leading to a fatal error as described in section 4.5
no retransmissions should be performed
The 3rd question is unclear to me. I'm not a ngx-quic expert, but from the code it posted i would guess it does not retransmit, as the frame is freed ;-)
I have a TCP Client in rust, which should communicate with a Java Server. I got the basics working and can send bytearrays between them.
But for the bytearray buffer, I need to know the length of the bytearray. But I don't know I should obtain it. At the moment, I only have a fixed size for the buffer right now.
My Rust code looks like this:
loop {
let mut buffer = vec![0; 12]; //fixed buffer length
let n = stream.read(&mut buffer).await;
let text = from_utf8(&buffer).unwrap();
println!("{}", text);
}
In Java, you can send the size of the buffer directly as an Integer with DataInputStream. Is there any option to do that in rust?
For example, this is how I'm doing it in Java:
public String readMsg(Socket socket) throws IOException {
DataInputStream in = new DataInputStream(new BufferedInputStream(socket.getInputStream()));
byte[] bytes = new byte[in.readInt()]; //dynamic buffer length
in.readFully(bytes);
return new String(bytes, StandardCharsets.US_ASCII);
}
What you want to know is a property of the protocol that you are using. It's not a property of the programming language you use. Based on your Java code it seems like you are using a protocol which sends a 4 byte length field before the message data (signed/unsigned?).
If that is the case you can handle reading the message the same way in Rust:
1. Read the 4 bytes in order to obtain the length information
2. Read the remaining data
3. Deserialize the data
fn read_message(stream: Read) -> io::Result<String> {
let mut buffer = [0u8; 4];
// Read the length information
stream.read_exact(&mut buffer[..])?;
// Deserialize the length
let size = u32::from_be_bytes(buffer);
// Allocate a buffer for the message
// Be sure to check against a maximum size before doing this in production
let mut payload = vec![0; size];
stream.read_exact(&mut payload[..]).await;
// Convert the buffer into a string
let text = String::from_utf8(payload).map_err(/* omitted */)?;
println!("{}", text);
Ok(text)
}
This obviously is only correct if your protocol uses length prefixed messages with a 4byte unsigned int prefix. This is something that you need to check.
We have two Qt applications. App1 accepts a connection from App2 through QTcpServer and stores it in an instance of QTcpSocket* tcpSocket. App1 runs a simulation with 30 Hz. For each simulation run, a QByteArray consisting of a few kilobytes is sent using the following code (from the main/GUI thread):
QByteArray block;
/* lines omitted which write data into block */
tcpSocket->write(block, block.size());
tcpSocket->waitForBytesWritten(1);
The receiver socket listens to the QTcpSocket::readDataBlock signal (in main/GUI thread) and prints the corresponding time stamp to the GUI.
When both App1 and App2 run on the same system, the packages are perfectly in sync. However when App1 and App2 are run on different systems connected through a network, App2 is no longer in sync with the simulation in App2. The packages come in much slower. Even more surprising (and indicating our implementation is wrong) is the fact that when we stop the simulation loop, no more packages are received. This surprises us, because we expect from the TCP protocol that all packages will arrive eventually.
We built the TCP logic based on Qt's fortune example. The fortune server, however, is different, because it only sends one package per incoming client. Could someone identify what we have done wrong?
Note: we use MSVC2012 (App1), MSVC2010 (App2) and Qt 5.2.
Edit: With a package I mean the result of a single simulation experiment, which is a bunch of numbers, written into QByteArray block. The first bits, however, contain the length of the QByteArray, so that the client can check whether all data has been received. This is the code which is called when the signal QTcpSocket::readDataBlock is emitted:
QDataStream in(tcpSocket);
in.setVersion(QDataStream::Qt_5_2);
if (blockSize == 0) {
if (tcpSocket->bytesAvailable() < (int)sizeof(quint16))
return; // cannot yet read size from data block
in >> blockSize; // read data size for data block
}
// if the whole data block is not yet received, ignore it
if (tcpSocket->bytesAvailable() < blockSize)
return;
// if we get here, the whole object is available to parse
QByteArray object;
in >> object;
blockSize = 0; // reset blockSize for handling the next package
return;
The problem in our implementation was caused by data packages being piled up and incorrect handling of packages which had only arrived partially.
The answer goes in the direction of Tcp packets using QTcpSocket. However this answer could not be applied in a straightforward manner, because we rely on QDataStream instead of plain QByteArray.
The following code (run each time QTcpSocket::readDataBlock is emitted) works for us and shows how a raw series of bytes can be read from QDataStream. Unfortunately it seems that it is not possible to process the data in a clearer way (using operator>>).
QDataStream in(tcpSocket);
in.setVersion(QDataStream::Qt_5_2);
while (tcpSocket->bytesAvailable())
{
if (tcpSocket->bytesAvailable() < (int)(sizeof(quint16) + sizeof(quint8)+ sizeof(quint32)))
return; // cannot yet read size and type info from data block
in >> blockSize;
in >> dataType;
char* temp = new char[4]; // read and ignore quint32 value for serialization of QByteArray in QDataStream
int bufferSize = in.readRawData(temp, 4);
delete temp;
temp = NULL;
QByteArray buffer;
int objectSize = blockSize - (sizeof(quint16) + sizeof(quint8)+ sizeof(quint32));
temp = new char[objectSize];
bufferSize = in.readRawData(temp, objectSize);
buffer.append(temp, bufferSize);
delete temp;
temp = NULL;
if (buffer.size() == objectSize)
{
//ready for parsing
}
else if (buffer.size() > objectSize)
{
//buffer size larger than expected object size, but still ready for parsing
}
else
{
// buffer size smaller than expected object size
while (buffer.size() < objectSize)
{
tcpSocket->waitForReadyRead();
char* temp = new char[objectSize - buffer.size()];
int bufferSize = in.readRawData(temp, objectSize - buffer.size());
buffer.append(temp, bufferSize);
delete temp;
temp = NULL;
}
// now ready for parsing
}
if (dataType == 0)
{
// deserialize object
}
}
Please not that the first three bytes of the expected QDataStream are part of our own procotol: blockSize indicates the number of bytes for a complete single package, dataType helps deserializing the binary chunk.
Edit
For reducing the latency of sending objects through the TCP connection, disabling packet bunching was very usefull:
// disable Nagle's algorithm to avoid delay and bunching of small packages
tcpSocketPosData->setSocketOption(QAbstractSocket::LowDelayOption,1);
In a TcpClient/TcpListener set up, is there any difference from the receiving end point of view between:
// Will sending a prefixed length before the data...
client.GetStream().Write(data, 0, 4); // Int32 payload size = 80000
client.GetStream().Write(data, 0, 80000); // payload
// Appear as 80004 bytes in the stream?
// i.e. there is no end of stream to demarcate the first Write() from the second?
client.GetStream().Write(data, 0, 80004);
// Which means I can potentially read more than 4 bytes on the first read
var read = client.GetStream().Read(buffer, 0, 4082); // read could be any value from 0 to 4082?
I noticed that DataAvailable and return value of GetStream().Read() does not reliably tell whether there are incoming data on the way. Do I always need to write a Read() loop to exactly read the first 4 bytes?
// Read() loop
var ms = new MemoryStream()
while(ms.length < 4)
{
read = client.GetStream().Read(buffer, 0, 4 - ms.length);
if(read > 0)
ms.Write(buffer, 0, read);
}
The answer seems to be yes, we have to always be responsible for reading the same number of bytes that was sent. In other words, there has to be an application level protocol to read exactly what was written on to the underlying stream because it does not know when a new message start or ends.
Using .NET 4.0, IIS 7.5 (Windows Server 2008 R2). I would like to stream out a binary content of about 10 MB. The content is already in a MemoryStream. I wonder if IIS7 automatically chunks the output stream. From the client receiving the stream, is there any difference between these two approaches:
//#1: Output the entire stream in 1 single chunks
Response.OutputStream.Write(memoryStr.ToArray(), 0, (int) memoryStr.Length);
Response.Flush();
//#2: Output by 4K chunks
byte[] buffer = new byte[4096];
int byteReadCount;
while ((byteReadCount = memoryStr.Read(buffer, 0, buffer.Length)) > 0)
{
Response.OutputStream.Write(buffer, 0, byteReadCount);
Response.Flush();
}
Thanks in advance for any help.
I didn't try your 2nd suggestion passing the original data stream. The memory stream was indeed populated from a Response Stream of a Web Request. Here is the code,
HttpWebRequest webreq = (HttpWebRequest) WebRequest.Create(this._targetUri);
using (HttpWebResponse httpResponse = (HttpWebResponse)webreq.GetResponse())
{
using (Stream responseStream = httpResponse.GetResponseStream())
{
byte[] buffer = new byte[4096];
int byteReadCount = 0;
MemoryStream memoryStr = new MemoryStream(4096);
while ((byteReadCount = responseStream.Read(buffer, 0, buffer.Length)) > 0)
{
memoryStr.Write(buffer, 0, byteReadCount);
}
// ... etc ... //
}
}
Do you think it can safely pass the responseStream to Response.OutputStream.Write() ? If yes, can you suggest an economic way of doing so? How to send ByteArray + exact stream length to Response.OutputStream.Write()?
The second option is the best one as ToArray will in fact create a copy of the internal array stored in the MemoryStream.
But, you can also preferably use memoryStr.GetBuffer() that will return a reference to this internal array. In this case, you need to use the memoryStr.Length property because the buffer returned by GetBuffer() is in general bigger than the stored actual data (it's allocated chunk by chunk, not byte by byte).
Note that it would be best to pass the original data as a stream directly to the ASP.NET outputstream, instead of using an intermediary MemoryStream. It depends on how you get your binary data in the first place.
Another option, if you serve the exact same content often, is to save this MemoryStream to a physical file (using a FileStream), and use Response.TransmitFile on all subsequent requests. Response.TransmitFile is using low level Windows socket layers and there's nothing faster to send a file.