Reading from TcpStream results in empty buffer - tcp

I want to read data from a TCP stream but it results in an empty Vec:
extern crate net2;
use net2::TcpBuilder;
use std::io::Read;
use std::io::Write;
use std::io::BufReader;
let tcp = TcpBuilder::new_v4().unwrap();
let mut stream = tcp.connect("127.0.0.1:3306").unwrap();
let mut buf = Vec::with_capacity(1024);
stream.read(&mut buf);
println!("{:?}", buf); // prints []
When I use stream.read_to_end the buffer is filled but this takes way too long.
In Python I can do something like
import socket
TCP_IP = '127.0.0.1'
TCP_PORT = 3306
BUFFER_SIZE = 1024
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect((TCP_IP, TCP_PORT))
#s.send(MESSAGE)
data = s.recv(BUFFER_SIZE)
s.close()
print "received data:", data
How can I achieve this in Rust?

The two methods you tried don't work for different reasons:
read(): "does not provide any guarantees about whether it blocks waiting for data". In general, read() is unreliable from a users perspective and should only be used as a building block for higher level functions, like read_to_end().
But maybe more importantly, you have a bug in your code: you create your vector via with_capacity() which reserves memory internally, but doesn't change the length of the vector. It is still empty! When you now slice it like &buf, you pass an empty slice to read(), thus read() cannot read any actual data. To fix that, the elements of your vector need to be initialized: let mut buf = vec![0; 1024] or something like that.
read_to_end(): calls read() repeatedly until EOF is encountered. This doesn't really make sense in most TCP stream situations.
So what should you use instead? In your Python code you read a specific number of bytes into a buffer. You can do that in Rust, too: read_exact(). It works like this:
const BUFFER_SIZE: usize = 1024;
let mut stream = ...;
let mut buf = [0; BUFFER_SIZE];
stream.read_exact(&mut buf);
println!("{:?}", buf);
You could also use take(). That way you can use read_to_end():
const BUFFER_SIZE: usize = 1024;
let mut stream = ...;
let mut buf = Vec::with_capacity(BUFFER_SIZE);
stream.take(BUFFER_SIZE).read_to_end(&mut buf);
println!("{:?}", buf);
If you want to use the stream multiple times, you probably want to use by_ref() before calling take().
The two code snippets are not equivalent though! Please read the documentation for more details.

Related

How to multiplex a stream in rust using sinks and tokio?

I would like to split a stream in Rust, but I'd like to do it idiomatically without side effects in my stream functions.
This is what I have now where I pipe the stream result into a for_each
let (sender, receiver) tokio::sync::broadcast::channel(1);
let mut base_stream = inf_async_stream_gen().await
.map(function1)
.for_each(|x| {
let inner_sender = sender.clone();
async move { inner_sender.send(x); }
});
let stream_1 = BroadcastStream::new(receiver.resubscribe());
let stream_2 = BroadcastStream::new(receiver.resubscribe());
But this feels a little bit odd. Isn't receiver conceptually a Sink (despite not implementing the trait)?

Rust TCP how to get bytearray length?

I have a TCP Client in rust, which should communicate with a Java Server. I got the basics working and can send bytearrays between them.
But for the bytearray buffer, I need to know the length of the bytearray. But I don't know I should obtain it. At the moment, I only have a fixed size for the buffer right now.
My Rust code looks like this:
loop {
let mut buffer = vec![0; 12]; //fixed buffer length
let n = stream.read(&mut buffer).await;
let text = from_utf8(&buffer).unwrap();
println!("{}", text);
}
In Java, you can send the size of the buffer directly as an Integer with DataInputStream. Is there any option to do that in rust?
For example, this is how I'm doing it in Java:
public String readMsg(Socket socket) throws IOException {
DataInputStream in = new DataInputStream(new BufferedInputStream(socket.getInputStream()));
byte[] bytes = new byte[in.readInt()]; //dynamic buffer length
in.readFully(bytes);
return new String(bytes, StandardCharsets.US_ASCII);
}
What you want to know is a property of the protocol that you are using. It's not a property of the programming language you use. Based on your Java code it seems like you are using a protocol which sends a 4 byte length field before the message data (signed/unsigned?).
If that is the case you can handle reading the message the same way in Rust:
1. Read the 4 bytes in order to obtain the length information
2. Read the remaining data
3. Deserialize the data
fn read_message(stream: Read) -> io::Result<String> {
let mut buffer = [0u8; 4];
// Read the length information
stream.read_exact(&mut buffer[..])?;
// Deserialize the length
let size = u32::from_be_bytes(buffer);
// Allocate a buffer for the message
// Be sure to check against a maximum size before doing this in production
let mut payload = vec![0; size];
stream.read_exact(&mut payload[..]).await;
// Convert the buffer into a string
let text = String::from_utf8(payload).map_err(/* omitted */)?;
println!("{}", text);
Ok(text)
}
This obviously is only correct if your protocol uses length prefixed messages with a 4byte unsigned int prefix. This is something that you need to check.

buffer in rust-tokio streams is there a way to use something other then &[u8]?

I am trying to make a echo server that capitalize a String when it replies, to practice with tokio as an exercise. I used an array as a buffer which is annoying because what if the string overflows the buffer?
I would like to know if there is a better way to this without using an array, ideally just using a String or a vector without needing to create the buffer array.
I tried read_from_string() but is not async and ends up blocking the socket.
extern crate tokio;
use tokio::net::TcpListener;
use tokio::prelude::*;
fn main() {
let addr = "127.0.0.1:6142".parse().unwrap();
let listener = TcpListener::bind(&addr).unwrap();
let server = listener
.incoming()
.for_each(|socket| {
let (mut reader, mut writer) = socket.split();
let mut buffer = [0; 16];
reader.poll_read(&mut buffer)?;
let s = std::str::from_utf8(&buffer).unwrap();
s.to_uppercase();
writer.poll_write(&mut s.as_bytes())?;
Ok(())
})
.map_err(|e| {
eprintln!("something went wrong {}", e);
});
tokio::run(server);
}
Results:
"012345678901234567890" becomes -> "0123456789012345"
I could increase the buffer of course but it would just kick the can down the road.
I believe tokio_codec is a right tool for such tasks. Tokio documentation: https://tokio.rs/docs/going-deeper/frames/
It uses Bytes / BytesMut as its buffer - very powerful structure which will allow you to process your data however you want and avoid unnecessary copies

Deserialize from tokio socket

I am using tokio to implement a server which communicates with messages serialized with serde (bincode). Without asynchronous and futures I would do
extern crate tokio_io;
extern crate bincode;
extern crate serde;
extern crate bytes;
extern crate futures;
#[macro_use] extern crate serde_derive;
use tokio_io::{AsyncRead, AsyncWrite};
use tokio_io::io::{read_exact, write_all};
use bincode::{serialize, deserialize, deserialize_from, Infinite, serialized_size};
use std::io::Read;
use std::io::Cursor;
use futures::future::Future;
type Item = String; // Dummy, this is a complex struct with derived Serizalize
type Error = bincode::Error;
// This works
fn decode<R>(reader: &mut R) -> Result<Item, Error> where R: Read {
let message: Item = deserialize_from(reader, Infinite)?;
Ok(message)
}
fn main() {
let ser = serialize("Test", Infinite).unwrap();
let buf = Cursor::new(ser);
let mut reader = std::io::BufReader::new(buf);
println!("{:?}", decode(&mut reader))
}
But what I need is a decode function which can work with an asyncronous socket as
// I need this since I get the reader from a (tokio) socket as
// let socket = TcpListener::bind(&addr, &handle).unwrap();
// let (reader, writer) = socket.split();
fn decode_async<R>(reader: R) -> Result<Item, Error> where R: AsyncRead {
// Does not work:
let message: Item = deserialize_from(reader, Infinite)?;
Ok(message)
}
The only idea I have is to manually write the length into the buffer during encoding and then use read_exact:
// Encode with size
fn encode_async(item: &Item) -> Result<Vec<u8>, Error>{
let size = serialized_size(item);
let mut buf = serialize(&size, Infinite).unwrap();
let ser = serialize(item, Infinite).unwrap();
buf.extend(ser);
Ok(buf)
}
// Decode with size
fn decode_async<R>(reader: R) -> Box<Future<Item = Item, Error = std::io::Error>>
where R: AsyncRead + 'static {
let read = read_exact(reader, vec![0u8; 8]).and_then(|(reader, buf)| {
let size = deserialize::<u64>(&mut &buf[..]).unwrap();
Ok((reader, size as usize))
}).and_then(|(reader, size)| {
read_exact(reader, vec![0u8; size])
}).and_then(|(reader, buf)| {
let item = deserialize(&mut &buf[..]).unwrap();
Ok(item)
});
Box::new(read)
}
fn main() {
let ser = encode_async(&String::from("Test")).unwrap();
let buf = Cursor::new(ser);
let mut reader = std::io::BufReader::new(buf);
let dec = decode_async(reader).wait();
println!("{:?}", dec)
}
Is there a better way to implement the decoding?
deserialize_from can't handle IO errors, especially not of the kind WouldBlock which is returned by async (non-blocking) Readers when they are waiting for more data. That is limited by the interface: deserialize_from doesn't return a Future or a partial state, it returns the full decoded Result and wouldn't know how to combine the Reader with an event loop to handle WouldBlock without busy looping.
Theoretically, it is possible to implement an async_deserialize_from, but not by using the interfaces provided by serde unless you read the full data to decode in advance, which would defeat the purpose.
You need to read the full data using tokio_io::io::read_to_end or tokio_io::io::read_exact (what you're currently using), if you know the size of the encoded data in an "endless" stream (or in a stream followed by other data).
Stefan's answer is correct, however you might be interested in looking at the tokio-serde-* family of crates which do this for you, specifically tokio-serde-bincode. From the readme:
Utilities needed to easily implement a Tokio Bincode transport using serde for serialization and deserialization of frame values. Based on tokio-serde.
The crate has several examples of how to use it.

How do you read the contents of &mut [0; 128]?

I've been reading through the Rust documentation and made it to section 4.26 before looking at the libraries included. std::net::TcpStream caught my eye but I don't understand the following lines:
let _ = stream.write(&[1]);
let _ = stream.read(&mut [0; 128]);
I have seen [0; 128] before under Vectors as vec![0;10] // Ten 0s so I can see that a buffer of 128 0s is passed in. The documentation for read says "Pull some bytes from this source into the specified buffer, returning how many bytes were read." so how can you access the data that was read into the buffer?
The comment in the code indicates that the result is ignored:
let _ = stream.read(&mut [0; 128]); // ignore here too
To get the data, you need to create a variable:
let mut buffer = [0; 128];
let _ = stream.read(&mut buffer);
// The data is in buffer.

Resources