How to efficiently send large files across a single network connection? - tcp

I'm sending large objects over a network and noticed that using a single network connection is significantly slower than using multiple.
Server code:
use async_std::{
io::{BufWriter, Write},
net::TcpListener,
prelude::*,
task,
};
use bench_utils::{end_timer, start_timer};
use futures::stream::{FuturesOrdered, StreamExt};
async fn send(buf: &[u8], writer: &mut (impl Write + Unpin)) {
// Send the message length
writer.write_all(&(buf.len() as u64).to_le_bytes()).await.unwrap();
// Send the rest of the message
writer.write_all(&buf).await.unwrap();
writer.flush().await.unwrap();
}
fn main() {
task::block_on(async move {
let listener = TcpListener::bind("0.0.0.0:8000").await.unwrap();
let mut incoming = listener.incoming();
let mut writers = Vec::with_capacity(16);
for _ in 0..16 {
let stream = incoming.next().await.unwrap().unwrap();
writers.push(BufWriter::new(stream))
};
let buf = vec![0u8; 1 << 30];
let send_time = start_timer!(|| "Sending buffer across 1 connection");
send(&buf, &mut writers[0]).await;
end_timer!(send_time);
let send_time = start_timer!(|| "Sending buffer across 16 connections");
writers
.iter_mut()
.zip(buf.chunks(buf.len() / 16))
.map(|(w, chunk)| {
send(chunk, w)
})
.collect::<FuturesOrdered<_>>()
.collect::<Vec<_>>()
.await;
end_timer!(send_time);
});
}
Client code:
use async_std::{
io::{BufReader, Read},
net::TcpStream,
prelude::*,
task,
};
use bench_utils::{end_timer, start_timer};
use futures::stream::{FuturesOrdered, StreamExt};
async fn recv(reader: &mut (impl Read + Unpin)) {
// Read the message length
let mut len_buf = [0u8; 8];
reader.read_exact(&mut len_buf).await.unwrap();
let len: u64 = u64::from_le_bytes(len_buf);
// Read the rest of the message
let mut buf = vec![0u8; usize::try_from(len).unwrap()];
reader.read_exact(&mut buf[..]).await.unwrap();
}
fn main() {
let host = &std::env::args().collect::<Vec<_>>()[1];
task::block_on(async move {
let mut readers = Vec::with_capacity(16);
for _ in 0..16 {
let stream = TcpStream::connect(host).await.unwrap();
readers.push(BufReader::new(stream));
}
let read_time = start_timer!(|| "Reading buffer from 1 connection");
recv(&mut readers[0]).await;
end_timer!(read_time);
let read_time = start_timer!(|| "Reading buffer from 16 connections");
readers
.iter_mut()
.map(|r| recv(r))
.collect::<FuturesOrdered<_>>()
.collect::<Vec<_>>()
.await;
end_timer!(read_time);
});
}
Server result:
Start: Sending buffer across 1 connection
End: Sending buffer across 1 connection....................................55.134s
Start: Sending buffer across 16 connections
End: Sending buffer across 16 connections..................................4.19s
Client result:
Start: Reading buffer from 1 connection
End: Reading buffer from 1 connection......................................55.396s
Start: Reading buffer from 16 connections
End: Reading buffer from 16 connections....................................3.914s
I am assuming that this difference is due to the sending connection having to wait for an ACK when the TCP buffer is filled (both machines have TCP window scaling enabled)? It doesn't appear that Rust provides an API to modify the size of these things.
Is there anyway to achieve similar throughput on a single connection? It seems annoying to have to pass around multiple since all of this is going through a single network interface anyways.

Related

Simple Rust TcpStream client and server problem where client crash with "peer reset"

This server tries to send data to the client and the client tries to receive all the data. when a certain amount of data have been sent/received, the application exit.
the client code:
use std::{env::args, io::Read, net::TcpStream, time::Instant};
fn main() {
let addr = args()
.nth(1)
.unwrap_or_else(|| "localhost:2233".to_string());
let size_m = args().nth(2).unwrap_or_else(|| "10".to_string());
let size_m = size_m.parse::<usize>().unwrap();
let mut stream = TcpStream::connect(addr.clone()).unwrap();
println!(
"Connected to {}, local address: {}",
addr,
stream.local_addr().unwrap()
);
let now = Instant::now();
let mut buffer = vec![0; 1024 * 1024];
for i in 0..size_m {
stream.read_exact(buffer.as_mut_slice()).unwrap();
println!("read {}MB", i + 1);
println!(
"average speed: {}MB/s",
(i + 1) as f64 / now.elapsed().as_secs_f64()
);
}
println!("total time: {} s", now.elapsed().as_secs_f64());
println!(
"speed: {} MB/s",
size_m as f64 / now.elapsed().as_secs_f64()
);
}
the server code:
use std::env::args;
use std::io::Write;
use std::net::TcpListener;
fn main() {
let addr = args().nth(1).unwrap_or_else(|| ":::2233".to_string());
let size_m = args().nth(2).unwrap_or_else(|| "10".to_string());
let size_m = size_m.parse::<usize>().unwrap();
let listener = TcpListener::bind(addr).unwrap();
println!("Listening on {}", listener.local_addr().unwrap());
if let Ok((mut stream, addr)) = listener.accept() {
println!("Accepted connection from {}", addr);
// create 1M buffer
let buffer = vec![0; 1024 * 1024];
for i in 0..size_m {
stream.write_all(buffer.as_slice()).unwrap();
println!("write {}MB", i + 1);
}
stream.flush().unwrap();
//close stream
// stream.shutdown(std::net::Shutdown::Both).unwrap();
}
}
and I ran server like : cargo run --bin speedtestserver :::2233 2
client like: cargo run --bin speedtestclient ::1:2233 2
It can succeed in the same machine, but when to deploy to two different machines, the client says: "failed to fill whole buffer" or "peer reset".
Ok, I found the problem, it's about the docker container. I ran the server in a docker container using docker --rm ..., when the program finished, the docker shut down the container, and the packets failed to get out I guess. When I ran the server on the host machine, the error disappeared.

Why is async TcpStream blocking?

I'm working on a project to implement a distributed key value store in rust. I've made the server side code using Tokio's asynchronous runtime. I'm running into an issue where it seems my asynchronous code is blocking so when I have multiple connections to the server only one TcpStream is processed. I'm new to implementing async code, both in general and on rust, but I thought that other streams would be accepted and processed if there was no activity on a given tcp stream.
Is my understanding of async wrong or am I using tokio incorrectly?
This is my entry point:
use std::error::Error;
use std::net::SocketAddr;
use std::path::{Path, PathBuf};
use std::str::FromStr;
use std::sync::{Arc, Mutex};
use env_logger;
use log::{debug, info};
use structopt::StructOpt;
use tokio::net::TcpListener;
extern crate blue;
use blue::ipc::message;
use blue::store::args;
use blue::store::cluster::{Cluster, NodeRole};
use blue::store::deserialize::deserialize_store;
use blue::store::handler::handle_stream;
use blue::store::wal::WriteAheadLog;
#[tokio::main]
async fn main() -> Result<(), Box<dyn Error>> {
env_logger::Builder::from_env(env_logger::Env::default().default_filter_or("info")).init();
let opt = args::Opt::from_args();
let addr = SocketAddr::from_str(format!("{}:{}", opt.host, opt.port).as_str())?;
let role = NodeRole::from_str(opt.role.as_str()).unwrap();
let leader_addr = match role {
NodeRole::Leader => addr,
NodeRole::Follower => SocketAddr::from_str(opt.follow.unwrap().as_str())?,
};
let wal_name = addr.to_string().replace(".", "").replace(":", "");
let wal_full_name = format!("wal{}.log", wal_name);
let wal_path = PathBuf::from(wal_full_name);
let mut wal = match wal_path.exists() {
true => {
info!("Existing WAL found");
WriteAheadLog::open(&wal_path)?
}
false => {
info!("Creating WAL");
WriteAheadLog::new(&wal_path)?
}
};
debug!("WAL: {:?}", wal);
let store_name = addr.to_string().replace(".", "").replace(":", "");
let store_pth = format!("{}.pb", store_name);
let store_path = Path::new(&store_pth);
let mut store = match store_path.exists() {
true => deserialize_store(store_path)?,
false => message::Store::default(),
};
let listener = TcpListener::bind(addr).await?;
let cluster = Cluster::new(addr, &role, leader_addr, &mut wal, &mut store).await?;
let store_path = Arc::new(store_path);
let store = Arc::new(Mutex::new(store));
let wal = Arc::new(Mutex::new(wal));
let cluster = Arc::new(Mutex::new(cluster));
info!("Blue launched. Waiting for incoming connection");
loop {
let (stream, addr) = listener.accept().await?;
info!("Incoming request from {}", addr);
let store = Arc::clone(&store);
let store_path = Arc::clone(&store_path);
let wal = Arc::clone(&wal);
let cluster = Arc::clone(&cluster);
handle_stream(stream, store, store_path, wal, cluster, &role).await?;
}
}
Below is my handler (handle_stream from the above). I excluded all the handlers in match input as I didn't think they were necessary to prove the point (full code for that section is here: https://github.com/matthewmturner/Bradfield-Distributed-Systems/blob/main/blue/src/store/handler.rs if it actually helps).
Specifically the point that is blocking is the line let input = async_read_message::<message::Request>(&mut stream).await;
This is where the server is waiting for communication from either a client or another server in the cluster. The behavior I currently see is that after connecting to server with client the server doesn't receive any of the requests to add other nodes to the cluster - it only handles the client stream.
use std::io;
use std::net::{SocketAddr, TcpStream};
use std::path::Path;
use std::str::FromStr;
use std::sync::{Arc, Mutex};
use log::{debug, error, info};
use serde_json::json;
use tokio::io::AsyncWriteExt;
use tokio::net::TcpStream as asyncTcpStream;
use super::super::ipc::message;
use super::super::ipc::message::request::Command;
use super::super::ipc::receiver::async_read_message;
use super::super::ipc::sender::{async_send_message, send_message};
use super::cluster::{Cluster, NodeRole};
use super::serialize::persist_store;
use super::wal::WriteAheadLog;
// TODO: Why isnt async working? I.e. connecting servers after client is connected stays on client stream.
pub async fn handle_stream<'a>(
mut stream: asyncTcpStream,
store: Arc<Mutex<message::Store>>,
store_path: Arc<&Path>,
wal: Arc<Mutex<WriteAheadLog<'a>>>,
cluster: Arc<Mutex<Cluster>>,
role: &NodeRole,
) -> io::Result<()> {
loop {
info!("Handling stream: {:?}", stream);
let input = async_read_message::<message::Request>(&mut stream).await;
debug!("Input: {:?}", input);
match input {
...
}
}
}
This is the code for async_read_message
pub async fn async_read_message<M: Message + Default>(
stream: &mut asyncTcpStream,
) -> io::Result<M> {
let mut len_buf = [0u8; 4];
debug!("Reading message length");
stream.read_exact(&mut len_buf).await?;
let len = i32::from_le_bytes(len_buf);
let mut buf = vec![0u8; len as usize];
debug!("Reading message");
stream.read_exact(&mut buf).await?;
let user_input = M::decode(&mut buf.as_slice())?;
debug!("Received message: {:?}", user_input);
Ok(user_input)
}
Your problem lies with how you're handling messages after clients have connected:
handle_stream(stream, store, store_path, wal, cluster, &role).await?;
This .await means your listening loop will wait for handle_stream to return, but (making some assumptions) this function won't return until the client has disconnected. What you want is to tokio::spawn a new task that can run independently:
tokio::spawn(handle_stream(stream, store, store_path, wal, cluster, &role));
You may have to change some of your parameter types to avoid lifetimes; tokio::spawn requires 'static since the task's lifetime is decoupled from the scope where it was spawned.

Read/write problem in async tokio application (beginner)

I'm new to network programming and thread in Rust so I may be missing something obvious here. I've been following along with this trying to build a simple chat application. Only, he does it with the standard library and I'm trying to do it with tokio. The functionality is very simple: Client sends a message to Server, Server acknowledges it and sends it back to the Client. Here's my code for the client and server, stripped down as much as I can:
server.rs
#[tokio::main]
async fn main() {
let server = TcpListener::bind("127.0.0.1:7878").await.unwrap();
let mut clients = vec![];
let (tx, mut rx) = mpsc::channel(32);
loop {
if let Ok((socket, addr)) = server.accept().await {
let tx = tx.clone();
let (mut reader, writer) = split(socket);
clients.push(writer);
tokio::spawn(async move {
loop {
let mut buffer = vec![0; 1024];
reader.read(&mut buffer).await.unwrap();
//get message written by the client and print it
//then transmit it on the channel
let msg = buffer.into_iter().take_while(|&x| x != 0).collect::<Vec<_>>();
let msg = String::from_utf8(msg).expect("Invalid utf8 message");
println!("{}: {:?}", addr, msg);
match tx.send(msg).await {
Ok(_) => { ()}
Err(_) => { println!("Error");}
}
}
});
}
//write each message received back to its client
if let Some(msg) = rx.recv().await {
clients = clients.into_iter().filter_map(|mut x| {
println!("writing: {:?}", &msg);
x.write(&msg.clone().into_bytes());
Some(x)
}).collect::<Vec<_>>();
}
}
}
client.rs
#[tokio::main]
async fn main() {
let client = TcpStream::connect("127.0.0.1:7878").await.unwrap();
let (tx, mut rx) = mpsc::channel::<String>(32);
tokio::spawn(async move {
loop {
let mut buffer = vec![0; 1024];
// get message sent by the server and print it
match client.try_read(&mut buffer) {
Ok(_) => {
let msg = buffer.into_iter().take_while(|&x| x != 0).collect::<Vec<_>>();
println!("Received from server: {:?}", msg);
}
Err(ref err) if err.kind() == io::ErrorKind::WouldBlock => {
()
}
Err(_) => {
println!("Connection with server was severed");
break;
}
}
// get message transmitted from user input loop
// then write it to the server
match rx.try_recv() {
Ok(message) => {
let mut buffer = message.clone().into_bytes();
buffer.resize(1024, 0);
match client.try_write(&buffer) {
Ok(_) => { println!("Write successful");}
Err(_) => { println!("Write error");}
}
}
Err(TryRecvError::Empty) => (),
_ => break
}
}
} );
// user input loop here
// takes user message and transmits it on the channel
}
Sending to the server works fine, and the server appears to be successfully writing as indicated by its output:
127.0.0.1:55346: "test message"
writing: "test message"
The issue is the client never reads back the message from the server, instead getting WouldBlock errors every time it hits the match client.try_read(&mut buffer) block.
If I stop the server while keeping the client running, the client is suddenly flooded with successful reads of empty messages:
Received from server: []
Received from server: []
Received from server: []
Received from server: []
Received from server: []
Received from server: []
Received from server: []
Received from server: []
...
Can anyone tell me what's going on?
Here's what happens in your server:
Wait for a client to connect.
When the client is connected, spawn a background task to receive from the client.
Try to read from the channel, since it is very unlikely that the client has already sent anything at this point the channel is empty.
Loop → wait for another client to connect.
While waiting for another client, the background task receives the message from the first client and sends it to the channel, but the main task is blocked waiting for another client and never tries to read again from the channel.
Easiest way to get it to work is to get rid of the channel in the server and simply echo the message from the spawned task.
Another solution is to spawn an independent task to process the channel and write to the clients.
As for what happens when you kill the server: once the connection is lost attempting to read from the socket does not return an error but instead returns an empty buffer.

Rust TCP Proxy to pipe http, ssh and other tcp traffic

I am new to rust and as learning project I want to create a tcp proxy. Tokio overwhelmed me and I could not find the proper documentation to understand how it works. So I tried to go with std net modules but I stuck at properly piping the traffic. I bound multiple listeners in their own threads and want to pipe the traffic forth and back like a proper tcp proxy would do. Unfortunately I seem to don't understand how that has to work in Rust. Can someone give me an example without 3rd party dependencies?
Here is my code. handle_connection gets called from accepting the connection.
fn forward(direction: &str, input: &mut BufReader<TcpStream>, output: &mut BufWriter<TcpStream>) {
loop {
let mut buffer = [0;1024];
debug!("{} Reading", direction);
match input.read(&mut buffer) {
Ok(bytes) => {
debug!("Read {} bytes", bytes);
if bytes < 1 {
break;
}
match output.write_all(&mut buffer) {
Ok(_) => {
debug!("Forwarded {:#?} bytes", bytes);
output.flush();
if bytes < 1024 {
break; // abort when everything is sent
}
},
Err(error) => panic!("Could not forward: {}", error)
}
},
Err(error) => panic!("Could not read: {}", error)
}
}
}
fn handle_connection(mapping: ProxyMapping, incoming: TcpStream) -> Result<(), String> {
info!("Incoming connection from {}", incoming.peer_addr().map_err(|e| format!("{}", e))?);
// Try to connect to target
let outgoing = TcpStream::connect_timeout(
&mapping.get_target_addr()?,
Duration::from_secs(1)
)?;
// forward tcp steam
debug!("Start sync");
let input_clone = input.try_clone()
.map_err(|error| format!("Couldn't clone {}", error))?;
let mut input_read = BufReader::new(input);
let mut input_write = BufWriter::new(input_clone);
let output_clone = output.try_clone()
.map_err(|error| format!("Couldn't clone {}", error))?;
let mut output_read = BufReader::new(output);
let mut output_write = BufWriter::new(output_clone);
debug!("spawn sync");
loop {
debug!("Forward");
forward("forward", &mut input_read, &mut output_write);
debug!("Backward");
forward("backward",&mut output_read, &mut input_write);
}
Ok(())
}
So far it proxies some data but for http requests it stops at backward reading as I don't know yet when I should shutdown a connection. After some time it just floods the log with read 0 bytes. I added a loop for the sync as for one tcp connection multiple packets could be sent. Any ideas how to improve this?

TcpStream not returning from Read until connection closed

I'm trying to implement a simple TCP messaging protocol in Rust with the standard TCP Sockets. The protocol is like so:
[C0 FF EE EE] header
[XX XX] type U16LE
[XX XX] size U16LE
... (size - 8) bytes of arbitrary data ...
I have come up with the following code:
const REQUEST_HEADER: [u8; 4] = [0xC0, 0xFF, 0xEE, 0xEE];
pub fn run_server(host: &str, port: &str) {
let listener = TcpListener::bind(format!("{}:{}", host, port))
.expect("Could not bind to the requested host/port combination.");
for stream in listener.incoming() {
thread::spawn(|| {
let stream = stream.unwrap();
handle_client(stream);
});
}
}
fn handle_client(stream: TcpStream) {
stream.set_nodelay(true).expect("set_nodelay call failed");
loop {
match get_packet(stream) {
Ok(deframed) => {
match deframed.msgid {
Some(FrameType::Hello) => say_hello(stream),
Some(FrameType::Text) => {
println!("-- Got text message: {:X?}", deframed.content);
save_text(deframed);
},
Some(FrameType::Goodbye) => {
println!("-- You say goodbye, and I say hello");
break; // end of connection
},
_ => println!("!! I don't know this packet type")
}
},
Err(x) => {
match x {
1 => println!("!! Malformed frame header"),
2 => println!("!! Client gone?"),
_ => println!("!! Unknown error")
}
break;
}
}
}
}
// Read one packet from the stream
fn get_packet(reader: TcpStream) -> Result<MyFrame, u8> {
// read the packet header in here
let mut heading = [0u8; 0x8];
match reader.read_exact(&mut heading) {
Ok(_) => {
// check if header is OK
if heading[..4] == REQUEST_HEADER
{
// extract metadata
let mid: u16 = ((heading[4] as u16) << 0)
+ ((heading[5] as u16) << 8);
let len: u16 = ((heading[6] as u16) << 0)
+ ((heading[7] as u16) << 8);
let remain: u16 = len - 0x8;
println!("-> Pkt: 0x{:X}, length: 0x{:X}, remain: 0x{:X}", mid, len, remain);
let mut frame = vec![0; remain as usize];
reader.read_exact(&mut frame).expect("Read packet failed");
let deframed = MyFrame {
msgid: FromPrimitive::from_u16(mid),
content: frame
};
Ok(deframed)
} else {
println!("!! Expected packet header but got {:X?}", heading.to_vec());
Err(1)
}
},
_ => Err(2)
}
}
This is the Python client:
import socket, sys
sock = socket.socket()
sock.connect(("127.0.0.1", 12345))
f = open("message.bin",'rb')
dat = f.read()
sock.sendall(dat)
while True:
sock.recv(64)
When the program receives a multiple messages in a row, they seem to be processed fine. If the client sends a message and waits for a response, the program gets stuck at reading the frame content until ~16 more bytes are received. As a result, the program cannot send a response to the client.
I can tell it's stuck in there and not somewhere else because the ->Pkt log line appears with the correct metadata but it doesn't continue further processing despite the client having already sent everything.
I've tried replacing read_exact with read but it just keeps being stuck there. Once the client software drops the connection the message is suddenly processed as normal.
Is this a design problem or am I missing out a setting I need to change on the socket?

Resources