How to find the largest UDP packet I can send without fragmenting? - networking

I need to know what the largest UDP packet I can send to another computer is without fragmentation.
This size is commonly known as the MTU (Maximum Transmission Unit). Supposedly, between 2 computers, will be many routers and modems that may have different MTUs.
I read that the TCP implementation in windows automatically finds the maximum MTU in a path.
I was also experimenting, and I found out that the maximum MTU from my computer to a server was 57712 bytes+header. Anything above that was discarded. My computer is on a LAN, isn't the MTU supposed to be around 1500 bytes?

The following doesn't answer your question directly but you might find it interesting; it says that IP packets can be disassembled/reassembled, and therefore bigger than limit on the underling media (e.g. 1500-byte Ethernet): Resolve IP Fragmentation, MTU, MSS, and PMTUD Issues with GRE and IPSEC
More on this topic:
Re: UDP fragmentation says you should use ICMP instead of UDP to discover MTU
Path MTU Discovery says that a TCP connection might include implicit MTU negotiation via ICMP
I don't know about generating ICMP via an API on Windows: at one time such an API was proposed, and was controversial because people argued that would make it easy to write software that implements denial-of-service functionality by generating a flood of ICMP messages.
No, it looks like it is implemented: see for example Winsock Programmer's FAQ Examples: Ping: Raw Sockets Method.
So, to discover MTU, generate ping packets with the 'do not fragment' flag.
Maybe there's an easier API than this, I don't know; but I hope I've given you to understand the underlying protocol[s].

In addition to all the previous answers, quoting the classic:
IPv4 and IPv6 define minimum reassembly buffer size, the minimum datagram size that we are guaranteed any implementation must support. For IPv4, this is 576 bytes. IPv6 raises this to 1,280 bytes.
This pretty much means that you want to limit your datagram size to under 576 if you work over public internet and you control only one side of the exchange - that's what most of the standard UDP-based protocols do.
Also note that PMTU is a dynamic property of the path. This is one of the things TCP deals with for you. Unless you are ready to re-implement lots of sequencing, timing, and retransmission logic, use TCP for any critical networking. Benchmark, test, profile, i.e. prove that TCP is your bottleneck, only then consider UDP.

This is an interesting topic for me. Perhaps some practical results might be of interest when delivering chunky UDP data around the real world internet via UDP, and with a transmission rate of 1 packet a second, data continues to turn up with minimal packet loss up to about 2K. Over this and you start running into issues, but regularly we delivered 1600+ bytes packets without distress - this is over GPRS mobile networks as well as WAN world wide. At ~1K assuming the signal is stable (its not!) you get low packet loss.
Interestingly its not the odd packet, but often a squall of packets for a few seconds - which presumably is why VoIP calls just collapse occasionally.

Your own MTU is available in the registry, but the MTU in practice is going to the smallest MTU in the path between your machine and the destination. Its both variable and can only be determined empirically. There are a number of RFCs showing how to determine it.
LAN's can internally have very large MTU values, since the network hardware is typically homogeneous or at least centrally administrated.

For UDP applications you must handle end-to-end MTU yourself if you want to avoid IP fragmentation or dropped packets. The recommended approach for any application is to do your best to use PMTU to pick your maximum datagram, or send datagrams < minimum PMTU
https://www.rfc-editor.org/rfc/rfc5405#section-3.2
Unicast UDP Usage Guidelines for Application Designers "SHOULD NOT send datagrams that exceed the PMTU, SHOULD discover PMTU or send datagrams < minimum PMTU
Windows appears to settings and access to PMTU information via it's basic socket options interface:
You can make sure PMTU discover is on via IP_MTU_DISCOVER, and you can read the MTU via IP_MTU.
https://learn.microsoft.com/en-us/windows/desktop/winsock/ipproto-ip-socket-options

Here's a bit of Windows PowerShell that I wrote to check for Path MTU issues. (The general technique is not too hard to implement in other programming languages.) A lot of firewalls and routers are configured to drop all ICMP by people who don't know any better. Path MTU Discovery depends on being able to receive an ICMP Destination Unreachable message with Fragementation Needed set in response to sending a packet with Don't Fragment set. The Resolve IPv4 Fragmentation, MTU, MSS, and PMTUD Issues with GRE and IPsec actually does a really good job of explaining how discovery works.
function Test-IPAddressOrName($ipAddressOrName)
{
$ipaddress = $null
$isValidIPAddressOrName = [ipaddress]::TryParse($ipAddressOrName, [ref] $ipaddress)
if ($isValidIPAddressOrName -eq $false)
{
$hasResolveDnsCommand = $null -ne (Get-Command Resolve-DnsName -ErrorAction SilentlyContinue)
if ($hasResolveDnsCommand -eq $true)
{
$dnsResult = Resolve-DnsName -DnsOnly -Name $ipAddressOrName -ErrorAction SilentlyContinue
$isValidIPAddressOrName = $null -ne $dnsResult
}
}
return $isValidIPAddressOrName
}
function Get-NameAndIPAddress($ipAddressOrName)
{
$hasResolveDnsCommand = $null -ne (Get-Command Resolve-DnsName -ErrorAction SilentlyContinue)
$ipAddress = $null
$validIPAddress = [ipaddress]::TryParse($ipAddressOrName, [ref] $ipAddress)
$nameAndIp = [PSCustomObject] #{ 'Name' = $null; 'IPAddress' = $null }
if ($validIPAddress -eq $false)
{
if ($hasResolveDnsCommand -eq $true)
{
$dnsResult = Resolve-DnsName -DnsOnly $ipAddressOrName -Type A -ErrorAction SilentlyContinue
if ($null -ne $dnsResult -and $dnsResult.QueryType -eq 'A')
{
$nameAndIp.Name = $dnsResult.Name
$nameAndIp.IPAddress = $dnsResult.IPAddress
}
else
{
Write-Error "The name $($ipAddressOrName) could not be resolved."
$nameAndIp = $null
}
}
else
{
Write-Warning "Resolve-DnsName not present. DNS resolution check skipped."
}
}
else
{
$nameAndIp.IPAddress = $ipAddress
if ($hasResolveDnsCommand -eq $true)
{
$dnsResult = Resolve-DnsName -DnsOnly $ipAddress -Type PTR -ErrorAction SilentlyContinue
if ($null -ne $dnsResult -and $dnsResult.QueryType -eq 'PTR')
{
$nameAndIp.Name = $dnsResult.NameHost
}
}
}
return $nameAndIp
}
<#
.Synopsis
Performs a series of pings (ICMP echo requests) with Don't Fragment specified to discover the path MTU (Maximum Transmission Unit).
.Description
Performs a series of pings with Don't Fragment specified to discover the path MTU (Maximum Transmission Unit). An ICMP echo request
is sent with a random payload with a payload length specified by the PayloadBytesMinimun. ICMP echo requests of increasing size are
sent until a ping response status other than Success is received. If the response status is PackeTooBig, the last successful packet
length is returned as a reliable MTU; otherwise, if the respone status is TimedOut, the same size packet is retried up to the number
of retries specified. If all of the retries have been exhausted with a response status of TimedOut, the last successful packet
length is returned as the assumed MTU.
.Parameter UseDefaultGateway
If UseDefaultGateway is specified the default gateway reported by the network interface is used as the destination host.
.Parameter DestinationHost
The IP Address or valid fully qualified DNS name of the destination host.
.Parameter InitialTimeout
The number of milliseconds to wait for an ICMP echo reply. Internally, this is doubled each time a retry occurs.
.Parameter Retries
The number of times to try the ping in the event that no reply is recieved before the timeout.
.Parameter PayloadBytesMinimum
The minimum number of bytes in the payload to use. The minimum MTU for IPv4 is 68 bytes; however, in practice, it's extremely rare
to see an MTU size less than 576 bytes so the default value is 548 bytes (576 bytes total packet size minus an ICMP header of 28
bytes).
.Parameter PayloadBytesMaximum
The maximum number of bytes in the payload to use. An IPv4 MTU for jumbo frames is 9000 bytes. The default value is 8973 bytes (9001
bytes total packet size, which is 1 byte larger than the maximum IPv4 MTU for a jumbo frame, minus an ICMP header of 28 bytes).
.Example
Discover-PathMTU -UseDefaultGateway
.Example
Discover-PathMTU -DestinationHost '192.168.1.1'
.Example
Discover-PathMTU -DestinationHost 'www.google.com'
#>
function Discover-PathMtu
{
[CmdletBinding(SupportsShouldProcess = $false)]
param
(
[Parameter(Mandatory = $true, ParameterSetName = 'DefaultGateway')]
[switch] $UseDefaultGateway,
[Parameter(Mandatory = $true, Position = 0, ValueFromPipeline = $true, ParameterSetName = 'IPAddressOrName')]
[ValidateScript({ Test-IPAddressOrName $_ })]
[string] $DestinationHost,
[Parameter(ParameterSetName = 'IPAddressOrName')]
[Parameter(ParameterSetName = 'DefaultGateway')]
[int] $InitialTimeout = 3000,
[Parameter(ParameterSetName = 'IPAddressOrName')]
[Parameter(ParameterSetName = 'DefaultGateway')]
[int] $Retries = 3,
[Parameter(ParameterSetName = 'IPAddressOrName')]
[Parameter(ParameterSetName = 'DefaultGateway')]
$PayloadBytesMinimum = 548,
[Parameter(ParameterSetName = 'IPAddressOrName')]
[Parameter(ParameterSetName = 'DefaultGateway')]
$PayloadBytesMaximum = 8973
)
begin
{
$ipConfiguration = Get-NetIPConfiguration -Detailed | ?{ $_.NetProfile.Ipv4Connectivity -eq 'Internet' -and $_.NetAdapter.Status -eq 'Up' } | Sort { $_.IPv4DefaultGateway.InterfaceMetric } | Select -First 1
$gatewayIPAddress = $ipConfiguration.IPv4DefaultGateway.NextHop
$pingOptions = New-Object System.Net.NetworkInformation.PingOptions
$pingOptions.DontFragment = $true
$pinger = New-Object System.Net.NetworkInformation.Ping
$rng = New-Object System.Security.Cryptography.RNGCryptoServiceProvider
}
process
{
$pingIpAddress = $null
if ($UseDefaultGateway -eq $true)
{
$DestinationHost = $gatewayIPAddress
}
$nameAndIP = Get-NameAndIPAddress $DestinationHost
if ($null -ne $nameAndIP)
{
Write-Host "Performing Path MTU discovery for $($nameAndIP.Name) $($nameAndIP.IPAddress)..."
$pingReply = $null
$payloadLength = $PayloadBytesMinimum
$workingPingTimeout = $InitialTimeout
do
{
$payloadLength++
# Use a random payload to prevent compression in the path from potentially causing a false MTU report.
[byte[]] $payloadBuffer = (,0x00 * $payloadLength)
$rng.GetBytes($payloadBuffer)
$pingCount = 1
do
{
$pingReply = $pinger.Send($nameAndIP.IPAddress, $workingPingTimeout, $payloadBuffer, $pingOptions)
if ($pingReply.Status -notin 'Success', 'PacketTooBig', 'TimedOut')
{
Write-Warning "An unexpected ping reply status, $($pingReply.Status), was received in $($pingReply.RoundtripTime) milliseconds on attempt $($pingCount)."
}
elseif ($pingReply.Status -eq 'TimedOut')
{
Write-Warning "The ping request timed out while testing a packet of size $($payloadLength + 28) using a timeout value of $($workingPingTimeout) milliseconds on attempt $($pingCount)."
$workingPingTimeout = $workingPingTimeout * 2
}
else
{
Write-Verbose "Testing packet of size $($payloadLength + 28). The reply was $($pingReply.Status) and was received in $($pingReply.RoundtripTime) milliseconds on attempt $($pingCount)."
$workingPingTimeout = $InitialTimeout
}
Sleep -Milliseconds 10
$pingCount++
} while ($pingReply.Status -eq 'TimedOut' -and $pingCount -le $Retries)
} while ($payloadLength -lt $PayloadBytesMaximum -and $pingReply -ne $null -and $pingReply.Status -eq 'Success')
if ($pingReply.Status -eq 'PacketTooBig')
{
Write-Host "Reported IPv4 MTU is $($ipConfiguration.NetIPv4Interface.NlMtu). The discovered IPv4 MTU is $($payloadLength + 27)."
}
elseif ($pingReply.Status -eq 'TimedOut')
{
Write-Host "Reported IPv4 MTU is $($ipConfiguration.NetIPv4Interface.NlMtu). The discovered IPv4 MTU is $($payloadLength + 27), but may not be reliable because the packet appears to have been discarded."
}
else
{
Write-Host "Reported IPv4 MTU is $($ipConfiguration.NetIPv4Interface.NlMtu). The discovered IPv4 MTU is $($payloadLength + 27), but may not be reliable, due to an unexpected ping reply status."
}
return $payloadLength + 27
}
else
{
Write-Error "The name $($DestinationHost) could not be resolved. No Path MTU discovery will be performed."
}
}
end
{
if ($null -ne $pinger)
{
$pinger.Dispose()
}
if ($null -ne $rng)
{
$rng.Dispose()
}
}
}

Related

DPDK TX checksum offload for TCP has fixed offset from correct checksum

I try to compute checksum for IP and TCP via DPDK checksum offload. The code does the following:
ipv4_hdr->hdr_checksum = 0;
mb->l2_len = eth_hdr_len;
mb->l3_len = ipv4_hdr_len;
mb->ol_flags = PKT_TX_IPV4 | PKT_TX_IP_CKSUM | PKT_TX_TCP_CKSUM;
tcp_hdr->cksum = rte_ipv4_phdr_cksum(ipv4_hdr, mb->ol_flags);
rte_eth_conf has been set:
port_conf.txmode.offloads = DEV_TX_OFFLOAD_IPV4_CKSUM | DEV_TX_OFFLOAD_TCP_CKSUM;
the IP checksum is right, TCP checksum has fixed offset from the correct checksum (under my environment, it is 4)
Any ideas?
Please refer to the answer DPDK HW and SW checksum. For enabling HW checksum one has to set the current checksum value to 0.
/* HW check sum */
ip_hdr->hdr_checksum = 0;
tcp_hdr = (struct tcp_hdr *)((char *)ip_hdr + info->l3_len);
tcp_hdr->cksum = 0;

Opening a serialport, transmitting on txd, dtr and rts

Morning all,
I have developed an app with C# and Windows Forms that opens a serialport and transmits on txd, dtr and rts.
I want to be able to do this with tcl/tk but finding serialport tutorials on tcl/tk is proving quite difficult. I did find something on Stackoverflow Here. But when running it:
It says "couldn't open serial "COM7": permission denied"
Does anyone know why permission is denied and how to grant permission? Also does this code even work.
Does anyone have any sample code I can try or can point me to a good understandable source please?
You could have a look at this example that uses a Tcl/tk to read data from a serial port:
############################################
# A first quick test if you have a modem
# open com2: for reading and writing
# For UNIX'es use the appropriate devices /dev/xxx
set serial [open com2: r+]
# setup the baud rate, check it for your configuration
fconfigure $serial -mode "9600,n,8,1"
# don't block on read, don't buffer output
fconfigure $serial -blocking 0 -buffering none
# Send some AT-command to your modem
puts -nonewline $serial "AT\r"
# Give your modem some time, then read the answer
after 100
puts "Modem echo: [read $serial]"
############################################
# Example (1): Poll the comport periodically
set serial [open com2: r+]
fconfigure $serial -mode "9600,n,8,1"
fconfigure $serial -blocking 0 -buffering none
while {1} {
set data [read $serial] ;# read ALL incoming bytes
set size [string length $data] ;# number of received byte, may be 0
if { $size } {
puts "received $size bytes: $data"
} else {
puts "<no data>"
update ;# Display output, allow to close wish-window
}
############################################
# Example (2): Fileevents
set serial [open com2: r+]
fconfigure $serial -mode "9600,n,8,1" -blocking 0 -buffering none -translation binary
fileevent $serial readable [list serial_receiver $serial]
proc serial_receiver { chan } {
if { [eof $chan] } {
puts stderr "Closing $chan"
catch {close $chan}
return
}
set data [read $chan]
set size [string length $data]
puts "received $size bytes: $data"
}
(disclaimer: this is taken verbatim from here)
EDIT: I am sorry but I do not have enough reputation to comment, but it is probably a good idea to specify the platform to narrow down what the permission issues might be related to.

Simple UDP server OCaml/Async

I'm trying to do a simple UDP server using OCaml and the Async API but I'm stuck. I can't make this simple example work.
let wait_for_datagram () : unit Deferred.t =
let port = 9999 in
let addr = Socket.Address.Inet.create Unix.Inet_addr.localhost ~port in
let%bind socket = Udp.bind addr in
let socket = Socket.fd socket in
let stop = never () in
let config = Udp.Config.create ~stop () in
let callback buf _ : unit = failwith "got a datagram" in
Udp.recvfrom_loop ~config socket callback
I test it with:
echo -n "hello goodbye" > /dev/udp/localhost/9999
Nothing happens in my program. I tried to investigate with other tools.
I see a destination unreachable packet with Wireshark and lsof shows me this:
> lsof -i :9999
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
main.exe 77564 nemo 5u IPv4 0x25251bcc3485235f 0t0 UDP localhost:distinct
What am I doing wrong here?
The code looks ok to me. I think localhost is resolved to IPv6 address by default, and you just send it there.
Try to force using IPv4 protocol
echo -n "hello goodbye" | nc -4 -u -w0 localhost 9999
or specify explicit IPv4 address
echo -n "hello goodbye" > /dev/udp/127.0.0.1/9999

GOLANG, HTTP having "use of closed network connection" error

I am getting a lot of error like below mentioned,
read tcp xx.xx.xx.xx:80: use of closed network connection
read tcp xx.xx.xx.xx:80: connection reset by peer
//function for HTTP connection
func GetResponseBytesByURL_raw(restUrl, connectionTimeOutStr, readTimeOutStr string) ([]byte, error) {
connectionTimeOut, _ /*err*/ := time.ParseDuration(connectionTimeOutStr)
readTimeOut, _ /*err*/ := time.ParseDuration(readTimeOutStr)
timeout := connectionTimeOut + readTimeOut // time.Duration((strconv.Atoi(connectionTimeOutStr) + strconv.Atoi(readTimeOutStr)))
//timeout = 200 * time.Millisecond
client := http.Client{
Timeout: timeout,
}
resp, err := client.Get(restUrl)
if nil != err {
logger.SetLog("Error GetResponseBytesByURL_raw |err: ", logs.LevelError, err)
return make([]byte, 0), err
}
defer resp.Body.Close()
body, err := ioutil.ReadAll(resp.Body)
return body, err
}
Update (July 14):
Server : NumCPU=8, RAM=24GB, GO=go1.4.2.linux-amd64
I am getting such error during some high traffic.
20000-30000 request per minutes, and I have a time frame of 500ms to fetch response from third party api.
netstat status from my server (using : netstat -nat | awk '{print $6}' | sort | uniq -c | sort -n) to get frequency
1 established)
1 Foreign
9 LISTEN
33 FIN_WAIT1
338 ESTABLISHED
5530 SYN_SENT
32202 TIME_WAIT
sysctl -p
**sysctl -p**
fs.file-max = 2097152
vm.swappiness = 10
vm.dirty_ratio = 60
vm.dirty_background_ratio = 2
net.ipv4.tcp_synack_retries = 2
net.ipv4.ip_local_port_range = 2000 65535
net.ipv4.tcp_rfc1337 = 1
net.ipv4.tcp_fin_timeout = 5
net.ipv4.tcp_keepalive_time = 300
net.ipv4.tcp_keepalive_probes = 5
net.ipv4.tcp_keepalive_intvl = 15
net.core.rmem_default = 31457280
net.core.rmem_max = 12582912
net.core.wmem_default = 31457280
net.core.wmem_max = 12582912
net.core.somaxconn = 65536
net.core.netdev_max_backlog = 65536
net.core.optmem_max = 25165824
net.ipv4.tcp_mem = 65536 131072 262144
net.ipv4.udp_mem = 65536 131072 262144
net.ipv4.tcp_rmem = 8192 87380 16777216
net.ipv4.udp_rmem_min = 16384
net.ipv4.tcp_wmem = 8192 65536 16777216
net.ipv4.udp_wmem_min = 16384
net.ipv4.tcp_max_tw_buckets = 1440000
net.ipv4.tcp_tw_recycle = 0
net.ipv4.tcp_tw_reuse = 1
net.ipv6.bindv6only = 1
net.ipv4.conf.all.rp_filter = 1
net.ipv4.conf.all.accept_source_route = 0
net.ipv4.icmp_echo_ignore_broadcasts = 1
error: "net.ipv4.icmp_ignore_bogus_error_messages" is an unknown key
kernel.exec-shield = 1
kernel.randomize_va_space = 1
net.ipv4.conf.all.log_martians = 1
net.ipv4.conf.default.log_martians = 1
net.ipv4.icmp_ignore_bogus_error_responses = 1
net.ipv4.ip_forward = 0
net.ipv4.conf.all.send_redirects = 0
net.ipv4.conf.default.send_redirects = 0
net.ipv4.conf.all.secure_redirects = 0
net.ipv4.conf.default.secure_redirects = 0
When making connections at a high rate over the internet, it's very likely you're going to encounter some connection problems. You can't mitigate them completely, so you may want to add retry logic around the request. The actual error type at this point probably doesn't matter, but matching the error string for use of closed network connection or connection reset by peer is about the best you can do if you want to be specific. Make sure to limit the retries with a backoff, as some systems will drop or reset connections as a way to limit request rates, and you may get more errors the faster you reconnect.
Depending on the number of remote hosts you're communicating with, you will want to increase Transport.MaxIdleConnsPerHost (the default is only 2). The fewer hosts you talk to, the higher you can set this. This will decrease the number of new connections made, and speed up the requests overall.
If you can, try the go1.5 beta. There have been a couple changes around keep-alive connections that may help reduce the number of errors you see.
I recommend implementing an exponential back off or some other rate limiting mechanism on your side of the wire. There's not really anything you can do about those error, and using exponential back off won't necessarily make you get the data any faster either. But it can ensure that you get all the data and the API you're pulling from will surely appreciate the reduced traffic. Here's a link to one I found on GitHub; https://github.com/cenkalti/backoff
There was another popular option as well though I haven't used either. Implementing one yourself isn't terribly difficult either and I could provide some sample of that on request. One thing I do recommend based off my experience is make sure you're using a retry function that has an abort channel. If you get to really long back off times then you'll want some way for the caller to kill it.

wireshark count packets by port

I have a very large trace file and am trying to use Wireshark to determine which dest port has the most packets sent to it. Is there a way to get counts of packets sent to particular ports? Or to sort by number of packets sent a port?
You can write a simple wireshark listener in lua.
local tap
local ports = {}
local function packet(pinfo, tvb, userdata)
-- store number of packets per each port
local port = pinfo.dst_port
ports[port] = (ports[port] or 0) + 1
end
local function draw(userdata)
local maxi,maxv = 0,0
-- print all gathered statictics and find max
for i,v in pairs(ports) do
print(i .. ":", v)
if maxv < v then
maxi,maxv = i,v
end
end
print ("Max:", maxi, maxv)
end
local function reset(userdata)
ports = {}
end
local function show_ports()
tap = Listener.new()
tap.packet = packet
tap.draw = draw
tap.reset = reset
end
register_stat_cmd_arg('ports', show_ports)
Try it:
tshark -X lua_script:ports.lua -z ports -r in.pcap

Resources