Are TCP/UDP IP packets with a source port below 1024 possible - tcp

I am analyzing some events against dns servers running unbound. In the course of this investigation I am running into traffic involving queries to the dns servers that are reported as having in some cases a source port between 1 and 1024. As far as my knowledge goes these are reserved for services so there should never be traffic originating / initiated from those to a server.
Since I also know this is a practice, not a law, that evolved over time, I know there is no technical limitation to put any number in the source port field of a packet. So my conclusion would be that these queries were generated by some tool in which the source port is filled with a random value (the frequency is about evenly divided over 0-65535, except for a peak around 32768) and that this is a deliberate attack.
Can someone confirm/deny the source port theory and vindicate my conclusion or declare me a total idiot and explain why?
Thanks in advance.
Edit 1: adding more precise info to settle some disputes below that arose due to my incomplete reporting.
It's definitely not a port scan. It was traffic arriving on port 53 UDP and unbound accepted it apparently as an (almost) valid dns query, while generating the following error messages for each packet:
notice: remote address is <ipaddress> port <sourceport>
notice: sendmsg failed: Invalid argument
$ cat raw_daemonlog.txt | egrep -c 'notice: remote address is'
256497
$ cat raw_daemonlog.txt | egrep 'notice: remote address is' | awk '{printf("%s\n",$NF)}' | sort -n | uniq -c > sourceportswithfrequency.txt
$ cat sourceportswithfrequency.txt | wc -l
56438
So 256497 messages, 56438 unique source ports used
$ cat sourceportswithfrequency.txt | head
5 4
3 5
5 6
So the lowest source port seen was 4 which was used 5 times
$ cat sourceportswithfrequency.txt | tail
8 65524
2 65525
14 65526
1 65527
2 65528
4 65529
3 65530
3 65531
3 65532
4 65534
So the highest source port seen was 65534 and it was used 4 times.
$ cat sourceportswithfrequency.txt | sort -n | tail -n 25
55 32786
58 35850
60 32781
61 32785
66 32788
68 32793
71 32784
73 32783
88 32780
90 32791
91 32778
116 2050
123 32779
125 37637
129 7077
138 32774
160 32777
160 57349
162 32776
169 32775
349 32772
361 32773
465 32769
798 32771
1833 32768
So the peak around 32768 is real.
My original question still stands: does this traffic pattern suggest an attack or is there an logical explanation for, for instance, the traffic with source ports < 1024?

As far as my knowledge goes these are reserved for services so there should never be traffic originating / initiated from those to a server.
It doesn't matter what the source port number is, as long as it's between 1 and 65,535. It's not like a source port of 53 means that there is a DNS server listening on the source machine.
The source port is just there to allow multiple connections / in-flight datagrams from one machine to another machine on the same destination port.
See also Wiki: Ephemeral port:
The Internet Assigned Numbers Authority (IANA) suggests the range 49152 to 65535 [...] for dynamic or private ports.[1]

That sounds like a port scan.
There are 65536 distinct and usable port numbers. (ibid.)
FYI: The TCP and UDP port 32768 is registered and used by IBM FileNet TMS.

Related

How to fix high latency and retransmission rate in Ubuntu 18.04

I installed Ubuntu 18.04 on Hyper-V Win Server 2016.
And network performance of the Ubuntu is bad: I'm hosting few sites (Apache + PHP) and sometime response time is > 10 seconds. Sometimes it is fast.
As I troubleshooted, I see this netstat results:
# netstat -s | egrep -i 'loss|retran'
3447700 segments retransmitted
226 times recovered from packet loss due to fast retransmit
Detected reordering 6 times using reno fast retransmit
TCPLostRetransmit: 79831
45 timeouts after reno fast retransmit
6247 timeouts in loss state
2056435 fast retransmits
107095 retransmits in slow start
TCPLossProbes: 220607
TCPLossProbeRecovery: 3753
TCPSynRetrans: 90564
What can be cause of such high "segments retransmitted" number? And how to fix it?
Few notes:
- VMQ is disabled for Ubuntu VM
- The host system Network adapter is Intel I210
- I disabled IPv6 both on host and in VM
Here is WireShark showing, that it takes ~7 seconds to connect (just initial connection) to my site Propovednik.com:
Sep 20: So far, the issue seems to be caused by OVH / SoYouStart bad network:
This command shows 20-30% packets loss:
sudo ping us.soyoustart.com -c 10 -i 0.2 -p 00 -s 1200 -l 5
The problem could be anywhere along the network, including the workstation where you work from. I suggest you check the network as retransmissions and packetloss means that either something is malfunctioning or misconfigured. If this is on a wireless network, you could be out of range of your router.
I am pinging the website you noted from my computer and there is no packetloss.

ipvsadm -L -n suddenly showing no active connections

I have a very odd problem in a proxy cluster of four Squid proxies:
One of the machine is the master. The mater is running ldirectord which is checking the availability of all four machines, distributing new client connections.
All over a sudden, after years of operation I'm encountering this problem:
1) The machine serving the master role is not being assigned new connections, old connections are served until a new proxy is assigned to the clients.
2) The other machines are still processing requests, taking over the clients from the master (so far, so good)
3) "ipvsadm -L -n" shows ever-decreasing ActiveConn and InActConn values.
Once I migrate the master role to another machine, "ipvsadm -L -n" is showing lots of active and inactive connections, until after about an hour the same thing happens on the new master.
Datapoint: This happened again this afternoon, and now "ipvsadm -L -n" shows:
TCP 141.42.1.215:8080 wlc persistent 1800
-> 141.42.1.216:8080 Route 1 98 0
-> 141.42.1.217:8080 Route 1 135 0
-> 141.42.1.218:8080 Route 1 1 0
-> 141.42.1.219:8080 Route 1 2 0
No change in the numbers quite some time now.
Some more stats (ipvsadm -L --stats -n):
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Conns InPkts OutPkts InBytes OutBytes
-> RemoteAddress:Port
TCP 141.42.1.215:8080 1990351 87945600 0 13781M 0
-> 141.42.1.216:8080 561980 21850870 0 2828M 0
-> 141.42.1.217:8080 467499 23407969 0 3960M 0
-> 141.42.1.218:8080 439794 19364749 0 2659M 0
-> 141.42.1.219:8080 521378 23340673 0 4335M 0
Value for "Conns" is constant now for all realservers and the virtual server now. Traffic is still flowing (InPkts increasing).
I examined the output of "ipvsadm -L -n -c" and found:
25 FIN_WAIT
534 NONE
977 ESTABLISHED
Then I waited a minute and got:
21 FIN_WAIT
515 NONE
939 ESTABLISHED
It turns out that a local bird installation was injecting router for the IP of the virtual server and thus taking precedence over ARP.

How do I connect to kafka cluster on Virtualbox?

Here's my setup on local: 3 VMs (using Virtualbox), kafka and zookeeper installed on all three. They are all talking to each other as well.
I am trying to use kafka-console-producer from my local, which requires broker-list. I am supplying the IPs of my VMs but it doesn't seem to be working. I've tried the advertised.host properties on the VMs too but has no effect for me. Here's my server.properties from the three machines:
Server 1:
broker.id=4
port=9092
host.name=10.30.3.4
advertised.host.name=10.30.3.4
advertised.port=9092
zookeeper.connect=10.30.3.4:2181
zookeeper.connection.timeout.ms=6000
Server 2:
broker.id=3
port=9092
host.name=10.30.3.3
advertised.host.name=10.30.3.3
advertised.port=9092
zookeeper.connect=10.30.3.3:2181
zookeeper.connection.timeout.ms=6000
Server 3:
broker.id=2
port=9092
host.name=10.30.3.2
advertised.host.name=10.30.3.2
advertised.port=9092
zookeeper.connect=10.30.3.2:2181
zookeeper.connection.timeout.ms=6000
My virtualbox also has port forwarding setup:
Similarly for other two machines too ports are only tweaked a bit.
I am able to connect to zookeeper just fine, so:
bin/zkCli.sh -server 127.0.0.1:9999
is able to connect to zookeeper on VM. But if I try connecting kafka-console-producer it fails when I try sending messages:
bin/kafka-console-producer.sh --broker-list 127.0.0.1:9502 --topic partition2replica2 --timeout 3000
leads to:
[2016-02-17 15:06:36,943] WARN Property topic is not valid (kafka.utils.VerifiableProperties)
hi
there
[2016-02-17 15:07:23,699] WARN Failed to send producer request with correlation id 3 to broker 3 with data for partitions [partition2replica2,1] (kafka.producer.async.DefaultEventHandler)
java.nio.channels.ClosedChannelException
at kafka.network.BlockingChannel.send(BlockingChannel.scala:100)
at kafka.producer.SyncProducer.liftedTree1$1(SyncProducer.scala:73)
at kafka.producer.SyncProducer.kafka$producer$SyncProducer$$doSend(SyncProducer.scala:72)
at kafka.producer.SyncProducer$$anonfun$send$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(SyncProducer.scala:103)
at kafka.producer.SyncProducer$$anonfun$send$1$$anonfun$apply$mcV$sp$1.apply(SyncProducer.scala:103)
at kafka.producer.SyncProducer$$anonfun$send$1$$anonfun$apply$mcV$sp$1.apply(SyncProducer.scala:103)
at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33)
at kafka.producer.SyncProducer$$anonfun$send$1.apply$mcV$sp(SyncProducer.scala:102)
at kafka.producer.SyncProducer$$anonfun$send$1.apply(SyncProducer.scala:102)
at kafka.producer.SyncProducer$$anonfun$send$1.apply(SyncProducer.scala:102)
at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33)
at kafka.producer.SyncProducer.send(SyncProducer.scala:101)
at kafka.producer.async.DefaultEventHandler.kafka$producer$async$DefaultEventHandler$$send(DefaultEventHandler.scala:255)
at kafka.producer.async.DefaultEventHandler$$anonfun$dispatchSerializedData$2.apply(DefaultEventHandler.scala:106)
at kafka.producer.async.DefaultEventHandler$$anonfun$dispatchSerializedData$2.apply(DefaultEventHandler.scala:100)
at scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:772)
at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:226)
at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39)
at scala.collection.mutable.HashMap.foreach(HashMap.scala:98)
at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:771)
at kafka.producer.async.DefaultEventHandler.dispatchSerializedData(DefaultEventHandler.scala:100)
at kafka.producer.async.DefaultEventHandler.handle(DefaultEventHandler.scala:72)
at kafka.producer.async.ProducerSendThread.tryToHandle(ProducerSendThread.scala:105)
at kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:88)
at kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:68)
at scala.collection.immutable.Stream.foreach(Stream.scala:547)
at kafka.producer.async.ProducerSendThread.processEvents(ProducerSendThread.scala:67)
at kafka.producer.async.ProducerSendThread.run(ProducerSendThread.scala:45)
[2016-02-17 15:07:25,318] WARN Failed to send producer request with correlation id 7 to broker 3 with data for partitions [partition2replica2,1] (kafka.producer.async.DefaultEventHandler)
java.nio.channels.ClosedChannelException
at kafka.network.BlockingChannel.send(BlockingChannel.scala:100)
at kafka.producer.SyncProducer.liftedTree1$1(SyncProducer.scala:73)
at kafka.producer.SyncProducer.kafka$producer$SyncProducer$$doSend(SyncProducer.scala:72)
at kafka.producer.SyncProducer$$anonfun$send$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(SyncProducer.scala:103)
at kafka.producer.SyncProducer$$anonfun$send$1$$anonfun$apply$mcV$sp$1.apply(SyncProducer.scala:103)
at kafka.producer.SyncProducer$$anonfun$send$1$$anonfun$apply$mcV$sp$1.apply(SyncProducer.scala:103)
at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33)
at kafka.producer.SyncProducer$$anonfun$send$1.apply$mcV$sp(SyncProducer.scala:102)
at kafka.producer.SyncProducer$$anonfun$send$1.apply(SyncProducer.scala:102)
at kafka.producer.SyncProducer$$anonfun$send$1.apply(SyncProducer.scala:102)
at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33)
at kafka.producer.SyncProducer.send(SyncProducer.scala:101)
at kafka.producer.async.DefaultEventHandler.kafka$producer$async
Not sure what I am doing wrong here? Any ideas? (I can provide ifconfig output if anyone wants). Any help will be appreciated.
[Edit 1]: Adding output of zookeeper quorum:
That seems to be in quorum:
echo stat| nc 10.30.3.2 2181
Received: 81
Sent: 80
Connections: 1
Outstanding: 0
Mode: follower
Node count: 149
echo stat| nc 10.30.3.3 2181
Received: 660
Sent: 664
Connections: 1
Outstanding: 0
Zxid: 0x600000109
Mode: leader
Node count: 149
echo stat| nc 10.30.3.4 2181
Received: 293
Sent: 295
Connections: 1
Outstanding: 0
Zxid: 0x600000109
Mode: follower
Node count: 149
As far as I can understand your setup, the zookeepers on each node should also been in Quorum with each other to support the 3 kafka servers instances as one cluster. You have provided kafka config only, so I cannot make out if they are configured that way.
You can check by using the 4 letter command on each zookeeper node like below
echo stat | nc <zk ip> <zk port>
echo mntr | nc <zk ip> <zk port>
One should be a "leader" and other two should be "followers".
I am not sure if they will work as expected if they are not configured to be in quorum.

Internet Provider with "Private WAN" to the clients?

This is strange. How this actually works. So far I know it is "impossible" to have a network like this.
I'm going to explain in details how my network works.
I have a LAN. 192.168.1.0/24 and router is on 192.168.1.1, This router has a public address.
I can share the IP address because I'm running there a server for tests nothing more. It is ok so far.
Now the magic happens.
When I trace the route to an IP I got this (To google DNS):
traceroute to 8.8.8.8 (8.8.8.8), 30 hops max, 60 byte packets
1 zonhub.home (192.168.1.1) 1.160 ms 1.676 ms 1.340 ms
2 * * *
3 10.137.211.97 (10.137.211.97) 12.915 ms 12.526 ms 12.145 ms
4 10.255.49.90 (10.255.49.90) 10.349 ms 10.255.49.102 (10.255.49.102) 11.483 ms 11.042 ms
5 80.157.128.249 (80.157.128.249) 34.577 ms 80.157.130.41 (80.157.130.41) 32.917 ms 80.157.130.33 (80.157.130.33) 30.602 ms
6 mad-sa3-i.MAD.ES.NET.DTAG.DE (217.5.95.161) 33.396 ms 80.157.128.22 (80.157.128.22) 27.107 ms mad-sa3-i.MAD.ES.NET.DTAG.DE (217.5.95.161) 29.510 ms
7 80.157.128.22 (80.157.128.22) 28.050 ms 72.14.235.20 (72.14.235.20) 32.767 ms 80.157.128.22 (80.157.128.22) 27.932 ms
8 72.14.235.20 (72.14.235.20) 29.780 ms 72.14.235.18 (72.14.235.18) 27.020 ms 26.706 ms
9 216.239.43.233 (216.239.43.233) 49.456 ms 209.85.240.191 (209.85.240.191) 44.034 ms 216.239.43.233 (216.239.43.233) 51.935 ms
10 72.14.236.191 (72.14.236.191) 53.374 ms 209.85.253.20 (209.85.253.20) 50.699 ms 216.239.43.233 (216.239.43.233) 44.918 ms
11 209.85.251.231 (209.85.251.231) 50.151 ms * 216.239.49.45 (216.239.49.45) 47.309 ms
12 google-public-dns-a.google.com (8.8.8.8) 51.536 ms 50.180 ms 45.505 ms
What is that 2nd, 3rd and 4th hop? How can it be on class A private address when 192.168.1.1 is running the NAT service where I have my LAN and my external 3 publics addresses (yes I have 3 and is 3 "class A" IP's on 88,89,93 network).
Another thing is how on 4th hop we have the 2nd octet 255?
Anyone feel free to traceroute my no-ip domain: synackfiles.no-ip.org
Just don't mess with my router (it blocks if you port scan or fail to log in in ssh or http auth so you get banned for this. If you just traceroute it is fine) :P
Now, second magic and weird stuff happens.
I'm going to run nmap. So i GOT THIS:
sudo nmap -sV -A -O 10.137.211.113 -vv -p 1-500 -Pn
Starting Nmap 6.00 ( http://nmap.org ) at 2013-11-14 15:24 WET
NSE: Loaded 93 scripts for scanning.
NSE: Script Pre-scanning.
NSE: Starting runlevel 1 (of 2) scan.
NSE: Starting runlevel 2 (of 2) scan.
Initiating Parallel DNS resolution of 1 host. at 15:24
Completed Parallel DNS resolution of 1 host. at 15:24, 0.04s elapsed
Initiating SYN Stealth Scan at 15:24
Scanning 10.137.211.113 [500 ports]
SYN Stealth Scan Timing: About 30.40% done; ETC: 15:26 (0:01:11 remaining)
SYN Stealth Scan Timing: About 60.30% done; ETC: 15:26 (0:00:40 remaining)
Completed SYN Stealth Scan at 15:26, 101.14s elapsed (500 total ports)
Initiating Service scan at 15:26
Initiating OS detection (try #1) against 10.137.211.113
Initiating Traceroute at 15:26
Completed Traceroute at 15:26, 9.05s elapsed
Initiating Parallel DNS resolution of 1 host. at 15:26
Completed Parallel DNS resolution of 1 host. at 15:26, 0.01s elapsed
NSE: Script scanning 10.137.211.113.
NSE: Starting runlevel 1 (of 2) scan.
Initiating NSE at 15:26
Completed NSE at 15:26, 0.00s elapsed
NSE: Starting runlevel 2 (of 2) scan.
Nmap scan report for 10.137.211.113
Host is up (0.0010s latency).
All 500 scanned ports on 10.137.211.113 are filtered
Device type: general purpose|specialized|media device
Running: Barrelfish, Microsoft Windows 2003|PocketPC/CE|XP, Novell NetWare 3.X, Siemens embedded, Telekom embedded
OS CPE: cpe:/o:barrelfish:barrelfish cpe:/o:microsoft:windows_server_2003::sp1 cpe:/o:microsoft:windows_server_2003::sp2 cpe:/o:microsoft:windows_ce cpe:/o:microsoft:windows_xp:::professional cpe:/o:novell:netware:3.12
Too many fingerprints match this host to give specific OS details
TCP/IP fingerprint:
SCAN(V=6.00%E=4%D=11/14%OT=%CT=%CU=%PV=Y%G=N%TM=5284EBAB%P=armv7l-unknown-linux-gnueabi)
T7(R=Y%DF=N%TG=80%W=0%S=Z%A=S+%F=AR%O=%RD=0%Q=R)
U1(R=N)
IE(R=N)
TRACEROUTE (using proto 1/icmp)
HOP RTT ADDRESS
1 2.32 ms zonhub.home (192.168.1.1)
2 ... 30
NSE: Script Post-scanning.
NSE: Starting runlevel 1 (of 2) scan.
NSE: Starting runlevel 2 (of 2) scan.
Read data files from: /usr/bin/../share/nmap
OS and Service detection performed. Please report any incorrect results at http://nmap.org/submit/ .
Nmap done: 1 IP address (1 host up) scanned in 122.65 seconds
Raw packets sent: 1109 (49.620KB) | Rcvd: 4 (200B)
Well, this is odd. I don't know how my country's WAN is designed and built.
I'm from Portugal and my ISP is "ZON TVCABO". You can search now. :P
This is very very very interesting..
Sincerely,
int3
I cannot tell you how your providers WAN is built - but in order to
save public IPs - one can design an ISPs internal network with private
IPs. The routers that are not needed to be available from public will
have private IPs only - the IPs assigned to you can be routed to your
uplink over routers that are ISP internally only.
the 2nd hop has no tracing allowed, but allows to forward them.
the 4th - 10.255.x.x is a private IP in the 10.0.0.0/8 A range. (you
can use numbers from 0-255)

Scaling nginx with static files -- non-Persistent requests kill req/s

Working on a project where we need to server a small static xml file ~40k / s.
All incoming requests are sent to the server from HAProxy. However, none of the requests will be persistent.
The issue is that when benchmarking with non-Persistent requests, the nginx instance caps out at 19 114 req/s. When persistent connections are enabled, performance increases by nearly an order of magnitude, to 168 867 req/s. The results are similar with G-wan.
When benchmarking non-persistent requests, CPU usage is minimal.
What can I do to increase performance with non-persistent connections and nginx?
[root#spare01 lighttpd-weighttp-c24b505]# ./weighttp -n 1000000 -c 100 -t 16 "http://192.168.1.40/feed.txt"
finished in 52 sec, 315 millisec and 603 microsec, 19114 req/s, 5413 kbyte/s
requests: 1000000 total, 1000000 started, 1000000 done, 1000000 succeeded, 0 failed, 0 errored
status codes: 1000000 2xx, 0 3xx, 0 4xx, 0 5xx
traffic: 290000000 bytes total, 231000000 bytes http, 59000000 bytes data
[root#spare01 lighttpd-weighttp-c24b505]# ./weighttp -n 1000000 -c 100 -t 16 -k "http://192.168.1.40/feed.txt"
finished in 5 sec, 921 millisec and 791 microsec, 168867 req/s, 48640 kbyte/s
requests: 1000000 total, 1000000 started, 1000000 done, 1000000 succeeded, 0 failed, 0 errored
status codes: 1000000 2xx, 0 3xx, 0 4xx, 0 5xx
traffic: 294950245 bytes total, 235950245 bytes http, 59000000 bytes data
Your 2 tests are similar (except HTTP Keep-Alives):
./weighttp -n 1000000 -c 100 -t 16 "http://192.168.1.40/feed.txt"
./weighttp -n 1000000 -c 100 -t 16 -k "http://192.168.1.40/feed.txt"
And the one with HTTP Keep-Alives is 10x faster:
finished in 52 sec, 19114 req/s, 5413 kbyte/s
finished in 5 sec, 168867 req/s, 48640 kbyte/s
First, HTTP Keep-Alives (persistant connections) make HTTP requests run faster because:
Without HTTP Keep-Alives, the client must establish a new CONNECTION for EACH request (this is slow because of the TCP handshake).
With HTTP Keep-Alives, the client can send all requests at once (using the SAME CONNECTION). This is faster because there are less things to do.
Second, you say that the static file XML size is "small".
Is "small" nearer to 1 KB or 1 MB? We don't know. But that makes a huge difference in terms of available options to speedup things.
Huge files are usually served through sendfile() because it works in the kernel, freeing the usermode server from the burden of reading from disk and buffering.
Small files can use more flexible options available for application developers in usermode, but here also, file size matters (bytes and kilobytes are different animals).
Third, you are using 16 threads with your test. Are you really enjoying 16 PHYSICAL CPU Cores on BOTH the client and the server machines?
If that's not the case, then you are simply slowing-down the test to the point that you are no longer testing the web servers.
As you see, many factors have an influence on performance. And there are more with OS tuning (the TCP stack options, available file handles, system buffers, etc.).
To get the most of a system, you need to examinate all those parameters, and pick the best for your particular exercise.

Resources