em1: Watchdog timeout -- resetting - freebsd 8.3 / network down - networking

i have a major issue that i can't find nor heads nor tails of. I have googled this error, but i have not found any relevant solutions.
The problem:
I have about 8 servers, all running freebsd 8.3 p3 / p4. This fileserver is pushing around 300-400 mb/s.
This is the second time it happens. The network card just seems to die. I have 2 network cards in it, and i can reach the server via private network, and it all works okay, only that the public network is completely down. I have tried restarting the network interfaces: /etc/rc.d/netif restart && service routing restart | ifconfig em1 down && ifconfig em1 up, but with no success.
I can only bring the connectivity back if i reboot the server.
Below is the output from dmesg.boot that shows the network card drivers info.
em0: <Intel(R) PRO/1000 Network Connection 7.3.2> port 0xf020-0xf03f mem 0xf7b00000-0xf7b1ffff,0xf7b25000-0xf7b25fff irq 20 at device 25.0 on pci0
em0: Using an MSI interrupt
em0: [FILTER]
em0: Ethernet address: 00:25:90:7a:8e:9f
ehci0: <EHCI (generic) USB 2.0 controller> mem 0xf7b24000-0xf7b243ff irq 16 at device 26.0 on pci0
em1: <Intel(R) PRO/1000 Network Connection 7.3.2> port 0xd000-0xd01f mem 0xf7900000-0xf791ffff,0xf7920000-0xf7923fff irq 16 at device 0.0 on pci3
em1: Using MSIX interrupts with 3 vectors
em1: [ITHREAD]
em1: [ITHREAD]
em1: [ITHREAD]
em1: Ethernet address: 00:25:90:7a:8e:9e
----------------------------
pciconf -lv
em1#pci0:3:0:0: class=0x020000 card=0x000015d9 chip=0x10d38086 rev=0x00 hdr=0x00
vendor = 'Intel Corporation'
device = 'Intel 82574L Gigabit Ethernet Controller (82574L)'
class = network
subclass = ethernet
em0#pci0:0:25:0: class=0x020000 card=0x150215d9 chip=0x15028086 rev=0x05 hdr=0x00
vendor = 'Intel Corporation'
class = network
subclass = ethernet
I would really love some help to debug and fix this, because it usually happens while i am sleeping, at random days, and it's driving me crazy. I love my sleep.

This is a supermicro server, right?
cat /var/run/dmesg.boot | grep MSI
em0: Using an MSI interrupt
em1: Using MSIX interrupts with 3 vectors
Your answer is probably here: http://forums.freebsd.org/showthread.php?t=27736

Related

rPi OS upgrade introduced Predictable Network Interface Names; can't get eth0 back and dhcp working again [closed]

Closed. This question is not about programming or software development. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed last month.
Improve this question
System: Host: rpi32 Kernel: 5.15.56-v7+ armv7l bits: 32 Console: tty 0 Distro: Raspbian GNU/Linux 11 (bullseye)
Machine: Type: ARM Device System: Raspberry Pi 3 Model B Rev 1.2 details: BCM2835 rev: a22082 serial: 000000009a5073f1
I had a working machine before the upgrade, ntp, dhcp (is actually isc-dhcpserver), dns all working.
Then upgraded the OS (to Bullseye) and could no longer connect to the rPi.
dmesg revealed that eth0 could not be connected to.
The interface was identified as enxb827eb5073f1. en = Ethernet plus MAC address.
Some research revealed that what I am seeing is called "Predictable Network Interface Names".
It said this is the new standard/approach, due to multi-interface machines not necessarily assigning the interface name at kernel boot; e.g., it could be eth0 on one boot, and eth1 during another; not good for firewalls, etc.
So I changed the following config files to get dhcp working:
/etc/default/isc-dhcp-server
/etc/network/interfaces
/etc/dhcp/dhcpd.conf
... and changed eth0 to enxb827eb5073f1.
No luck.
sudo service dhcpcd status
● dhcpcd.service - dhcpcd on all interfaces
Loaded: loaded (/lib/systemd/system/dhcpcd.service; enabled; vendor preset: enabled)
Drop-In: /etc/systemd/system/dhcpcd.service.d
└─wait.conf
Active: failed (Result: exit-code) since Fri 2022-08-19 15:04:18 AEST; 28min ago
Process: 859 ExecStart=/usr/lib/dhcpcd5/dhcpcd -q -w (code=exited, status=6)
CPU: 11ms
Aug 19 15:04:18 rpi32 systemd[1]: Starting dhcpcd on all interfaces...
Aug 19 15:04:18 rpi32 dhcpcd[859]: Not running dhcpcd because /etc/network/interfaces
Aug 19 15:04:18 rpi32 dhcpcd[859]: defines some interfaces that will use a
Aug 19 15:04:18 rpi32 dhcpcd[859]: DHCP client or static address
Aug 19 15:04:18 rpi32 systemd[1]: dhcpcd.service: Control process exited, code=exited, status=6/NOTCONFIGURED
Aug 19 15:04:18 rpi32 systemd[1]: dhcpcd.service: Failed with result 'exit-code'.
Aug 19 15:04:18 rpi32 systemd[1]: Failed to start dhcpcd on all interfaces.
and
dhcpd -t /etc/dhcp/dhcpd.conf
/etc/dhcp/dhcpd.conf: interface name too long (is 20)
Researching this topic pointed to incorrect dhcpd config, pointing to udev rules, and I do not understand, and from what I could see, did not contain interface reference.
I read here: https://www.freedesktop.org/wiki/Software/systemd/PredictableNetworkInterfaceNames/ that this naming scheme can be reverted by adding this: net.ifnames=0 to the kernel command line (/boot/cmdline.txt).
This is what I did. I reverted all changes in the three config files listed above, plus in the cmdline.txt.
(I rebooted as required after these changes.)
and dhcpd -t /etc/dhcp/dhcpd.conf still returns:
/etc/dhcp/dhcpd.conf: interface name too long (is 20)
All services work, except dhcp (ntp is back up as well, as no changes where made here WRT eth0 changes).
Now I wonder what else I need to do to get dhcp working again.
Config files:
ifconfig -a
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 192.168.1.8 netmask 255.255.255.0 broadcast 192.168.1.255
ether b8:27:eb:50:73:f1 txqueuelen 1000 (Ethernet)
RX packets 14682 bytes 1148952 (1.0 MiB)
RX errors 0 dropped 3460 overruns 0 frame 0
TX packets 7079 bytes 1063400 (1.0 MiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536
inet 127.0.0.1 netmask 255.0.0.0
loop txqueuelen 1000 (Local Loopback)
RX packets 105 bytes 10173 (9.9 KiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 105 bytes 10173 (9.9 KiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
cat /etc/default/isc-dhcp-server
# Path to dhcpd's config file (default: /etc/dhcp/dhcpd.conf).
DHCPDv4_CONF=/etc/dhcp/dhcpd.conf
#DHCPDv6_CONF=/etc/dhcp/dhcpd6.conf
# Path to dhcpd's PID file (default: /var/run/dhcpd.pid).
DHCPDv4_PID=/var/run/dhcpd.pid
#DHCPDv6_PID=/var/run/dhcpd6.pid
#OPTIONS=""
# On what interfaces should the DHCP server (dhcpd) serve DHCP requests?
# Separate multiple interfaces with spaces, e.g. "eth0 eth1".
#INTERFACESv4="enxb827eb5073f1"
INTERFACESv4="eth0"
#INTERFACESv6=""
cat /etc/dhcpcd.conf
# A sample configuration for dhcpcd.
# Inform the DHCP server of our hostname for DDNS.
hostname
# Use the hardware address of the interface for the Client ID.
clientid
# Persist interface configuration when dhcpcd exits.
persistent
# Rapid commit support.
option rapid_commit
# A list of options to request from the DHCP server.
option domain_name_servers, domain_name, domain_search, host_name
option classless_static_routes
# Respect the network MTU. This is applied to DHCP routes.
option interface_mtu
# Most distributions have NTP support.
#option ntp_servers
# A ServerID is required by RFC2131.
require dhcp_server_identifier
# Generate SLAAC address using the Hardware Address of the interface
#slaac hwaddr
# OR generate Stable Private IPv6 Addresses based from the DUID
slaac private
cat /etc/dhcp/dhcpd.conf
# 190803-1530 installed DHCP server on rPi32
#
# 170611-1933 MaxG: changed from none to interim
#ddns-update-style none;
ddns-update-style interim;
# 170612-2300 MaxG: added based on
# https://blog.bigdinosaur.org/running-bind9-and-isc-dhcp/
ddns-updates on;
update-static-leases on;
ddns-domainname "argylecourt.lan";
ddns-rev-domainname "in-addr.arpa.";
authoritative;
# 190804-1424 MaxG: added key and 2 zones
key DHCP_UPDATER {
algorithm HMAC-MD5.SIG-ALG.REG.INT;
# Important: Replace this key with your generated key.
# Also note that the key should be surrounded by quotes.
secret "someKeyBlah";
};
zone argylecourt.lan. {
primary 127.0.0.1;
key DHCP_UPDATER;
}
zone 1.168.192.in-addr.arpa. {
primary 127.0.0.1;
key DHCP_UPDATER;
}
# 150301 MaxG - added to shut up Windows PC from clogging
# syslog with DHCPACK and DHCPINFORM msgs (WPAD)
option wpad-url code 252 = text;
# my subnet specifications
subnet 192.168.1.0 netmask 255.255.255.0 {
#interface enxb827eb5073f1;
# pool range; can have multiple ranges in this file
range 192.168.1.50 192.168.1.199;
option subnet-mask 255.255.255.0;
option routers 192.168.1.1;
ddns-domainname "argylecourt.lan";
ddns-rev-domainname "in-addr.arpa";
option broadcast-address 192.168.1.255;
option domain-name "argylecourt.lan";
option domain-name-servers 192.168.1.8;
option ntp-servers 192.168.1.8; # Default NTP server to be used by DHCP clients
default-lease-time 86400; # 1 day
max-lease-time 604800; # 7 days
option wpad-url "\n";
}
# reservations; must NOT be in pool
# sorted by assinged IP address
host maxg-x570 {
# MaxG's PC -- x570
# added 20220409-2106
hardware ethernet 04:42:1a:95:2b:37;
fixed-address 192.168.1.13;
}
host brother-mfc {
# Brother Network Printer -- BRN_368926
hardware ethernet 00:80:77:36:89:26;
fixed-address 192.168.1.33;
ddns-hostname "brothermfc8820d";
}
I ran into the same situation and was not able to tell where the mistake was.
try $ dhcpd /etc/dhcp/
this will search the whole file and will point directly where the mistake is
Well, well... how embarrassing!
The solution is simple:
sudo service isc-dhcp-server start
Start the correct service. It is not dhcp, it is isc-dhcp-server!
What I do not understand is why this service was no longer auto-starting.
Anyway, problem, or rather stupidity solved.

Bluepy BLE Disconnecting after receiving several packets from Bluno Beetle

I am required to connect 6 bluno beetles to my Raspi 3B+ to receive some data concurrently. However, with only 1 connection to one bluno beetle, I am already having constant disconnection after receiving a few packets. Sometimes I am able to receive 20 packets before disconnecting, while sometimes I am able to receive 5 packets before disconnecting. The number of packets received fluctuates.
Is this suppose to be normal?
My raspi 3B+ is installed with Raspbian GNU/Linux 10 (buster). I have python3 installed with bluepy version 1.3.0 installed.
The Bluno Beetle is an Arduino Uno based board with bluetooth 4.0
The Raspi 3B+ has a Bluetooth HCI Version: 5.0 (0x9)
I have tried to handle disconnection by reconnecting and it works fine. But the time taken to reconnect takes a while (4-5 seconds) which I would have data from Bluno beetle side.
How can I further enhance the robustness of BLE?
This is my python code below where I am only listening to data sent from the Bluno Beetle.
from bluepy import btle
from bluepy.btle import BTLEException, Scanner, BTLEDisconnectError
import threading
# Global Vars
connectionObjects = [] # Total of 6 Connections expected
connectedThreads = [] # Total of 6 Connections expected
threads = list()
class MyDelegate(btle. DefaultDelegate):
def __init__(self, connection_index):
btle.DefaultDelegate.__init__(self)
def handleNotification(self, cHandle, data):
print("handling notification...")
data_string = str(data)
print(data_string)
class ConnectionHandlerThread (threading.Thread):
def __init__(self, connection_index, BTAddress):
threading.Thread.__init__(self)
self.connection_index = connection_index
self.BTAddress = BTAddress
self.connection = connectionObjects[self.connection_index]
def connect(self):
self.connection.setDelegate(MyDelegate(self.connection_index))
def run(self):
self.connect()
while True:
try:
if self.connection.waitForNotifications(1.0):
continue
print("Waiting...")
except:
try:
self.connection.disconnect()
except:
pass
finally:
reestablish_connection(self.connection, self.BTAddress, self.connection_index)
def reestablish_connection (connection, BTAddr, index):
while True:
try:
print("trying to reconnect with " + str(BTAddr) )
connection.connect(str(BTAddr))
print("re-connected to "+ str(BTAddr) +", index = " + str(index))
return
except:
continue
BTAddress = ['1c:ba:8c:1d:3a:c1']
for index in range(len(BTAddress)):
while True:
try:
p = btle.Peripheral()
p.connect(BTAddress[index], btle.ADDR_TYPE_PUBLIC)
break
except btle.BTLEException as e:
print("Connection Fail. Retrying now...")
continue
print ("Successful Connection to BLE " + str(BTAddress[index]))
connectionObjects.append(p) # first index is 0, first connected is beetle num 1
thread = ConnectionHandlerThread(len(connectionObjects)-1, BTAddress[index])
thread.setName("BLE-Thread-" + str(len(connectionObjects)-1))
thread.start()
connectedThreads.append(thread)
My hciconfig:
hci0: Type: Primary Bus: UART
BD Address: B8:27:EB:D4:EC:5D ACL MTU: 1021:8 SCO MTU: 64:1
UP RUNNING
RX bytes:204543 acl:2383 sco:0 events:8342 errors:0
TX bytes:49523 acl:152 sco:0 commands:4072 errors:0
Connecting via bluetoothctl seems fine:
root#raspberrypi:~# bluetoothctl
Agent registered
[bluetooth]# connect 1c:ba:8c:1d:3a:c1
Attempting to connect to 1c:ba:8c:1d:3a:c1
[CHG] Device 1C:BA:8C:1D:3A:C1 Connected: yes
Connection successful
I have been searching around the net for a robust solution but could not find a way to prevent disconnection or increase the reconnection time. Would appreciate if you could help me with this!
Could the reason be RPI firmaware makes BLE unstable?? https://github.com/IanHarvey/bluepy/issues/396
I can conclude that RPI firmware has issue after testing with another board.
Could the reason be RPI firmaware makes BLE unstable??
https://github.com/IanHarvey/bluepy/issues/396
Works without disconnection when tested the same setup with Ultra96 v2 running on Ubuntu 18.04 bionic. I am able to read/write to bluno device every 1ms and the connection is robust.
Hence, it seems like the problem was with RPI firmware. I was using Rasbian Buster on Raspberry pi 3b+.

PFSense DHCP Server doesn't work with RHEL 6 machines

I have a DHCP Server working on PFSense 2.4.4. While it works perfectly with RHEL 7/CentOS 7 machines, it doesn't work on RHEL6/CentOS 6 (both with fixed IP or dynamic range).
This is what DHCP Server Logs show (IP and MAC are obfuscated):
DHCPREQUEST for xxx.xx.255.15 from aa:bb:cc:dd:ee:ff via bge0
DHCPACK on xxx.xx.255.15 to aa:bb:cc:dd:ee:ff via bge0
send_packet: Host is down
dhcp.c:3976: Failed to send 318 byte long packet over fallback interface.
Here is what service network restart shows in CentOS 6:
Restarting network service
And here is what /var/log/messages shows (xxx.xxx.255.3 is the Pfsense DHCP Server address; xxx.xxx.255.1 is the default route; xxx.xxx.255.15 is the supposed address that should be bound to the machine):
Messages file
Lastly, here is my PFSense server info if it helps:
BIOS Vendor: Dell Inc.
Version: 2.6.0
Release Date: Tue Oct 31 2017
Version 2.4.4-RELEASE (amd64)
built on Thu Sep 20 09:03:12 EDT 2018
FreeBSD 11.2-RELEASE-p3
CPU Type Intel(R) Xeon(R) CPU E5-2620 v3 # 2.40GHz
24 CPUs: 2 package(s) x 6 core(s) x 2 hardware threads
AES-NI CPU Crypto: Yes (inactive)
I've tried rebooting those Centos 6 machines, rebooting PFSense, and I made sure the machines and PFSense packages are all updated. Nothing works.
Any help is appreciated.
After struggling with this I found this in DHCP Server option in PfSense:
Additional BOOTP/DHCP Options
I configured it like this:
Additional config
Turns out DHCP wasn't providing the Subnet Mask to CentOS 6 instances and with this option enabled, the mask is appended to the lease file.

How to fix high latency and retransmission rate in Ubuntu 18.04

I installed Ubuntu 18.04 on Hyper-V Win Server 2016.
And network performance of the Ubuntu is bad: I'm hosting few sites (Apache + PHP) and sometime response time is > 10 seconds. Sometimes it is fast.
As I troubleshooted, I see this netstat results:
# netstat -s | egrep -i 'loss|retran'
3447700 segments retransmitted
226 times recovered from packet loss due to fast retransmit
Detected reordering 6 times using reno fast retransmit
TCPLostRetransmit: 79831
45 timeouts after reno fast retransmit
6247 timeouts in loss state
2056435 fast retransmits
107095 retransmits in slow start
TCPLossProbes: 220607
TCPLossProbeRecovery: 3753
TCPSynRetrans: 90564
What can be cause of such high "segments retransmitted" number? And how to fix it?
Few notes:
- VMQ is disabled for Ubuntu VM
- The host system Network adapter is Intel I210
- I disabled IPv6 both on host and in VM
Here is WireShark showing, that it takes ~7 seconds to connect (just initial connection) to my site Propovednik.com:
Sep 20: So far, the issue seems to be caused by OVH / SoYouStart bad network:
This command shows 20-30% packets loss:
sudo ping us.soyoustart.com -c 10 -i 0.2 -p 00 -s 1200 -l 5
The problem could be anywhere along the network, including the workstation where you work from. I suggest you check the network as retransmissions and packetloss means that either something is malfunctioning or misconfigured. If this is on a wireless network, you could be out of range of your router.
I am pinging the website you noted from my computer and there is no packetloss.

El capitan pololu isp dongle pgm03a not working anymore

I have a Pololu PGM03A isp dongle to programm microcontrollers via avrdude. It worked great under Yosemite, but under El Capitan, it seems to disconnect and reconnect randomly every 1 - 5 sec. I can see this by the green status LED and in the terminal:
Macintosh:~ xxxxx$ ls /dev/cu.*
/dev/cu.Bluetooth-Incoming-Port /dev/cu.usbmodem00070153
/dev/cu.serial1 /dev/cu.usbmodem2169
/dev/cu.usbmodem00070151
Macintosh:~ xxxxx$ ls /dev/cu.*
/dev/cu.Bluetooth-Incoming-Port /dev/cu.usbmodem00070153
/dev/cu.serial1
Macintosh:~ xxxxx$ ls /dev/cu.*
/dev/cu.Bluetooth-Incoming-Port /dev/cu.usbmodem00070153
/dev/cu.serial1 /dev/cu.usbmodem2177
/dev/cu.usbmodem00070151
As you can see, the usbmodem00070151 is there, then not, and then again available. The usbmodem00070153 belongs to the ISP Dongle too, but it provides an different feature than programming the microcontroller (it's an USART Port or something I think).
Edit: The latter modem now also disconnects.
When I try to flash the microcontroller, it sometimes manages it to write some bytes, but the time between 2 disconnects is to short to finish the job:
Writing | ######################## | 47% 0.08s
avrdude: ser_recv(): read error: Device not configured
make: *** [all] Error 1
Does anyone know how to fix this?

Resources