Trying to create a swap partion using the following query
parted -s /dev/nvme0n2 mkpart extended linux-swap 1000MB 2000MB
[root#rhel8vmware ~]# parted -s /dev/nvme0n2 print free
Model: NVMe Device (nvme)
Disk /dev/nvme0n2: 5369MB
Sector size (logical/physical): 512B/512B
Partition Table: msdos
Disk Flags:
Number Start End Size Type File system Flags
1024B 1049kB 1048kB Free Space
1 1049kB 1000MB 999MB primary
1000MB 5369MB 4368MB Free Space
[root#rhel8vmware ~]# parted -s /dev/nvme0n2 mkpart extended linux-swap 1000MB 2000MB
Error: Invalid number.
Related
I am trying to run a Julia script in paralell on a cluster.
The cluster uses Moab and Torque for the scheduler and resource manager.
Since SSH seems to be restricted, I use MPI for multiprocessing.
I throw the following job, requesting for 3 nodes:
#!/bin/bash
#PBS -l walltime=1:00:00
#PBS -l pmem=10gb
#PBS -l nodes=3:ppn=1
#PBS -j oe
#PBS -A open
#PBS -o (some path)
#PBS -e (some path)
cd (some path)
echo ""
echo "JOB Started on $(hostname -s) at $(date)"
echo ""
module purge
module use (some path)/modules
module load julia
module load openmpi
mpirun -np 3 -display-allocation julia --project=. "(some path)/test.jl"
echo ""
echo "JOB ended at $(date)"
But it if I look at the output script, it seems that it recognizes only one node, comp-bc-0384:
JOB Started on comp-bc-0384 at Sat Mar 19 22:05:12 EDT 2022
====================== ALLOCATED NODES ======================
comp-bc-0384: slots=24 max_slots=0 slots_inuse=0 state=UP
=================================================================
--------------------------------------------------------------------------
[[12308,1],2]: A high-performance Open MPI point-to-point messaging module
was unable to find any relevant network interfaces:
Module: OpenFabrics (openib)
Host: comp-bc-0384
Another transport will be used instead, although this may result in
lower performance.
NOTE: You can disable this warning by setting the MCA parameter
btl_base_warn_component_unused to 0.
--------------------------------------------------------------------------
[comp-bc-0384.acib.production.int.aci.ics.psu.edu:10656] 2 more processes have sent help message help-mpi-btl-base.txt / btl:no-nics
[comp-bc-0384.acib.production.int.aci.ics.psu.edu:10656] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
10.214858 seconds (116.21 k allocations: 6.110 MiB)
JOB ended at Sat Mar 19 22:05:36 EDT 2022
I was expecting the ALLOCATED NODES section to display the other node(s) I was assigned to.
A similar question in the past (openMPI/mpich2 doesn't run on multiple nodes) suggests that it has something to do with host file.
Therefore I also tried with mpirun -hostfile $PBS_NODEFILE -np 3 -display-allocation julia --project=. "(some path)/test.jl" . It then returns the following:
JOB Started on comp-bc-0384 at Sat Mar 19 22:16:15 EDT 2022
Host key verification failed.
--------------------------------------------------------------------------
ORTE was unable to reliably start one or more daemons.
This usually is caused by:
* not finding the required libraries and/or binaries on
one or more nodes. Please check your PATH and LD_LIBRARY_PATH
settings, or configure OMPI with --enable-orterun-prefix-by-default
* lack of authority to execute on one or more specified nodes.
Please verify your allocation and authorities.
* the inability to write startup files into /tmp (--tmpdir/orte_tmpdir_base).
Please check with your sys admin to determine the correct location to use.
* compilation of the orted with dynamic libraries when static are required
(e.g., on Cray). Please check your configure cmd line and consider using
one of the contrib/platform definitions for your system type.
* an inability to create a connection back to mpirun due to a
lack of common network interfaces and/or no route found between
them. Please check network connectivity (including firewalls
and network routing requirements).
--------------------------------------------------------------------------
JOB ended at Sat Mar 19 22:16:16 EDT 2022
What could be the cause here?
I have a cephfs and I need to mount this file system.
I have two pools cephfs_data and cephfs_meta.
ceph -s output is:
cluster:
id: 9f3e7f80-4515-4b5f-92f0-4eb49f3cbf44
health: HEALTH_OK
services:
mon: 2 daemons, quorum mon1,osd0
mgr: osd0(active), standbys: mon1
mds: mycephfs-1/1/1 up {0=mon1=up:active}
osd: 1 osds: 1 up, 1 in
data:
pools: 3 pools, 72 pgs
objects: 24 objects, 35 KiB
usage: 1.1 GiB used, 837 GiB / 838 GiB avail
pgs: 72 active+clean
I created a user with this properties:
[client.foo]
key = AQA4d5xdlAklBxAA+Q5T+b3HLAxj2kRKzXUOSA==
caps mds = "allow r"
caps mon = "allow r"
caps osd = "allow rw tag cephfs data=mycephfs"
And when i try run this command:
sudo mount -t fuse.ceph conf=/etc/ceph/ceph.conf /mnt/cephfs/
this happens:
mount: /mnt/cephfs: wrong fs type, bad option, bad superblock on conf=/etc/ceph/ceph.conf, missing codepage or helper program, or other error.
or
when i try run this command:
sudo mount.ceph mon1:6789:/ /mnt/cephfs/
this happens:
mount error 110 = Connection timed out
or
when i try run this command:
sudo ceph-fuse -n client.foo /mnt/cephfs/
this happens:
ceph-fuse[64711]: starting ceph client
2019-10-21 16:21:17.329932 7f58cedbb500 -1 init, newargv = 0x55a6c11f0340 newargc=9
and indifinite pending. I can't see "starting fuse".
.
Where is my fault? Which way i should follow?
The syntax of your commands is incorrect.
You can mount the CephFS using
mount -t ceph mon1:6789:/ /mnt/ceph -o name=foo,secretfile=/path/to/keyring/file
There are many options you can use for the mount that can be found in the mount.ceph Documentation
I am using a linux machine.
I want to set the number of TCP retransmits to zero. I am using below command to modify:
sudo sysctl -w net.ipv4.tcp_syn_retries=0
The above command dont work and gives me below error:
error: "Invalid argument" setting key "net.ipv4.tcp_syn_retries"
However, this command works -> sudo sysctl -w net.ipv4.tcp_syn_retries=1
According to the documentation -> man tcp :
tcp_syn_retries (integer; default: 5; since Linux 2.2)
The maximum number of times initial SYNs for an active TCP connection attempt will be retransmitted. This value should not
be higher than 255. The default value
is 5, which corresponds to approximately 180 seconds.
Does setting sudo sysctl -w net.ipv4.tcp_syn_retries=0 means disabling syn or syn retries. I am unclear after reading the documentation.
I created new vm instance using "Ubuntu Server 10.04 LTS (Lucid Lynx) - 32 bits" image and m1.small falvour which has 20 GB Disk (OpenStack Icehouse). When i logging to the vm and run df -h , I found that the VM does not use the entire assigned HD. The command results are shown as the following:
Filesystem Size Used Avail Use% Mounted on
/dev/vda1 1.4G 595M 721M 46% /
none 1005M 144K 1005M 1% /dev
none 1007M 0 1007M 0% /dev/shm
none 1007M 36K 1007M 1% /var/run
none 1007M 0 1007M 0% /var/lock
none 1007M 0 1007M 0% /lib/init/rw
The "fdisk -l" shows the DH size is 20 GB:
Disk /dev/vda: 21.5 GB, 21474836480 bytes
4 heads, 32 sectors/track, 327680 cylinders
Units = cylinders of 128 * 512 = 65536 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x000cb9da
Device Boot Start End Blocks Id System
/dev/vda1 * 17 32768 2096128 83 Linux
I need the vm to take the full space assigned to it. Any idea how could I fix it? I want the solution to be applied on each vm I create, so I do not want to manually update the VM after instantiation. I also must use 10.04 image ( can not upgrdate to 14.04)
The problem here is the image. I grabbed that one and ran it up, it's pretty simple to run a
sudo resize2fs /dev/vda1
which will resize the filesystem to the size of the partition, which seems to be 2GB. Beyond that, you have to increase the partition size. For that I think you're probably best off using virt-resize, there are some good howto's out there e.g. askubuntu, in essence:
SSH into your openstack controller node
source keystonerc_admin (or whatever yours may be called)
nova list --all-tenants | grep <instance_name> or just grab the server guid from horizon
nova show <server_guid> and note which nova host your machine is running on. Also note the instance name (e.g. instance-00000adb)
SSH into that nova node
virsh dumpxml instance-00000adb and look for the image file. On mine, this is /var/lib/nova/instances/<server_guid>/disk but that may not always be the case?
yum install libguestfs-tools
truncate -r /var/lib/nova/instances/d887249a-0d95-473e-b4f2-41f71df4dbb5/disk /var/lib/nova/instances/d887249a-0d95-473e-b4f2-41f71df4dbb5/disk.new
truncate -s +2G /var/lib/nova/instances/d887249a-0d95-473e-b4f2-41f71df4dbb5/disk.new
virt-resize --expand /dev/sda1 /var/lib/nova/instances/d887249a-0d95-473e-b4f2-41f71df4dbb5/disk /var/lib/nova/instances/d887249a-0d95-473e-b4f2-41f71df4dbb5/disk.new
mv disk disk.old ; mv disk.new disk
NB - mine didn't quite work when I booted that up again, not got time to investigate yet but it can't be far off that, and hopefully this helps.
Once you've managed to boot that up again, then you can shut it down and create a snapshot from horizon. You can then use that snapshot just like any other image, and launch all subsequent VMs directly from there.
HTH.
I'm running an Ubuntu Server 12.04 instance on EC2, with an installation of IRCD-hybrid 7.2 on it. Right now, I'm trying to load test the server by making a bunch on connections and seeing how much the server can handle. I have a script that connects to the room.
My problem is that I can get 4026 connections in the server maximum. My other socket connections just don't seem to work. I have the max clients set to 100k just to be safe and 50k for max number per ip.
When i run
sysctl fs.file-nr -> fs.file-nr = 4576 0 1513750
Also, my ulimits have been set:
ulimit -S -> 65536
My ulimit -n is 1024, but since I can get 4026 connections, I don't see how that's affecting it.
ulimit -n -> 1024
Memory and CPU are also nowhere even close to maximum when I run into this.
My code is this:
import random
import sys
import socket
import string
import time
n = ''.join(random.choice(string.letters) for i in xrange(40))
HOST="<MYHOST IS HERE>"
PORT=6666
NICK=n
IDENT=n
REALNAME=n
readbuffer=""
s=socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect((HOST, PORT))
s.send("NICK %s\r\n" % NICK)
s.send("USER %s %s %s :%s\r\n" % (IDENT, HOST, REALNAME, REALNAME))
s.send('JOIN #foobar\r\n')
while 1:
readbuffer=readbuffer+s.recv(1024)
temp=string.split(readbuffer, "\n")
readbuffer=temp.pop( )
for line in temp:
line=string.rstrip(line)
line=string.split(line)
if 'PRIVMSG' in line:
print line
if(line[0]=="PING"):
s.send("PONG %s\r\n" % line[1])
Is there a setting on ircd-hybrid that sets this? The terminal window says that "Server is full" when I try to connect with a regular client and I already have 4026 connections.
There are two types of ulimit, hard and soft. A ulimit on a particular resource may be increased up to the hard limit by the process. However it maybe listed no further.
On my box (ubuntu 12.04), the soft file descrption is 1024), but the hard limit is 4096
$ulimit -n
1024
moment#moment:~/tmp 20:26:04 0
$ulimit -n -H
4096
moment#moment:~/tmp 20:26:16 0
$ulimit -n -S
1024
It's entirely plausible that your irc server is increasing up to this hard limit.
A horrible hack to increase your ulimit temporarily works like this.
sudo su
ulimit -n 10000
su USERNAME
Long term you would need to increase the limit system wide or , preferably, increase the ulimit for just the process you are running. For daemons I normally do this using the ulimit instruction in upstart configuration files.
In general strace can be useful for debugging problems like this (this will probably show an earlier call to increase the file ulimit)