Stress-ng - Overload Memory - unix

I want to test the systems reaction to a process that wants to consume more memory than there is available.
I run stress-ng with the following command (on a 6G RAM machine):
stress-ng --vm-bytes 8G --vm-keep -m 1 --aggressive
but I get this error:
stress-ng: error: [5035] stress-ng-vm: gave up trying to mmap, no available memory
Is it possible to force the program to ignore its own secure mechanism ?

try to add this parameter --vm 4
I was having the same problem and it is gone after that.

Related

airflow scheduler: WARNING - (parsing_processes = 2) when using sqlite. So we set parallelism to 1

How can I fix the problem here?
I tried to run airflow scheduler but it does not work.
It seems to me that warning "Because we cannot use more than 1 thread (parsing_processes = 2) when using sqlite. So we set parallelism to 1." is not connected with errors. In my case the problem was caused with the processes using :8793 port. To fix it you could try using these commands:
lsof -i :8793 (install lsof beforehand if needed)
Kill processes that were printed with the command

Cstack_info() output different between Rstudio Server and Rstudio Desktop on Ubuntu 20.04LTS

I am having trouble getting rid of the CStack limit when running my code.
I managed to get rid of the error by appending
* hard stack unlimited
* soft stack unlimited
* soft memlock unlimited
* hard memlock unlimited
root soft stack unlimited
root hard stack unlimited
root soft memlock unlimited
root hard memlock unlimited
to /etc/security/limits.conf which fixes the problem on RStudio Desktop.
I get the following output from running Cstack_info()
> Cstack_info()
size current direction eval_depth
NA NA 1 2
This is the output from ulimit -s on the desktop terminal
coolshades#coolshades-ws:~$ ulimit -s
unlimited
Code runs perfectly on RStudio Desktop.
On the same machine, I also am running RStudio Server (free) to run code remotely. It would seem that these settings are not sticking when running RStudio Server.
This is the output from Cstack_info() on the RStudio Server
> Cstack_info()
size current direction eval_depth
7969177 26336 1 2
This is the ulimit output from terminal on the RStudio Server
coolshades#coolshades-ws:~$ ulimit -s
8192
I am able to change the limit back to unlimited with ulimit -s unlimited. But it will only kick in after Rsession is restarted. However, when I restart the R session, the output of ulimit -s reverts back to 8192.
I am out of ideas as to how best to tackle this problem and hope a more experienced RStudio Server user will be able to advise on this matter.
I have solved this problem.
I had to make the following changes to the following files:
sudo nano /etc/systemd/user.conf add DefaultLimitSTACK=134217728
sudo nano /etc/systemd/system.conf add DefaultLimitSTACK=134217728
Make sure the number you define is a power of 2, else Ubuntu fails to login for some reason.
I have 128GB of RAM. So I have set my limit to 2^27.
Hope this helps someone with the same problem.

Am I properly forcing RocksDB to use fsync? Neither fsync() nor msync() shows in strace

I'm using RocksDB via the C API.
I have a test program that opens a database, does 1,000 writes (gathering timing data between initiation of write and callback), does 1,000 reads, and shuts down.
This works. Average time to do a write is about 1ms.
I modified the test program to turn on write syncing via this
rocksdb_writeoptions_set_sync(wri_u, 1);
and ran it again. Average time to do a write is about 8ms.
So far, so good.
HOWEVER, I then ran strace on both versions of the program to verify that fsync() or fdatasync() or msync() is getting called.
The no-sync program shows 4 invocations of fsync(), 2 of fdatasync() and 0 of msync(). Reasonable.
...but the sync version of the program shows the same 4, 2, and 0. Odd! Surprising! Worrying!
The sync version DOES show 2 interesting deltas from the no-sync version: (i) 2 calls to nanosleep() per write, (ii) an 80% increase in the time spent in mmap().
One out-of-my-butt theory is that perhaps msync() [ or a stand-in for it ] is actually implemented in terms of nanosleep() ?
This is on a desktop Linux 16.04
uname -a
Linux mithril 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
Anyway, my question is, as per the subject line:
Am I properly forcing RocksDB to use fsync? ... because neither fsync() nor msync() shows in strace
Thanks.
Yes, this is the correct way to turn fsync() on.
The issue is that strace must be used with the -f flag to trace system calls in new threads ... and RocksDB was doing all syncs in other threads.

GridGain Out of Memory Exception: Unable to create new native thread

I'm trying to create more then 2 instances of Grid Gain (Just by running the shell script) in Red Hat Release 6.5 (Santiago), but i get the following error when i try to run the shell script a 3rd time:
java.lang.OutOfMemoryError: unable to create new native thread
at java.lang.Thread.start0(Native Method)
at java.lang.Thread.start(Thread.java:714)
at java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:949)
at java.util.concurrent.ThreadPoolExecutor.prestartAllCoreThreads(ThreadPoolExecutor.java:1604)
at org.gridgain.grid.kernal.GridGainEx$GridNamedInstance.start0(GridGainEx.java:1507)
at org.gridgain.grid.kernal.GridGainEx$GridNamedInstance.start(GridGainEx.java:1289)
at org.gridgain.grid.kernal.GridGainEx.start0(GridGainEx.java:832)
at org.gridgain.grid.kernal.GridGainEx.start(GridGainEx.java:759)
at org.gridgain.grid.kernal.GridGainEx.start(GridGainEx.java:677)
at org.gridgain.grid.kernal.GridGainEx.start(GridGainEx.java:524)
at org.gridgain.grid.kernal.GridGainEx.start(GridGainEx.java:494)
at org.gridgain.grid.GridGain.start(GridGain.java:314)
at org.gridgain.grid.startup.cmdline.GridCommandLineStartup.main(GridCommandLineStartup.java:293)
I have set ulimit -n 4096 but still no joy
The box has 64GB of memory - ample amount to run more then 2 instances of GridGain
Can anyone help with this error? are there any configuration changes i can make in Red Hat?
Thanks
Most likely you are running out of allowed number of user processes. We have encountered the same issue on our CentOs servers and setting ulimit -u 10240 helped.

What do programs see when ZFS can't deliver uncorrupted data?

Say my program attempts a read of a byte in a file on a ZFS filesystem. ZFS can locate a copy of the necessary block, but cannot locate any copy with a valid checksum (they're all corrupted, or the only disks present have corrupted copies). What does my program see, in terms of the return value from the read, and the byte it tried to read? And is there a way to influence the behavior (under Solaris, or any other ZFS-implementing OS), that is, force failure, or force success, with potentially corrupt data?
EIO is indeed the only answer with current ZFS implementations.
An open ZFS "bug" asks for some way to read corrupted data:
http://bugs.opensolaris.org/bugdatabase/printableBug.do?bug_id=6186106
I believe this is already doable using the undocumented but open source zdb utility.
Have a look at http://www.cuddletech.com/blog/pivot/entry.php?id=980 for explanations about how to dump a file content using zdb -R option and "r" flag.
Solaris 10:
# Create a test pool
[root#tesalia z]# cd /tmp
[root#tesalia tmp]# mkfile 100M zz
[root#tesalia tmp]# zpool create prueba /tmp/zz
# Fill the pool
[root#tesalia /]# dd if=/dev/zero of=/prueba/dummy_file
dd: writing to `/prueba/dummy_file': No space left on device
129537+0 records in
129536+0 records out
66322432 bytes (66 MB) copied, 1.6093 s, 41.2 MB/s
# Umount the pool
[root#tesalia /]# zpool export prueba
# Corrupt the pool on purpose
[root#tesalia /]# dd if=/dev/urandom of=/tmp/zz seek=100000 count=1 conv=notrunc
1+0 records in
1+0 records out
512 bytes (512 B) copied, 0.0715209 s, 7.2 kB/s
# Mount the pool again
zpool import -d /tmp prueba
# Try to read the corrupted data
[root#tesalia tmp]# md5sum /prueba/dummy_file
md5sum: /prueba/dummy_file: I/O error
# Read the manual
[root#tesalia tmp]# man -s2 read
[...]
RETURN VALUES
Upon successful completion, read() and readv() return a
non-negative integer indicating the number of bytes actually
read. Otherwise, the functions return -1 and set errno to
indicate the error.
ERRORS
The read(), readv(), and pread() functions will fail if:
[...]
EIO A physical I/O error has occurred, [...]
You must export/import the test pool because, if not, the direct overwrite (pool corruption) will be missed since the file will still be cached in OS memory.
And no, currently ZFS will refuse to give you corrupted data. As it should.
How would returning anything but an EIO error from read() make sense outside a file system specific low level data rescue utility?
The low level data rescue utility would need to use an OS and FS specific API other than open/read/write/close to to access the file. The semantics it would need are fundamentally different from reading normal files, so it would need a specialized API.

Resources