Solaris 10 i386 vmstat giving more free than swap - unix

How come when running vmstat on Solaris 10 i386 I got more free space than swap space? Isn't free a proportion of swap which is available?
$ vmstat
kthr memory page disk faults cpu
r b w swap free re mf pi po fr de sr s0 s1 -- -- in sy cs us sy id
1 0 0 7727088 17137388 37 303 1 0 0 0 0 -0 4 0 0 7247 7414 8122 4 1 95

No. Free RAM represent the part of RAM that is immediately available to use while free swap represent part of virtual memory which is neither allocated or reserved. Reserved memory doesn't physically use any storage (RAM or disk).
Have a look at swap -s output for details.

Related

Extending a Logical Volume on RHEL7

I am using a RHEL7 box, created by our in-house vm-provisioning system.
They create logical volumes for the likes of /var, /home, swap etc. using 2 pools of space. I was attempting to follow the examples and descriptions of how to add some of that un-allocated space to a volume from https://www.tecmint.com/extend-and-reduce-lvms-in-linux/, and am stuck getting 'resize2fs' to operate as expected.
using lvdisplay - I got the appropriate volume:
--- Logical volume ---
LV Path /dev/rootvg/lvvar
LV Name lvvar
VG Name rootvg
LV UUID WGkYI1-WG0S-uiXS-ziQQ-4Pbe-rv1H-0HyA2a
LV Write Access read/write
LV Creation host, time localhost, 2018-06-05 16:10:01 -0400
LV Status available
# open 1
LV Size 2.00 GiB
Current LE 512
Segments 1
Allocation inherit
Read ahead sectors auto
- currently set to 8192
Block device 253:5
I found the associated volume group with vgdisplay:
--- Volume group ---
VG Name rootvg
System ID
Format lvm2
Metadata Areas 1
Metadata Sequence No 8
VG Access read/write
VG Status resizable
MAX LV 0
Cur LV 7
Open LV 7
Max PV 0
Cur PV 1
Act PV 1
VG Size <49.00 GiB
PE Size 4.00 MiB
Total PE 12543
Alloc PE / Size 5120 / 20.00 GiB
Free PE / Size 7423 / <29.00 GiB
VG UUID 5VkgVi-oZ56-KqMk-6vmf-ttNo-EMHG-quotwk
I decided to take 4G from the Free PE's and extended the space with:
lvextend -l +1024 /dev/rootvg/lvvar
which answered as expected:
Size of logical volume rootvg/lvvar changed from 2.00 GiB (512 extents) to 6.00 GiB (1536 extents).
Logical volume rootvg/lvvar successfully resized.
But when I try to use resize2fs - I get this:
# resize2fs /dev/mapper/rootvg-lvvar
resize2fs 1.42.9 (28-Dec-2013)
resize2fs: Bad magic number in super-block while trying to open /dev/mapper/rootvg-lvvar
I'm sure it's something dumb I'm missing - can anyone push me in the right direction here?
Use xfs_growfs instead.
xfs_growfs /dev/mapper/rootvg-lvvar

nMDs non-metric multi-dimensional scaling coding a data set

I have a data set of lizard retreat sites that i'd like to examine using an nmds in r to determine which variables are likely important. I'm a novice with r and was told I need to code the data so r can read it. I'm using OS X 10.9.5 (13F1911, r version R 3.3.3 GUI 1.69 Mavericks build (7328).
I'm not sure how to attach the data file, so I've copied the 'head'(data)here:
data <- data.frame(newdataset)
head(data)
Hide.. PIT Year Species Alive.Partial.Dead Standing.half.fallen.fallen X..days.obs Total...of.day.occupied Height Diameter Angle Aspect
1 1 91A1 2004 Hog Doctor A S 6 6 4.2 ? . ?
2 2 91A1 2004 Mammie A S 4 4 1.8 5-10cm 90 SW
3 3 COFE 2004 Tabebuia riparia A S 17 16 3 5-10cm 0 ENE
4 4 COFE 2004 Columar cactus P Fallen 2 2 0 5-10cm 90 S
5 5 COFE 2004 ? D Fallen 4 3 0.2 5-10cm 60 ?
6 6 COFE 2004 Eugenia sp (check greeny fruit) P S 7 7 3.5 10-20cm 0 W
As you can see I managed to read the data into r, but I'm not sure what is next? I know I need to the convert my data.frame(newdataset) to a distance matrix, but I am unclear if I have to code or create levels for some of the variables, e.g., If the retreat site (selected by the lizard) was in a tree that was either, 1. Alive, 2. Partially Dead, 3. Dead.
A little more about the variables- Column 1. Hide (retreat) Identifies each retreat selected by lizards i.e., one lizard may use a single or multiple retreats, Column 2.Passive Internal Transponder identification number uniquely identifying each lizard, Column 3. Year the data were collected, 4. Species refers to the tree species in which a retreat was located or in the case of a single lizard the substrate (rock) used, 5. Identifies if the tree was alive, partially alive or dead, 6. Identifies if the tree was standing upright, if it was leaning over, or if it was lying on the ground, 7. The number of days a lizard was observed using a particular retreat site, 8. The total number of days a retreat site was known to be used, 9. The height of the retreat site from the ground, 10. The diameter of the section of tree containing the retreat site, 11. The angle of the retreat site relative to the ground, 12. The angle of the retreat site relative to the ground.
Thank you to anyone that can give some advice with this problem.
Cheers
Rick

RServe - Scalability related

My requirement is to execute an R Script through a Java Webservice. The webservice needs a concurrency of 50.
We are using RServe to execute the R Script from the Java code. To achieve this, in the linux server , we created 50 instances of the RServe instances, started at different ports. Inside the java application, created a connection pool with 50 RConnection objects, each linked to one of the RServe instance created . For every execution, we fetch a RConnection from the pool, execute the R script, get the response value and then return the RConnection to the pool.
When we execute the webservice with a single user accessing, the R execution gets completed in 1 second. However , if I try to run the same webservice with a concurrency of 50, it takes around 30 seconds to execute the R Script inside the RServe.
Since the actual R execution takes only 1 second if executed with single user, Im thinking that Im doing something wrong with the RServe. Any pointers would help.
Although I think it is best to use one Rserve instance on Linux and let it just fork sub processes for parallel processing, it may not speed up processing at all.
From your question, it is not clear whether the application is used intensively and many concurrent requests are being processed continually. If that is the case, I assume that your R code is CPU intensive and the different processes just need to share CPU time, increasing the clock time needed to complete.
I tested just that kind of scenario and found these results with top
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
33839 ***** 20 0 269792 57104 3496 R 10.3 1.5 0:15.33 Rserve
33847 ***** 20 0 269776 57100 3496 R 10.3 1.5 0:09.86 Rserve
33849 ***** 20 0 269792 57104 3496 R 10.3 1.5 0:08.20 Rserve
33855 ***** 20 0 269528 56840 3496 R 10.3 1.5 0:04.92 Rserve
29725 ***** 20 0 268872 56836 4020 R 10.0 1.5 1360:13 Rserve
33841 ***** 20 0 269784 57100 3496 R 10.0 1.5 0:14.42 Rserve
33843 ***** 20 0 269796 57104 3496 R 10.0 1.5 0:12.50 Rserve
33844 ***** 20 0 269792 57104 3496 R 10.0 1.5 0:11.72 Rserve
33852 ***** 20 0 269512 56836 3496 R 10.0 1.5 0:06.38 Rserve
33856 ***** 20 0 269520 56836 3496 R 10.0 1.5 0:04.05 Rserve
33842 ***** 20 0 269776 57100 3496 R 9.3 1.5 0:13.20 Rserve
33851 ***** 20 0 269784 57100 3496 R 9.3 1.5 0:06.69 Rserve
33857 ***** 20 0 269512 56836 3496 R 9.3 1.5 0:03.15 Rserve
33834 ***** 20 0 269792 57112 3496 R 9.0 1.5 0:18.56 Rserve
33835 ***** 20 0 269784 57100 3496 R 9.0 1.5 0:17.33 Rserve
33837 ***** 20 0 269776 57100 3496 R 9.0 1.5 0:16.46 Rserve
33846 ***** 20 0 269784 57100 3496 R 9.0 1.5 0:10.17 Rserve
33848 ***** 20 0 269796 57104 3496 R 9.0 1.5 0:08.61 Rserve
33853 ***** 20 0 269532 56840 3496 R 9.0 1.5 0:05.34 Rserve
33858 ***** 20 0 269532 56840 3496 R 9.0 1.5 0:02.27 Rserve
33838 ***** 20 0 269796 57104 3496 R 8.6 1.5 0:15.74 Rserve
The %CPU sums up to 200%, corresponding to the two CPU cores available.
As you can see, the processes have the same priority (PR=20), and the shares of %CPU are almost equal, around 10%, so all of them will be allocated only 1/10th op the CPU time and will therefore take 10 times longer to complete, compared to the case of just one Rserve instance.
This is not 20 times longer, because a single Rserve process will only utilise one CPU core, leaving the other core 'sleeping'.
You simply need more CPU's if you want to speed up calculations. Also, if you don't want the 51st (or 101st, or 1001st) concurrent user to be denied access, it is better to implement a message queue. You can create multiple workers for the queue, which can distribute work-load over many CPU's, in different machines.

R memory issues for extremely large dataset

I need to perform regression analysis on a 3.5gb dataset consisting mixed (numerical and categorical) dataset in CSV format consisting of 1.8 million records and 1000 variables/columns mainly containing 0s and 1s and a few categorical and numeric values. (Refer snapshot of data.)
I was initially supposed to directly perform clustering on this dataset but I kept getting a lot of errors related to memory in spite of running it on a remote virtual machine (64-bit Windows Server 2012 R2) having 64gb RAM. So I thought of doing some factor analysis to find correlation between the variables so that I can reduce the number of columns to 600 - 700 (as much possible). Any other ideas are appreciated as I am very naïve to data analysis.
I have tried various packages like ff, bigmemory, biganalytics, biglm, FactoMineR, Matrix etc but with no success. Have always encountered “cannot allocate vector of size …” or reached maximum allocation of size 65535MB some other errors.
Can you guys let me know of a solution to this as I feel memory should be a problem as 64gb RAM should suffice.
Snapshot of dataset:
SEX AGE Adm Adm LOS DRG DRG RW Total DC Disp Mortality AAADXRUP
M 17 PSY 291 887 0.8189 31185 PDFU 0 0
M 57 PSY ER 31 884 0.9529 54960.4 SNF 0 0
F 23 AC PH 3 775 0.5283 9497.7 HOM 0 0
F 74 AC PH 3 470 2.0866 23020.3 SNF 0 0
There are additional columns after Mortality mostly containing 0s or 1s

merge.ffdf incorrect result when multiple columns match in R

I have an ff dataframe windows_ff:
edge ipaddr port protocol windowed_qd class
1 1182430570 41.2.194.42 1299 1 0 WEB
2 1182430570 41.2.194.42 1302 1 0 WEB
I want to find a mutual relation among its rows, so I decided to make an exact copy of that dataframe:
outgoing_windows_ff_1 <- ffdf(edge=outgoing_windows_ff$edge,
ipaddr=outgoing_windows_ff$ipaddr,
influencing_port=outgoing_windows_ff$port,
influencing_proto=outgoing_windows_ff$proto,
influencing_class=outgoing_windows_ff$class)
and then merge the 2 dataframes:
merged <- merge(x=outgoing_windows_ff, y=outgoing_windows_ff_1,
by.x=c('edge','ipaddr'),by.y=c('edge','ipaddr') )
The result is:
edge ipaddr port protocol windowed_qd class influencing_port
1 1182430570 41.2.194.42 1299 1 0 WEB 1299
2 1182430570 41.2.194.42 1302 1 0 WEB 1299
but it is WRONG, because I would expect 4 rows in the result.
Doing the merge between normal dataframes:
merged <- merge(x=as.data.frame(outgoing_windows_ff),
y=as.data.frame(outgoing_windows_ff_1),
by.x=c('edge','ipaddr'),by.y=c('edge','ipaddr') )
I get the correct result:
edge ipaddr port protocol windowed_qd class influencing_port influencing_proto
1 1182430570 41.2.194.42 1299 1 0 WEB 1299 1
2 1182430570 41.2.194.42 1299 1 0 WEB 1302 1
3 1182430570 41.2.194.42 1302 1 0 WEB 1299 1
4 1182430570 41.2.194.42 1302 1 0 WEB 1302 1
I think that is really DANGEROUS that a certain operation gives 2 different results if ff dataframes or "normal dataframes" are used. This can lead to poisoned results and the experimenter cannot know about it. My doubt is: "maybe other results that I obtained with ff package are poisoned and I didn't realize"
Have your read the documentation of merge.ffdf from package ffbase, which is the function you are using?
It says:
This method is similar as merge in the base package but only allows inner and left outer joins.
Mark that joining is done based on ffmatch or ffdfmatch, meaning that only the *first* element in y will be added to x and ffdfmatch works on link[base]{paste}-ing together a key. So this might not be suited if your key contains columns of vmode double.
Mark what is highlighted in bold. What you are doing with merge.ffdf is a full outer join which is not supported by merge.ffdf. Mark the word 'first' in the documentation. Also mark that it paste's together a key.
If you are in need of code which performs a full outer join, feel free to push code which does a full outer join which works on ff objects on the github repository of ffbase: https://github.com/edwindj/ffbase

Resources