SAS creating data sets in the /home directory - unix

Is there any process by which SAS creates data sets(.sas7bdat extension) in the /home directory? we have both 9.3 and 9.4 installed on the unix server.
The programmers do not have access to the /home directory but there are some files which are being generated at this location without the knowledge of programmers.
Please Help.

This is the default output location for any shell commands run by those users on the server. All sorts of things will typically end up there if they mistype the occasional command. One way to change this so that SAS outputs these sorts of files to the work library instead is to run the following in a session on the server:
x "cd %sysfunc(pathname(work))";
E.g. suppose you have a command like this:
echo Hello > myfile.txt
Unless you run a cd command first, myfile.txt will be created in your home directory by default.

Related

Submit R jobs in slurm on a cluster server and saving objects in a different directory than the slurm working directory

I am submitting a R job on a cluster server that seems to run correctly as far as I save the results in the same working directory my scripts files are. Actually I want save results from R (via saveRDS()) in another directory (e.g. /var/tmp/results), but I think that Slurm is redirecting my bash script output to its working directory, causing a No such file or directory R error.
I've looked around but I was not able to find a way to specify another path for the job resulting output. Maybe I have used the wrong keys because of my language limitations since I am not a native English, and I before apologize if it is something trivial.
I'll add in the following the script I am calling, i.e. my_script.sh
#!/bin/bash
#SBATCH --ntasks=1
#SBATCH --array=1-2
#SBATCH --mem-per-cpu=15000
#SBATCH -e hostname_%j.err
#SBATCH -o hostname_%j.out
module load R
FILES=($HOME/simulation_study_bnp/models/*.R)
FILE=${FILES[$SLURM_TASK_ARRAY_ID]}
echo ${FILE}
srun Rscript runModel.R ${FILES[$SLURM_ARRAY_TASK_ID]}
The R script runModel sources the code of each file in the models folder, load some data and run some analysis. All the necessary files are located in $HOME/simulation_study_bnp/models/ with $HOME being my home directory on the server (e.g. /users/my_dir).
The R script runModel.R should save part of the results, but in another path e.g. /var/tmp/results/ outside my $HOME. When I run something like Rscript runModel.R models/filename.R everything works fine. After some trials, my opinion is that when submitting
my_script.sh via sbatch the process look for the /var/tmp/results/ under my $HOME instead of considering the absolute path, but I have no idea how to solve this.
As it was pointed out by #carles-fenoy, you have to make sure that whatever directory the jobs are writing to must be accessible by all computing nodes. Usually, large clusters do have a dedicated storage space that is accessible by all nodes (see for example USC's scratch dir). Ask your cluster IT managers to see what's that in your cluster.
Another thing I would mention is that it is worthwhile for you to check symbolic links. You can have a symbolic link in your home directory pointing to another folder located in a node with more space. I do that for my R packages as shown here. That way I don't need to worry about running out of space!

Creating multiple directories for users

I am using gatekeeper for access to pages on server.
This is done by creating directories with an index file in them. This then directs whoever inputted the password to a specific page.
I would like to be able to produce lots of directories with either not long random names or assigned names from say a database as creating many by a manual process is not practical.
Can someone tell me how to generate lots of directories on the fly?
Would be even better if users could create their own directory but thats probably something else.
Thanks
If you have bash (shell) access on your server, you can execute a simple bash script to create directories with a file in each.
for f in foo/bar{00..50}; do mkdir -p $f && touch $f/index.txt; done
Replace:
foo/bar with your directory
50 with the number of directories
index.txt with the name of the file
If you want to additionally write text to each file, then do this instead
for f in foo/bar{00..50}; do mkdir -p $f && printf "text\n goes\n here" > $f/index.txt; done

Filter command history by folder they were executed in?

I know shell history doesn't keep track of the folder the commands were executed in but I think it would be really useful to be able to output the history for a particular folder by using a flag like history --local for example.
I often jump from project to project which use very similar commands but have different destination host for ssh or environment variable...
Is there any way to achieve that –preferably using zsh?
In bash, you can set PROMPT_COMMAND to something like the following:
PROMPT_COMMAND='history | tail -n1 >> .$USER.history'
It will save each command to a file in the current directory.
For an alternative approach (replacing cd with a command that changes where history is saved), see http://www.compbiome.com/2010/07/bash-per-directory-bash-history.html.

synchronise local directories over ssh

The following command works great for me for a single file:
scp your_username#remotehost.edu:foobar.txt /some/local/directory
What I want to do is do it recursive (i.e. for all subdirectories / subfiles of a given path on server), merge folders and overwrite files that already exist locally, and finally downland only those files on server that are smaller than a certain value (e.g. 10 mb).
How could I do that?
Use rsync.
Your command is likely to look like this:
rsync -az --max-size=10m your_username#remotehost.edu:foobar.txt /some/local/directory
-a (archive mode - the sync is recursive, transfers ownership, attributes, symlinks among other things)
-z (compresses transfer)
--max-size (only copies files up to a certain size)
There are many more flags which may be suitable. Checkout the docs for more details - http://linux.die.net/man/1/rsync
First option: use rsync.
Second option, and it's not going to be a one liner, but can be done in three or four lines:
Create a tar archive on the remote system using ssh.
Copy the tar from remote system with scp.
Untar the archive locally.
If the creation of the archive gets a bit complicated and involves using find and/or tar with several options it is quite practical to create a script which would do that locally, upload it on the server with scp, and only then execute remotely with ssh.

Which all processes are using shared library

I have a shared library(.so file) on UNIX.
I need to know what all running processes are using it.
Do unix provide any such utility/command?
You can inspect the contents of /proc/<pid>/maps to see which files are mapped into each process. You'll have to inspect every process, but that's easier than it sounds:
$ grep -l /lib/libnss_files-2.11.1.so /proc/*/maps
/proc/15620/maps
/proc/22439/maps
/proc/22682/maps
/proc/32057/maps
This only works on the Linux /proc filesystem, AFAIK.
A quick solution would be to use the lsof command
[root#host]# lsof /lib/libattr.so.1
COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME
gdm-binar 11442 root mem REG 8,6 30899 295010 /lib/libattr.so.1.1.0
gdm-binar 12195 root mem REG 8,6 30899 295010 /lib/libattr.so.1.1.0
This should work not only for .so files but any other files, dirs, mount points, etc.
N.B. lsof displays all processes that use a file, so there is a very remote possibility of a false positive if is a process that opens the *.so file but not actually use it. If this is an issue for you, then Marcelo's answer would be the way to go.
Do in all directories of interest
ldd * >ldd_output
vi ldd_output
Then look for the the library name, e.g. “aLib.so”. This shows all modules linked to e.g. "aLib.so"

Resources