Which all processes are using shared library

Which all processes are using shared library - unix

I have a shared library(.so file) on UNIX.
I need to know what all running processes are using it.
Do unix provide any such utility/command?

You can inspect the contents of /proc/<pid>/maps to see which files are mapped into each process. You'll have to inspect every process, but that's easier than it sounds:
$ grep -l /lib/libnss_files-2.11.1.so /proc/*/maps
/proc/15620/maps
/proc/22439/maps
/proc/22682/maps
/proc/32057/maps
This only works on the Linux /proc filesystem, AFAIK.

A quick solution would be to use the lsof command
[root#host]# lsof /lib/libattr.so.1
COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME
gdm-binar 11442 root mem REG 8,6 30899 295010 /lib/libattr.so.1.1.0
gdm-binar 12195 root mem REG 8,6 30899 295010 /lib/libattr.so.1.1.0
This should work not only for .so files but any other files, dirs, mount points, etc.
N.B. lsof displays all processes that use a file, so there is a very remote possibility of a false positive if is a process that opens the *.so file but not actually use it. If this is an issue for you, then Marcelo's answer would be the way to go.

Do in all directories of interest
ldd * >ldd_output
vi ldd_output
Then look for the the library name, e.g. “aLib.so”. This shows all modules linked to e.g. "aLib.so"

Related

Rsync copy "unsafe" symlinks but don't update modification time on the symlink targets

Is it possible to have rsync copy "unsafe" symlinks (that is, those that refer to files/dirs outside of the copied tree, see docs here) but not update the times on them?
I'm using rsync -a --delete --omit-dir-times to copy a bunch of files from /home/somebody/foo/bar to a destination machine, but running into the following error: rsync: failed to set times on "/home/somebody/foo/bar/symlink": Operation not permitted (1), where /home/somebody/foo/bar/smylink refers to something in /usr/lib/ owned by root at the destination and lacking proper permission for the rsync user to update it.
Essentially rsync tries to update the time on the symlink like all other files it copies, but gets blocked by permissions because it's not root at the destination.
What I'd like to do is copy the link, but not touch the symlink target at all during the copy. I just want the link. I could change permissions on the target file, but I'd like to avoid that.
Is this achievable? Is this a terrible idea and I'd be abusing rsync? Suggestions for alternative approaches in the latter case?

There is another option for rsync --omit-link-times which will probably do what you are looking for. See man page at:
http://manpages.ubuntu.com/manpages/bionic/man1/rsync.1.html

synchronise local directories over ssh

The following command works great for me for a single file:
scp your_username#remotehost.edu:foobar.txt /some/local/directory
What I want to do is do it recursive (i.e. for all subdirectories / subfiles of a given path on server), merge folders and overwrite files that already exist locally, and finally downland only those files on server that are smaller than a certain value (e.g. 10 mb).
How could I do that?

Use rsync.
Your command is likely to look like this:
rsync -az --max-size=10m your_username#remotehost.edu:foobar.txt /some/local/directory
-a (archive mode - the sync is recursive, transfers ownership, attributes, symlinks among other things)
-z (compresses transfer)
--max-size (only copies files up to a certain size)
There are many more flags which may be suitable. Checkout the docs for more details - http://linux.die.net/man/1/rsync

First option: use rsync.
Second option, and it's not going to be a one liner, but can be done in three or four lines:
Create a tar archive on the remote system using ssh.
Copy the tar from remote system with scp.
Untar the archive locally.
If the creation of the archive gets a bit complicated and involves using find and/or tar with several options it is quite practical to create a script which would do that locally, upload it on the server with scp, and only then execute remotely with ssh.

Copy or rsync command

The following command is working as expected...
cp -ur /home/abc/* /mnt/windowsabc/
Does rsync has any advantage over it? Is there a better way to keep to backup folder in sync every 24 hours?

Rsync is better since it will only copy only the updated parts of the updated file, instead of the whole file. It also uses compression and encryption if you want. Check out this tutorial.

rsync is not necessarily more efficient, due to the more detailed inventory of files and blocks it performs. The algorithm is fantastic at what it does, but you need to understand your problem to know if it is really going to be the best choice.
On a very large file system (say many thousands or millions of files) where files tend to be added but not updated, "cp -u" will likely be more efficient. cp makes the decision to copy solely on metadata and can simply get to the business of copying.
Note that you might want some buffering, e.g. by using tar rather than straight cp, depending on the size of the files, network performance, other disk activity, etc. I find the following idea very useful:
tar cf - . | tar xCf directory -
Metadata itself may actually become a significant overhead on very large (cluster) file systems, but rsync and cp will share this problem.
rsync seems to frequently be the preferred tool (and in general purpose applications is my usual default choice), but there are probably many people who blindly use rsync without thinking it through.

The command as written will create new directories and files with the current date and time stamp, and yourself as the owner. If you are the only user on your system and you are doing this daily it may not matter much. But if preserving those attributes matters to you, you can modify your command with
cp -pur /home/abc/* /mnt/windowsabc/
The -p will preserve ownership, timestamps, and mode of the file. This can be pretty important depending on what you're backing up.
The alternative command with rsync would be
rsync -avh /home/abc/* /mnt/windowsabc
With rsync, -a indicates "archive" which preserves all those attributes mentioned above. -v indicates "verbose" which just lists what it's doing with each file as it runs. -z is left out here for local copies, but is for compression, which will help if you are backing up over a network. Finally, the -h tells rsync to report sizes in human-readable formats like MB,GB,etc.
Out of curiosity, I ran one copy to prime the system and avoid biasing against the first run, then I timed the following on a test run of 1GB of files from an internal SSD drive to a USB-connected HDD. These simply copied to empty target directories.
cp -pur : 19.5 seconds
rsync -ah : 19.6 seconds
rsync -azh : 61.5 seconds
Both commands seem to be about the same, although zipping and unzipping obviously tax the system where bandwidth is not a bottleneck.

Especially if you use a copy-on-write filesystem like BTRFS or ZFS, rsync is much better.
I use BTRFS, and I have this in my ~/.bashrc:
alias cp="rsync -ah --inplace --no-whole-file --info=progress2"
The important flag here for CoW FSs like BTRFS is --inplace because it only copies the changed part of the files, doesn't create new inodes for small changes between files, etc. See this.

It's not really a question of what's more efficient.
The commands 'rsync', and 'cp' are not equivalent and achieve different goals.
1- rsync can preserve the time of creation of existing files. (using -a option)
2- rsync will run multiprocess and transfer using either local sockets or network sockets. (i.e. fork itself into multiple processes)
3- The multiprocessing, and threading will increase your throughput when copying large number of small files, and even with multiple larger files.
So bottom line is rsync is for large data, and cp is for smaller local copying. (MB to small GB range). When you start getting into multiple GB or in the TB range, go with rsync. And of course network copies, rsync all the way.

For a local copy, the only advantage of rsync is that it will avoid copying if the file already exists in the destination directory. The definition of "already exists" is (a) same file name (b) same size (c) same timestamp. (Maybe same owner/group; I am not sure...)
The "rsync algorithm" is great for incremental updates of a file over a slow network link, but it will not buy you much for a local copy, as it needs to read the existing (partial) file to run it's "diff" computation.
So if you are running this sort of command frequently, and the set of changed files is small relative to the total number of files, you should find that rsync is faster than cp. (Also rsync has a --delete option that you might find useful.)

Keep in mind that while transferring files internally on a machine i.e not network transfer, using the -z flag can have a massive difference in the time taken for the transfer.
Transfer within same machine
Case 1: With -z flag:
TAR took: 9.48345208168
Encryption took: 2.79352903366
CP took = 5.07273387909
Rsync took = 30.5113282204
Case 2: Without the -z flag:
TAR took: 10.7535531521
Encryption took: 3.0386879921
CP took = 4.85565590858
Rsync took = 4.94515299797

if you are using cp doesn't save existing files when copying folders of the same name. Lets say you have this folders:
/myFolder
someTextFile.txt
/someOtherFolder
/myFolder
wellHelloThere.txt
Then you copy one over the other:
cp /someOtherFolder/myFolder /myFolder
result:
/myFolder
wellHelloThere.txt
This is at least what happens on macOS and I wanted to preserve the diff files so I used rsync.

I will prefer to use rsync with the following options
rsync -avhW --no-compress --progress --info=progress2 <src directory> <dst directory>
The above parameters can be defined as follows :
-a for the archive to preserves ownership, permissions, etc.
-v for verbose
-h for human-readable
-W for copying whole files only
--no-compress as there's no lack of bandwidth between local devices
--progress to see the progress of large files
--info=progress2 to see the overall progress
source directory path
destination directory path

rsync is much much better compared to cp because rsync copies whole files/directory only the first time. The next time when you use rsync command with the same files/directory, only new changes are copied to the destination folder, not the entire files are copied.

I used rsynk to transfer 330G data from a local HD to a external HD via USB 3.0. It took me three days. The transfer rate went down to 800 Kb/s and rised to 50 M/s for a while only after pausing the job. It is a typical overbuffering issue. Bad experience for local file tranfers: as the name indicates, (R)sync stands for REMOTE-sync (optimized for tranfers via network). As often happens, I discovered the "-z" flag only after I wondered about the issue and looked for an understandment

shell built in pwd versus /bin/pwd

I would like to know the code implementation of built-in pwd and /bin/pwd especially when the directory path is a symbolic link.
Example:
hita#hita-laptop:/home$ ls -l
lrwxrwxrwx 1 root root 31 2010-06-13 15:35 my_shell -> /home/hita/shell_new/hita_shell
hita#hita-laptop:/home$ cd my_shell
hita#hita-laptop:/home/my_shell$ pwd <SHELL BUILT-IN PWD>
/home/my_shell
hita#hita-laptop:/home/my_shell$ /bin/pwd
/home/hita/shell_new/hita_shell
The output is different in both the cases. Any clue?
thanks

The kernel maintains a current directory (by inode) and when you need the current directory, it determines its name by walking up the directory tree (using ..) to find the names of all the path components. This is the 'real' or sometimes called 'physical' working directory. There is a library function getcwd(3) which does this for you; on more-recent Linux systems this is actually a system call, which helps with getting a consistent view should the parent directories be in the process of being renamed.
Some shells, notably bash, maintain a environment variable PWD to keep track of where you are, and if you changed directory through a symbolic link, this environment variable will show that. They call this the 'logical' path.
/bin/pwd shows the result of getcwd(3), ie the real path; if you give it -L it will tell you the value of PWD (unless it's rubbish, then you get the real path). (Gnu's version of /bin/pwd does more work than this to deal with complexities of parent directories without read permission and very long path names.)
Bash's built-in pwd shows you the 'logical' path with whatever symlinks you used to get there; even if it's now rubbish (ie deleted or renamed since you used it). The default of the built-in pwd can be changed with set -o physical (on) or set +o physical (off is plus!) The default prompt (containing the current directory) follows the option too.
# make a directory with a symlink alias
cd /tmp
mkdir real
ln -s real sym
cd sym
pwd # will say sym
pwd -L # will say sym
pwd -P # will say real
/bin/pwd # will say real
/bin/pwd -L # will say sym
/bin/pwd -P # will say real
rm /tmp/sym
pwd # says sym, though link no longer exists
/bin/pwd -L # will say real!
rmdir /tmp/real
pwd # says sym, though no directory exists
/bin/pwd # says error, as there isn't one
For what it's worth, my opinion is that all the 'logical' business is just adding to the confusion; the old way was the better way. It's true that symbolic links can be confusing, but this makes it more confusing, because any file operations which open .. don't do the same thing as any directory changes which use .. for example in this rather nasty example:
mkdir -p /tmp/dir/subdir
ln -s /tmp/dir/subdir /tmp/a
cd /tmp/a
ls .. # shows contents of /tmp/dir
(cd .. ; ls) # shows contents of /tmp
To avoid all this, you can put the following in your ~/.bashrc
set -o physical
Hope that helps!
Kind regards,
J.
PS The above is pretty specific to Linux and Gnu bash; other shells and systems are similar but different.

The shell's builtin pwd has the advantage of being able to remember how you accessed the symlinked directory, so it shows you that information. The standalone utility just knows what your actual working directory is, not how you changed to that directory, so it reports the real path.
Personally, I dislike shells that do what you're describing because it shows a reality different than that which standalone tools will see. For example, how a builtin tool parses a relative path will differ from how a standalone tool parses a relative path.

The shell keeps track in its own memory what your currenct directory is by concatenating it with whatever you cd to (and eliminating . and .. entries). It does this so that symbolic links don't mess up cd ... The /bin/pwd implementation walks the directory tree upwards trying to find inodes with the right names.

The built-in pwd shows symbolic links by default, but won't do if you give it the -P option.
In contrast, the pwd command doesn't show symbolic links by default, but will do if given the -L option.

Add last n lines of files to tar/zip

I need to regularly send a collection of log files that can grow quite large, so I would like to only send the last n lines of the each of the files.
for example:
/usr/local/data_store1/file.txt (500 lines)
/usr/local/data_store2/file.txt (800 lines)
Given a file with a list of needed files named files.txt, I would like to create an archive (tar or zip) with the last 100 lines of each of those files.
I can do this by creating a separate directory structure with the tail-ed files, but that seems like a waste of resources when there's probably some piping magic that can happen to accomplish it. Full directory structure also must be preserved since files can have the same names in different directories.
I would like the solution to be a shell script if possible, but perl (without added modules) is also acceptable (this is for Solaris machines that don't have ruby/python/etc.. installed on them.)

You could try
tail -n 10 your_file.txt | while read line; do zip /tmp/a.zip $line; done
where a.zip is the zip file and 10 is n or
tail -n 10 your_file.txt | xargs tar -czvf test.tar.gz --
for tar.gz

You are focusing in an specific implementation instead of looking at the bigger picture.
If the final goal is to have an exact copy of the files on the target machine while minimizing the amount of data transfered, what you should use is rsync, which automatically sends only the parts of the files that have changed and also can automatically compress while sending and decompress while receiving.
Running rsync doesn't need any more daemons on the target machine that the standard sshd one, and to setup automatic transfers without passwords you just need to use public key authentication.

There is no piping magic for that, you will have to create the folder structure you want and zip that.
mkdir tmp
for i in /usr/local/*/file.txt; do
mkdir -p "`dirname tmp/${i:1}`"
tail -n 100 "$i" > "tmp/${i:1}"
done
zip -r zipfile tmp/*

Use logrotate.
Have a look inside /etc/logrotate.d for examples.

Why not put your log files in SCM?
Your receiver creates a repository on his machine from where he retrieves the files by checking them out.
You send the files just by commiting them. Only the diff will be transmitted.