Error happens when trying to umount the Lustre file system - unix

When I umount Lustre FS it displays:
[root#cn17663-ens4 mnt]# umount /mnt/lustre
umount: /mnt/lustre: target is busy.
(In some cases useful info about processes that use
the device is found by lsof(8) or fuser(1))
and if I add the force option -f it gives the same result:
[root#cn17663-ens4 mnt]# umount /mnt/lustre -f
umount: /mnt/lustre: target is busy.
(In some cases useful info about processes that use
the device is found by lsof(8) or fuser(1))
When I try to list the directory it gives me :
[root#cn17663-ens4 mnt]# ls
ls: cannot access lustre: Cannot send after transport endpoint shutdown
lustre
and I cannot find what the reason is and cannot solve it.

Did you actually try running lsof /mnt/lustre (as the error message recommends) to see what is using the filesystem? This problem is not unique to Lustre, but true of any local filesystem as well - if there is a process using the filesystem (current working directory or open file) then it can't be unmounted until that process stops using it (cd out of /mnt/lustre or close the open file(s)).

I find I can use umount -l /mnt/xx to solve this problem!

Related

lftp mirror wrong return code

I am using lftp mirror -R to sync a local dir to a remote sftp dir
Just to get myself super clarified, the script my i am running lftp -f is
as follows
open sftp://hostname port
user username password
mirror -R local_dir sftp_dir
exit
However I keep getting exit code 1 from mirror -R,even though from the standard stdout it seem that it has successfully uploaded the file and I can verify that the files are indeed upload from sftp.
So just wondering why is that happening and how i can get correct exit code
Non-zero exit code without error messages means that something has silently failed. Most often it is "chmod" operation. Try adding --no-perms option. To be sure, enable debug and see the interaction with the server.

aws configure command giving [Errno 5] Input/output error

I am configuring awscli
I run following command:
[bharthan#pchirmpc007 ~]$ aws configure
AWS Access Key ID [None]: adfasdfadfasdfasdfasdf
AWS Secret Access Key [None]: adfasdfasdfasdfasdfasdfasd
Default region name [None]: us-east-1
Default output format [None]: json
It is giving me following error:
[Errno 5] Input/output error
Any suggestions what may be the reason.
You may have some bad sectors on the target HDD.
To check sda1 volume for bad sectors in Linux run fsck -c /dev/sda1. For drive C: in Windows it should be chkdsk c: /f /r.
IMHO chkdsk way will be more suitable as it will remap bad blocks on the HDD while Linux fsck simply marks such blocks as unusable in the current file system.
Quote from man fsck.ext2
-c This option causes e2fsck to use badblocks(8) program to do a read-only scan of the device in order to find any bad blocks. If any bad blocks are found, they are added to the bad block inode to prevent them from being allocated to a file or directory. If this option is specified twice, then the bad block scan will be done using a non-destructive read-write test

What cause the error "Couldn't canonicalise: No such file or directory" in SFTP?

I am trying to use SFTP to upload the entire directory to remote host but I got a error.(I know SCP does work, but I really want to figure out the problem of SFTP.)
I used the command as below:
(echo "put -r LargeFile/"; echo quit)|sftp -vb - username#remotehost:TEST/
But I got the error "Couldn't canonicalise: No such file or directory""Unable to canonicalise path "/home/s1238262/TEST/LargeFile"
I thought it was caused by access rights. So, I opened a SFTP connection to the remote host in interactive mode and tried to create a new directory "LargeFile" in TEST/. And I succeeded. Then, I used the same command as above to uploading the entire directory "LargeFile". I also succeeded. The subdirectories in LargeFile were create or copied automatically.
So, I am confused. It seems only the LargeFile/ directory cannot be created in non-interactive mode. What's wrong with it or my command?
With SFTP you can only copy if the directory exists. So
> mkdir LargeFile
> put -r path_to_large_file/LargeFile
Same as the advice in the link from #Vidhuran but this should save you some reading.
This error could possibly occur because of the -r option. Refer https://unix.stackexchange.com/questions/7004/uploading-directories-with-sftp
A better way is through using scp.
scp -r LargeFile/"; echo quit)|sftp -vb - username#remotehost:TEST/
The easiest way for me was to zip my folder on local LargeFile.zip and simply put LargeFile.zip
zip -r LargeFile.zip LargeFile
sftp www.mywebserver.com (or ip of the webserver)
put LargeFile.zip (it will be on your remote server local directory)
unzip Largefile.zip
If you are using Ubuntu 14.04, the sftp has a bug. If you have the '/' added to the file name, you will get the Couldn't canonicalize: Failure error.
For example:
sftp> cd my_inbox/ ##will give you an error
sftp> cd my_inbox ##will NOT give you the error
Notice how the forward-slash is missing in the correct request. The forward slash appears when you use the TAB key to auto-populate the names in the path.

Why OpenMPI uses a different server given a different -n setting?

I am testing out OpenMPI, provided and compiled by another user, (I am using soft link to his directories for all bin, include, etc - all the mandatory directories) but I ran into this weird thing:
First of all, if I ran mpirun with -n setting <= 10, I can run this below. testrunmpi.py simply prints out "run." from each core.
# I am in serverA.
bash-3.2$ /home/karl/bin/mpirun -n 10 ./testrunmpi.py
run.
run.
run.
run.
run.
run.
run.
run.
run.
run.
However, when I tried running -n more than 10, I will run into this:
bash-3.2$ /home/karl/bin/mpirun -n 24 ./testrunmpi.py
karl#serverB's password: Could not chdir to home directory /home/karl: No such file or directory
bash: /home/karl/bin/orted: No such file or directory
--------------------------------------------------------------------------
A daemon (pid 19203) died unexpectedly with status 127 while attempting
to launch so we are aborting.
There may be more information reported by the environment (see above).
This may be because the daemon was unable to find all the needed shared
libraries on the remote node. You may set your LD_LIBRARY_PATH to have the
location of the shared libraries on the remote nodes and this will
automatically be forwarded to the remote nodes.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun noticed that the job aborted, but has no info as to the process
that caused that situation.
--------------------------------------------------------------------------
bash-3.2$
bash-3.2$
Permission denied, please try again.
karl#serverB's password:
Permission denied, please try again.
karl#serverB's password:
I see that the work is dispatched to serverB, while I was on serverA. I don't have any account on serverB. But if I invoke mpirun -n <= 10, the work will be on serverA.
This is strange, so I checked out /home/karl/etc/openmpi-default-hostfile, and tried set the following:
serverA slots=24 max_slots=24
serverB slots=0 max_slots=32
But the problem persists and still gives out the same error message above. What must I do in order to have my program run on serverA only?
The default hostfile in Open MPI is system-wide, i.e. its location is determined while the library is being built and installed and there is no user-specific version of it. The actual location can be obtained by running the ompi_info command like this:
$ ompi_info --param orte orte | grep orte_default_hostfile
MCA orte: parameter "orte_default_hostfile" (current value: <LOOK HERE>, data source: default value)
You can override the list of hosts in several different ways. First, you can provide your own hostfile via the -hostfile option to mpirun. If so, you don't have to put hosts with zero slots inside it - simply omit machines that you have no access to. For example:
localhost slots=10 max_slots=10
serverA slots=24 max_slots=24
You can also change the path to the default hostfile by setting the orte_default_hostfile MCA parameter:
$ mpirun --mca orte_default_hostfile /path/to/your/hostfile -n 10 executable
Instead of passing each time the --mca option, you can set the value in an exported environment variable called OMPI_MCA_orte_default_hostfile. This could be set in your shell's dot-rc file, e.g. in .bashrc if using Bash.
You can also specify the list of nodes directly via the -H (or -host) option.

mount error(6): No such device or address when sharing windows folder to ubuntu

I have a windows shared folder named \\mymachine\sf and I want to map it as a ubuntu device. I use smbmount command as below:
smbmount //mymachine/sf /mnt/sf -o <username>
The output is like
retrying with upper case share name
mount error(6): No such device or address
Refer to the mount.cifs(8) manual page (e.g. man mount.cifs)
I'm sure the device exists and mymachine is ping'ed through.
Any idea?
Double check that the share exists and is the name you expect with:
smbclient -L //mymachine -U <username>
Also double check that the directory your share points to (as mentioned in smb.conf) actually exists on the server/host. This is one situation where you will receive that error, despite smbclient -L //hostname giving reasonable output.
Make sure that the directory the samba share points to exists on the server side as well (might have been deleted or mount might have failed at boot). smbclient -L //mymachine -U <username> lists shares as available even though they're not available!

Resources