Unix egrep looks into folder information? - unix

I am have a simple egrep command that searches through all the files in the current directory for lines that contain the word "error":
egrep -i "error" *
This command will also go through the sub-directories as well. Here is a sample of what the whole folder looks like:
/Logfile_20120630_030000_ID1.log
/Logfile_20120630_030001_ID2.log
/Logfile_20120630_030005_ID3.log
/subfolder/Logfile_20120630_031000_Errors_A3.log
/subfolder/Logfile_20120630_031001_Errors_A3.log
/subfolder/Logfile_20120630_031002_Errors_A3.log
/subfolder/Logfile_20120630_031003_Errors_A3.log
The logfiles at the top directory contain "error" lines. But the logfiles in the "subfolder" directory do not contain lines with "error". (only in the filename)
So the problem I am getting is that the egrep command seems to be looking at the information within the "subfolder". My result gets a chunk of what seems to be binary block, then the text lines that contain the word "error" from the top folder logfiles.
If I deleted all the files underneath "subfolder", but did not delete the folder itself, I get the exact same results.
So does Unix keep file history information inside a folder??
The problem was corrected by running:
find . -type f | egrep -i "error" *
But I still dont understand why it was a problem. I'm running C-shell on a SunOS.

egrep -i error *
The * metacharacter matches ANY file name. Directories are files, too. * is expanded by the shell into any and all files in the current directory, this is traditionally called globbing.
set noglob
turns off that behavior. However, it is unlikely there are files named * in your directory, so in this example the command would find no files of any kind. BTW - Do not create a file named * to test this, because files named * may cause all kinds of interesting and unwanted things to happen. Think about what might happen when you tried to delete the file? rm '*' would be the right command, but if you or someone else did a rm * unthinkingly then you have problems...

Related

rsync will not read from include file

I am trying to use rsync to do backups. I have an include file called /etc/daily.rsync and it contains the following:
+ /home/demo
- *
Then I run the command below:
$ sudo rsync -acvv --delete --include-from=/etc/daily.rsync /mnt/offsite_backup/home/
sending incremental file list
delta-transmission disabled for local transfer or --whole-file
drwxrwxr-x 6 2021/02/22 14:09:13 .
total: matches=0 hash_hits=0 false_alarms=0 data=0
sent 52 bytes received 131 bytes 366.00 bytes/sec
total size is 0 speedup is 0.00
When I go look in the directory I see nothing. What I think is that it is trying to rsync from the current directory which btw is empty. So this leaves me to believe that it is not getting the data form the include file.
This command runs as expected:
sudo rsync -acvv --delete /home/demo /mnt/offsite_backup/home/
The different posts made many suggestions, and I have tried them. I am just stuck. Any thoughts would be very welcome.
I think you're misunderstanding what a filter file (like the one you specified with --include-from) does. It does not specify where to sync files from; it specifies which files within the source directory to sync.
You need to specify both the source and destination as part of the command line. In the command:
sudo rsync -acvv --delete --include-from=/etc/daily.rsync /mnt/offsite_backup/home/
You only specified one directory, /mnt/offsite_backup/home/, so rsync has assumed it's the source, and there is no destination. According to the rsync man page:
As a special case, if a single source arg is specified without a
destination, the files are listed in an output format similar to "ls -l".
So, basically, it's listing the contents of /mnt/offsite_backup/home/, and apparently that's empty.
The second command you gave specifies both the source and destination, which is why it works correctly. If you want to add a filter file to, be aware that the paths in the filter will be relative to the source. So if you used
sudo rsync -acvv --delete --include-from=/etc/daily.rsync /home/demo /mnt/offsite_backup/home/
...it's going to try to include the file/directory /home/demo/home/demo, which probably doesn't exist. Except it actually won't do that, because the - * line will exclude /home/demo/home, so if it did exist, it and its contents will be excluded. You need to include the parent directories of anything you want to include in the sync operation. Again, from the man page:
The concept path exclusion is particularly important when using a
trailing '*' rule. For instance, this won't work:
+ /some/path/this-file-will-not-be-found
+ /file-is-included
- *
This fails because the parent directory "some" is excluded by the '*' rule, so rsync never visits any of the files in the "some" or
"some/path" directories. One solution is to ask for all directories in
the hierarchy to be included by using a single rule: "+ */" (put it
somewhere before the "- *" rule), and perhaps use the
--prune-empty-dirs option. Another solution is to add specific include rules for all the parent dirs that need to be visited. For instance,
this set of rules works fine:
+ /some/
+ /some/path/
+ /some/path/this-file-is-found
+ /file-also-included
- *
ok, so after walking away from the problem I realized that, I never specified what actual directory I wanted to sync. The include can't work from thin air. so the command is:
sudo rsync -acv --delete /home/ --include-from=/etc/weekly.rsync /mnt/offline_backup/home/
The include file had to change as well.
+ demo/***
+ truenorth/***
- *
To have it decend into the directory structure, I needed the ***. I hope this can help someone else out.

Grep command stopped working

Suddenly grep command stopped working. When I did the ls -l ~/grep showing the one file in my home directory.But this file has been present for ages. If I give command which grep --> pointing to /bin/grep and with /bin/grep it is working fine. Can anyone please suggest.
Thanks,
Regards,
Shiv
You can delete the zero-byte file in your home directory. It's not doing anything. (I don't know how it got there.) The problem is that the first entry in PATH, ".", points to whatever directory you're in. So when you're in your home directory, the shell (bash, I assume) looks for grep in the current directory, and finds the file that's there, which can't do anything.
I consider it a bad idea to have "." in your path. It's convenient, and natural if you're coming from the Windows world, but it means that what gets executed can change depending on what directory you're in (as you have now seen). It also means that if you're on a multiuser system, someone can put an executable in one of their directories, and then when you cd into their directory, all of a sudden you're executing their code, which might not be what you want, and could be dangerous.
Instead, remove ".:" (dot colon) from your PATH. When you need to run a script in the current directory, add "./" to its name to execute it. "/bin" and "/usr/bin" should usually be at the front of the list. Some people prefer to put "/usr/local/bin" at the front of the list, or something else.
You can change your PATH by editing .profile or .bash_profile or .bashrc. It depends on how you have your shell set up. Be careful to separate each directory path in PATH with one ":" character.

How to delete Excel files from a Unix machine?

I have to delete a number of files with names like "test excel-27-03-2016.xls" from a directory on a Unix machine. Can you please suggest how? I tried using command
rm -f test excel-27-03-2016.xls
but it is not deleting the file.
Does the name of the file contains a space? It seems so.
If this is the case, rm -f "test excel-27-03-2016.xls" (note double quotes around the file name) ought to do it.
Running rm -f test excel-27-03-2016.xls means trying to erase two files, one named test and the other excel-27-03-2016.xls.
So if 'test excel-27-03-2016.xls' is one filename, you have to escape the space in the rm command.
rm test\ excel-27-03-2016.xls
or
rm 'test excel-27-03-2016.xls'
otherwise rm will think 'test' and 'excel-27-03-2016.xls' are two different files.
(Also you shouldn't need to use -f.)
For a single file, if the file name contains spaces, you have to protect those spaces. By default, the shell splits file names (arguments in general) at spaces. So, enclose the name in double quotes or single quotes:
rm -f "test excel-27-03-2016.xls"
or use a backslash if you prefer (but I don't prefer backslashes; I normally use quotes):
rm -f test\ excel-27-03-2016.xls
When a delete operation doesn't work, the -f option to rm becomes your enemy; it suppresses the error messages that rm would otherwise give. For example, without the -f, you might see:
$ rm test excel-27-03-2016.xls
rm: test: No such file or directory
rm: excel-27-03-2016.xls: No such file or directory
$
That tells you that rm was given two names, not one as you intended.
From a comment:
I have 20-30 files; do I have to give rm 'test excel-27-03-2016.xls" each time and provide "Yes" permission to delete file?
Time to learn wild-cards. First thing to learn — Be Careful! Do not destroy files carelessly.
Run a command such as:
ls -ld *.xls
Does that list the files you want deleted — all the files you want deleted and nothing but the files you want deleted? If it doesn't contain any extra file names (and no directory names), then you can run:
rm -f *.xls
If it doesn't contain all the file names you need deleted, but it does contain only names that you need deleted, then run the rm to reduce the size of the problem, then devise an alternative pattern to delete the others:
ls -ld *.xlsx # Semi-plausible
If it contains too many names, you have a couple of options. One is to use rm interactively:
rm -i *.xls
and respond yes to those that should be deleted and no to those that should be kept. Another is to work out a more precise wildcard, perhaps *-27-03-2016.xls.
When using wild-cards, the shell keeps file names as single arguments, so the fact that the generated names have spaces in them isn't a problem. Be aware that many shell techniques, such as capturing that list of file names in a variable, do not preserve the spaces properly — a cause of much confusion.
And, with any mass file removal, be very careful. The Unix system will not stop you doing immense damage to your files It will take you at your word — if you say 'remove everything', it will try to do so.
From another comment:
I have taken root access so I will have all permissions.
Don't run as root when you have problems working out what you are doing. Running as root means that any mistake has the potential to be dramatically more devastating than if you run as yourself.
If you are running as root, the -f option to rm really isn't needed (unless someone has attempted to protect you by creating an alias for the rm command).
When you're root, the system does what you tell it to do. root: Remove the kernel. system: Yes, sir! Right away, sir! root: Remove the complete root file system. system: Yes, sir! Right away, sir!
Be very, very careful when running as root. It is a bad idea to experiment when running as root. It is very important to know exactly what you plan to do as root, to gain root privileges and do what you plan to do, and then lose the root privileges as soon as possible. Use sudo (or su) to temporarily gain root privileges.

Correct Wildcard Notation for UNIX systems?

I am currently trying to remove a number of files from my root directory. There are about 110 files with almost the exact same file name.
The file name appears as wp-cron.php?doing_wp_cron=1.93 where 93 is any integer from 1-110.
However when I try to run the code: sudo rm /root/wp-cron.php?doing_wp_cron=1.* it actually tries to find the file with the asterisk * in the filename, leaving me with a file not found error.
What is the correct notation for removing a series of files using wildcard notation?
NOTE: I have already tried delimiting the filepath with both single ' and double quotes ". This did not avail.
Any thoughts on the matter?
Take a look at the permission on the /root directory with ls -ld /root, typically a non-root user will not have r-x permissions, which won't allow them to read the directory listing.
In your command sudo rm /root/wp-cron.php?doing_wp_cron=1.* the filename expansion attempt happens in the shell running under your non-root user. That fails to expand to the individual filenames as you do not have permissions to read /root.
The shell then execs sudo\0rm\0/root/wp-cron.php?doing_wp_cron=1.*\0. (Three separate, explicit arguments).
sudo, after satisfying its conditions, execs rm\0/root/wp-cron.php?doing_wp_cron=1.*\0.
rm runs and attempts to unlink the literal path /root/wp-cron.php?doing_wp_cron=1.*, failing as you've seen.
The solution to removing depends on your sudo permissions. If permitted, you may run a bash sub-process to do the file-name expansion as root:
sudo bash -c "rm /root/a*"
If not permitted, do the sudo rm with explicit filenames.
Brandon,
I agree with #arkascha . That glob should match, so something is amiss here. Do you get the appropriate list of files if you use a different binary, say 'ls' ? Try this:
ls /root/wp-cron.php?doing_wp_cron=1.*
If that returns the full list of files, then you know there's something funny with your environment regarding rm. This could be an alias as suggested.
If you cannot determine what is different or wrong with your environment you could run the list of files through a for loop and remove each one as a work-around:
for file in `ls /root/wp-cron.php?doing_wp_cron=1.*`
do
rm $file
done

Tar creating a file that is unexpectedly large

Figured maybe someone here might know whats going on, but essentially what I have to do is take a directory, and make a tar file omitting a subdir two levels down (root/1/2). Given it needs to work on a bunch of platforms, the easiest way I thought was to do a find and egrep that directory out, which works well giving me the list of files.
But then I pipe that file list into a xargs tar rvf command and the resulting file comes out something like 33gb. I've tried to output the find to a file, and use tar -T with that file as input, its still comes out to about 33gb, when if I did a straight tar of the whole directory (not omitting anything) it comes in where I'd expect it at 6-ish gb.
Any thoughts on what is going on? Or how to remedy this? I really need to get this figured out, I'm guessing it has to do with feeding it a list of files vs. having it just tar a directory, but not sure how to fix that.
Your find command will return directories as well as files
Consider using find to look for directories and to exclude some
tar cvf /path/to/archive.tar $(find suite -type d ! -name 'suite/tmp/Shared/*')
When you specify a directory in the file list, tar packages the directory and all the files in it. If you then list the files in the directory separately, it packages the files (again). If you list the sub-directories, it packages the contents of each subdirectory again. And so on.
If you're going to do a files list, make sure it truly is a list of files and that no directories are included.
find . -type f ...
The ellipsis might be find options to eliminate the files in the sub-directory, or it might be a grep -v that eliminates them. Note that -name normally only matches the last component of the name. GNU find has ! -path '*/subdir/*' or variants that will allow you to eliminate the file based on path, rather than just name:
find . -type f ! -path './root/1/2/*' -print

Resources