I need to check the display the files my server with their sizes.Which command that I need to use.
Any variants of ls command?
I hope ls -lah will do the job. Also if you are new to unix environment please go to http://www.tutorialspoint.com/unix/unix-useful-commands.htm
stat -c %s file.txt
This command will give you the size of the file in bytes. You can learn more about why you should avoid parsing output of ls command over here: http://mywiki.wooledge.org/ParsingLs
ls -l --block-size=M
will give you a long format listing (needed to actually see the file size) and round file sizes up to the nearest MiB.
If you want MB (10^6 bytes) rather than MiB (2^20 bytes) units, use --block-size=MB instead.
Or
ls -lah
-h
When used with the -l option, use unit suffixes: Byte, Kilobyte, Megabyte, Gigabyte, Terabyte and Petabyte in order to reduce the number of digits to three or less using base 2 for sizes.
man ls
http://unixhelp.ed.ac.uk/CGI/man-cgi?ls
Here's yet another option to add to the mix:
$ du -b file.txt
That is: estimate file space usage of file.txt in bytes.
Bit late but here is a better solution imo to Big McLargeHuge's answer
$ du -h *
You can use:ls -lh, then you will get a list of file information
ls -lh file.txt | awk '{ print $5 }'
Related
I have a directory full of files with names such as:
file_name_is_001
file_name_001
file_name_is_002
file_name_002
file_name_is_003
file_name_003
I want to copy only the files that don't contain 'is'. I'm not sure how to do this. I have tried to search for it, but can't seem to google the right phrase to find the results.
Details depend on operating system, shell, etc.
For a unix system a quite verbose but easy to understand approach could look like this (please mind that I didn't test it):
mkdir some_temporary_directory
mv *_is_* some_temporary_directory
cp * where_ever_you_want_to_copy_it
mv some_temporary_directory/* .
rmdir some_temporary_directory
You can do this using bash. First, here's a command to get you a list of files that don't contain the text _is_:
ls | grep -v "_is_"
This takes the output of ls and matches all values with DO NOT contain _is_ using grep -v.
In order to then copy these files, we need to turn the lines output by grep into arguments of cp. We can do this using xargs:
ls | grep -v "_is_" | xargs -J % cp % new_folder
From the xargs man page, it is a tool to "build and execute command lines from standard input".
I want to find string pattern in file in unix. I use below command:
$grep 2005057488 filename
But file contains millions of lines and i have many such files. What is fastest way to get pattern other than grep.
grep is generally as fast as it gets. It's designed to one thing and one thing only - and it does what it does very well. You can read why here.
However, to speed things up there are a couple of things you could try. Firstly, it looks like the pattern you're looking for is a fixed string. Fortunately, grep has a 'fixed-strings' option:
-F, --fixed-strings
Interpret PATTERN as a list of fixed strings, separated by newlines, any of which is to be matched. (-F is specified by POSIX.)
Secondly, because grep is generally pretty slow on UTF-8, you could try disabling national language support (NLS) by setting the environment LANG=C. Therefore, you could try this concoction:
LANG=C grep -F "2005057488" file
Thirdly, it wasn't clear in your question, but if your only trying to find if something exists once in your file, you could also try adding a maximum number of times to find the pattern. Therefore, when -m 1, grep will quit immediately after the first occurrence is found. Your command could now look like this:
LANG=C grep -m 1 -F "2005057488" file
Finally, if you have a multicore CPU, you could give GNU parallel a go. It even comes with an explanation of how to use it with grep. To run 1.5 jobs per core and give 1000 arguments to grep:
find . -type f | parallel -k -j150% -n 1000 -m grep -H -n STRING {}
To grep a big file in parallel use --pipe:
< bigfile parallel --pipe grep STRING
Depending on your disks and CPUs it may be faster to read larger blocks:
< bigfile parallel --pipe --block 10M grep STRING
grep works faster than sed.
$grep 2005057488 filename
$sed -n '/2005057488/p' filename
Still Both works to get that particular string in a file
sed -n '/2005057488/p' filename
Not sure if this is faster than grep though.
I've got a script that calls grep to process a text file. Currently I am doing something like this.
$ grep 'SomeRegEx' myfile.txt > myfile.txt.temp
$ mv myfile.txt.temp myfile.txt
I'm wondering if there is any way to do in-place processing, as in store the results to the same original file without having to create a temporary file and then replace the original with the temp file when processing is done.
Of course I welcome comments as to why this should or should not be done, but I'm mainly interested in whether it can be done. In this example I'm using grep, but I'm interested about Unix tools in general. Thanks!
sponge (in moreutils package in Debian/Ubuntu) reads input till EOF and writes it into file, so you can grep file and write it back to itself.
Like this:
grep 'pattern' file | sponge file
Perl has the -i switch, so does sed and Ruby
sed -i.bak -n '/SomeRegex/p' file
ruby -i.bak -ne 'print if /SomeRegex/' file
But note that all it ever does is creating "temp" files at the back end which you think you don't see, that's all.
Other ways, besides grep
awk
awk '/someRegex/' file > t && mv t file
bash
while read -r line;do case "$line" in *someregex*) echo "$line";;esac;done <file > t && mv t file
No, in general it can't be done in Unix like this. You can only create/truncate (with >) or append to a file (with >>). Once truncated, the old contents would be lost.
In general, this can't be done. But Perl has the -i switch:
perl -i -ne 'print if /SomeRegEx/' myfile.txt
Writing -i.bak will cause the original to be saved in myfile.txt.bak.
(Of course internally, Perl just does basically what you're already doing -- there's no special magic involved.)
To edit file in-place using vim-way, try:
$ ex -s +'%!grep foo' -cxa myfile.txt
Alternatively use sed or gawk.
Most installations of sed can do in-place editing, check the man page, you probably want the -i flag.
Store in a variable and then assign it to the original file:
A=$(cat aux.log | grep 'Something') && echo "${A}" > aux.log
Take a look at my slides "Field Guide To the Perl Command-Line Options" at http://petdance.com/perl/command-line-options.pdf for more ideas on what you can do in place with Perl.
cat myfile.txt | grep 'sometext' > myfile.txt
This will find sometext in myfile.txt and save it back to myfile.txt, this will accomplish what you want. Not sure about regex, but it does work for text.
I am looking for a unix command to get a single line by passing line number to a big file (with around 5 million records). For example to get 10th line, I want to do something like
command file-name 10
Is there any such command available? We can do this by looping through each record but that will be time consuming process.
This forum entry suggests:
sed -n '52p' (file)
for printing the 52th line of a file.
Going forward, There are a lot of ways to do it, and other related stuffs.
If you want multiple lines to be printed,
sed -n -e 'Np' -e 'Mp'
Where N and M are lines which will only be printed. Refer this 10 Awesome Examples for Viewing Huge Log Files in Unix
command | sed -n '10p'
or
sed -n '10p' file
You could do something like:
head -n<lineno> <file> | tail -n1
That would give you the <lineno> lines, then only give the last line of output (your line).
Edit: It seems all the solutions here are pretty slow. However, by definition you'll have to iterate through all the records since the operating system has no way to parse line-oriented files since files are byte-oriented. (In some sense, all these programs are going to do is count the number of \n or \r characters.) In lieu of a great answer, I'll also present the timings on my system of several of these commands!
[mjschultz#mawdryn ~]$ time sed -n '145430980p' br.txt
0b10010011111111010001101111010111
real 0m25.871s
user 0m17.315s
sys 0m2.360s
[mjschultz#mawdryn ~]$ time head -n 145430980 br.txt | tail -n1
0b10010011111111010001101111010111
real 0m41.112s
user 0m39.385s
sys 0m4.291s
[mjschultz#mawdryn ~]$ time awk 'NR==145430980{print;exit}' br.txt
0b10010011111111010001101111010111
real 2m8.835s
user 1m38.076s
sys 0m3.337s
So, on my system, it looks like the sed -n '<lineno>p' <file> solution is fastest!
you can use awk
awk 'NR==10{print;exit}' file
Put an exit after printing the 10th line so that awk won't process the 5 million records file further.
I need to rename files names like this
transform.php?dappName=Test&transformer=YAML&v_id=XXXXX
to just this
XXXXX.txt
How can I do it?
I understand that i need more than one mv command because they are at least 25000 files.
Easiest solution is to use "mmv"
You can write:
mmv "long_name*.txt" "short_#1.txt"
Where the "#1" is replaced by whatever is matched by the first wildcard.
Similarly #2 is replaced by the second, etc.
So you do something like
mmv "index*_type*.txt" "t#2_i#1.txt"
To rename index1_type9.txt to t9_i1.txt
mmv is not standard in many Linux distributions but is easily found on the net.
If you are using zsh you can also do this:
autoload zmv
zmv 'transform.php?dappName=Test&transformer=YAML&v_id=(*)' '$1.txt'
You write a fairly simple shell script in which the trickiest part is munging the name.
The outline of the script is easy (bash syntax here):
for i in 'transform.php?dappName=Test&transformer=YAML&v_id='*
do
mv $i <modified name>
done
Modifying the name has many options. I think the easiest is probably an awk one-liner like
`echo $i | awk -F'=' '{print $4}'`
so...
for i in 'transform.php?dappName=Test&transformer=YAML&v_id='*
do
mv $i `echo $i | awk -F'=' '{print $4}'`.txt
done
update
Okay, as pointed out below, this won't necessarily work for a large enough list of files; the * will overrun the command line length limit. So, then you use:
$ find . -name 'transform.php?dappName=Test&transformer=YAML&v_id=*' -prune -print |
while read
do
mv $reply `echo $reply | awk -F'=' '{print $4}'`.txt
done
Try the rename command
Or you could pipe the results of an ls into a perl regex.
You may use whatever you want to transform the name (perl, sed, awk, etc.). I'll use a python one-liner:
for file in 'transform.php?dappName=Test&transformer=YAML&v_id='*; do
mv $file `echo $file | python -c "print raw_input().split('=')[-1]"`.txt;
done
Here's the same script entirely in Python:
import glob, os
PATTERN="transform.php?dappName=Test&transformer=YAML&v_id=*"
for filename in glob.iglob(PATTERN):
newname = filename.split('=')[-1] + ".txt"
print filename, '==>', newname
os.rename(filename, newname)
Side note: you would have had an easier life saving the pages with the right name while grabbing them...
find -name '*v_id=*' | perl -lne'rename($_, qq($1.txt)) if /v_id=(\S+)/'
vimv lets you rename multiple files using Vim's text editing capabilities.
Entering vimv opens a Vim window which lists down all files and you can do pattern matching, visual select, etc to edit the names. After you exit Vim, the files will be renamed.
[Disclaimer: I'm the author of the tool]
I'd use ren-regexp, which is a Perl script that lets you mass-rename files very easily.
21:25:11 $ ls
transform.php?dappName=Test&transformer=YAML&v_id=12345
21:25:12 $ ren-regexp 's/transform.php.*v_id=(\d+)/$1.txt/' transform.php*
transform.php?dappName=Test&transformer=YAML&v_id=12345
1 12345.txt
21:26:33 $ ls
12345.txt
This should also work:
prfx='transform.php?dappName=Test&transformer=YAML&v_id='
ls $prfx* | sed s/$prfx// | xargs -Ipsx mv "$prfx"psx psx
this renamer command would do it:
$ renamer --regex --find 'transform.php?dappName=Test&transformer=YAML&v_id=(\w+)' --replace '$1.txt' *
Ok, you need to be able to run a windows binary for this.
But if you can run Total Commander, do this:
Select all files with *, and hit ctrl-M
In the Search field, paste "transform.php?dappName=Test&transformer=YAML&v_id="
(Leave Replace empty)
Press Start
It doesn't get much simpler than that.
You can also rename using regular expressions via this dialog, and you see a realtime preview of how your files are going to be renamed.