So I have a list of values
123.45
7739.6
7398
777777.0
1.2333333
3.3.3
Shdkv
.0
0.
.
And I would like to grep the lines that doesn't fit (example) nullable decimal[6,2]
Which would be
777777.0
1.2333333
3.3.3
Shdkv
.
I can select each criteria individually (such as length more than or doesn't contain letters) but I can't do them all properly. Additionally, a decimal point by itself ,".", should not be considered a number
I didn't know you can specify the length for before and after the decimal point so I tried to used sed select the before and after the decimal point and check them separately
sed -e 's/^[^\.]*\.//'| grep -E -v "^-? ?[0-9]{0,$dec_length}
sed -e 's/\.//'| grep -E -v "^-? ?[0-9]{0,$int_length}
however it doesn't work for cases that doesn't have a decimal point
I also tried to use awk instead, by recommendation of my friend, but i ran to similar problem.
Currently, thanks to tripleeee's suggestion, I am here
sed -e 's/\+//' -e 's/\-//'| grep -E -v '^-?[0-9]{0,$int_length}(\.?[0-9]{0,dec_length}?)$'
Probably try a regex tutorial before asking for trivial regular expressions.
grep -E '^-?[1-9]{0,6}(\.[0-9]{0,2}?)$'
This requires a number before the point. Leave out the -? to only allow non-negative floats. Search for existing questions if this is not precisely what you need (your requirements are unclear with regard to some corner cases, like is . a valid float?)
How to replace double consonants with only one letter using sed Linux command. Example: WILLIAM -> WILIAM. grep -E '(.)\1+' commands finds the words that follow two same consonants in a row pattern, but how do I replace them with only one occurrence of the letter?
I tried
cat test.txt | head | tr -s '[^AEUIO\n]' '?'
tr is all or nothing; it will replace all occurrences of the selected characters, regardless of context. For regex replacement, look at sed - you even included this in your question's tags, but you don't seem to have explored how it might be useful?
sed 's/\(.\)\1/\1/g' test.txt
The dot matches any character; to restrict to only consonants, change it to [b-df-hj-np-tv-xz] or whatever makes sense (maybe extend to include upper case; perhaps include accented characters?)
The regex dialect understood by sed is more like the one understood by grep without -E (hence all the backslashes); though some sed implementations also support this option to select the POSIX extended regular expression dialect.
Neither sed not tr need cat to read standard input for them (though tr obscurely does not accept a file name argument). See tangentially also Useless use of cat?
Match one consonant, remember it in \( \), then match is again with \1 and substitute it for itself.
sed 's/\([bcdfghjklmnpqrstvxzBCDFGHJKLMNPQRSTVXZ]\)\1/\1/'
How do I tell sed to repeat substitution until no match was replaced?
If doing echo x | sed 's/x/xx/g' I'm really glad sed doesn't restart on the output.
But if I have, say, echo 'x,a,b,x,x,c,x,d,e,x,x,x,f,x' | sed 's/,x,/,y,/g'
it does not substitute every x for y, for an obvious reason: the prior substitution has already consumed the surrounding delimiters.
And I'm aware that I have a tiny problem with the first and last x as well, but I ignore this for simplicity of the question.
Edit: I have to clarify the question, as already mentioned but only in comments: I want to see every x replaced by y, but only if it was a single word for itself, enclosed by delimiters, commas in this example, but if there is a way to cope with more complex delimiters, this will be welcome.
(No way to fall into the y2k trap, replacing Monday by Mondak, just joking.)
Use \b as a word delimiter.
$ echo 'x,xx,x,x' | sed 's/\bx\b/y/g'
y,xx,y,y
\b denotes word boundaries, but even used within a capture group it's not going to cause replacement of the characters outside the word, if any.
Try this:
$ echo 'x,a,b,x,x,c,x,d,e,x,x,x,f,x' | sed 's/x/y/g'
y,a,b,y,y,c,y,d,e,y,y,y,f,y
What about
$ echo 'x,a,b,x,x,c,x,d,e,x,x,x,f,x' | sed ':label; s/,x,/,y,/g; t label;'
x,a,b,y,y,c,y,d,e,y,y,y,f,x
? This will not replace the x at the edges but that was not requested explicitly.
It will also not replace x in words:
$ echo 'x,a,b,fix,x,c,x,d,e,x,x,x,f,x' | sed ':label; s/,x,/,y,/g; t label;'
x,a,b,fix,y,c,y,d,e,y,y,y,f,x
Explanation: the t command will jump to the label if some substition took place. It will the apply the same sed expression to the line again.
I can't get the ([^/]+) sed regex to work properly.
Instead of returning all non-forward slash characters, it only returns one.
Command:
echo '/test/path/file.log' | sed -r 's|^.*([^/]+)/(.*)$|\1.\2|g'
Expected:
path.file.log
Result:
h.file.log
Also Tried this but got the same result:
echo '/test/path/file.log' | sed -r 's|^.*([^/]{1,})/(.*)$|\1.\2|g'
The problem is not with [^/]+, but with the preceding .*. .* is greedy, and will consume a maximal amount of input. My usual suggestion would be to use .*? to make it non-greedy, but POSIX regexes don't support that syntax.
If there will always be a slash, you could add one to the regex to stop it from consuming too much.
$ echo '/test/path/file.log' | sed -r 's|^.*/([^/]+)/(.*)$|\1.\2|g'
path.file.log
OSes uses different versions of sed. Some sed versions use basic regexp syntax by default; if you need extended regexp syntax (+ is one of those features) then you need to switch option with -E.
I have a string " r1/pkg/amd64/misc/hash/hash-r1.r5218.tbz"
but, I only want "hash-r1.r5218.tbz"
so, I try this
unix$ a="r1/pkg/amd64/misc/hash/hash-r1.r5218.tbz"
unix$ echo $a | sed 's/.*\/\([^\/]*\)\.tbz/\1/' //[1]
hash-r1.r5218 //I know this should work
unix$ echo $a | sed 's/.*\/\([^\/]+\)\.tbz/\1/' //[2]
r1/pkg/amd64/misc/hash/hash-r1.r5218.tbz //however I do not know why it does not work.
as far as I remember, + in regexp, means using previous regexp 1 or more times. * in regexp, means using previous regexp 0 or more times.
Could anyone explain why [2] fails, thanks a lot.
a="r1/pkg/amd64/misc/hash/hash-r1.r5218.tbz"
echo $a | sed 's:.*/::; s:.tbz$::'
hash-r1.r5218
You don't need to use '/' as the patern/repl marker, you can use other chars. The ':' is very popular.
Also, you don't have to use capture buffers, when you know the exact text on both sides of your target data.
I have substituted out all chars up to the last '/', relying on .* for all chars, and '/' to terminate the standard greedy search of sed. THe you sub out the trailing \.tbz with noting.
IHTH.
Not all versions of sed support + in the regex. Some that do support it require -r to be specified. But why use sed instead of basename or echo ${a##*/}?
Using this submatch via parentheses will grab everything after the last slash to the end of your line.
str="r1/pkg/amd64/misc/hash/hash-r1.r5218.tbz"
echo $str | sed -n -E -e 's/.+\/(.+)$/\1/p'
returns hash-r1.r5218.tbz
Oh, and your #2 fails because sed by default prints out each line that has a match. Using the -n flag suppresses that, and the trailing 'p' on this regex prints out the replace part of the substitution.