KSH script checks for alphanumeric in case statement - unix

Below is a simplified model of what I'm trying to achieve:
#!bin/ksh
string=AUS00
case $string in
[[:alnum:]] ) echo "alphanumeric" ;;
*) echo "nope" ;;
esac
I'm unable to validate alphanumeric code.
Constraints:
The validation need to happen inside the case statement
alnum function is not supported
Positive check only. Can't check for the absence of alphanumeric.
Thank you very much

The pattern [[:alnum:]] will match a single alphanumeric character. Your string is longer than one character, so it won't match.
If you want to check that your string contains an alnum character, you want *[[:alnum:]]*
If you want to check that your string only contains alnum characters, I'd flip the check to see if the string contains a non-alnum character:
for string in alnumOnly 'not all alnum'; do
case "$string" in
*[^[:alnum:]]*) echo "$string -> nope" ;;
*) echo "$string -> alphanumeric" ;;
esac
done
alnumOnly -> alphanumeric
not all alnum -> nope
I realized that ksh (even ksh88) implements what bash describes as "extended patterns":
A pattern-list is a list of one or more patterns separated
from each other with a |. Composite patterns can be formed
with one or more of the following:
?(pattern-list)
Optionally matches any one of the given patterns.
*(pattern-list)
Matches zero or more occurrences of the given
patterns.
+(pattern-list)
Matches one or more occurrences of the given patterns.
#(pattern-list)
Matches exactly one of the given patterns.
!(pattern-list)
Matches anything, except one of the given patterns.
So we can do:
case "$string" in
+([[:alnum:]]) ) echo "$string -> alphanumeric" ;;
* ) echo "string -> nope" ;;
esac

Related

Using SPARQL's URI( ) fn w/PREFIX & suffix containing slashes?

I want to use a PREFIX to simplify URI creation. Will welcome any advice that helps me build a mental model of what PREFIX is doing in a SPARQL query - it doesn't seem to a simple key/value replacement.
Here are some examples of what have tried.
working
This works as expected and does what I want, except for not using a PREFIX.
SELECT * WHERE {
BIND ( URI("http:://www.foo.com/bar/01/grik/234") as ?s ) # (a) works fine
?s a ?o .
# Here (a) works as expected. I'm binding ?s to a specific URI
# for testing because otherwise it runs too long to debug my query logic.
}
LIMIT 10
My failed PREFIX attempts
My actual prefix URI fragment is longer but this example shows the idea.
I want to put the first part of the above URI, http:://www.foo.com/bar/, in a PREFIX and use 01/grik/234 as a suffix.
Variations of this return nothing or error out on URI composition:
PREFIX foo: <http:://www.foo.com/bar/>
SELECT * WHERE {
# I'm just running run one of these BIND statements
# at a time; listing all of them here for easier visual comparison.
# BIND ( URI(foo:01/grik/234) as ?s ) # (b) Lexical error. Encountered "/" after "grik"
# BIND ( URI(foo:"01/grik/234") as ?s ) # (c) Encountered " <STRING_LITERAL2> "\01/grik/234"\""
# BIND ( URI(foo:URI("01/grik/234")) as ?s ) # (d) Encountered "/" after "01"
# BIND ( URI(foo:ENCODE_FOR_URI("01/grik/234")) as ?s ) # (e) Encountered "/" after "01"
# BIND( URI(foo:ENCODE_FOR_URI("01/grik/234")) as ?s ) # (f) WARN URI <http:://www.foo.com/bar/ENCODE_FOR_URI> has no registered function factory
?s a ?o .
}
LIMIT 10
You're trying to use a IRI in its prefixed name form. The W3C SPARQL recommendation contains the following section
4.1.1.1 Prefixed Names
The PREFIX keyword associates a prefix label with an IRI. A prefixed name is a prefix label and a local part, separated by a colon
":". A prefixed name is mapped to an IRI by concatenating the IRI
associated with the prefix and the local part. The prefix label or the
local part may be empty. Note that SPARQL local names allow leading
digits while XML local names do not. SPARQL local names also allow the
non-alphanumeric characters allowed in IRIs via backslash character
escapes (e.g. ns:id\=123). SPARQL local names have more syntactic
restrictions than CURIEs.
Given that / is a non-alphanumeric character, the most important part here is
SPARQL local names also allow the non-alphanumeric characters allowed
in IRIs via backslash character escapes (e.g. ns:id\=123).
Long story short, your query should be
PREFIX foo: <http:://www.foo.com/bar/>
SELECT * WHERE {
BIND ( URI(foo:01\/grik\/234) as ?s )
?s a ?o .
}
LIMIT 10

Erlang: How to create a function that returns a string containing the date in YYMMDD format?

I am trying to learn Erlang and I am working on the practice problems Erlang has on the site. One of them is:
Write the function time:swedish_date() which returns a string containing the date in swedish YYMMDD format:
time:swedish_date()
"080901"
My function:
-module(demo).
-export([swedish_date/0]).
swedish_date() ->
[YYYY,MM,DD] = tuple_to_list(date()),
string:substr((integer_to_list(YYYY, 3,4)++pad_string(integer_to_list(MM))++pad_string(integer_to_list(DD)).
pad_string(String) ->
if
length(String) == 1 -> '0' ++ String;
true -> String
end.
I'm getting the following errors when compiled.
demo.erl:6: syntax error before: '.'
demo.erl:2: function swedish_date/0 undefined
demo.erl:9: Warning: function pad_string/1 is unused
error
How do I fix this?
After fixing your compilation errors, you're still facing runtime errors. Since you're trying to learn Erlang, it's instructive to look at your approach and see if it can be improved, and fix those runtime errors along the way.
First let's look at swedish_date/0:
swedish_date() ->
[YYYY,MM,DD] = tuple_to_list(date()),
Why convert the list to a tuple? Since you use the list elements individually and never use the list as a whole, the conversion serves no purpose. You can instead just pattern-match the returned tuple:
{YYYY,MM,DD} = date(),
Next, you're calling string:substr/1, which doesn't exist:
string:substr((integer_to_list(YYYY,3,4) ++
pad_string(integer_to_list(MM)) ++
pad_string(integer_to_list(DD))).
The string:substr/2,3 functions both take a starting position, and the 3-arity version also takes a length. You don't need either, and can avoid string:substr entirely and instead just return the assembled string:
integer_to_list(YYYY,3,4) ++
pad_string(integer_to_list(MM)) ++
pad_string(integer_to_list(DD)).
Whoops, this is still not right: there is no such function integer_to_list/3, so just replace that first call with integer_to_list/1:
integer_to_list(YYYY) ++
pad_string(integer_to_list(MM)) ++
pad_string(integer_to_list(DD)).
Next, let's look at pad_string/1:
pad_string(String) ->
if
length(String) == 1 -> '0' ++ String;
true -> String
end.
There's a runtime error here because '0' is an atom and you're attempting to append String, which is a list, to it. The error looks like this:
** exception error: bad argument
in operator ++/2
called as '0' ++ "8"
Instead of just fixing that directly, let's consider what pad_string/1 does: it adds a leading 0 character if the string is a single digit. Instead of using if to check for this condition — if isn't used that often in Erlang code — use pattern matching:
pad_string([D]) ->
[$0,D];
pad_string(S) ->
S.
The first clause matches a single-element list, and returns a new list with the element D preceded with $0, which is the character constant for the character 0. The second clause matches all other arguments and just returns whatever is passed in.
Here's the full version with all changes:
-module(demo).
-export([swedish_date/0]).
swedish_date() ->
{YYYY,MM,DD} = date(),
integer_to_list(YYYY) ++
pad_string(integer_to_list(MM)) ++
pad_string(integer_to_list(DD)).
pad_string([D]) ->
[$0,D];
pad_string(S) ->
S.
But a simpler approach would be to use the io_lib:format/2 function to just format the desired string directly:
swedish_date() ->
io_lib:format("~w~2..0w~2..0w", tuple_to_list(date())).
First, note that we're back to calling tuple_to_list(date()). This is because the second argument for io_lib:format/2 must be a list. Its first argument is a format string, which in our case says to expect three arguments, formatting each as an Erlang term, and formatting the 2nd and 3rd arguments with a width of 2 and 0-padded.
But there's still one more step to address, because if we run the io_lib:format/2 version we get:
1> demo:swedish_date().
["2015",["0",56],"29"]
Whoa, what's that? It's simply a deep list, where each element of the list is itself a list. To get the format we want, we can flatten that list:
swedish_date() ->
lists:flatten(io_lib:format("~w~2..0w~2..0w", tuple_to_list(date()))).
Executing this version gives us what we want:
2> demo:swedish_date().
"20150829"
Find the final full version of the code below.
-module(demo).
-export([swedish_date/0]).
swedish_date() ->
lists:flatten(io_lib:format("~w~2..0w~2..0w", tuple_to_list(date()))).
UPDATE: #Pascal comments that the year should be printed as 2 digits rather than 4. We can achieve this by passing the date list through a list comprehension:
swedish_date() ->
DateVals = [D rem 100 || D <- tuple_to_list(date())],
lists:flatten(io_lib:format("~w~2..0w~2..0w", DateVals)).
This applies the rem remainder operator to each of the list elements returned by tuple_to_list(date()). The operation is needless for month and day but I think it's cleaner than extracting the year and processing it individually. The result:
3> demo:swedish_date().
"150829"
There are a few issues here:
You are missing a parenthesis at the end of line 6.
You are trying to call integer_to_list/3 when Erlang only defines integer_to_list/1,2.
This will work:
-module(demo).
-export([swedish_date/0]).
swedish_date() ->
[YYYY,MM,DD] = tuple_to_list(date()),
string:substr(
integer_to_list(YYYY) ++
pad_string(integer_to_list(MM)) ++
pad_string(integer_to_list(DD))
).
pad_string(String) ->
if
length(String) == 1 -> '0' ++ String;
true -> String
end.
In addition to the parenthesis error on line 6, you also have an error on line 10 where yo use the form '0' instead of "0", so you define an atom rather than a string.
I understand you are doing this for educational purpose, but I encourage you to dig into erlang libraries, it is something you will have to do. For a common problem like this, it already exists function that help you:
swedish_date() ->
{YYYY,MM,DD} = date(), % not useful to transform into list
lists:flatten(io_lib:format("~2.10.0B~2.10.0B~2.10.0B",[YYYY rem 100,MM,DD])).
% ~X.Y.ZB means: uses format integer in base Y, print X characters, uses Z for padding

Correct usage of ~ parameter expansion flag?

According to man zshexpn (5.0.2):
~ Force string arguments to any of the flags below that follow within the parentheses to be treated as
patterns.
For example, using the s flag to perform field splitting requires a string argument:
% print -l ${(s:9:):-"foo893bar923baz"}
foo8
3bar
23baz
My reading of the ~ flag suggests that I should be able to specify a pattern in place of a literal string to split on, so that the following
% print -l ${(~s:<->:):-"foo893bar923baz"}
should produce
foo
bar
baz
Instead, it behaves the same as if I omit the ~, performing no splitting at all.
% print -l ${(s:<->:):-"foo893bar923baz"}
foo893bar923baz
% print -l ${(~s:<->:):-"foo893bar923baz"}
foo893bar923baz
Ok, rereading the question, it's the difference between this:
$ val="foo???bar???baz"
$ print -l ${(s.?.)val}
foo
bar
baz
And this:
$ val="foo???bar???baz"
$ print -l ${(~s.?.)val}
foo???bar???baz
It operates on the variable, i.e. the "argument" to the split (from your documentation quote). In the first example, we substitute literal ?, and in the second, we treat the variable as a glob, and there are no literal ?, so nothing gets substituted.
Still, though, split works on characters and not globs in the substution itself, from the documentation:
s:string:
Force field splitting (see the option SH_WORD_SPLIT) at the separator string.
So, it doesn't look like you can split on a pattern. The ~ character modifies the interpretation of the string to be split.
Also, from the same pattern expansion documentation you reference, it continutes:
Compare with a ~ outside parentheses, which forces the entire
substituted string to be treated as a pattern. [[ "?" = ${(~j.|.)array} ]]
with the EXTENDED_GLOB option set succeeds if and
only if $array contains the string ‘?’ as an element. The argument may
be repeated to toggle the behaviour; its effect only lasts to the end
of the parenthesised group.
The difference between ${(~j.|.)array} and ${(j.|.)~array} is that the former treats the values inarray as global, and the latter treats the result as a glob.
See also:
${~spec} Turn on the GLOB_SUBST option for the evaluation of spec; if
the ‘~’ is doubled, turn it off. When this option is set, the string
resulting from the expansion will be interpreted as a pattern anywhere
that is possible, such as in filename expansion and filename
generation and pattern-matching contexts like the right hand side of
the ‘=’ and ‘!=’ operators in conditions.
Here is a demo that shows the differences:
$ array=("foo???bar???baz" "foo???bar???buz")
$ [[ "foo___bar___baz" = ${(~j.|.)array} ]] && echo true || echo false
false
$ [[ "foo___bar___baz" = ${(j.|.)~array} ]] && echo true || echo false
true
And for completeness:
$ [[ "foo___bar___baz" = ${(~j.|.)~array} ]] && echo true || echo false
true

How do I create a function to check if a string only consists of A-Z , 0-9

Is there any way to check in Xquery (A Xquery function) if an input string has only characters A-Z or numbers 0-9 and no other characters.
for example if the string is ABZ10 the function should return true and if the input string is 5& 123x it returns a false.
I am able to do it in java by simply using following.
inputstring.matches("^[0-9A-Z]+$"))
Use:
matches($vYourString, '^[A-Z0-9]+$')

Regex: Split X length words

I'm new to regular expresions. I have a gigantic text. In the aplication, i need words of 4 characters and delete the rest. The text is in spanish. So far, I can select 4 char length words but i still need to delete the rest.
This is my regular expression
\s(\w{3,3}[a-zA-ZáéíóúäëïöüñÑ])\s
How can i get all words with 4 letters in asp.net vb?
/(?:\A|(?<=\P{L}))(\p{L}{4})(?:(?=\P{L})|\z)/g
Explanation:
Switch /g is for repeatedly search
\A is start of the string (not start of line)
\p{L} matches a single code point in the category letter
\P{L} matches a single code point not in the category letter
{n} specify a specific amount of repetition [n is number]
\z is end of string (not end of line)
| is logic OR operator
(?<=) is lookbehind
(?=) is lookahead
(?:) is non backreference grouping
() is backreference grouping
Using the character class provided above in another answer (\w does NOT match spanish word characters unfortunately).
You can use this for a match (it matches the reverse, basically matches everything that is NOT a 4-character word, so you can replace with " ", leaving only the 4-character words):
/(^|(?<=(?<=\W)[a-zA-ZáéíóúäëïöüñÑ]{4,4}(?=\W)))(.*?)((?=(?<=\W)[a-zA-ZáéíóúäëïöüñÑ]{4,4}(?=\W))|$)/gis
Approximated code in VB (not tested):
Dim input As String = "This is your text"
Dim pattern As String = "/(^|(?<=(?<=\W)[a-zA-ZáéíóúäëïöüñÑ]{4,4}(?=\W)))(.*?)((?=(?<=\W)[a-zA-ZáéíóúäëïöüñÑ]{4,4}(?=\W))|$)/gis"
Dim replacement As String = " "
Dim rgx As New Regex(pattern)
Dim result As String = rgx.Replace(input, replacement)
Console.WriteLine("Original String: {0}", input)
Console.WriteLine("Replacement String: {0}", result)
You can see the result of the regex in action here:
http://regexr.com?30n29
\[^a-zA-ZáéíóúäëïöüñÑ][a-zA-ZáéíóúäëïöüñÑ]{4}[^a-zA-ZáéíóúäëïöüñÑ]\g
Translated:
A non-letter, followed by 4 letters, followed by a non-letter. The 'g' indicated will match globally ... more than once.
Check out this link to find out more info on looping over your matches:
http://osherove.com/blog/2003/5/12/practical-parsing-using-groups-in-regular-expressions.html

Resources