Interesting reverse of coding interview question

Interesting reverse of coding interview question - decode

This is a popular interview question:
Given an encoded string, return its decoded string.
The encoding rule is: k[encoded_string], where the encoded_string inside the square brackets is being repeated exactly k times. Note that k is guaranteed to be a positive integer.
Input: s = "3[a]2[bc]"
Output: "aaabcbc"
which can be solved through smart use of stack.
I was thinking about the reverse: given a decoded string, can you find the optimally encoded string?
i.e.
Input: s = "aaabcbc"
Output: "3[a]2[bc]"
I have a feeling that this may be much harder than the previous one case you will have to keep track of each running substring.
Or is there an easy of doing it?

You could get creative with regular expressions and delegate the heavy lifting to the regular expression engine.
Here's a JavaScript example:
const d = "aaabcbc";
const e = d.replaceAll(/(.+)\1+/g, (a, b) => `${a.length / b.length}[${b}]`);
console.log(e);

Related

is it the same to use MATCHES (* + "" + *) and no parameters in a FOR EACH in Progress 4GL?

So I made the following FOR EACH
FOR EACH insp_cd
WHERE insp_cd.status_ = 1
AND insp_cd.item MATCHES('*' + pc-itemPost + '*')
AND insp_cd.update_at < NOW:
So, when the pc-itemPost is "", should I avoid using the MATCHES? Like:
IF pc-itemPost = "" THEN DO:
FOR EACH insp_cd
WHERE insp_cd.status_ = 1
AND insp_cd.update_at < NOW:
...
END.
ELSE DO:
FOR EACH insp_cd
WHERE insp_cd.status_ = 1
AND insp_cd.item MATCHES('*' + pc-itemPost + '*')
AND insp_cd.update_at < NOW:
I know it's very slow because of the table scan, but I'd like to know if there is any difference. Thanks.

Any time that you can avoid MATCHES you should do so.
Using an IF statement to choose branches that execute different static FOR EACH statements is one way to do it. Building dynamic queries based on similar logic would be another approach.
Whether or not your two queries are "different"? Sure, they are different. They have different WHERE clauses so their specific behavior (and performance) will depend on the index structure (which we don't know).
insp_cd.item matches “*” + pc-itempost + “*”
Can be very different from:
insp_cd.item = “”.
And logically it is not the same as omitting a check of insp_cd.item altogether. Logically maybe you’re attempting to exclude empty values? I’m not sure what the requirement is here.
If insp_cd.item is the first component of an index, or the second component after insp_cd.Status then a variation of this query using ‘ = “” ‘ will be much more efficient than one using MATCHES.
Back to avoiding MATCHES, at a high level:
If there is no need for wild cards use "=". Equality matches are always preferred.
If the wild card is at the end of the string use BEGINS.
If the wild card is being used to signify a known list use a series of OR clauses or a LOOKUP() or build a temp-table to join in the query.
There are probably more ways to avoid MATCHES but these are the ones that spring to mind.

Unable to extract (US) Zipcode from doc file

I am need to get the Zipcode from the Resume.doc file..
but not succceded,,,
Its working with static string , I mean it validates the static string but unable to parse the zipcode from doc file,,
I am sharing my code ...
protected void zipcodeGetter()
{
var path = "C:\\Users\\Jatinder\\Desktop\\LUCENE\\Resume\\Jeffrey.doc";
Document doc = new Document();
string html = File.ReadAllText(path);
using (StreamReader sr = new StreamReader(path, System.Text.Encoding.Default))
{
html = sr.ReadToEnd();
}
const string MatchPhondePattern = #"^\d{5}(?:[-\s]\d{4})?$";
Regex rx = new Regex(MatchPhondePattern, RegexOptions.Compiled | RegexOptions.IgnorePatternWhitespace);
MatchCollection matches = rx.Matches(html);
// Report the number of matches found.
int noOfMatches = matches.Count;
//Do something with the matches
foreach (Match match in matches)
{
//Do something with the matches
string tempPhoneNumber = match.Value.ToString(); ;
}
}
can anyone help me with this

Your code just won't work with that regular expression.
This problem is complicated and your best option is to use a service from a company that does this. They will have a robust system.
Here is a quote from an article on regex and addresses:
We get a lot of questions from programmers about parsing addresses. We see a lot of people trying to use regular expressions for street addresses, and as the address user experience experts, we cringe whenever another programmer falls prey to this trap. We hope that this information will save you some trouble, and if your searching is in vain, please feel free to ask us any questions you have about addresses. ...
Should you use regular expressions to parse street addresses? The short answer is, "Probably not." Because of the wide variance in address content and formatting, addresses aren't "regular"—an indispensable factor in using regular expressions to process information.
Now, some notes and hints about your regular expressions.
I used RegExr to make an example of the regular expression you used. As you can see, there are no highlighted regions, meaning your regular expression won't work.
If you just want to match five consecutive digits, the regular expression is: [0-9]{5}. Here is an example.
You can't just use ^ and $ because, for example, there might be a space or a period before or after the zip code and ^ and $ in your code would mean you're looking for beginnings and ends of lines.
The problem with not having any other qualifiers, however, is you will match long numbers, too. In other words, with a string like 1234567890, you will match [0-9]{5} because there are five consecutive digits in that string.
It's hard to qualify the regular expression with possible punctuation or spaces before or after the match, because what if the match is at the beginning or end of the line? It will miss some.
Here is a regex that might be useful to you. It seems to work in a lot of cases. You can see the example here, with more explanation.
(?<=\W|^)\d{5}(-?\d{4})?(?=\W|$)
(Full disclosure: I work for SmartyStreets and we have an API that does this. Check out the API docs if you're interested.)

Marklogic collate sequence in XQuery

Is there a way to modify the elements a sequence so only collated versions of the items are returned?
let $currencies := ('dollar', 'Dollar', 'dollar ')
return fn:collated-only($currencies, "http://marklogic.com/collation/en/S1/T00BB/AS")
=> ('dollar', 'dollar', 'dollar')

The values that are stored in the range index (that feeds the facets) are literally the first value that was encountered that compared equal to the others. (Because, the collation says you don't care...)
You can get a long way by calling
fn:replace(fn:lower-case(xdmp:diacritic-less(fn:normalize-unicode($str,"NFKC"))),"\p{P}","")
This won't be exactly the same in that it overfolds some things and underfolds others, but it may be good for your purposes.

Is this the expected output? There is no fn:collated-only function, so I'm assuming you're asking how to write such a function or whether there is such a function.
The thing is, there isn't a mapping from one string to another in collation comparisons, there is only a comparison algorithm (the Unicode Collation Algorithm) so there really is no canonical kind of string to return to you, and therefore no API to do so.
Stepping back, what is the problem you are actually trying to solve? By the rules of that collation, "dollar" and "Dollar" are equivalent, and by using it you declare you don't care which form you use, so you could use either one.

If these values are in XML elements and you have a range index using http://marklogic.com/collation/en/S1/T00BB/AS, you can do something like this:
let $ref := cts:element-reference(xs:QName("currency"), "collation=http://marklogic.com/collation/en/S1/T00BB/AS")
for $curr in cts:values($ref, (), "frequency-order")
return $curr || ": " || cts:frequency($curr)
This will produce results like:
"dollar: 15",
"euro: 12"
... and so on. The collation will disregard the differences among your sample inputs. These results could be formatted however you want. Is that what you're looking to do?

regular expression for date with Starting and Ending date

I am using the regular expression of the date for the format "MM/DD/YYYY" like
"^(0[1-9]|1[012])[- /.](0[1-9]|[12][0-9]|3[01])[- /.](19|20)\d\d$"
its working fine, no problem....here I want to limit the year between "1950" to "2050", how to do this, can anyone help me....

So the answer depends on how you want to accomplish the task.
Your current Regex search pattern is going to match on most dates in the format "MM/DD/YYYY" in the 20th and 21st century. So one approach is to loop through the resulting matches, which are represented as string values at this point, and parse each string into a DateTime. Then you can do some range validation checking.
(Note: I removed the beginning ^ and ending $ from your original to make my example work)
string input = "This is one date 07/04/1776 and this is another 12/07/1941. Today is 08/10/2019.";
string pattern = "(0[1-9]|1[012])[- /.](0[1-9]|[12][0-9]|3[01])[- /.](19|20)\\d\\d";
List<DateTime> list = new List<DateTime>();
foreach (Match match in Regex.Matches(input, pattern))
{
Console.WriteLine(match.Value);
DateTime result;
if (DateTime.TryParse(match.Value, out result))
{
if (result.Year >= 1950 && result.Year <= 2050)
{
list.Add(result);
}
}
}
Console.WriteLine("Number of valid dates: {0}", list.Count);
This code outputs the following, noting that 1776 is not matched, the other two dates are, but only the last one is added to the list.
12/07/1941
08/10/2019
Number of valid dates: 1
Although this approach has some drawbacks, such as looping over the results a second time to try and do the range validation, there are some advantages as well.
The built-in DateTime methods in the framework are easier to deal with, rather than constantly adjusting the Regex search pattern as your acceptable range can move over time.
By range checking afterward, you could also simplify your Regex search pattern to be more inclusive, perhaps even getting all dates.
A simpler Regex search pattern is easier to maintain, and also makes clear the intent of the code. Regex can be confusing and tricky to decipher the meaning, especially for less experienced coders.
Complex Regex search patterns can introduce subtle bugs. Make sure you have good unit tests wrapped around your code.
Of course your other approach is to adjust the Regex search pattern so that you don't have to parse and check afterwards. In most cases this is going to be the best option. Your search pattern is not returning any values that are outside the range, so you don't have to loop or do any additional checking at that point. Just remember those unit tests!
As #skywalker pointed out in his answer, this pattern should work for you.
string pattern = "(0[1-9]|1[012])[- /.](0[1-9]|[12][0-9]|3[01])[- /.](19[5-9][0-9]|20[0-4][0-9]|2050)";

year 1950-2050 both inclusive can be found using 19[5-9][0-9]|20[0-4][0-9]|2050

Printing the original string as well as vowels in the string with recursion

I'm trying to write a recursive function that would get some string, as well as the lenght of that string as its parameters, and then print out the original string, as well as the reverse order of the vowels in that string. For example if the string is 'Horse', then the output would be 'Horse eo'.
What I'm having trouble with is how to get the original string printed while still getting the vowels out in a reverse order. I'm writing this function with a pseudocode, and how I'd print out only the reversed vowels would be as following.
MODULE VowelRecursion(String, n)
IF n != 0 THEN
letter := first letter of String
vowel := ""
IF letter == vowel THEN
vowel := letter
ENDIF
VowelRecursion(remainder of String, n-1)
Print(vowel)
ENDIF
ENDMODULE
Like I mentioned, the problem I have is that I can't figure out how to get the original string printed after the vowel finding has been done, as the original string needs to be printed first, and to do that wouldn't it have to be returned first after n gets to 0? The problem with that is that since we're calling the function with remainder of string, that would be just an empty string when n == 0, right?
As this is a problem I need to solve for school, I'm not looking for any ready made solutions, but I'd like to hear where my thought process is going wrong and what sort of methods I could use to achieve what's needed.
Thank you.

You may print letter before descending into the next recursion level, i.e. right before the call to VowelRecursion(remainder of String, n-1).
Print(letter)
VowelRecursion(remainder of String, n-1)
Print(vowel)

You can pass the original string along during the recursion. You don't modify that string but you simply use it when the recursion is done. Also, you can't print a vowel when you find it. You'll need to store them somewhere and only print them when you're done.
This means you should add two more parameter: a parameter that contains the original string and a (computed) string with the vowels found so far (initially empty). As a hint, you can solve this problem with a recursive function that is called VowelRecursion("Horse", "Horse", "", 5). When n = 0, you'll have all the values you need to print the desired result.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex