Tokenizing a letter as an operator - ply

I need to make a language that has variables in it, but it also needs the letter 'd' to be an operand that has a number on the right and maybe a number on the left. I thought that making sure the lexer checks for the letter first would give it precedence, but that doesn't happen and i don't know why.
from ply import lex, yacc
tokens=['INT', 'D', 'PLUS', 'MINUS', 'LPAR', 'RPAR', 'BIGGEST', 'SMALLEST', 'EQ', 'NAME']
t_PLUS = r'\+'
t_MINUS = r'\-'
t_LPAR = r'\('
t_RPAR = r'\)'
t_BIGGEST = r'\!'
t_SMALLEST = r'\#'
t_D = r'[dD]'
t_EQ = r'\='
t_NAME = r'[a-zA-Z_][a-zA-Z0-9_]*'
def t_INT(t):
r'[0-9]\d*'
t.value = int(t.value)
return t
def t_newline(t):
r'\n+'
t.lexer.lineno += 1
t_ignore = ' \t'
def t_error(t):
print("Not recognized by the lexer:", t.value)
t.lexer.skip(1)
lexer = lex.lex()
while True:
try: s = input(">> ")
except EOFError: break
lexer.input(s)
while True:
t = lexer.token()
if not t: break
print(t)
If i write:
3d4
it outputs:
LexToken(INT,3,1,0)
LexToken(NAME,'d4',1,1)
and i don't know how to work around it.

Ply does not prioritize token variables by order of appearance; rather, it orders them in decreasing order by length (longest first). So your t_NAME pattern will come before t_D. This is explained in the Ply manual, along with a concrete example of how to handle reserved words (which may not apply in your case).
If I understand correctly, the letter d cannot be an identifier, and neither can d followed by a number. It is not entirely clear to me whether you expect d2e to be a plausible identifier, but for simplicity I'm assuming that the answer is "No", in which case you can easily restrict the t_NAME regular expression by requiring an initial d to be followed by another letter:
t_NAME = '([a-ce-zA-CE-Z_]|[dD][a-zA-Z_])[a-zA-Z0-9_]*'
If you wanted to allow d2e to be a name, then you could go with:
t_NAME = '([a-ce-zA-CE-Z_]|[dD][0-9]*[a-zA-Z_])[a-zA-Z0-9_]*'

Related

How do I create a function that would take a textfile, two logical operators for comment and blank lines?

I am to construct a function named read_text_file.
It takes in an argument textFilePath that is a single character and two optional parameters withBlanks and withComments that are both single
logicals;
textFilePath is the path to the text file (or R script);
if withBlanks and withComments are set to FALSE, then read_text_file() will return the text file without blank lines (i.e. lines that contain nothing or only whitespace) and commented (i.e. lines that starts with “#”) lines respectively;
it outputs a character vector of length n where each element corresponds to its respective line of text/code.
I came up with the function below:
read_text_file <- function(textFilePath, withBlanks = TRUE, withComments = TRUE){
# check that `textFilePath`: character(1)
if(!is.character(textFilePath) | length(textFilePath) != 1){
stop("`textFilePath` must be a character of length 1.")}
if(withComments==FALSE){
return(grep('^$', readLines(textFilePath),invert = TRUE, value = TRUE))
}
if(withBlanks==FALSE){
return(grep('^#', readLines(textFilePath),invert = TRUE, value = TRUE))
}
return(readLines(textFilePath))
}
The second if-statement will always be executed leaving the third if-statement unexecuted.
I'd recommend processing an imported object instead of returning it immediately:
read_text_file <- function(textFilePath, withBlanks = TRUE, withComments = TRUE){
# check that `textFilePath`: character(1)
if(!is.character(textFilePath) | length(textFilePath) != 1){
stop("`textFilePath` must be a character of length 1.")}
result = readLines(textFilePath)
if(!withComments){
result = grep('^\\s*#\\s*', result, invert = TRUE, value = TRUE)
}
if(!withBlanks){
result = grep('^\\s*$', result, invert = TRUE, value = TRUE)
}
result
}
The big change is defining the result object that we modify as needed and then return at the end. This is good both because (a) it is more concise, not repeating the readLines command multiple times, (b) it lets you easily do 0, 1, or more data cleaning steps on result before returning it.
I also made some minor changes:
I don't use return() - it is only needed if you are returning something before the end of the function code, which with these modifications is not necessary.
You had your "comment" and "blank" regex patterns switched, I corrected that.
I changed == FALSE to !, which is a little safer and good practice. You could use isFALSE() if you want more readability.
I added \\s* into your regex patterns in a couple places which will match any amount of whitespace (including none)

iterating 2D array in Elixir

I am new to Elixir language and I am having some issues while writing a piece of code.
What I am given is a 2D array like
list1 = [
[1 ,2,3,4,"nil"],
[6,7,8,9,10,],
[11,"nil",13,"nil",15],
[16,17,"nil",19,20] ]
Now, what I've to do is to get all the elements that have values between 10 and 20, so what I'm doing is:
final_list = []
Enum.each(list1, fn row ->
Enum.each(row, &(if (&1 >= 10 and &1 <= 99) do final_list = final_list ++ &1 end))
end
)
Doing this, I'm expecting that I'll get my list of numbers in final_list but I'm getting blank final list with a warning like:
warning: variable "final_list" is unused (there is a variable with the same name in the context, use the pin operator (^) to match on it or prefix this variable with underscore if it is not meant to be used)
iex:5
:ok
and upon printing final_list, it is not updated.
When I try to check whether my code is working properly or not, using IO.puts as:
iex(5)> Enum.each(list1, fn row -> ...(5)> Enum.each(row, &(if (&1 >= 10 and &1 <= 99) do IO.puts(final_list ++ &1) end))
...(5)> end
...(5)> )
The Output is:
10
11
13
15
16
17
19
20
:ok
What could I possibly be doing wrong here? Shouldn't it add the elements to the final_list?
If this is wrong ( probably it is), what should be the possible solution to this?
Any kind of help will be appreciated.
As mentioned in Adam's comments, this is a FAQ and the important thing is the message "warning: variable "final_list" is unused (there is a variable with the same name in the context, use the pin operator (^) to match on it or prefix this variable with underscore if it is not meant to be used)" This message actually indicates a very serious problem.
It tells you that the assignment "final_list = final_list ++ &1" is useless since it just creates a local variable, hiding the external one. Elixir variables are not mutable so you need to reorganize seriously your code.
The simplest way is
final_list =
for sublist <- list1,
n <- sublist,
is_number(n),
n in 10..20,
do: n
Note that every time you write final_list = ..., you actually declare a new variable with the same name, so the final_list you declared inside your anonymous function is not the final_list outside the anonymous function.

Searching in database with scrambled words in SQLite

I am wondering if its possible to search in the database with the given scrambled words.
I have a mobs table in database and it holds the name of the monster names
If given monster name is A Golden Dregon or A Golden Dfigon or A Gelden Dragon I want it to find A Golden Dragon or with the matches that close to it from database. Usually one or two letters at max is given like this as scrambled.
Is that possible with just SQL queries? Or should I build the query by parsing the given monster name?
I am using LUA for the code side.
I have come to know this search type as a fuzzy search. I mainly program in JS and use fuse.js all the time for this kind of problem.
Fuzzy Searches are based on the Levenshtein algorithm that rate the distance of two strings. When you have this distance value you can sort or drop elements from a list based on the score.
I found the algorithm in lua here.
function levenshtein(s, t)
local s, t = tostring(s), tostring(t)
if type(s) == 'string' and type(t) == 'string' then
local m, n, d = #s, #t, {}
for i = 0, m do d[i] = { [0] = i } end
for j = 1, n do d[0][j] = j end
for i = 1, m do
for j = 1, n do
local cost = s:sub(i,i) == t:sub(j,j) and 0 or 1
d[i][j] = math.min(d[i-1][j]+1, d[i][j-1]+1, d[i-1][j-1]+cost)
end
end
return d[m][n]
end
end
As explained in the site you compare two strings like so and get a score based on the distance of them, then sort or drop the items being search based on the scores given. As this is CPU expensive I would suggest caching or use a memoize function to store common mistakes.
levenshtein('referrer', 'referrer') -- zero distance
>>> 0
levenshtein('referrer', 'referer') -- distance of one character
>>> 1
levenshtein('random', 'strings') -- random big distance
>>> 6
Got a simple version of it working in lua here I must say lua is an easy language to pick up and start coding with.
local monsters = {'A Golden Dragon', 'Goblins', 'Bunny', 'Dragoon'}
function levenshtein(s, t)
local s, t = tostring(s), tostring(t)
if type(s) == 'string' and type(t) == 'string' then
local m, n, d = #s, #t, {}
for i = 0, m do d[i] = { [0] = i } end
for j = 1, n do d[0][j] = j end
for i = 1, m do
for j = 1, n do
local cost = s:sub(i,i) == t:sub(j,j) and 0 or 1
d[i][j] = math.min(d[i-1][j]+1, d[i][j-1]+1, d[i-1][j-1]+cost)
end
end
return d[m][n]
end
end
--Fuzzy Search Returns the Best Match in a list
function fuzzySearch(list, searchText)
local bestMatch = nil;
local lowestScore = nil;
for i = 1, #list do
local score = levenshtein(list[i], searchText)
if lowestScore == nil or score < lowestScore then
bestMatch = list[i]
lowestScore = score
end
end
return bestMatch
end
print ( fuzzySearch(monsters, 'golen dragggon') )
print ( fuzzySearch(monsters, 'A Golden Dfigon') )
print ( fuzzySearch(monsters, 'A Gelden Dragon') )
print ( fuzzySearch(monsters, 'Dragooon') ) --should be Dragoon
print ( fuzzySearch(monsters, 'Funny') ) --should be Bunny
print ( fuzzySearch(monsters, 'Gob') ) --should be Goblins
Output
A Golden Dragon
A Golden Dragon
A Golden Dragon
Dragoon
Bunny
Goblins
For SQL
You can try to do this same algorithm in T-SQL as talked about here.
In SQLlite there is an extension called editdist3 which also uses this algorithm the docs are here.
I would be hard to compensate for all the different one and two letter scrambled combinations, but you could create a lua table of common misspellings of "A Golden Dragon" check if it is in the table. I have never used lua before but here is my best try at some sample code:
local mob_name = "A Golden Dregon"--you could do something like, input("Enter mob name:")
local scrambled_dragon_names = {"A Golden Dregon", "A Golden Dfigon", "A Gelden Dragon"}
for _,v in pairs(scrambled_dragon_names) do
if v == mob_name then
mob_name = "A Golden Dragon"
break
end
end
I really hope I have helped!
P.S. If you have anymore questions go ahead and comment and I will try to answer ASAP.
You will have to parse the given monster name to some extent, by making assumptions about how badly it is misspelled. For example, if the user supplied the name
b fulden gorgon
There is no way in hell you can get to "A Golden Dragon". However, if you assume that the user will always get the first and last letters of every word correctly, then you could parse the words in the given name to get the first and last letters of each word, which would give you
"A", "G" "n", "D" "n"
Then you could use the LIKE operator in your query, like so:
SELECT * FROM mobs WHERE monster_name LIKE 'A G%n D%n';
The main point here is what assumptions you make about the misspelling. The closer you can narrow it down, the better your query results will be.

nginx string.match non posix

I got a string (str1) and I want to extract anything after pattern "mycode=",
local str1 = "ServerName/codebase/?mycode=ABC123";
local tmp1 = string.match(str1, "mycode=%w+");
local tmp2 = string.gsub(tmp1,"mycode=", "");
From the logs,
tmp1 => mycode=ABC123
tmp2 => ABC123
Is there a better/more efficient way to do this? I do belive lua strings do not follow the POSIX standard (due to the size of the code base).
Yes, use a capture in your pattern to control what you get back from string.match.
From the lua reference manual (emphasis mine):
Looks for the first match of pattern in the string s. If it finds one, then match returns the captures from the pattern; otherwise it returns nil. If pattern specifies no captures, then the whole match is returned. A third, optional numerical argument init specifies where to start the search; its default value is 1 and can be negative.
It works like this:
> local str1 = "ServerName/codebase/?mycode=ABC123"
> local tmp1 = string.match(str1, "mycode=%w+")
> print(tmp1)
mycode=ABC123
> local tmp2 = string.match(str1, "mycode=(%w+)")
> print(tmp2)
ABC123

Python 3.4 help - using slicing to replace characters in a string

Say I have a string.
"poop"
I want to change "poop" to "peep".
In fact, I also want all of the o's in poop to change to e's for any word I put in.
Here's my attempt to do the above.
def getword():
x = (input("Please enter a word."))
return x
def main():
y = getword()
for i in range (len(y)):
if y[i] == "o":
y = y[:i] + "e"
print (y)
main()
As you can see, when you run it, it doesn't amount to what I want. Here is my expected output.
Enter a word.
>>> brother
brether
Something like this. I need to do it using slicing. I just don't know how.
Please keep your answer simple, since I'm somewhat new to Python. Thanks!
This uses slicing (but keep in mind that slicing is not the best way to do it):
def f(s):
for x in range(len(s)):
if s[x] == 'o':
s = s[:x]+'e'+s[x+1:]
return s
Strings in python are non-mutable, which means that you can't just swap out letters in a string, you would need to create a whole new string and concatenate letters on one-by-one
def getword():
x = (input("Please enter a word."))
return x
def main():
y = getword()
output = ''
for i in range(len(y)):
if y[i] == "o":
output = output + 'e'
else:
output = output + y[i]
print(output)
main()
I'll help you this once, but you should know that stack overflow is not a homework help site. You should be figuring these things out on your own to get the full educational experience.
EDIT
Using slicing, I suppose you could do:
def getword():
x = (input("Please enter a word."))
return x
def main():
y = getword()
output = '' # String variable to hold the output string. Starts empty
slice_start = 0 # Keeps track of what we have already added to the output. Starts at 0
for i in range(len(y) - 1): # Scan through all but the last character
if y[i] == "o": # If character is 'o'
output = output + y[slice_start:i] + 'e' # then add all the previous characters to the output string, and an e character to replace the o
slice_start = i + 1 # Increment the index to start the slice at to be the letter immediately after the 'o'
output = output + y[slice_start:-1] # Add the rest of the characters to output string from the last occurrence of an 'o' to the end of the string
if y[-1] == 'o': # We still haven't checked the last character, so check if its an 'o'
output = output + 'e' # If it is, add an 'e' instead to output
else:
output = output + y[-1] # Otherwise just add the character as-is
print(output)
main()
Comments should explain what is going on. I'm not sure if this is the most efficient or best way to do it (which really shouldn't matter, since slicing is a terribly inefficient way to do this anyways), just the first thing I hacked together that uses slicing.
EDIT Yeah... Ourous's solution is much more elegant
Can slicing even be used in this situation??
The only probable solution I think would work, as MirekE stated, is y.replace("o","e").

Resources