Need help understanding how gsub and tonumber are used to encode lua source code? - encryption

I'm new to LUA but figured out that gsub is a global substitution function and tonumber is a converter function. What I don't understand is how the two functions are used together to produce an encoded string.
I've already tried reading parts of PIL (Programming in Lua) and the reference manual but still, am a bit confused.
local L0_0, L1_1
function L0_0(A0_2)
return (A0_2:gsub("..", function(A0_3)
return string.char((tonumber(A0_3, 16) + 256 - 13 + 255999744) % 256)
end))
end
encodes = L0_0
L0_0 = gg
L0_0 = L0_0.toast
L1_1 = "__loading__\226\128\166"
L0_0(L1_1)
L0_0 = encodes
L1_1 = --"The Encoded String"
L0_0 = L0_0(L1_1)
L1_1 = load
L1_1 = L1_1(L0_0)
pcall(L1_1)
I removed the encoded string where I put the comment because of how long it was. If needed I can upload the encoded string as well.

gsub is being used to get 2 digit sections of A0_2. This means the string A0_3 is a 2 digit hexadecimal number but it is not in a number format so we cannot preform math on the value. A0_3 being a hex number can be inferred based on how tonubmer is used.
tonumber from Lua 5.1 Reference Manual:
Tries to convert its argument to a number. If the argument is already a number or a string convertible to a number, then tonumber returns this number; otherwise, it returns nil.
An optional argument specifies the base to interpret the numeral. The base may be any integer between 2 and 36, inclusive. In bases above 10, the letter 'A' (in either upper or lower case) represents 10, 'B' represents 11, and so forth, with 'Z' representing 35. In base 10 (the default), the number can have a decimal part, as well as an optional exponent part (see ยง2.1). In other bases, only unsigned integers are accepted.
So tonumber(A0_3, 16) means we are expecting for A0_3 to be a base 16 number (hexadecimal).
Once we have the number value of A0_3 we do some math and finally convert it to a character.
function L0_0(A0_2)
return (A0_2:gsub("..", function(A0_3)
return string.char((tonumber(A0_3, 16) + 256 - 13 + 255999744) % 256)
end))
end
This block of code takes a string of hex digits and converts them into chars. tonumber is being used to allow for the manipulation of the values.
Here is an example of how this works with Hello World:
local str = "Hello World"
local hex_str = ''
for i = 1, #str do
hex_string = hex_string .. string.format("%x", str:byte(i,i))
end
function L0_0(A0_2)
return (A0_2:gsub("..", function(A0_3)
return string.char((tonumber(A0_3, 16) + 256 - 13 + 255999744) % 256)
end))
end
local encoded = L0_0(hex_str)
print(encoded)
Output
;X__bJbe_W
And taking it back to the orginal string:
function decode(A0_2)
return (A0_2:gsub("..", function(A0_3)
return string.char((tonumber(A0_3, 16) + 13) % 256)
end))
end
hex_string = ''
for i = 1, #encoded do
hex_string = hex_string .. string.format("%x", encoded:byte(i,i))
end
print(decode(hex_string))

Related

pyparsing parse c/cpp enums with values as user defined macros

I have a usecase where i need to match enums where values can be userdefined macros.
Example enum
typedef enum
{
VAL_1 = -1
VAL_2 = 0,
VAL_3 = 0x10,
VAL_4 = **TEST_ENUM_CUSTOM(1,2)**,
}MyENUM;
I am using the below code, if i don't use format as in VAL_4 it works. I need match format as in VAL_4 as well. I am new to pyparsing, any help is appeciated.
My code:
BRACE, RBRACE, EQ, COMMA = map(Suppress, "{}=,")
_enum = Suppress("enum")
identifier = Word(alphas, alphanums + "_")
integer = Word("-"+alphanums) **#I have tried to "_(,)" to this but is not matching.**
enumValue = Group(identifier("name") + Optional(EQ + integer("value")))
enumList = Group(enumValue + ZeroOrMore(COMMA + enumValue) + Optional(COMMA))
enum = _enum + Optional(identifier("enum")) + LBRACE + enumList("names") + RBRACE + Optional(identifier("typedef"))
enum.ignore(cppStyleComment)
enum.ignore(cStyleComment)
Thanks
-Purna
Just adding more characters to integer is just the wrong way to go. Even this expression:
integer = Word("-"+alphanums)
isn't super-great, since it would match "---", "xyz", "q--10-", and many other non-integer strings.
Better to define integer properly. You could do:
integer = Combine(Optional('-') + Word(nums))
but I've found that for these low-level expressions that occur many places in your parse string, a Regex is best:
integer = Regex(r"-?\d+") # Regex(r"-?[0-9]+") if you like more readable re's
Then define one for hex_integer also,
Then to add macros, we need a recursive expression, to handle the possibility of macros having arguments that are also macros.
So at this point, we should just stop writing code for a bit, and do some design. In parser development, this design usually looks like a BNF, where you describe your parser in a sort of pseudocode:
enum_expr ::= "typedef" "enum" [identifier]
"{"
enum_item_list
"}" [identifier] ";"
enum_item_list ::= enum_item ["," enum_item]... [","]
enum_item ::= identifier "=" enum_value
enum_value ::= integer | hex_integer | macro_expression
macro_expression ::= identifier "(" enum_value ["," enum_value]... ")"
Note the recursion of macro_expression: it is used in defining enum_value, but it includes enum_value as part of its own definition. In pyparsing, we use a Forward to set up this kind of recursion.
See how that BNF is implemented in the code below. I build on some of the items you posted, but the macro expression required some rework. The bottom line is "don't just keep adding characters to integer trying to get something to work."
LBRACE, RBRACE, EQ, COMMA, LPAR, RPAR, SEMI = map(Suppress, "{}=,();")
_typedef = Keyword("typedef").suppress()
_enum = Keyword("enum").suppress()
identifier = Word(alphas, alphanums + "_")
# define an enumValue expression that is recursive, so that enumValues
# that are macros can take parameters that are enumValues
enumValue = Forward()
# add more types as needed - parse action on hex_integer will do parse-time
# conversion to int
integer = Regex(r"-?\d+").addParseAction(lambda t: int(t[0]))
# or just use the signed_integer expression found in pyparsing_common
# integer = pyparsing_common.signed_integer
hex_integer = Regex(r"0x[0-9a-fA-F]+").addParseAction(lambda t: int(t[0], 16))
# a macro defined using enumValue for parameters
macro_expr = Group(identifier + LPAR + Group(delimitedList(enumValue)) + RPAR)
# use '<<=' operator to attach recursive definition to enumValue
enumValue <<= hex_integer | integer | macro_expr
# remaining enum expressions
enumItem = Group(identifier("name") + Optional(EQ + enumValue("value")))
enumList = Group(delimitedList(enumItem) + Optional(COMMA))
enum = (_typedef
+ _enum
+ Optional(identifier("enum"))
+ LBRACE
+ enumList("names")
+ RBRACE
+ Optional(identifier("typedef"))
+ SEMI
)
# this comment style includes cStyleComment too, so no need to
# ignore both
enum.ignore(cppStyleComment)
Try it out:
enum.runTests([
"""
typedef enum
{
VAL_1 = -1,
VAL_2 = 0,
VAL_3 = 0x10,
VAL_4 = TEST_ENUM_CUSTOM(1,2)
}MyENUM;
""",
])
runTests is for testing and debugging your parser during development. Use enum.parseString(some_enum_expression) or enum.searchString(some_c_header_file_text) to get the actual parse results.
Using the new railroad diagram feature in the upcoming pyparsing 3.0 release, here is a visual representation of this parser:

How can I change ascii string to hex and vice versa in python 3.7?

I look some solution in this site but those not works in python 3.7.
So, I asked a new question.
Hex string of "the" is "746865"
I want to a solution to convert "the" to "746865" and "746865" to "the"
Given that your string contains ascii only (each char is in range 0-0xff), you can use the following snippet:
In [28]: s = '746865'
In [29]: import math
In [30]: int(s, base=16).to_bytes(math.ceil(len(s) / 2), byteorder='big').decode('ascii')
Out[30]: 'the'
Firstly you need to convert a string into integer with base of 16, then convert it to bytes (assuming 2 chars per byte) and then convert bytes back to string using decode
#!/usr/bin/python3
"""
Program name: txt_to_ASC.py
The program transfers
a string of letters -> the corresponding
string of hexadecimal ASCII-codes,
eg. the -> 746865
Only letters in [abc...xyzABC...XYZ] should be input.
"""
print("Transfer letters to hex ASCII-codes")
print("Input range is [abc...xyzABC...XYZ].")
print()
string = input("Input set of letters, eg. the: ")
print("hex ASCII-code: " + " "*15, end = "")
def str_to_hasc(x):
global glo
byt = bytes(x, 'utf-8')
bythex = byt.hex()
for b1 in bythex:
y = print(b1, end = "")
glo = str(y)
return glo
str_to_hasc(string)
If you have a byte string, then:
>>> import binascii
>>> binascii.hexlify(b'the')
b'746865'
If you have a Unicode string, you can encode it:
>>> s = 'the'
>>> binascii.hexlify(s)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: a bytes-like object is required, not 'str'
>>> binascii.hexlify(s.encode())
b'746865'
The result is a byte string, you can decode it to get a Unicode string:
>>> binascii.hexlify(s.encode()).decode()
'746865'
The reverse, of course, is:
>>> binascii.unhexlify(b'746865')
b'the'
#!/usr/bin/python3
"""
Program name: ASC_to_txt.py
The program's input is a string of hexadecimal digits.
The string is a bytes object, and each byte is supposed to be
the hex ASCII-code of a (capital or small) letter.
The program's output is the string of the corresponding letters.
Example
Input: 746865
First subresult: ['7','4','6','8','6','5']
Second subresult: ['0x74', '0x68', '0x65']
Third subresult: [116, 104, 101]
Final result: the
References
Contribution by alhelal to stackoverflow.com (20180901)
Contribution by QintenG to stackoverflow.com (20170104)
Mark Pilgrim, Dive into Python 3, section 4.6
"""
import string
print("The program converts a string of hex ASCII-codes")
print("into the corresponding string of letters.")
print("Input range is [41, 42, ..., 5a] U [61, 62, ..., 7a]. \n")
x = input("Input the hex ASCII-codes, eg. 746865: ")
result_1 = []
for i in range(0,len(x)//2):
for j in range(0,2):
result_1.extend(x[2*i+j])
# First subresult
lenres_1 = len(result_1)
result_2 = []
for i in range(0,len(result_1) - 1,2):
temp = ""
temp = temp + "0x" + result_1[i] #0, 2, 4
temp = temp + result_1[i + 1] #1, 3, 5
result_2.append(temp)
# Second subresult
result_3 = []
for i in range(0,len(result_2)):
result_3.append(int(result_2[i],16))
# Third subresult
by = bytes(result_3)
result_4 = by.decode('utf-8')
# Final result
print("Corresponding string of letters:" + " "*6, result_4, end = "\n")

How do you access name of a ProtoField after declaration?

How can I access the name property of a ProtoField after I declare it?
For example, something along the lines of:
myproto = Proto("myproto", "My Proto")
myproto.fields.foo = ProtoField.int8("myproto.foo", "Foo", base.DEC)
print(myproto.fields.foo.name)
Where I get the output:
Foo
An alternate method that's a bit more terse:
local fieldString = tostring(field)
local i, j = string.find(fieldString, ": .* myproto")
print(string.sub(fieldString, i + 2, j - (1 + string.len("myproto")))
EDIT: Or an even simpler solution that works for any protocol:
local fieldString = tostring(field)
local i, j = string.find(fieldString, ": .* ")
print(string.sub(fieldString, i + 2, j - 1))
Of course the 2nd method only works as long as there are no spaces in the field name. Since that's not necessarily always going to be the case, the 1st method is more robust. Here is the 1st method wrapped up in a function that ought to be usable by any dissector:
-- The field is the field whose name you want to print.
-- The proto is the name of the relevant protocol
function printFieldName(field, protoStr)
local fieldString = tostring(field)
local i, j = string.find(fieldString, ": .* " .. protoStr)
print(string.sub(fieldString, i + 2, j - (1 + string.len(protoStr)))
end
... and here it is in use:
printFieldName(myproto.fields.foo, "myproto")
printFieldName(someproto.fields.bar, "someproto")
Ok, this is janky, and certainly not the 'right' way to do it, but it seems to work.
I discovered this after looking at the output of
print(tostring(myproto.fields.foo))
This seems to spit out the value of each of the members of ProtoField, but I couldn't figure out the correct way to access them. So, instead, I decided to parse the string. This function will return 'Foo', but could be adapted to return the other fields as well.
function getname(field)
--First, convert the field into a string
--this is going to result in a long string with
--a bunch of info we dont need
local fieldString= tostring(field)
-- fieldString looks like:
-- ProtoField(188403): Foo myproto.foo base.DEC 0000000000000000 00000000 (null)
--Split the string on '.' characters
a,b=fieldString:match"([^.]*).(.*)"
--Split the first half of the previous result (a) on ':' characters
a,b=a:match"([^.]*):(.*)"
--At this point, b will equal " Foo myproto"
--and we want to strip out that abreviation "abvr" part
--Count the number of times spaces occur in the string
local spaceCount = select(2, string.gsub(b, " ", ""))
--Declare a counter
local counter = 0
--Declare the name we are going to return
local constructedName = ''
--Step though each word in (b) separated by spaces
for word in b:gmatch("%w+") do
--If we hav reached the last space, go ahead and return
if counter == spaceCount-1 then
return constructedName
end
--Add the current word to our name
constructedName = constructedName .. word .. " "
--Increment counter
counter = counter+1
end
end

Maximum input number to URL shortener

Given the following code which encodes a number - how can I calculate the maximum number if I want to limit the length of my generated keys. e.g. setting the max length of the result of encode(num) to some fixed value say 10
var alphabet = <SOME SET OF KEYS>,
base = alphabet.length;
this.encode = function(num) {
var str = '';
while (num > 0) {
str = _alphabet.charAt(num % base) + str;
num = Math.floor(num / base);
}
return str;
};
You are constructing num's representation in base base, with some arbitrary set of characters as numerals (alphabet).
For n characters we can represent numbers 0 through base^n - 1, so the answer to your question is base^10 - 1. For example, using the decimal system, with 5 digits we can represent numbers from 0 to 99999 (10^5 - 1).
It's worth noting that you will not ever use some sub-n length strings such as '001' or '0405' (using the decimal system numerals) - so any string starting with the equivalent of 0 except '0' itself.
I imagine that, for the purpose of a URL shortener that is allowed variable length, this might be considered a waste. By using all combinations you could represent numbers 0 through base^(n+1) - 2, but it wouldn't be as straightforward as your scheme.

How to convert a group of Hexadecimal to Decimal (Visual Studio )

I want to retrieve like in Pic2, the values in Decimal. ( hardcoded for visual understanding)
This is the codes to convert Hex to Dec for 16 bit:
string H;
int D;
H = txtHex.Text;
D = Convert.ToInt16(H, 16);
txtDec.Text = Convert.ToString(D);
however it doesn't work for a whole group
So the hex you are looking at does not refer to a decimal number. If it did refer to a single number that number would be far too large to store in any integral type. It might actually be too large to store in floating point types.
That hex you are looking at represents the binary data of a file. Each set of two characters represents one byte (because 16^2 = 2^8).
Take each pair of hex characters and convert it to a value between 0 and 255. You can accomplish this easily by converting each character to its numerical value. In case you don't have a complete understanding of what hex is, here's a map.
'0' = 0
'1' = 1
'2' = 2
'3' = 3
'4' = 4
'5' = 5
'6' = 6
'7' = 7
'8' = 8
'9' = 9
'A' = 10
'B' = 11
'C' = 12
'D' = 13
'E' = 14
'F' = 15
If the character on the left evaluates to n and the character on the right evaluates to m then the decimal value of the hex pair is (n x 16) + m.
You can use this method to get your values between 0 and 255. You then need to store each value in an unsigned char (this is a C/C++/ObjC term - I have no idea what the C# or VBA equivalent is, sorry). You then concatenate these unsigned char's to create the binary of the file. It is very important that you use an 8 bit type to store these values. You should not store these values in 16 bit integers, as you do above, or you will get corrupted data.
I don't know what you're meant to output in your program but this is how you get the data. If you provide a little more information I can probably help you use this binary.
You will need to split the contents into separate hex-number pairs ("B9", "D1" and so on). Then you can convert each into their "byte" value and add it to a result list.
Something like this, although you may need to adjust the "Split" (now it uses single spaces, returns, newlines and tabs as separator):
var byteList = new List<byte>();
foreach(var bytestring in txtHex.Text.Split(new[] {' ', '\r', '\n', '\t'},
StringSplitOptions.RemoveEmptyEntries))
{
byteList.Add(Convert.ToByte(bytestring, 16));
}
byte[] bytes = byteList.ToArray(); // further processing usually needs a byte-array instead of a List<byte>
What you then do with those "bytes" is up to you.

Resources