how to change 3 concatenated strings to bytes in python? - serial-port

I am getting an error when running the following code in python 3, I look all over but could not find a right way to do it. any help will be appreciated.
raise TypeError('unicode strings are not supported, please encode to bytes: {!r}'.format(seq))
TypeError: unicode strings are not supported, please encode to bytes: 'relay read 7\n\r'
I need to send the following string via serial port: relay read #of relay.
import sys
import serial
if (len(sys.argv) < 2):
print ("Usage: relayread.py <PORT> <RELAYNUM>\nEg: relayread.py COM1 0")
sys.exit(0)
else:
portName = sys.argv[1];
relayNum = sys.argv[2];
#Open port for communication
serPort = serial.Serial(portName, 19200, timeout=1)
if (int(relayNum) < 10):
relayIndex = str(relayNum)
else:
relayIndex = chr(55 + int(relayNum))
serPort.write("relay read "+ relayIndex + "\n\r")
response = serPort.read(25)
if(response.find("on") > 0):
print ("Relay " + str(relayNum) +" is ON")
elif(response.find("off") > 0):
print ("Relay " + str(relayNum) +" is OFF")
#Close the port
serPort.close()

Use the string's encode method to construct the corresponding byte sequence.
In this case all of the characters in the string are in the ASCII range so it doesn't really matter which encoding scheme you use. (Differences between encoding schemes generally only matter when you're dealing with non-ASCII characters, ones whose ord() value is greater than 127.) So in this case you don't even need to specify a particular encoding scheme, you can simply use the encode method with no argument and let Python apply the platform's default encoding.
To do that, change this:
serPort.write("relay read "+ relayIndex + "\n\r")
to this:
serPort.write(("relay read "+ relayIndex + "\n\r").encode())
You'll probably have to do the reverse operation to get a string from the byte sequence returned by serPort.read. Change this:
response = serPort.read(25)
to:
response = serPort.read(25).decode()
BTW, it's typical for line endings in transmitted data to be represented by a Carriage Return followed by a Line Feed, or "\r\n". In your serPort.write call you're using the reverse of that, "\n\r". That's unusual but if that's what your device needs then so be it.

Related

Nginx: How to convert post data encoding from UTF-8 to TIS-620 before pass them through proxy

I would like to convert POST data received from request and convert it from UTF-8 to TIS-620 before pass it to backend via proxy_pass using code below but I am not sure which way to do it
location / {
proxy_pass http://targetwebsite;
}
If I am not wrong, I believe I have to use Lua module to manipulate the request but I don't know if they support any character conversion.
Could anybody help me with sample code to convert POST data from UTF-8 toTIS-620 using LUA and how to validate if POST data is UTF-8 before convert it or if there is other better way to manipulate/convert POST data in nginx ?
This solution works on Lua 5.1/5.2/5.3
local function utf8_to_unicode(utf8str, pos)
-- pos = starting byte position inside input string
local code, size = utf8str:byte(pos), 1
if code >= 0xC0 and code < 0xFE then
local mask = 64
code = code - 128
repeat
local next_byte = utf8str:byte(pos + size)
if next_byte and next_byte >= 0x80 and next_byte < 0xC0 then
code, size = (code - mask - 2) * 64 + next_byte, size + 1
else
return
end
mask = mask * 32
until code < mask
elseif code >= 0x80 then
return
end
-- returns code, number of bytes in this utf8 char
return code, size
end
function utf8to620(utf8str)
local pos, result_620 = 1, {}
while pos <= #utf8str do
local code, size = utf8_to_unicode(utf8str, pos)
if code then
pos = pos + size
code =
(code < 128 or code == 0xA0) and code
or (code >= 0x0E01 and code <= 0x0E3A or code >= 0x0E3F and code <= 0x0E5B) and code - 0x0E5B + 0xFB
end
if not code then
return utf8str -- wrong UTF-8 symbol, this is not a UTF-8 string, return original string
end
table.insert(result_620, string.char(code))
end
return table.concat(result_620) -- return converted string
end
Usage:
local utf8string = "UTF-8 Thai text here"
local tis620string = utf8to620(utf8string)
I looked up the encoding on Wikipedia and came up the following solution for converting from UTF-8 to TIS-620. It assumes that all the codepoints in the UTF-8 string have an encoding in TIS-620. It will work if the UTF-8 string only contains ASCII printable characters (codepoints " " to "~") or Thai characters (codepoints "ก" to "๛"). Otherwise, it will give wrong and possibly very strange results.
This assumes you have Lua 5.3's utf8 library or an equivalent. If you're using an earlier version of Lua, one possibility is the pure-Lua version of the ustring library from MediaWiki (used by Wikipedia and Wiktionary, for instance). It provides a function to validate UTF-8, and many of the other functions will validate strings automatically. (That is, they throw an error if the string is invalid UTF-8.) If you use that library, you just have to replace utf8.codepoint with ustring.codepoint in the code below.
-- Add this number to TIS-620 values above 0x80 to get the Unicode codepoint.
-- 0xE00 is the first codepoint of Thai block, 0xA0 is the corresponding byte used in TIS-620.
local difference = 0xE00 - 0xA0
function UTF8_to_TIS620(UTF8_string)
local TIS620_string = UTF8_string:gsub(
'[\194-\244][\128-\191]+',
function (non_ASCII)
return string.char(utf8.codepoint(non_ASCII) - difference)
end)
return TIS620_string
end

Concat 2 strings erlang and send with http

I'm trying to concat 2 variables Address and Payload. After that I want to send them with http to a server but I have 2 problems. When i try to concat the 2 variables with a delimiter ';' it doesn't work. Also sending the data of Payload or Address doesn't work. This is my code:
handle_rx(Gateway, #link{devaddr=DevAddr}=Link, #rxdata{port=Port, data= RxData }, RxQ)->
Data = base64:encode(RxData),
Devaddr = base64:encode(DevAddr),
TextAddr="Device address: ",
TextPayload="Payload: ",
Address = string:concat(TextAddr, Devaddr),
Payload = string:concat(TextPayload, Data),
Json=string:join([Address,Payload], "; "),
file:write_file("/tmp/foo.txt", io_lib:fwrite("~s.\n", [Json] )),
inets:start(),
ssl:start(),
httpc:request(post, {"http://192.168.0.121/apiv1/lorapacket/rx", [], "application/x-www-form-urlencoded", Address },[],[]),
ok;
handle_rx(_Gateway, _Link, RxData, _RxQ) ->
{error, {unexpected_data, RxData}}.
I have no errors that I can show you. When I write Address or Payload individually to the file it works but sending doesn't work...
Thank you for your help!
When i try to concat the 2 variables with a delimiter ';' it doesn't work.
5> string:join(["hello", <<"world">>], ";").
[104,101,108,108,111,59|<<"world">>]
6> string:join(["hello", "world"], ";").
"hello;world"
base64:encode() returns a binary, yet string:join() requires string arguments. You can do this:
7> string:join(["hello", binary_to_list(<<"world">>)], ";").
"hello;world"
Response to comment:
In erlang the string "abc" is equivalent to the list [97,98,99]. However, the binary syntax <<"abc">> is not equivalent to <<[97,98,99]>>, rather the binary syntax <<"abc">> is special short hand notation for the binary <<97, 98, 99>>.
Therefore, if you write:
Address = [97,98,99].
then the code:
Bin = <<Address>>.
after variable substitution becomes:
Bin = <<[97,98,99]>>.
and that isn't legal binary syntax.
If you need to convert a string/list contained in a variable, like Address, to a binary, you use list_to_binary(Address)--not <<Address>>.
In your code here:
Json = string:join([binary_to_list(<<Address>>),
binary_to_list(<<Pa‌​yload>>)],
";").
Address and Payload were previously assigned the return value of string:concat(), which returns a string, so there is no reason to (attempt) to convert Address to a binary with <<Address>>, then immediately convert the binary back to a string with binary_to_list(). Instead, you would just write:
Json = string:join(Address, Payload, ";")
The problem with your original code is that you called string:concat() with a string as the first argument and a binary as the second argument--yet string:concat() takes two string arguments. You can use binary_to_list() to convert a binary to the string that you need for the second argument.
Sorry I'm new to Erlang
As with any language, you have to study the basics and write numerous toy examples before you can start writing code that actually does something.
You don't have to concatenate strings. It is called iolist and is one of best things in Erlang:
1> RxData = "Hello World!", DevAddr = "Earth",
1> Data = base64:encode(RxData), Devaddr = base64:encode(DevAddr),
1> TextAddr="Device address", TextPayload="Payload",
1> Json=["{'", TextAddr, "': '", Devaddr, "', '", TextPayload, "': '", Data, "'}"].
["{'","Device address","': '",<<"RWFydGg=">>,"', '",
"Payload","': '",<<"SGVsbG8gV29ybGQh">>,"'}"]
2> file:write_file("/tmp/foo.txt", Json).
ok
3> file:read_file("/tmp/foo.txt").
{ok,<<"{'Device address': 'RWFydGg=', 'Payload': 'SGVsbG8gV29ybGQh'}">>}

python creation json with plone object

I've a json file with plone objects and there is one field of the objects giving me an error:
UnicodeDecodeError('ascii', '{"id":"aluminio-prata", "nome":"ALUM\xc3\x8dNIO PRATA", "num_demaos":0, "rendimento": 0.0, "unidade":"litros", "url":"", "particular":[], "profissional":[], "unidades":[]},', 36, 37, 'ordinal not in range(128)') (Also, the following error occurred while attempting to render the standard error message, please see the event log for full details: 'NoneType' object has no attribute 'getMethodAliases')
I already know witch field is, is the "title" from title = obj.pretty_title_or_id(), when I remove it from here its ok:
json += '{"id":"' + str(id) + '", "nome":"' + title + '", "num_demaos":' + str(num_demaos) + ', "rendimento": ' + str(rendimento) + ', "unidade":"' + str(unidade) + '", "url":"' + link_produto + '", "particular":' + arr_area_particular + ', "profissional":' + arr_area_profissional + ', "unidades":' + json_qtd + '},
but when I leave it I've got this error.
UnicodeDecodeError('ascii', '{"id":"aluminio-prata", "nome":"ALUM\xc3\x8dNIO PRATA", "num_demaos":0, "rendimento": 0.0, "unidade":"litros", "url":"", "particular":[], "profissional":[], "unidades":[]},', 36, 37, 'ordinal not in range(128)') (Also, the following error occurred while attempting to render the standard error message, please see the event log for full details: 'NoneType' object has no attribute 'getMethodAliases')
I'm going to assume that the error occurs when you're reading the JSON file.
Internally, Plone uses Python Unicode strings for nearly everything. If you read a string from a file, it will need to be decoded into Unicode before Plone can use it. If you give no instructions otherwise, Python will assume that the string was encoded as ASCII, and will attempt its Unicode conversion on that basis. It would be similar to writing:
unicode("ALUM\xc3\x8dNIO PRATA")
which will produce the same kind of error.
In fact, the string you're using was evidently encoded with the UTF-8 character set. That's evident from the "\xc3", and it also makes sense, because that's the character set Plone uses when it sends data to the outside world.
So, how do you fix this? You have to specify the character set that you wish to use when you convert to Unicode:
"ALUM\xc3\x8dNIO PRATA".decode('UTF8')
This gives you a Python Unicode string with no error.
So, after you've read your JSON file into a string (let's call it mystring), you will need to explicitly decode it by using mystring.decode('UTF8'). unicode(mystring, 'UTF8') is another form of the same operation.
As Steve already wrote do title.decode('utf8')
An Example illustrate the facts:
>>> u"Ä" == u"\xc4"
True # the native unicode char and escaped versions are the same
>>> "Ä" == u"\xc4"
False # the native unicode char is '\xc3\x84' in latin1
>>> "Ä".decode('utf8') == u"\xc4"
True # one can decode the string to get unicode
>>> "Ä" == "\xc4"
False # the native character and the escaped string are
# of course not equal ('\xc3\x84' != '\xc4').
I find this Thread very helpfull for Problems and Understanding with Encode/Decode of UTF-8.

Can LLDB data formatters call methods?

I'm debugging a Qt application using LLDB. At a breakpoint I can write
(lldb) p myQString.toUtf8().data()
and see the string contained within myQString, as data() returns char*. I would like to be able to write
(lldb) p myQString
and get the same output. This didn't work for me:
(lldb) type summary add --summary-string "${var.toUtf8().data()}" QString
Is it possible to write a simple formatter like this, or do I need to know the internals of QString and write a python script?
Alternatively, is there another way I should be using LLDB to view QStrings this way?
The following does work.
First, register your summary command:
debugger.HandleCommand('type summary add -F set_sblldbbp.qstring_summary "QString"')
Here is an implementation
def make_string_from_pointer_with_offset(F,OFFS,L):
strval = 'u"'
try:
data_array = F.GetPointeeData(0, L).uint16
for X in range(OFFS, L):
V = data_array[X]
if V == 0:
break
strval += unichr(V)
except:
pass
strval = strval + '"'
return strval.encode('utf-8')
#qt5
def qstring_summary(value, unused):
try:
d = value.GetChildMemberWithName('d')
#have to divide by 2 (size of unsigned short = 2)
offset = d.GetChildMemberWithName('offset').GetValueAsUnsigned() / 2
size = get_max_size(value)
return make_string_from_pointer_with_offset(d, offset, size)
except:
print '?????????????????????????'
return value
def get_max_size(value):
_max_size_ = None
try:
debugger = value.GetTarget().GetDebugger()
_max_size_ = int(lldb.SBDebugger.GetInternalVariableValue('target.max-string-summary-length', debugger.GetInstanceName()).GetStringAtIndex(0))
except:
_max_size_ = 512
return _max_size_
It is expected that what you tried to do won't work. The summary strings feature does not allow calling expressions.
Calling expressions in a debugger is always interesting, in a data formatter more so (if you're in an IDE - say Xcode - formatters run automatically). Every time you stop somewhere, even if you just stepped over one line, all these little expressions would all automatically run over and over again, at a large performance cost - and this is not even taking into account the fact that your data might be in a funny state already and running expressions has the potential to alter it even more, making your debugging sessions trickier than needed.
If the above wall of text still hasn't discouraged you ( :-) ), you want to write a Python formatter, and use the SB API to run your expression. Your value is an SBValue object, which has access to an SBFrame and an SBTarget. The combination of these two allows you to run EvaluateExpression("blah") and get back another SBValue, probably a char* to which you can then ask GetSummary() to get your c-string back.
If, on the other hand, you are now persuaded that running expressions in formatters is suboptimal, the good news is that QString most certainly has to store its data pointer somewhere.. if you find out where that is, you can just write a formatter as ${var.member1.member2.member3.theDataPointer} and obtain the same result!
this is my trial-and-error adaptation of a UTF16 string interpretation lldb script I found online (I apologise that I don't remember the source - and that I can't credit the author)
Note that this is for Qt 4.3.2 and versions close to it - as the handling of the 'data' pointer has since changed between then and Qt 5.x
def QString_SummaryProvider(valobj, internal_dict):
data = valobj.GetChildMemberWithName('d')#.GetPointeeData()
strSize = data.GetChildMemberWithName('size').GetValueAsUnsigned()
newchar = -1
i = 0
s = u'"'
while newchar != 0:
# read next wchar character out of memory
data_val = data.GetChildMemberWithName('data').GetPointeeData(i, 1)
size = data_val.GetByteSize()
e = lldb.SBError()
if size == 1:
newchar = data_val.GetUnsignedInt8(e, 0) # utf-8
elif size == 2:
newchar = data_val.GetUnsignedInt16(e, 0) # utf-16
elif size == 4:
newchar = data_val.GetUnsignedInt32(e, 0) # utf-32
else:
s = s + '<unexpected char size - error parsing QString>'
break
if e.fail:
s = s + '<parse error:' + e.why() + '>'
break
i = i + 1
if i > strSize:
break
# add the character to our string 's'
# print "char2 = %s" % newchar
if newchar != 0:
s = s + unichr(newchar)
s = s + u'"'
return s.encode('utf-8')

pyparsing multiple lines optional missing data in result set

I am quite new pyparsing user and have missing match i don't understand:
Here is the text i would like to parse:
polraw="""
set policy id 800 from "Untrust" to "Trust" "IP_10.124.10.6" "MIP(10.0.2.175)" "TCP_1002" permit
set policy id 800
set dst-address "MIP(10.0.2.188)"
set service "TCP_1002-1005"
set log session-init
exit
set policy id 724 from "Trust" to "Untrust" "IP_10.16.14.28" "IP_10.24.10.6" "TCP_1002" permit
set policy id 724
set src-address "IP_10.162.14.38"
set dst-address "IP_10.3.28.38"
set service "TCP_1002-1005"
set log session-init
exit
set policy id 233 name "THE NAME is 527 ;" from "Untrust" to "Trust" "IP_10.24.108.6" "MIP(10.0.2.149)" "TCP_1002" permit
set policy id 233
set service "TCP_1002-1005"
set service "TCP_1006-1008"
set service "TCP_1786"
set log session-init
exit
"""
I setup grammar this way:
KPOL = Suppress(Keyword('set policy id'))
NUM = Regex(r'\d+')
KSVC = Suppress(Keyword('set service'))
KSRC = Suppress(Keyword('set src-address'))
KDST = Suppress(Keyword('set dst-address'))
SVC = dblQuotedString.setParseAction(lambda t: t[0].replace('"',''))
ADDR = dblQuotedString.setParseAction(lambda t: t[0].replace('"',''))
EXIT = Suppress(Keyword('exit'))
EOL = LineEnd().suppress()
P_SVC = KSVC + SVC + EOL
P_SRC = KSRC + ADDR + EOL
P_DST = KDST + ADDR + EOL
x = KPOL + NUM('PId') + EOL + Optional(ZeroOrMore(P_SVC)) + Optional(ZeroOrMore(P_SRC)) + Optional(ZeroOrMore(P_DST))
for z in x.searchString(polraw):
print z
Result set is such as
['800', 'MIP(10.0.2.188)']
['724', 'IP_10.162.14.38', 'IP_10.3.28.38']
['233', 'TCP_1002-1005', 'TCP_1006-1008', 'TCP_1786']
The 800 is missing service tag ???
What's wrong here.
Thanks by advance
Laurent
The problem you are seeing is that in your expression, DST's are only looked for after having skipped over optional SVC's and SRC's. You have a couple of options, I'll go through each so you can get a sense of what all is going on here.
(But first, there is no point in writing "Optional(ZeroOrMore(anything))" - ZeroOrMore already implies Optional, so I'm going to drop the Optional part in any of these choices.)
If you are going to get SVC's, SRC's, and DST's in any order, you could refactor your ZeroOrMore to accept any of the three data types, like this:
x = KPOL + NUM('PId') + EOL + ZeroOrMore(P_SVC|P_SRC|P_DST)
This will allow you to intermix different types of statements, and they will all get collected as part of the ZeroOrMore repetition.
If you want to keep these different types of statements in groups, then you can add a results name to each:
x = KPOL + NUM('PId') + EOL + ZeroOrMore(P_SVC("svc*")|
P_SRC("src*")|
P_DST("dst*"))
Note the trailing '*' on each name - this is equivalent to calling setResultsName with the listAllMatches argument equal to True. As each different expression is matched, the results for the different types will get collected into the "svc", "src", or "dst" results name. Calling z.dump() will list the tokens and the results names and their values, so you can see how this works.
set policy id 233
set service "TCP_1002-1005"
set dst-address "IP_10.3.28.38"
set service "TCP_1006-1008"
set service "TCP_1786"
set log session-init
exit
shows this for z.dump():
['233', 'TCP_1002-1005', 'IP_10.3.28.38', 'TCP_1006-1008', 'TCP_1786']
- PId: 233
- dst: [['IP_10.3.28.38']]
- svc: [['TCP_1002-1005'], ['TCP_1006-1008'], ['TCP_1786']]
If you wrap ungroup on the P_xxx expressions, maybe like this:
P_SVC,P_SRC,P_DST = (ungroup(expr) for expr in (P_SVC,P_SRC,P_DST))
then the output is even cleaner-looking:
['233', 'TCP_1002-1005', 'IP_10.3.28.38', 'TCP_1006-1008', 'TCP_1786']
- PId: 233
- dst: ['IP_10.3.28.38']
- svc: ['TCP_1002-1005', 'TCP_1006-1008', 'TCP_1786']
This is actually looking pretty good, but let me pass on one other option. There are a number of cases where parsers have to look for several sub-expressions in any order. Let's say they are A,B,C, and D. To accept these in any order, you could write something like OneOrMore(A|B|C|D), but this would accept multiple A's, or A, B, and C, but not D. The exhaustive/exhausting combinatorial explosion of (A+B+C+D) | (A+B+D+C) | etc. could be written, or you could maybe automate it with something like
from itertools import permutations
mixNmatch = MatchFirst(And(p) for p in permutations((A,B,C,D),4))
But there is a class in pyparsing called Each that allows to write the same kind of thing:
Each([A,B,C,D])
meaning "must have one each of A, B, C, and D, in any order". And like And, Or, NotAny, etc., there is an operator shortcut too:
A & B & C & D
which means the same thing.
If you want "must have A, B, and C, and optionally D", then write:
A & B & C & Optional(D)
and this will parse with the same kind of behavior, looking for A, B, C, and D, regardless of the incoming order, and whether D is last or mixed in with A, B, and C. You can also use OneOrMore and ZeroOrMore to indicate optional repetition of any of the expressions.
So you could write your expression as:
x = KPOL + NUM('PId') + EOL + (ZeroOrMore(P_SVC) &
ZeroOrMore(P_SRC) &
ZeroOrMore(P_DST))
I looked at using results names with this expression, and the ZeroOrMore's seem to be confusing things, maybe still a bug in how this is done. So you may have to reserve using Each for more basic cases like my A,B,C,D example. But I wanted to make you aware of it.
Some other notes on your parser:
dblQuotedString.setParseAction(lambda t: t[0].replace('"','')) is probably better written
dblQuotedString.setParseAction(removeQuotes). You don't have any embedded quotes in your examples, but it's good to be aware of where your assumptions might not translate to a future application. Here are a couple of ways of removing the defining quotes:
dblQuotedString.setParseAction(lambda t: t[0].replace('"',''))
print dblQuotedString.parseString(r'"This is an embedded quote \" and an ending quote \""')[0]
# prints 'This is an embedded quote \ and an ending quote \'
# removed leading and trailing "s, but also internal ones too, which are
# really part of the quoted string
dblQuotedString.setParseAction(lambda t: t[0].strip('"'))
print dblQuotedString.parseString(r'"This is an embedded quote \" and an ending quote \""')[0]
# prints 'This is an embedded quote \" and an ending quote \'
# removed leading and trailing "s, and leaves the one internal ones but strips off
# the escaped ending quote
dblQuotedString.setParseAction(removeQuotes)
print dblQuotedString.parseString(r'"This is an embedded quote \" and an ending quote \""')[0]
# prints 'This is an embedded quote \" and an ending quote \"'
# just removes leading and trailing " characters, leaves escaped "s in place
KPOL = Suppress(Keyword('set policy id')) is a bit fragile, as it will break if there are any extra spaces between 'set' and 'policy', or between 'policy' and 'id'. I usually define these kind of expressions by first defining all the keywords individually:
SET,POLICY,ID,SERVICE,SRC_ADDRESS,DST_ADDRESS,EXIT = map(Keyword,
"set policy id service src-address dst-address exit".split())
and then define the separate expressions using:
KSVC = Suppress(SET + SERVICE)
KSRC = Suppress(SET + SRC_ADDRESS)
KDST = Suppress(SET + DST_ADDRESS)
Now your parser will cleanly handle extra whitespace (or even comments!) between individual keywords in your expressions.

Resources