(PyQt6) How can I write bytes to memory using sip.voidptr? - qt

I want to fill the pixel data of an empty QVideoFrame instance. In PySide6, the bits(0) method returns a memory view and I can directly modify it by slicing and assigning. But in PyQt6, sip.voidptr is returned instead and I can't figure out how should I deal with it!
I am not really familiar with Python C bindings. Is there any way to easily access the memory using voidptr? Thanks in advance!
Edit 1
Here is a sample code.
from PyQt6 import QtCore, QtMultimedia
pfmt = QtMultimedia.QVideoFrameFormat.PixelFormat.Format_Y8
ffmt = QtMultimedia.QVideoFrameFormat(QtCore.QSize(1280, 1024), pfmt)
vf = QtMultimedia.QVideoFrame(ffmt)
vf.map(QtMultimedia.QVideoFrame.MapMode.ReadWrite)
# Data manipulation should come here.
vf.unmap()
Edit 2
Ok, I found out that voidptr can be converted to an array by passing the size of the video frame.
In the comment location of the code above,
>>> size = vf.bytesPerLine(0)*vf.height()
>>> vf.bits(0).asarray(size)
PyQt6.sip.array(unsigned char, 1310720)
Modifying a single pixel works fine.
>>> arr = vf.bits(0).asarray(vf.bytesPerLine(0)*vf.height())
>>> arr[0]
0
>>> arr[0] = 255
>>> arr[0]
255
But assigning the bytes to sip.array raised error.
>>> data = bytes(255 for _ in range(size))
>>> arr[:] = data
TypeError: can only assign another array of unsigned char to the slice
Answers in another question raised same error.

Thanks to #musicamante I managed to find the solution. It seems that voidptr can be directly accessed to write the bytes, provided that its size is known. QVideoFrame.bits() returns voidptr with unknown size (perhaps it is a bug of PyQt) so I had to manually set the size from QVideoFrame.mappedBytes.
Here is the full code:
from PyQt6 import QtCore, QtMultimedia
pfmt = QtMultimedia.QVideoFrameFormat.PixelFormat.Format_Y8
ffmt = QtMultimedia.QVideoFrameFormat(QtCore.QSize(1280, 1024), pfmt)
vf = QtMultimedia.QVideoFrame(ffmt)
vf.map(QtMultimedia.QVideoFrame.MapMode.ReadWrite)
ptr = frame.bits(0)
ptr.setsize(frame.mappedBytes(0))
ptr[:] = bytes(255 for _ in range(size))
vf.unmap()

Related

BertModel transformers outputs string instead of tensor

I'm following this tutorial that codes a sentiment analysis classifier using BERT with the huggingface library and I'm having a very odd behavior. When trying the BERT model with a sample text I get a string instead of the hidden state. This is the code I'm using:
import transformers
from transformers import BertModel, BertTokenizer
print(transformers.__version__)
PRE_TRAINED_MODEL_NAME = 'bert-base-cased'
PATH_OF_CACHE = "/home/mwon/data-mwon/paperChega/src_classificador/data/hugingface"
tokenizer = BertTokenizer.from_pretrained(PRE_TRAINED_MODEL_NAME,cache_dir = PATH_OF_CACHE)
sample_txt = 'When was I last outside? I am stuck at home for 2 weeks.'
encoding_sample = tokenizer.encode_plus(
sample_txt,
max_length=32,
add_special_tokens=True, # Add '[CLS]' and '[SEP]'
return_token_type_ids=False,
padding=True,
truncation = True,
return_attention_mask=True,
return_tensors='pt', # Return PyTorch tensors
)
bert_model = BertModel.from_pretrained(PRE_TRAINED_MODEL_NAME,cache_dir = PATH_OF_CACHE)
last_hidden_state, pooled_output = bert_model(
encoding_sample['input_ids'],
encoding_sample['attention_mask']
)
print([last_hidden_state,pooled_output])
that outputs:
4.0.0
['last_hidden_state', 'pooler_output']
While the answer from Aakash provides a solution to the problem, it does not explain the issue. Since one of the 3.X releases of the transformers library, the models do not return tuples anymore but specific output objects:
o = bert_model(
encoding_sample['input_ids'],
encoding_sample['attention_mask']
)
print(type(o))
print(o.keys())
Output:
transformers.modeling_outputs.BaseModelOutputWithPoolingAndCrossAttentions
odict_keys(['last_hidden_state', 'pooler_output'])
You can return to the previous behavior by adding return_dict=False to get a tuple:
o = bert_model(
encoding_sample['input_ids'],
encoding_sample['attention_mask'],
return_dict=False
)
print(type(o))
Output:
<class 'tuple'>
I do not recommend that, because it is now unambiguous to select a specific part of the output without turning to the documentation as shown in the example below:
o = bert_model(encoding_sample['input_ids'], encoding_sample['attention_mask'], return_dict=False, output_attentions=True, output_hidden_states=True)
print('I am a tuple with {} elements. You do not know what each element presents without checking the documentation'.format(len(o)))
o = bert_model(encoding_sample['input_ids'], encoding_sample['attention_mask'], output_attentions=True, output_hidden_states=True)
print('I am a cool object and you can acces my elements with o.last_hidden_state, o["last_hidden_state"] or even o[0]. My keys are; {} '.format(o.keys()))
Output:
I am a tuple with 4 elements. You do not know what each element presents without checking the documentation
I am a cool object and you can acces my elements with o.last_hidden_state, o["last_hidden_state"] or even o[0]. My keys are; odict_keys(['last_hidden_state', 'pooler_output', 'hidden_states', 'attentions'])
I faced the same issue while learning how to implement Bert. I noticed that using
last_hidden_state, pooled_output = bert_model(encoding_sample['input_ids'], encoding_sample['attention_mask'])
is the issue. Use:
outputs = bert_model(encoding_sample['input_ids'], encoding_sample['attention_mask'])
and extract the last_hidden state using
output[0]
You can refer to the documentation here which tells you what is returned by the BertModel

DM Script to import a 2D image in text (CSV) format

Using the built-in "Import Data..." functionality we can import a properly formatted text file (like CSV and/or tab-delimited) as an image. It is rather straight forward to write a script to do so. However, my scripting approach is not efficient - which requires me to loop through each raw (use the "StreamReadTextLine" function) so it takes a while to get a 512x512 image imported.
Is there a better way or an "undocumented" script function that I can tap in?
DigitalMicrograph offers an import functionality via the File/Import Data... menu entry, which will give you this dialog:
The functionality evoked by this dialog can also be accessed by script commands, with the command
BasicImage ImageImportTextData( String img_name, ScriptObject stream, Number data_type_enum, ScriptObject img_size, Boolean lines_are_rows, Boolean size_by_counting )
As with the dialog, one has to pre-specify a few things.
The data type of the image.
This is a number. You can find out which number belongs to which image data type by, f.e., creating an image outputting its data type:
image img := Realimage( "", 4, 100 )
Result("\n" + img.ImageGetDataType() )
The file stream object
This object describes where the data is stored. The F1 help-documention explains how one creates a file-stream from an existing file, but essentially you need to specify a path to the file, then open the file for reading (which gives you a handle), and then using the fileHandle to create the stream object.
string path = "C:\\test.txt"
number fRef = OpenFileForReading( path )
object fStream = NewStreamFromFileReference( fRef, 1 )
The image size object
This is a specific script object you need to allocate. It wraps image size information. In case of auto-detecting the size from the text, you don't need to specify the actual size, but you still need the object.
object imgSizeObj = Alloc("ImageData_ImageDataSize")
imgSizeObj.SetNumDimensions(2) // Not needed for counting!
imgSizeObj.SetDimensionSize(0,10) // Not used for counting
imgSizeObj.SetDimensionSize(1,10) // Not used for counting
Boolean checks
Like with the checkboxes in the UI, you spefic two conditions:
Lines are Rows
Get Size By Counting
Note, that the "counting" flag is only used if "Lines are Rows" is also true. Same as with the dialog.
The following script improrts a text file with couting:
image ImportTextByCounting( string path, number DataType )
{
number fRef = OpenFileForReading( path )
object fStream = NewStreamFromFileReference( fRef, 1 )
number bLinesAreRows = 1
number bSizeByCount = 1
bSizeByCount *= bLinesAreRows // Only valid together!
object imgSizeObj = Alloc("ImageData_ImageDataSize")
image img := ImageImportTextData( "Imag Name ", fStream, DataType, imgSizeObj, bLinesAreRows, bSizeByCount )
return img
}
string path = "C:\\test.txt"
number kREAL4_DATA = 2
image img := ImportTextByCounting( path, kREAL4_DATA )
img.ShowImage()

How to define empty IndexedTables in Julia?

I am unable to define empty IndexedTables, e.g.
using IndexedTables, IndexedTables.Table
t = Table(Columns(a=Int64[],b=String[]),Int64[])
t[1,"a"] = 1
t[1,"b"] = 2
t[1,"c"] = t[1,"a"] + t[1,"b"]
BoundsError: attempt to access 0-element Array{Int64,1} at index [0]
I am aware that creating the IndexedTable with already the data is more efficient that creating an empty one and then insert values, but sometimes you are obliged to go on this way.
Is this a bug ? If so, is there any workaround possible ?
(I already posted this thread on the Julia forum, but so far I had no replies there)
This is probably a bug in IndexedTables.
Inserting into an IndexedTable requires reindexing to access the data. Reindexing is done with flush!.
But flush!(t) fails in the example in the question with the empty t.
Fixing flush! which calls _merge! can be done by:
julia> function IndexedTables._merge!(dst::IndexedTable, src::IndexedTable, f)
if length(dst.index)==0 || isless(dst.index[end], src.index[1])
append!(dst.index, src.index)
append!(dst.data, src.data)
else
# merge to a new copy
new = _merge(dst, src, f)
ln = length(new)
# resize and copy data into dst
resize!(dst.index, ln)
copy!(dst.index, new.index)
resize!(dst.data, ln)
copy!(dst.data, new.data)
end
return dst
end
julia> t[1,"c"] = t[1,"a"] + t[1,"b"]
3
The change is the addition of the length(...) check in the first if.
Of course, a pull request / issue should be opened with IndexedTables.jl. Antonello, will you do this? (or shall I)

Can LLDB data formatters call methods?

I'm debugging a Qt application using LLDB. At a breakpoint I can write
(lldb) p myQString.toUtf8().data()
and see the string contained within myQString, as data() returns char*. I would like to be able to write
(lldb) p myQString
and get the same output. This didn't work for me:
(lldb) type summary add --summary-string "${var.toUtf8().data()}" QString
Is it possible to write a simple formatter like this, or do I need to know the internals of QString and write a python script?
Alternatively, is there another way I should be using LLDB to view QStrings this way?
The following does work.
First, register your summary command:
debugger.HandleCommand('type summary add -F set_sblldbbp.qstring_summary "QString"')
Here is an implementation
def make_string_from_pointer_with_offset(F,OFFS,L):
strval = 'u"'
try:
data_array = F.GetPointeeData(0, L).uint16
for X in range(OFFS, L):
V = data_array[X]
if V == 0:
break
strval += unichr(V)
except:
pass
strval = strval + '"'
return strval.encode('utf-8')
#qt5
def qstring_summary(value, unused):
try:
d = value.GetChildMemberWithName('d')
#have to divide by 2 (size of unsigned short = 2)
offset = d.GetChildMemberWithName('offset').GetValueAsUnsigned() / 2
size = get_max_size(value)
return make_string_from_pointer_with_offset(d, offset, size)
except:
print '?????????????????????????'
return value
def get_max_size(value):
_max_size_ = None
try:
debugger = value.GetTarget().GetDebugger()
_max_size_ = int(lldb.SBDebugger.GetInternalVariableValue('target.max-string-summary-length', debugger.GetInstanceName()).GetStringAtIndex(0))
except:
_max_size_ = 512
return _max_size_
It is expected that what you tried to do won't work. The summary strings feature does not allow calling expressions.
Calling expressions in a debugger is always interesting, in a data formatter more so (if you're in an IDE - say Xcode - formatters run automatically). Every time you stop somewhere, even if you just stepped over one line, all these little expressions would all automatically run over and over again, at a large performance cost - and this is not even taking into account the fact that your data might be in a funny state already and running expressions has the potential to alter it even more, making your debugging sessions trickier than needed.
If the above wall of text still hasn't discouraged you ( :-) ), you want to write a Python formatter, and use the SB API to run your expression. Your value is an SBValue object, which has access to an SBFrame and an SBTarget. The combination of these two allows you to run EvaluateExpression("blah") and get back another SBValue, probably a char* to which you can then ask GetSummary() to get your c-string back.
If, on the other hand, you are now persuaded that running expressions in formatters is suboptimal, the good news is that QString most certainly has to store its data pointer somewhere.. if you find out where that is, you can just write a formatter as ${var.member1.member2.member3.theDataPointer} and obtain the same result!
this is my trial-and-error adaptation of a UTF16 string interpretation lldb script I found online (I apologise that I don't remember the source - and that I can't credit the author)
Note that this is for Qt 4.3.2 and versions close to it - as the handling of the 'data' pointer has since changed between then and Qt 5.x
def QString_SummaryProvider(valobj, internal_dict):
data = valobj.GetChildMemberWithName('d')#.GetPointeeData()
strSize = data.GetChildMemberWithName('size').GetValueAsUnsigned()
newchar = -1
i = 0
s = u'"'
while newchar != 0:
# read next wchar character out of memory
data_val = data.GetChildMemberWithName('data').GetPointeeData(i, 1)
size = data_val.GetByteSize()
e = lldb.SBError()
if size == 1:
newchar = data_val.GetUnsignedInt8(e, 0) # utf-8
elif size == 2:
newchar = data_val.GetUnsignedInt16(e, 0) # utf-16
elif size == 4:
newchar = data_val.GetUnsignedInt32(e, 0) # utf-32
else:
s = s + '<unexpected char size - error parsing QString>'
break
if e.fail:
s = s + '<parse error:' + e.why() + '>'
break
i = i + 1
if i > strSize:
break
# add the character to our string 's'
# print "char2 = %s" % newchar
if newchar != 0:
s = s + unichr(newchar)
s = s + u'"'
return s.encode('utf-8')

how to save qundocommand to file and reload it?

I am using qt's undo framework , which use qundocommand to do some application support undo.
Is there an easy way I can use to save those qundocommand to a file and reload it?
There's no built-in way. I don't think it's very common to save the undo stack between sessions. You'll have to serialize the commands yourself by iterating through the commands on the stack, and saving each one's unique data using QDataStream. It might look something like this:
...
dataStream << undoStack->count(); // store number of commands
for (int i = 0; i < undoStack->count(); i++)
{
// store each command's unique information
dataStream << undoStack->command(i)->someMemberVariable;
}
...
Then you would use QDataStream again to deserialize the data back into QUndoCommands.
You can use QFile to handle the file management.
Use Qt's serialization as described here:
Serialization with Qt
Then within your QUndoCommands you can use a temp file to write the data to it:
http://qt-project.org/doc/qt-4.8/qtemporaryfile.html
However this might cause you an issue since each file is kept open and so on some platforms (Linux) you may run out of open file handles.
To combat this you'd have to create some other factory type object which handles your commands - then this could pass in a reference to a QTemporaryFile automatically. This factory/QUndoCommand care taker object must have the same life time as the QUndoCommands. If not then the temp file will be removed from disk and your QUndoCommands will break.
The other thing you can do is use QUndoCommand as a proxy to your real undo command - this means you can save quite a bit of memory since when your undo command is saved to file you can delete the internal pointer/set it to null. Then restore it later.
Here's a PyQt solution for serializing/pickling QUndoCommands. The tricky part was getting the parent to call __init__ first, then the children. This method relies on all the children's __setstate__ to be called before the parent's, which happens upon pickling as children are returned in the parent's __getstate__.
class UndoCommand(QUndoCommand):
"""
For pickling
"""
def __init__(self, text, parent=None):
QUndoCommand.__init__(self, text, parent)
self.__parent = parent
self.__initialized = True
# defined and initialized in __setstate__
# self.__child_states = {}
def __getstate__(self):
return {
**{k: v for k, v in self.__dict__.items()},
'_UndoCommand__initialized': False,
'_UndoCommand__text': self.text(),
'_UndoCommand__children':
[self.child(i) for i in range(self.childCount())]
}
def __setstate__(self, state):
if hasattr(self, '_UndoCommand__initialized') and \
self.__initialized:
return
text = state['_UndoCommand__text']
parent = state['_UndoCommand__parent'] # type: UndoCommand
if parent is not None and \
(not hasattr(parent, '_UndoCommand__initialized') or
not parent.__initialized):
# will be initialized in parent's __setstate__
if not hasattr(parent, '_UndoCommand__child_states'):
setattr(parent, '_UndoCommand__child_states', {})
parent.__child_states[self] = state
return
# init must be called on unpickle-time to recreate Qt object
UndoCommand.__init__(self, text, parent)
for child in state['_UndoCommand__children']:
child.__setstate__(self.__child_states[child])
self.__dict__ = {k: v for k, v in state.items()}
#staticmethod
def from_QUndoCommand(qc: QUndoCommand, parent=None):
if type(qc) == QUndoCommand:
qc.__class__ = UndoCommand
qc.__initialized = True
qc.__parent = parent
children = [qc.child(i) for i in range(qc.childCount())]
for child in children:
UndoCommand.from_QUndoCommand(child, parent=qc)
return qc

Resources