Related
I am experimenting with Julia's text mining module.
When I feed the Corpus function with a DataArray{TextAnalysis.StringDocument,1} I got a convert error, i.e. (note I am using the Lazypacakge to pipeline commands)
using Lazy, TextMining, DataArrays
#>> #data(["hello","bro"]) map(StringDocument) Corpus
->LoadError: MethodError: `convert` has no method matching convert(::Type{TextAnalysis.Corpus}, ::DataArrays.DataArray{TextAnalysis.StringDocument,1})
This may have arisen from a call to the constructor TextAnalysis.Corpus(...),
since type constructors fall back to convert methods.WARNING: Error showing method candidates, aborted
I need to apply convert(Vector{GenericDocument}) to have this piece of code work:
#>> #data(["hello","bro"]) map(StringDocument) convert(Vector{GenericDocument}) Corpus
Here's the Corpus function:
type Corpus
documents::Vector{GenericDocument}
total_terms::Int
lexicon::Dict{Compat.UTF8String, Int}
inverse_index::Dict{Compat.UTF8String, Vector{Int}}
h::TextHashFunction
end
function Corpus(docs::Vector{GenericDocument})
Corpus(
docs,
0,
Dict{Compat.UTF8String, Int}(),
Dict{Compat.UTF8String, Vector{Int}}(),
TextHashFunction()
)
end
Corpus(docs::Vector{Any}) = Corpus(convert(Array{GenericDocument,1}, docs))
What am I missing here?
There are two issues with the code.
#data creates a DataArray, which TextAnalysis code does not know what to do with. Writing an explicit convert, as you do, is a perfectly acceptable way to handle this. Is there a particular reason you are using DataArrays, rather than plain arrays?
Even if you remove #data, the code will fail currently. That is because the constructor of Corpus is limited in what it can take as parameters (and the fact that Julia arrays are not co-variant).
One workaround, again is to explicitly convert. A proper fix will be to define the a new constructor as below. This should probably be added to the package, but you can use the following code in your REPL directly:
TextAnalysis.Corpus{T<:AbstractDocument}(docs::Vector{T}) = TextAnalysis.Corpus(convert(Array{GenericDocument,1}, docs))
with this, you can do:
julia> #>> ["hello","bro"] map(StringDocument) Corpus
A Corpus
I have found similar questions but none that worked for my situation, so I am asking my own.
I want to use a library function that takes a pointer to a std::vector, and fills it with data.
I already have a C++/CLI Wrapper set up.
I am currently trying to instantiate the vector in the wrapper,
private:
std::vector<int>* outputVector
and in the constructor, I instantiate it :
outputVector = new std::vector<int>();
Now, in the wrapper method that calls the c++ library function :
m_pUnmanagedTPRTreeClass->GetInRegion(..., &outputVector)
I omitted the other parameters because they dont matter for this case. I can already use other functions of the library and they work without a problem. I just can't manage to pass a pointer to a std::vector.
With the code like this, I get the error message :
error C2664: 'TPSimpleRTree<CT,T>::GetInRegion' : cannot convert parameter 3 from 'cli::interior_ptr<Type>' to 'std::vector<_Ty> &'
I have tried removing the "&", as I am not great at C++ and am unsure how to correctly use pointers. Then, the error becomes :
error C2664: 'TPSimpleRTree<CT,T>::GetInRegion' : cannot convert parameter 3 from 'std::vector<_Ty> *' to 'std::vector<_Ty> &'
EDIT: I have tried replacing "&" by "*", it does not work, I get the error :
cannot convert from 'std::vector<_Ty>' to 'std::vector<_Ty> &'
The signature of the c++ function for the vector is so :
GetInRegion(..., std::vector<T*>& a_objects)
Given the signature:
GetInRegion(..., std::vector<T*>& a_objects)
You would call this (in C++ or C++/CLI) like:
std::vector<int*> v;
m_pUnmanagedTPRTreeClass->GetInRegion(..., v);
Then you can manipulate the data as needed or marshall the data into a .Net container.
'std::vector<_Ty> *' to 'std::vector<_Ty> &'
is self explanatory, you need to dereference instead of taking a pointer, so instead of:
m_pUnmanagedTPRTreeClass->GetInRegion(..., &outputVector)
use:
m_pUnmanagedTPRTreeClass->GetInRegion(..., *outputVector)
^~~~~~~!!
after your edit I see your getinregion signature is:
GetInRegion(..., std::vector<T*>& a_objects)
so it accepts std::vector where T is a pointer, while you want to pass to getinregion a std::vector where int is not a pointer.
I have a dictionary that maps a key to a function object. Then, using Spark 1.4.1 (Spark may not even be relevant for this question), I try to map each object in the RDD using a function object retrieved from the dictionary (acts as look-up table). e.g. a small snippet of my code:
fnCall = groupFnList[0].fn
pagesRDD = pagesRDD.map(lambda x: [x, fnCall(x[0])]).map(shapeToTuple)
Now, it has fetched from a namedtuple the function object. Which I temporarily 'store' (c.q. pointing to fn obj) in FnCall. Then, using the map operations I want the x[0] element of each tuple to be processed using that function.
All works fine and good in that there indeed IS a fn object, but it behaves in a weird way.
Each time I call an action method on the RDD, even without having used a fn obj in between, the RDD values have changed! To visualize this I have created dummy functions for the fn objects that just output a random integer. After calling the fn obj on the RDD, I can inspect it with .take() or .first() and get the following:
pagesRDD.first()
>>> [(u'myPDF1.pdf', u'34', u'930', u'30')]
pagesRDD.first()
>>> [(u'myPDF1.pdf', u'23', u'472', u'11')]
pagesRDD.first()
>>> [(u'myPDF1.pdf', u'4', u'69', u'25')]
So it seems to me that the RDD's elements have the functions bound to them in some way, and each time I do an action operation (like .first(), very simple) it 'updates' the RDD's contents.
I don't want this to happen! I just want the function to process the RDD ONLY when I call it with a map operation. How can I 'unbind' this function after the map operation?
Any ideas?
Thanks!
####### UPDATE:
So apparently rewriting my code to call it like pagesRDD.map(fnCall) should do the trick, but why should this even matter? If I call
rdd = rdd.map(lambda x: (x,1))
rdd.first()
>>> # some output
rdd.first()
>>> # same output as before!
So in this case, using a lambda function it would not get bound to the rdd and would not be called each time I do a .take()-like action. So why is that the case when I use a fn object INSIDE the lambda? Logically it just does not make sense to me. Any explanation on this?
If you redefine your functions that their parameter is an iterable. Your code should look like this.
pagesRDD = pagesRDD.map(fnCall).map(shapeToTuple)
I have a function expecting array pointers in Cython, e.g. with the signature
cdef void foo(DTYPE_t* x)
and a function which receives a typed memoryview from which I would like to call the first function, e.g.:
def bar(DTYPE_t[:,::1] X not None):
foo(X[0])
Which naturally does not even compile. I've been trying for some hours now to figure out a way to access the data pointer underlying the memory view i.e. something like X.data.
Is there a way to achieve this? I sadly can not adept foo to accept memoryviews.
You want this:
foo(&X[0,0])
The solution is that simple, it's quite embarrassing
&X[i,j]
i.e. the call will become
foo(&X[i,0])
Which, by the way, also works with the old style numpy arrays, which are initialized like
object[int, ndim=2, mode='strided'] X
PS: If you would like to pass C-array, X[i][j] would be required, which equally works for typed memoryviews.
I am very new to IDL so forgive me if this seems dumb. I am trying to simply read a .tif image and let IDL show the image. My commands were:
IDL> a=read_image('frame_1.tif')
IDL> help, a
then I receive
A BYTE = Array[3, 560, 420]
IDL> plotimage ,bytscl(a)
But after I execute the last command, I receive "Keyword parameters not allowed in call."I don't understand what I did wrong. Any ideas?
Thank you in advance.
I'm not sure what is going on, but one thing that seems to generate that error message is that IDL gets confused between arrays (which can use parens to index) and function calls. Try using strictarr before the call:
compile_opt strictarr
This will mean that you must use square brackets to index arrays and parens for function calls.
Note, that you have to put this into every routine (and at the command line) you are having trouble with.