Global word embedding for local word embeddings - vector

Imagine, based on some criteria, we have three vectors (vec1, vec2, vec3) for the word king, and we call these three vectors as local vectors for the king. Which way is sufficient to generate a global (single, or unique) vector for the word king from these three local vectors (vec1, vec2, vec3) that can be used in downstream task.
There are three possible options:
Concat(vec1, vec2, vec3)
average(vec1, vec2, vec3)
sum(vec1, vec2, vec3)
are they sufficient? WHY?
any reference?

You haven't stated how those 3 vectors were created, and that matters. If the method they were created meant they all shared, in some important sense, the "same coordinate system", then it might be appropriate to add or average them.
But, if they're derived in unrelated ways, and thus their individual coordinates aren't part of the same self-consistent/comparable system, then concatenation makes more sense, preserving their individual information – forwarding all that info to downstream steps, without any assumptions about what's more important, nor allowing any 'cancelling-out' of position info from the random/arbitrary interaction of unrelated coordinate-systems.
Also, if vec1, vec2, and vec3 are of different-dimensionalities, concatenation always works, but sum/average won't.
(I could possibly give more reasoning if you added more concrete information about the different sources of vec1, vec2, vec3.)

Related

What is the difference between a list and a pairlist in R?

In reading the documentation for lists, I found references to pairlists, but it wasn't clear to me how they were different from lists.
Pairlists in day to day R
There are two places that pair lists will show up commonly in day to day R. One is as function formals:
str(formals(var))
The other is as language objects. For example:
quote(1 + 1)
produces a pairlist of type language (LANGSXP internally). The principal reason why you would even care about being aware of this is that operations such as length(<language object>) or language_object[[x]] can be slow because of how pairlist are stored internally (though long pairlist language objects are somewhat rare; note expressions are not pairlists).
Note that empty elements are just zero length symbols, and you can actually store them in lists if you cheat a bit (though you probably shouldn't do this):
list(x=substitute(x, alist(x=))) # hack alert
All that said, for the most part, OP is correct that you don't need to worry about pairlists too much unless you are writing C code for use in R.
Internal differences between lists and pairlists
Pairlists and list are different principally in their storage structure. Pairlists are stored as a chain of nodes, where each node points to the location of the next node in addition to the node's contents and the node's "name" (See CAR/CDR wiki article for generic discussion). Among other things this means you can't know how many elements there are in a pairlist unless you know what element is the first one, and you then traverse the entire list.
Pairlists are used extensively in the R internals, and do exist in normal R use, but most of the time are disguised by the print or access methods and/or coerced to lists when accessed.
Lists are also a list of addresses, but unlike pairlists, all the addresses are stored in one contiguous memory location and the total length is tracked. This makes it easy to access any arbitrary member of the list by location since you can just look up the address in the memory table. With a pairlist, you would have to jump from node to node until you eventually got to the desired node. Names are also stored as attributes of the list proper, instead of being attached to each node of a pairlist.
Benefits of pairlists
One (generally small) benefit of pairlists is that you can add to them with minimal overhead since you only need modify at most two nodes (the node ahead of the new node, and the new node itself), whereas with a list you may need to re-allocate the entire address table with an increase in size (this is typically not much of an issue since the address table is usually very small compared to the size of the data the table points to). There are also many algorithms that specialize in pairlist manipulation (e.g. sorting, indexing, etc.), but those can be ported to normal lists as well.
Less relevant for day-to-day use since you can only do this in internals, it is very easy to modify list from a programming perspective by changing what any arbitrary element points to.
Loosely related to the above, pairlists are likely be more efficient when you have highly nested objects. lists can easily replicate this structure, but each list and nested list will be saddled with the extra memory address table. This is likely the reason pairlists are used for language objects that very likely have a high nesting / element ratio.
For more details see R Internals (look for LISTSXP and VECSXP, pairlists and lists respectively, in the linked location).
edit: interestingly an experiment to compare the memory footprint of a list to a pairlist shows the pairlist to be larger, so the storage efficiency argument may be incorrect (not sure if object.size can be trusted here):
> plist_to_list <- function(x) {
+ if(is.call(x)) x <- as.list(x)
+ if(length(x) > 1) for(i in 2:length(x)) x[[i]] <- Recall(x[[i]])
+ x
+ }
> add_quote <- function(x, y) call("+", x, y)
> x <- Reduce(add_quote, lapply(letters, as.name))
> object.size(x)
7056 bytes
> y <- plist_to_list(x)
> object.size(y)
4656 bytes
First of all, pairlists are deprecated
pairlists are deprecated for normal use because "generic vectors" are typically more efficient. You won't ever need to worry about them unless you are working on R internals.
lists can contain named elements
Each element in a list in R can have a name. You can access each element in a list either by name or by its numerical index.
Here is an example of a list in which the second element is named 'second':
> my.list <- list('A',second='B','C')
> my.list
[[1]]
[1] "A"
$second
[1] "B"
[[3]]
[1] "C"
All elements can be indexed by its position in the list. Named elements can additionally be accessed by name:
> my.list[[2]]
[1] "B"
> my.list$second
[1] "B"
Also, each element in a list is a vector, even if it is only a vector containing a single element. For more about lists, see How to Correctly Use Lists in R?.
pairlists can contain empty named elements
A pairlist is basically the same as a list, except that a pairlist can contain an empty named element, but a list cannot. Also, a pairlist is constructed using the alist function.
> list('A',second=,'C')
Error in as.pairlist(list(...)) : argument is missing, with no default
> alist('A',second=,'C')
[[1]]
[1] "A"
$second
[[3]]
[1] "C"
But, as mentioned earlier, they are deprecated. They do not have any benefit or advantage over lists that I know of.

What is the difference between permutations and derangements?

I have been given a program to write difference combinations of set of number entered by user and when I researched for the same I get examples with terms permutations and derangements.
I am unable to find the clarity between the them. Also adding to that one more term is combinations. Any one please provide a simple one liner for clarity on the question.
Thanks in advance.
http://en.wikipedia.org/wiki/Permutation
The notion of permutation relates to the act of rearranging, or permuting, all the members of a set into some sequence or order (unlike combinations, which are selections of some members of the set where order is disregarded). For example, written as tuples, there are six permutations of the set {1,2,3}, namely: (1,2,3), (1,3,2), (2,1,3), (2,3,1), (3,1,2), and (3,2,1). As another example, an anagram of a word, all of whose letters are different, is a permutation of its letters.
http://en.wikipedia.org/wiki/Derangement
In combinatorial mathematics, a derangement is a permutation of the elements of a set such that none of the elements appear in their original position.
The number of derangements of a set of size n, usually written Dn, dn, or !n, is called the "derangement number" or "de Montmort number". (These numbers are generalized to rencontres numbers.) The subfactorial function (not to be confused with the factorial n!) maps n to !n.1 No standard notation for subfactorials is agreed upon; n¡ is sometimes used instead of !n.2

Count negative numbers in list using list comprehension

Working through the first edition of "Introduction to Functional Programming", by Bird & Wadler, which uses a theoretical lazy language with Haskell-ish syntax.
Exercise 3.2.3 asks:
Using a list comprehension, define a function for counting the number
of negative numbers in a list
Now, at this point we're still scratching the surface of lists. I would assume the intention is that only concepts that have been introduced at that point should be used, and the following have not been introduced yet:
A function for computing list length
List indexing
Pattern matching i.e. f (x:xs) = ...
Infinite lists
All the functions and operators that act on lists - with one exception - e.g. ++, head, tail, map, filter, zip, foldr, etc
What tools are available?
A maximum function that returns the maximal element of a numeric list
List comprehensions, with possibly multiple generator expressions and predicates
The notion that the output of the comprehension need not depend on the generator expression, implying the generator expression can be used for controlling the size of the generated list
Finite arithmetic sequence lists i.e. [a..b] or [a, a + step..b]
I'll admit, I'm stumped. Obviously one can extract the negative numbers from the original list fairly easily with a comprehension, but how does one then count them, with no notion of length or indexing?
The availability of the maximum function would suggest the end game is to construct a list whose maximal element is the number of negative numbers, with the final result of the function being the application of maximum to said list.
I'm either missing something blindingly obvious, or a smart trick, with a horrible feeling it may be the former. Tell me SO, how do you solve this?
My old -- and very yellowed copy of the first edition has a note attached to Exercise 3.2.3: "This question needs # (length), which appears only later". The moral of the story is to be more careful when setting exercises. I am currently finishing a third edition, which contains answers to every question.
By the way, did you answer Exercise 1.2.1 which asks for you to write down all the ways that
square (square (3 + 7)) can be reduced to normal form. It turns out that there are 547 ways!
I think you may be assuming too many restrictions - taking the length of the filtered list seems like the blindingly obvious solution to me.
An couple of alternatives but both involve using some other function that you say wasn't introduced:
sum [1 | x <- xs, x < 0]
maximum (0:[index | (index, ()) <- zip [1..] [() | x <- xs, x < 0]])

Fortran: multiplication with matrices only containing +1 and -1 as entries

What would be an efficient way (in terms of CPU-time and/or memory requirements) of multiplying, in fortran9x, an arbitrary M x N matrix, say A, only containing +1 and -1 as its entries (and fully populated!), with an arbitrary (dense) N-dimensional vector, v?
Many thanks,
Osmo
P.S. The size of A (i.e., M and N) is not known at the compilation time.
My guess is that it would be faster to just do the multiplication instead of trying to avoid the multiplication by checking the sign of the matrix element and adding/subtracting accordingly. Hence, just use a general optimized matrix-vector multiply routine. E.g. xGEMV from BLAS.
Depending on the usage scenario, if you have to apply the same matrix multiple times, you might separate it into two parts, one with the positive entries and one with the negatives.
With this you can avoid the need for multiplications, however it would introduce an indirection, which might be more expensive then the multiplications.
Thus janneb's solution might be the most suitable.

Difference between a vector in maths and programming

Maybe this question is better suited in the math section of the site but I guess stackoverflow is suited too. In mathematics, a vector has a position and a direction, but in programming, a vector is usually defined as:
Vector v (3, 1, 5);
Where is the direction and magnitude? For me, this is a point, not a vector... So what gives? Probably I am not getting something so if anybody can explain this to me it would be very appreciated.
If we are working in cartesian coordinates, and assume (0,0,0) to be the origin, then a point p=(3,1,5) can be written as
where i, j and k are the unit vectors in the x, y and z directions. For convenience sake, the unit vectors are dropped from programming constructs.
The magnitude of the vector is
and its direction cosines are
respectively, both of which can be done programmatically. You can also take dot products and cross-products, which I'm sure you know about. So the usage is consistent between programming and mathematics. The difference in notations is mostly because of convenience.
However as Tomas pointed out, in programming, it is also common to define a vector of strings or objects, which really have no mathematical meaning. You can consider such vectors to be a one dimensional array or a list of items that can be accessed or manipulated easily by indexing.
In mathematics, it is easy to represent a vector by a point - just say that the "base" of the vector is implied to be the origin. Thus, a mathematical point for all practical purposes is also a representation of a mathematical vector, and the vector in your example has the magnitude sqrt(3^2 + 1^2 + 5^2) = 6 and the direction (1/2, 1/6, 5/6) (a normalized vector from the origin).
However, a vector in programming usually has no geometrical use, which means you really aren't interested in things like magnitude or direction. A vector in programming is rather just an ordered list of items. Important here is that the items need not be numbers - it can be anything handled by the language in question! Thus, ("Hello", "little", "world") is also a vector in programming, although it (obviously) has no vector interpretation in the mathematical sense.
Practically speaking (!):
A vector in mathematics is only a direction without a position (actually something more general, but to stay in your terminology). In programming you often use vectors for points. You can think of your vector as the vector pointing from the origin (0,0,0) to the point (3,1,5), called the location vector of the point. Consult texts on analytical and affine geometry for more insight.
A Vector in computer science is an "one dimensional" data structure (array) (can be thought as direction) with an usually dynamic size (length/magnitude). For that reason it is called as vector. But it's an array at least.
A vector also means a set of coordinates. This is how it is used in programming. Just as a set of numbers. You might want to represent position vectors, velocity vectors, momentum vectors, force vectors with a vector object, or you may wish to represent it any way that suits you.
Many times vector quantities may be represented by 4 coordinates instead of 3 (see homogeneous coordinates in computer graphics) so a physical vector is represented by a computer vector with 4 elements. Alternatively you can store direction and magnitude separately, or encode them with 3, 4 or more coordinates.
I guess what I am getting to, is that computer languages are designed to represent physical models, but abstract data containers that the programmer use as tools for his/hers modeling.
Vector in math is an element of n-dimensional space over some field(e.g. real/complex number, functions, string). It may have infinite dimension, e.g. functional space L^2. I don't remember infite-dimensional vectors were used in programming (infinite vectors are not vectors with non-limited length, but vector with infite number of elements)
The most rigorous statement is that a mathematical vector is a first-order tensor that transforms from one coordinate system to another according to tensor transformation rules. The physical idea to keep in mind is that vectors have both magnitude and direction.
Programming vectors are data structures that need not transform according to any rules and may or may not have a notion of a coordinate system as reference. If you happen to use a vector data structure to hold numbers, they may conform to the mathematical definition. But if you have a vector of objects, it's unlikely that they have anything to do with coordinate transformations.

Resources