Composite Vectors

Instead of storing all the vector components in a flat way (as most computing libraries), Nieme -- since v0.6 -- uses Composite Vectors: a vector is either a flat vector, either the union of multiple sub-vectors. This original data-structure allows to introduce structure in the vectors which is advantageous in numerous situations such as the ones below:

Parameters of a Multi Layer Network. Such learning machines have two sets of parameters: one set for the first layer, another set for the second layer. Composite Vectors allows to see all the parameters as a unique vector while keeping layer-specific parameters in separated sub-vectors.

Feature Descriptions. When using Feature Generators, objects are described with features coming from different families (e.g. bigrams, trigrams, input features, ...). With Composite Vectors, there is a sub-vector per feature family. More generally, Composite Vectors allows a very efficient implementation of the Feature Generators.

Sub-vector sharing. It is often the case that multiple objects share a set of common features. Thanks to Composite Vectors, it is easy to share the common corresponding sub-vectors. This is a natural way to spare memory in many cases.

Sub-linear dot products. The last but not the least: when performing dot products between two Composite Vectors, computations can be pruned each time a sub-vector appears on one side but not on the other.

next