Fluid Construction Grammar

Syntax and Semantics

Introduction

Fluid Construction Grammar's (FCG) linguistic perspective is in the general line of cognitive linguistics and construction grammar and like many other contemporary theories it is feature structure- and unification-based. It is currently the only computational construction grammar formalism that can handle both parsing and production using the same set of constructions rather than using separate generation and parsing procedures as is done in other formalisms. So far, FCG has mainly been applied in research on the emergence and evolution of grammatical phenomena. This document will detail the syntax and semantics required for writing FCG constructions of varying complexity.

Syntax and Semantics of FCG

The core data structure in FCG is a Coupled Feature Structure (CFS). As the name implies this is a coupling of two feature structures divided by <-->. These two feature structures are also referred to as left-pole and right-pole and in general (but not necessarily) the left-pole contains the semantics of the structure, the right pole the syntax.

A Feature Structure} (FS) is an unordered list of units and a unit is a list starting with a name (which has to be unique in the feature structure) followed by the actual features. A list encloses its elements within parentheses, thus a list of the elements a b and c is written as (a b c). A list can also include sub lists as for instance the list (e f) instead of the element c which results in the list (a b (e f)).

A feature then is a list starting with a name (which has to be unique in the unit, not in the feature structure) followed by its value which can be any sort of (nested) list structure\footnote{There is one feature, the referent feature, where the value should not be a list but can be a single symbol. But this is an exception to the rule.}. The template for a coupled feature structure looks like this:

    ((unit-1-name
      (feature-name-1 values)   // values should be a list
      (feature-name-2 values))  // unique feature-names in the unit
     (unit-2-name               // unique unit-names in the FS
      (feature-name-2 values)   // non unique feature-names in the FS
      (feature-name-3 values)))
    <-->
    right-pole (similar to the left-pole)

Note that a unit cannot contain another unit (i.e. they cannot be nested) and thus you cannot in this way build a tree-like feature structure. Instead in FCG a tree structure is built by using a special subunits feature of which the value is a list of unit-names as shown below.

Subunits structure in list notation
subunits structure in list notation
Subunits structure as shown in the web interface
subunits structure as shown in the web interface

Language processing in FCG always starts from an initial coupled feature structure consisting of one unit at both sides containing either only meaning (in production) or only form (in parsing). This CFS is then gradually modified by applying a sequence of FCG constructions. These constructions are also coupled feature structures but they can contain variables and special FCG operators that guide the unification process. An FCG variable is represented as a symbol that starts with a question mark. During unification it can be bound to a symbol, a list or another variable, but of course only to one value (check the examples).

    (unify '(a b (c)) '(a b (c))) // unifies (both lists are equal)
    (unify '(a ?x c) '(a b c))    // unifies and binds variable ?x to b
    (unify '((a (?z)) ?z)         // won't unify because ?z
           '((a (b)) c))          // should be bound both to b and c
    (unify '(a b) '(b a))         // won't unify because the order differs
    (unify '(a b c) '(a b))       // won't unify because of c

FCG Special Operators

The FCG special operators guide the unification process by making it either more flexible or stricter. This section gives an overview of the most important ones. These operators are normally put at the beginning of a list and affect the values of that list.

Includes Operator (==)

Functionality: The includes operator allows the list to be a sub-list of the other list and the ordering doesn't matter.

Example: The last two examples from above will work by adding the includes operator.

    (unify '(== a b) '(b a))
    (unify '(== b a) '(a b c))

Permutation Operator (==p)

Functionality: The permutation operator allows the other list to be a permutation of the list (i.e. the order doesn't matter).

Example:

    (unify '(==p ?x b) '(b a))
    (unify '(==p c a) '(c b a))
    // This won't unify, although == would

Includes Uniquely Operator (==1)

Functionality: The includes uniquely operator is like the includes operator but doesn't allow elements from the list to appear more than once in the other list. If the element is a list it only checks the first element of this list.

Example:

    (unify '(==1 a b) '(a a b))
    // won't unify although == would
    (unify '(==1 (a)) '((a) (a b)))
    // won't unify

Includes Not Operator (==0)

Functionality: The Includes Not operator essentially disallows the elements that follow to appear in the other list. Even if one of them appears, it is enough to block the unification (also the ordering doesn't matter).

Example:

    (unify '(==0 a b c) '(x))
    // unifies
    (unify '(==0 b a c) '(a))
    // does not unify

The above extensions allow us to write FCG constructions such the following one.

FCG construction in list representation.
FCG construction in list representation.
FCG construction in graphical representation
FCG construction in graphical representation

Remark that unification in FCG never adds elements, except when binding variables and thus works differently than HPSG unification. Adding elements is done by another operation called merge. Just like unification, merging requires two feature structures of which only one can contain special operators. We call the feature structure containing the special operators the pattern and the other one the source. FCG constructions are thus patterns and the feature structures they apply on the source. The merger will look for any extension of the source so that it would unify with the pattern. In the examples below the pattern is the first parameter, the source the second.

    (fcg-merge 'a 'a)          // returns 'a
    (fcg-merge '(a) '(a))      // returns '(a)
    (fcg-merge '(a) '(a b))    // does not merge
    (fcg-merge '(== a) '(a b)) // returns '(a b)
    (fcg-merge '(a b) '(a))    // returns '(a b)
    (fcg-merge '(==0 a) '(b))  // returns '(b)

Merging can also return multiple hypotheses, for example (fcg-merge '(== ?x a) '(a b c)) returns (a b c) with ?x bound to either b or c.

Before we continue with more advanced ways to alter the feature structure there is one last key idea crucial to the understanding of grammatical constructions in FCG. This is the idea of linking through variable equalities. As noted earlier one variable cannot be bound to multiple values but multiple variables can be bound to the same value (which can be a variable itself), we call this a variable equality.

Modification of Units and Moving Information between Units

Although we can now modify structures by merging in new information this is not powerful enough to build the complex constituent structures needed for processing natural language. In fact there are three important operations we currently cannot achieve:

  1. We are unable to create new units.This could be done by merging, but not for complex trees and this would also cause problems with the bi-directionality of FCG.
  2. We cannot relocate existing (or new) units in the tree.
  3. We cannot move features from one unit to another.

In what follows we will show how we have solved these problems in FCG through a special tree manipulation operator called the J-operator.

The J-operator is specified inside the feature structure itself and resides at the same level as the units, which is why we refer to the declarations of these operations as J-units. Such a J-unit doesn't however specify a unit at all, but instead specifies operations on a unit. To distinguish it clearly from "normal" units a J-unit does not start with a symbol (the unit-name) but instead with a list starting with the symbol J.

A J-unit specifies operations for only one unit, called the focus unit. Of course a feature structure can contain multiple J-units allowing operations on multiple units. The focus unit therefore is the only parameter you are required to supply.

    ((?top
      (form ((string ?top "big"))))
     ((J ?new-unit)
      (syn-cat ((pos adjective)))))
The above feature structure consists of only one "real" unit ?top and one J-unit. When merged it will create a new unit \verb+?new-unit+ containing the syntactic category adjective. It's that easy to create new units and as shown in the example, the body of a J-unit (i.e. the part after the initial list) resembles that of a regular unit in that it can contain feature value pairs.

Although we can now create new units we would still like to specify where it should be located in the feature structure tree. This is, we would like to specify its parent unit and optionally even child units. This can be done by two optional parameters following the focus unit, first specifying the parent and then a list of children as shown below.

Syntactic Feature Structure
Syntactic Feature Structure
Syntactic pole of an FCG construction
Syntactic pole of an FCG construction
Transformation of a feature structure by a J-unit
Transformation of a feature structure by a J-unit

In the examples so far the focus unit has always been a reference to a new unit. The focus unit however, can refer to an existing unit as well. It will then not create a new unit but operate on the referred unit.

From the three missing operations presented above we have now addressed the first two. All that remains is the moving of features from one unit to another (existing or new) unit. To move something from A to B you need a way to mark the thing you wish to move and whereto. Marking what feature value pair you wish to move is done by the tag-operator which allows you to bind a feature-value pair to a variable. It has the following syntax.

    (tag ?tag-name (feature value))

You can then simply put the tag variable (i.e. ?tag-name) in the body of a J-unit to mark where the feature value pair should be moved to. It works like cut and paste, you cut by the tag-operator and paste by placing the tag-variable at the desired location in a J-unit. This means the body of a J-unit also allows these tag-variables to reside there next to feature value pairs. You cannot refer to a tag-variable in regular units. An example is shown in below where a very small initial feature structure containing only one unit with some meaning is transformed into a a new feature structure containing two units and where the tagged meaning is moved from one to the other.

To conclude the syntax of a J-unit looks as follows:

    ((J focus parent children)
     body)
with focus the only required parameter being a new variable or one of an existing unit. Parent should be a variable referring to an existing unit and children a list of existing unit variables. Body can contain tag-variables next to regular feature value pairs.

Transformation of a feature structure by a J-unit including tags.
Transformation of a feature structure by a J-unit including tags.

Language Processing in FCG

So far we have been concentrating on unification and merging of single feature structures.We will now focus on coupled feature structure and how they are processed in bi-directional language processing.

Fluid Construction Grammar supports both production (generation in HPSG terminology) and parsing using the same set of constructions. Both start with an initial coupled feature structure that contains either only meaning (in production) or form (in parsing). This coupled feature structure is the key data structure of the language processing. It is this structure that will be modified by applicable constructions finally resulting in a much larger feature structure containing the inferred form and meaning. A high level view of such processing shown below.

A schematic high level depiction of language processing in FCG.
A schematic high level depiction of language processing in
                            FCG.

As is clear from the figure, applying a construction consists of at least two phases, a unification phase and a merge phase. As explained earlier unification is quite strict and can thus be seen as a conditional for the construction to apply. We do not both left and right pole of the coupled feature structure but only the left pole in production and the right-pole in parsing. When unification of the required pole is successful both poles of the construction are merged with the central coupled feature structure.