The MINSE syntax necessary to convey a structured expression is completely described by the page on syntax, but to augment that syntax you can use notation definitions specific to your application. Right now, all a notation definition contains is a list of operator declarations, allowing you to define your own infix operators. (More notational power might be available in future.)
There is currently just one defined notation.
Although the tree-like structure we described on the syntax page is sufficient to express just about anything, the use of infix operators is just too common to ignore. The sum of three and five could be explicitly written thus:
'sum(3,5)
However, such notation would quickly grow tiresome, since we're all
used to just writing "3 + 5". Thus, an extra layer of processing is
added to support a number of infix operators such as addition,
subtraction, and so on, as well as symbols that must themselves be
entered as compounds, such as "intersect" for sets. Operators may or
may not have an infix short form, but like all
compounds, they have names. For instance, the addition operator has
the name "sum
" and the short form "+
".
The name for an operator may be used as an operator when
the name is preceded by a period ("."), or as an ordinary compound when
given after a single-quote ("'") with its sub-elements in parentheses.
Thus, for example, the following three expressions are taken as equivalent:
and the following two expressions are taken as equivalent:3 + 5 3 .sum 5 'sum(3,5)
'a .noteq 'b 'noteq(a,b)
Expressions involving operators are converted to a canonical
tree structure such as 'sum(3,5)
before compounds are
transformed by the style definition. The notation definition specifies,
for all operators, the compounds they represent, their precedence,
and their associativity. For binary operators, associativity may be
"left", "right", or "nonassoc". A left-associative operator, like
subtraction, groups the leftmost elements first; a right-associative
operator, like exponentiation, groups the rightmost elements first.
An operator is non-associative when it doesn't make sense to use in
succession (for example, the associativity of the operator "is an
element of" is "nonassoc").
Sometimes, though, we need operators with more than two elements. While it is okay to associate "a + b + c" as "(a+b) + c", for instance, it is certainly wrong to associate "P implies Q implies R" and turn it into "(P implies Q) implies R"! So a special associativity type called "serial" is also defined, which converts a series of several of the same operator into a compound with possibly many elements. Another special associativity type called "chain" is defined to deal with a series of several operators that may be different (but are all of a similar nature), like "X is less than Y is less than or equal to Z".
The result of conversion is a semantic tree with all operators replaced by compounds and each compound explicitly grouped with its arguments. (A MINSE expression that contains no operators and whose syntax can be completely described by the syntax page without an additional notation definition is called a canonical structured expression or CSE). Here is an example of how conversion might take place:
a * _ c / (_ 3 * x - y) + 2 * b
when used with the "math" notation definition becomes equivalent to the CSE
'sum('quot('prod(a, 'neg(c) ), 'diff('prod('neg(3), x ), y ) ), 'prod(2, b ) )
Happily, you should never have to worry about doing this sort of expansion manually. It's just nice to understand how everything works and to see the consistent structure underneath, in case you later want to add new operators or define your own notations. This method of separately performing conversion (based on the notation) and transformation (based on the style) easily lends itself to extending the language to include new operators or constructions, and gives us the flexibility to define operators to mean different things in different situations.
There is one important point to note: common written mathematical notation simply places two expressions side-by-side to indicate that they should be multiplied. If we used this convention in MINSE syntax, this would make something like:
ambiguous, depending on whether f is a variable or a relation: is thatf(x + y)
'prod(f, x + y)
,
or the function f with argument
(x + y)
?
Because we are already adopting the convention of "name followed by
elements in parentheses" to mean a MINSE compound, we choose the
latter meaning for f(x + y)
in the math context.
If anything which is immediately followed by a parenthesized
expression is treated as a relation, it becomes possible to also
express relations of relations and to use operators (like
composition) on relations.
To accomplish this, we treat such a case as another infix operator defined in the notation -- in this example, almost as though the left-parenthesis itself were acting as an infix operator. It also has a name and has precedence among the other operators, and is converted to canonical structure as part of the same process. A left-associative parenthesis operator groups to the left first, and thus it joins a parenthesized element with the element on its right. A right-associative parenthesis operator (like our example) groups to the right first, and joins a parenthesized element with the element on its left. This behaviour is triggered by declaring an operator with the special symbol "(".
Here is another example of conversion:
a * 'compose(f,g)(b / c)
when used with the "math" notation definition becomes equivalent to the CSE
'times(a, 'apply('compose(f, g ), 'quot(b, c ) ) )
Notation files are written in Python, and consist of a number
of calls to the makeop
function which is part of the
minse
module. Here's an example:
This fragment defines operators on five precedence levels. The first argument toimport minse # algebra makeop(1, "right", "exp ^ **") makeop(2, "left", "prod *", "quot /") makeop(3, "unright", "neg _") makeop(4, "left", "sum +", "diff -") makeop(5, "chain", "comparison", "eq =", "lt <", "gt >") makeop(2, "left", "divby")
makeop
is a (possibly floating-point)
number representing precedence, with the lower numbers denoting higher
precedence. The second argument is the associativity of the precedence
level, which is either "unleft" or "unright" for unary operators;
"left", "right", or "nonassoc" for binary operators; or "serial" or
"chain" for operators with many arguments.
For all except "chain" operators, each of the rest of the arguments is a string defining one operator on that precedence level, which consists of its compound name possibly followed by some short forms. (In the case of "serial" associativity, there must only be one operator defined for that precedence level.) All of these operators will get parsed into trees with the compound name at the parent node and the sub-elements in child nodes. For instance, with the notation given above, the expression
is parsed into a tree with_a^2
'neg
at the root node,
with the single child 'exp
, which has two children
a
and 2
.
For a group of operators with "chain" associativity, the third argument is the name of the compound to form, which will be placed at the parent node of the resulting tree. After parsing a chain of operators, the child nodes will contain all of the operands and operator names in the order seen. For example, with the notation given above, the expression
is parsed into a tree with1 < i = j
'comparison
at the root node,
which has the five children 1
, 'lt
,
i
, 'eq
, and j
.
As the example shows, you don't have to declare all of the operators
for a given precedence level at the same time. The last line adds the
divby
operator to the group at level 2. If you do this,
though, the associativity type must be consistent across all the calls
to makeop
for a given level.
It is safe to declare an operator more than once. The last declaration of any particular operator is the one that will be used. Thus you can use an existing notation definition and then change it to suit your needs.