Expressions and Operators

The TADS 3 language uses an algebraic style of notation for expressions. Constants (strings, integers, floating point values), variables, and object references can be combined with operators to form expressions. Most operators are written with punctuation marks, and many of these come from ordinary arithmetic, such as "+" to add two values and "*" to multiply. C/C++ and Java programmers will find that the operators are almost entirely the same as in those languages.

The way to think about expressions is that every expression is a miniature program. An expression specifies a precise, step-by-step procedure for carrying out a calculation. Every expression has a well-defined, single-threaded order of operations: to carry out the formula, you just do one thing at a time, from the first step to the last step, until you've carried out the whole procedure expressed in the formula.

The steps in the procedure embodied in an expression are the operators. To carry out an expression's procedure, you figure out which operator to apply first, based on the precedence order and the associativity (left-to-right or right-to-left) of the operators; the operator with the highest precedence goes first. You apply this operator to its operand or operands. This yields a value - for example, if our expression is 3+4*5, we'd first carry out the multiplication, computing 4 times 5, which yields 20. You replace the sub-expression in the overall formula with its yielded value, and then continue to the operator with the next highest precedence: so we'd rewrite the formula as 3+20 and proceed to the addition. This would yield 23, so we'd use this to replace the 3+20. Our whole formula would now be down to 23 - there are no more operators left, so we're done.

The main reason that it's important to look at expressions this way, and important to understand the order of evaluation, is that parts of expressions can trigger "side effects." For example, the function-call operator invokes a function, which could display something in the output window, play some music, create a disk file, or any manner of thing that a function can do. Other side effects are more direct: some operators assign values to variables, so evaluating such an operator has the side effect of changing the value in a variable. The important thing to understand is that if any part of an expression does have a side effect, the effect will occur precisely at the step in the expression-procedure where that operator is reached.

Operator placement: prefix, postfix, binary, ternary

The operators fall into four positioning categories that determine how an operator is placed relative to the value or values it acts upon.

A unary prefix operator acts on a single value, and the operator comes just before the value. For example, logical negation: !a

A unary postfix operator acts on a single value, which the operator immediately follows. For example, list indexing: a[7]

Note that most unary operators are either prefix or postfix operators, but two can be used both ways: the increment and decrement operators, ++ and --.

A binary operator acts on two values. The operator is placed between the two values it acts upon. For example, arithmetic addition: a + b

There's one ternary operator, the conditional operator, which acts on three values. This operator comes in two pieces: a "?", which goes between the first and second operands, and a ":", which goes between the second and third operands: x ? 'yes' : 'no'

Operator precedence

The operators are arranged in a hierarchy of precedence. The precedence order of the arithmetic operators is generally the same as in ordinary algebra; for example, 3+4*2 evalutes to 11, because the multiplication operator has higher precedence than the addition operator.

You can override the standard order of evaluation by explicitly grouping part of an expression in parentheses. For example, (3+4)*2 evaluates to 14, since the parentheses tell the compiler to evaluate the addition first, even though the multiplication would normally have higher precedence.

The order of precedence is shown below. The operators are listed in order of decreasing precedence. When multiple operators are listed together on a line, those operators are all at the same precedence level.

Primary new
inherited
delegated
defined
__objref
&x
Postfix x++
x––
[]
.
x()
Unary ++x
––x
!
~
+x
x
Multiplicative *
/
%
Additive +

Bit-shift <<
>>
>>>
Comparison <
<=
>
>=
Equality ==
!=
is in
not in
Bitwise AND &
Bitwise XOR ^
Bitwise OR |
Conditional AND &&
Conditional OR ||
If-nil ??
Conditional ? :
Expression chaining ,
Assignment =
+=
–=
*=
/=
%=
&=
|=
^=
<<=
>>=
>>>=

Operator associativity

Each operator has a standard "associativity," which controls the default order of evaluation when two operators at the same precedence level are used consecutively. In almost all most cases, the standard associativity will produce the results you'd expect from ordinary arithmetic, so most of the time you won't even notice it; but it's worth going over the precise rules, to help explain any odd cases that might come up.

When binary operators at the same precedence level are combined, evaluation proceeds in left-to-right order, except for the assignment operators (=, +=, *=, etc). For example, 6-3-2 is evaluated as (6-3)-2. This is true of all binary operators except the assignment operators, which work right to left: a=b=3 is evaluated as a=(b=3) - that is, the value 3 is first assigned to b, yielding the assigned value (3) as the result of the overall sub-expression, and then this result is assigned to a.

The conditional operator is right-associative. This comes into play in an expression like this: a ? b : c ? d : e. Because the conditional operator groups right to left, the expression is evaluated as a ? b : (c ? d : e). This might look backwards at first, but it yields the grouping that most people would intuitively assume by thinking about it as a series of if-then-else branches: if a then b, else if c then d, else e. Don't let the parentheses confuse you into thinking that the parenthesized part has to be evaluated first, by the way - it doesn't. In fact, a is evaluated first here, because we always evaluate the controlling expression of a condition before evaluating either of the two result operands. Associativity and order of evaluation are different things.

You can always override the default associativity by using parentheses. For example:

local a = b - (c - d);

Without the parentheses, the calculation would have proceeded from left to right, so we would first have calculated (b-c) and then subtracted d from the result. The parentheses override this, ensuring that the calculation begins by calculating (c-d), then subtracting this from b.

Operators in detail

This section describes each operator's syntax and usage.

new

The new operator is used to create a new instance of a class dynamically. The syntax is:

new className [ argumentList ] 

If the argument list is omitted entirely, it's equivalent to using an empty argument list - so the following two lines are equivalent:

x = new MyClass;
x = new MyClass();

When this operator is evaluated, the VM creates an instance of the given class, and immediately invokes the method named "construct" of the new object, passing the list of arguments specified. A run-time error results if the "construct" method's parameter list doesn't have the same number of parameters as the argument list in the "new" expression.

The class must be specified by name. It's not legal to use a variable or other expression here; you can only use the literal name of a class. (If you need to create an instance of a class determined by a variable or other expression, you can use the createInstance() method of the Object intrinsic class.)

The result value of the expression is the new object reference.

inherited

The inherited operator invokes the method that the currently executing method overrides.

inherited [ superclass ]  [ . propertySpec ]  [ argumentList ] 

(See below for the syntax of the propertySpec and argumentList elements.)

If the superclass name is specified, it must be a literal class name; it can't be a variable or other expression. If the superclass name is specified, the method of the given superclass is invoked, regardless of which method the current method actually overrides.

If superclass is omitted, the VM automatically determines which method the current method overrides. This determination is made dynamically, because a given method in a given object might override different base class methods depending on the superclass composition of the actual instance. The overridden method is the one that would have been called instead of the current method, when the current method was called, if the current method had never been defined.

If the propertySpec is included, the VM invokes the specified property, regardless of the property under which the current method is defined. If propertySpec is omitted, the VM automatically uses the same property under which the current method is defined.

If the argument list is omitted entirely, it's equivalent to an empty argument list.

The result of this expression is the return value of the inherited method. If the target property isn't defined or inherited by the target superclass, the return value is nil.

This operator can only be used within a method.

Example:

myObject: Thing
  test(x)
  {
    return inherited(x) + 1;
  }
;

Refer to the inheritance model section for information on inheritance order.

delegated

This operator is similar to inherited, but allows you to specify an unrelated object to specify the target object, and further allows you to delegate to any object, regardless of any inheritance relationship with the current method's defining object.

The syntax is:

delegated objectExpr [ .propertySpec ]  [ argumentList ] 

This operator is useful when you want to circumvent the normal inheritance relationships between objects, and call a method in an unrelated object as though it were inherited from a base class of the current object. For example, you might want to create an object that sometimes acts as though it were derived from one base class, and sometimes acts as though it were derived from another class, based on some dynamic state in the object. Or, you might wish to create a specialized set of inheritance relationships that don't fit into the usual class tree model.

delegated invokes a method in another object while retaining the same "self" object as the caller.

For example:

myObj: MyClass
  handler = myOtherObj
  test(x) { return delegated handler.test(x); }
;

In this example, the test() method delegates its processing to the test() method of the object given by the handler property of the self object, which in this case is the myOtherObj object. When myObj.test() executes, its self object will be the same as it was in myObj.test(), because delegated preserves the self object in the delegatee.

In the delegatee, the targetobj pseudo-variable contains the object that was the target of the delegated expression.

defined(sym)

This operator tests whether or not a symbol is defined. If sym is defined in the program's global symbol table (as a function, property, or object name), defined(sym) yields true, otherwise it yields nil.

Note that defined has a separate meaning within an #if preprocessor directive. In an #if directive, defined(x) determines if x is defined among the preprocessor (#define) symbols. Outside of #if expressions, defined tests to see if a symbol is defined in the compiler's global symbol table, among the objects, functions, and properties.

defined has a constant value at compile-time. This means that if it's used as the controlling expression of a condition (such as in an if statement or the ?: operator), the compiler applies the condition at compile time, not at run time. In particular, if x isn't defined, defined(x) is nil, so any code that's conditional on defined(x) is simply factored out - the compiler effectively strips that code out of the final program. This property is extremely useful, in that you can "hide" a reference to a symbol behind defined(), and that reference will be stripped out of the program if the symbol isn't defined. For example:

if (defined(foo) && foo.isOpen)
   ...

If foo is defined, this turns into simply if (foo.isOpen). Otherwise, it turns into if (nil), because of the short-circuit feature of the && operator (when the first operand of && is a constant nil value, the compiler skips the second operand entirely). This means that the foo.isOpen part of the expression won't cause an "unknown symbol" error when foo isn't defined, since that part of the expression will be skipped entirely. In this example, the compiler will also skip the entire body of the if, since the controlling is nil.

This operator is especially useful in libraries where you want to make a module optional, but still want to be able to reference that optional module from other modules when it's present. For example, suppose we create a library with a "score" module that some games wish to include and some games wish to omit. Suppose further that we want to reference an object in the score module from the status line module, so that the status line can display the current score. The tricky thing about this situation is that a direct reference to the object in the status line module will effectively make the score module required, since omitting the score module would also omit that referenced object, causing a link error. The defined() operator solves the problem: rather than referencing the score object directly, we can make the reference conditional on the object's existence:

local score = (defined(libScore) ? libScore.totalScore : nil);

This solves the problem because defined(libScore) has a constant value during compilation. If the score module is included, the libScore object is present, so defined(libScore) turns into true; if the score module is omitted, defined(libScore) becomes nil. In either case, when the control expression of the ?: operator is a constant, the compiler knows it only has to compile one branch or the other; so when defined(libScore) is nil, the compiler entirely omits the first branch, eliminating the problematic reference to libScore. But when libScore is included in the build, the expression turns into libScore.totalScore, allowing the status line module to get access to the score information as we wanted. We've thus succeeded in making the score module optional, but without giving up extra features in other parts of the library that make use of the score module when it's available.

__objref(sym [, mode])

This operator is similar to defined, but is specifically for object references. If sym is the name of a defined object, the result of the operator is the object reference value, as though you had written simply sym in the first place. If the symbol isn't defined, or refers to something other than an object (such as a property or function name), the result of the expression is nil.

The mode element is optional. If specified, this must be the literal text warn or error. If it's warn, the compiler will display a warning message if the symbol isn't a defined object; if it's error, the compiler displays an error message. (The message is the same in either case; the difference is that warnings allow the build to proceed to completion, while errors stop it at the end of the current source file.) If mode isn't specified at all, there's no message of any kind of the symbol isn't a defined object; the expression simply yields nil as its value.

local x = __objref(Action, warn);
if (x == nil)
   "Action isn't defined in this build!\n";

&x

The unary & operator yields a property ID value for a given property name, or a pointer to a function, or a pointer to a built-in function. The syntax is:

& propertyName
& functionName
& intrinsicFunctionName

When applied to a property name, this operator simply yields the property ID value for the named property. It does not invoke the property. This operator has no side effects.

Similarly, when applied to the name of a function or a built-in function, it yields a pointer to the function, without invoking the function.

Property ID values and function pointers are useful because they let you decide what you're going to call at one point in the code, but actually perform the call in some other part of the program. The part that performs the call doesn't have to know exactly what it's invoking, since that's determined by the pointer value. This is a powerful tool, especially for writing reusable utility code. For example, this approach is often used to iterate through complex data structures: a single piece of code that knows how to do the iteration can be reused for all sorts of different tasks, because the actual task to perform for each item in the collection is specified through a function pointer. What's more, the task-specific functions can be reused for iterating through completely different structures, since they don't have to know anything about how the iteration part works. It makes for a very clean division of labor.

++, ––

The ++ and -- operators increment or decrement the contents of their operand. The operand must be an lvalue, which is any expression that you can assign a value to.

When these operators are used as prefix operators, preceding the operand, they increment/decrement the operand first, then they yield as the result value the modified value of the lvalue.

When these operators are used as postfix operators, following the operand, they start by temporarily saving the current value of the operand. Then they increment/decrement the operand, and finally they yield the saved value as the result of the expression.

Whether used as prefix or postfix operators, these operators have the side effect of changing the target lvalue at the time at which the operator is evaluated.

local x = 5;
local a = x++;
local b = ++x;

In the example above, we first assign a value of 5 to the local variable x. Then we evaluate x++: since this is the postfix form, it first saves the old value of x, then increments the contents of x, then yields the old value as the result of the expression - so x is changed to 6, but the value yielded by the expression is 5, so 5 is assigned to a. Next, we evaluate ++x: this first increment x, then yields the new value of x as the result - so x is changed to 7, and the value yielded is 7, so 7 is stored in b. So, when we're done, x is 7, a is 5, and b is 7.

Here's a summary of each combination:

initial a expression final a final b
15 b = ++a; 16 16
22 b = a++; 23 22
17 b = --a; 16 16
99 b = a--; 98 99

[ ]

This operator is used to index a list or lookup table value. The syntax is:

expression [ expression]

The first expression is evaluated first. This expression must yield a value that is valid for indexing: a List, a Vector, or a LookupTable. The second expression must yield a valid index value for the value to be indexed. In the case of a List or a Vector, this must be an integer value, and must be within range (from 1 to the length of the list or vector). In the case of a LookupTable, this can be any value.

The operator yields as its result the element of the list, vector, or lookup table at the given index. In the case of a lookup table, if the given index has never been assigned a value, the result is nil (there's no error in this case - it's perfectly legal).

Example:

local x = ['a', 'b', 'c', 'd'];
local y = x[3];  // stores 'c' in y

.

This operator evaluates a property or method of an object. The syntax is:

expression . propertySpec [ argumentList ] 

(For details on the propertySpec and argumentList elements, see below.)

The expression must evaluate to an object reference. This gives the target object, whose property or method will be evaluated.

If the argument list is omitted, it's equivalent to specifying an empty argument list.

Evaluating this type of expression invokes the given property or method of the given object, passing the given arguments. The argument must match in number the parameter list defined in the method that's being invoked; if not, a run-time error occurs.

If a method is invoked, the result of the expression is the return value of the method. If it's a simple value property rather than a method, the result of the expression is the property value.

Example:

local x = myObject.test(3);

( )

This operator invokes a function. It can be used to invoke a function by name, or through a function pointer expression. The syntax is:

expression ( [ argument [ , argument ... ]  ]  )

The expression can be simply the literal name of a function, or it can be any expression that yields a pointer to a function.

A pointer to a named function is obtained simply by using the function's name without the function call operator (i.e., with no argument list).

For example, this code stores a pointer to the function MyFunc in a local variable, then invokes the function through the local variable:

local x = MyFunc;
local y = x(1, 2, 3);

!

The ! operator yields the logical negation of an expression. This is a unary prefix operator: it goes immediately before its operand value.

The logical negation of a value depends upon its type:

Example:

local x = true;
local y = !x;  // stores nil in y

~

The ~ operator yields the bitwise inverse of an integer value. The bitwise inverse is the value that results from reversing each bit of the value's binary representation (i.e., changing each 0 to 1 and each 1 to 0).

For example, 17 has the binary representation 10001, so its bitwise negation is, in binary, 11111111111111111111111111101110, or FFFFFFEE in hex, or, as a signed decimal value, -18.

This operator is particularly useful for manipulating bit-mask values, where a set of bit flags are combined with the | (bitwise-OR) operator.

Example:

#define FLAG_A    0x0001
#define FLAG_B    0x0002
#define FLAG_C    0x0004

local x = FLAG_A | FLAG_B;
x = ~x;

+x

The + operator, when used as a unary prefix operator (that is, immediately preceding a single operand expression), simply evaluates and yields its operand value. It has no other effect.

The compiler generates an error if this operator is applied to a constant expression other than an integer or BigNumber value. However, this restriction doesn't apply at run-time; at run-time, the operator simply has no effect other than to evaluate its operand.

Example:

local x = 37;
local y = +x;

x

The - operator, when used as a unary prefix operator (that is, immediately preceding a single operand expression), yields the arithmetic negative of a numeric value. It can be applied to integers and BigNumber values; applying it to any other type causes a run-time error.

The result is of the same type as the operand. In the case of a BigNumber, the result has the same precision as the operand.

Example:

local x = 37;
local y = -x;

*

This operator multiplies two numeric values, yielding the arithmetic product. The operands can be integers or BigNumbers.

See below for details on how integers and BigNumber operands are handled.

Example:

local x = 37;
local y = 1.7;
local z = x * y;

/

This operator divides one numeric value by another, yielding the quotient.

If both inputs are integers, the calculation performs an integer division. This means that the result is the quotient with any fractional part discarded. Note that the result is not rounded to the nearest integer - the fractional part is simply discarded. For example, 8/3 yields 2, and (-8)/3 yields -2.

See below for details on how BigNumber operands are handled.

If the right-hand operand is zero, a run-time error results.

Example:

local x = 37;
local y = 12;
local z = x / y;

%

This is the modulo operator. It calculates the remainder that results from dividing one integer value by another. The operands must both be integers.

The result of this operation produces a value such that, for any integers a and b, (a/b)*b + a%b equals a. This relationship holds for both positive and negative values.

If the right-hand operand is zero, a run-time error results.

local x = 37;
local y = 12;
local z = x % y;

+

This operator calculates the arithmetic sum of two numbers, concatenates strings, and appends values to lists.

Example:

local x = 37;
local y = 12;
local z = x + y;

local lst = [1, 2, 3];
lst = lst + x;

local str = 'testing';
str = str + y;

This operator calculates the arithmetic difference of two numbers, or removes elements from a list or Vector.

Example:

local x = 37;
local y = 12;
local z = x + y;

local lst = [12, 37, 42, 54];
lst = lst - x;

<<

Left shift. Both operands must be integers. a << n shifts the bits of the binary representation of a left (towards the most significant bit) by n places. The high-order n bits are simply discarded; the low-order n bits are filled with zeros. This is equivalent to multiplying a by 2n. If the result overflows the 32-bit integer type, there's no error; the overflowing high-order bits are simply discarded.

Example:

local x = 37;
local y = 2;
local z = x << y;

TADS doesn't have separate operators for arithmetic and logical left shifts, because both always yield the same results. This is in contrast to the right shift, where we have separate operators for arithmetic right shift (>>) and logical right shift (>>>) because of the difference in results for negative values.

>>

Arithmetic right shift. Both operands must be integers. a >> n shifts the bits of the binary representation of a right (toward the least significant bit position) by n bits. The least significant n bits of a are discarded, and the n vacated high-order bits are filled in with the original high-order bit of a. If a is positive, this is equivalent to dividing a by 2n. If a is negative, the result is equivalent to to dividing a by 2n and then rounding towards negative infinity (note that this differs from the / operator, which rounds towards zero: -3/2 == -1, whereas -3>>1 == -2).

Example:

local x = 37;
local y = 2;
local z = x >> y;

The difference between the arithmetic right shift and the logical right shift is the treatment of the vacated high-order bits. The arithmetic shift preserves the sign of the original value by filling the vacated bits with the original value's most significant bit; the logical shift always fills the vacated bits with zeros. If a is positive, a >> n and a >>> n have the same result.

>>>

Logical right shift. Both operands must be integers. a >>> n shifts the bits of the binary representation of a right (toward the least significant bit position) by n bits. The least significant n bits of a are discarded, and the n vacated high-order bits are filled with zeros. If a is positive, this is equivalent to dividing a by 2n. If a is negative, the result is equivalent to dividing (232+a) by 2n.

Example:

local x = 37;
local y = 2;
local z = x >>> y;

The difference between the arithmetic right shift and the logical right shift is the treatment of the vacated high-order bits. The arithmetic shift preserves the sign of the original value by filling the vacated bits with the original value's most significant bit; the logical shift always fills the vacated bits with zeros. For positive values of a, a >> n and a >>> n have the same result.

>   <   >=   <=

These operators each compare two values. Each operator yields true if its comparison holds for the two values, nil if not.

The meaning of the comparison depends upon the types of the values being compared:

Example:

local x = 37;
local y = 2;
local z = x > y; // stores true in z

is in   not in

These operators compare one value to each value in a set of values. is in yields true if the first value is equal to any of the values in the set, and not in yields true if the first value is not equal to any of the values in the set.

The syntax of these operators is unusual:

expr is in ( expr1 [ , expr2 ... ]  )
expr not in ( expr1 [ , expr2 ... ]  )

The first expression, expr, is the value to find in the set. The values in the parentheses - expr1 and so on - are the set of values to search.

Any of the values can be of any type.

Note that within the set, the comma has special meaning as the expression separator. This special meaning supersedes the normal "comma operator" within the set, so if you want to use the comma operator within one of the set expressions, you must enclose that expression in parentheses.

This type of operator proceeds as follows. First, it evaluates the left-hand expression. It then evaluates the first expression in the set, and compares it to the left-hand value. If the two values are equal, the operator immediately stops and yields its value (true in the case of is in, nil in the case of not in. If the two values are unequal, the operator evaluates the second expression in the set, and repeats the comparison.

The comparisons are performed according to the same rules used by the == and != operators.

It's important to note that these operators only evaluate the expressions in the set until they find a match. The set expressions are evaluated one at a time in left-to-right order, and the operator stops evaluating the expressions as soon as it finds a match. Also, note that the left-hand expression is evaluated only once, no matter how many set expressions must be compared.

Example:

local x = 17;
local y = 5;
local y = (x + 3) is in (y*1, y*2, y*3, y*4, y*5);

==   !=

These operators test for equality and inequality, respectively. == yields true if the two values being compared are equal, nil if not. != yields nil if the two values are equal, true if not.

The meaning of the comparison varies according to the types of the values being compared:

Example:

local a = 17;
local b = 34.0 / 2.0;
local c = (a == b);  // assigns true to c, since 17 == 17.0

&

This operator calculates the bitwise AND of two integers. Both operand values must be integers, otherwise a run-time error ("integer value required") is generated.

The "bitwise AND" of two values is the result of applying the boolean AND operator to each pair of bits from the binary representations of the operands. That is, the lowest-order bit of the result is the result of ANDing the lowest-order bit of the first operand with the lowest-order bit of the second operand; the second bit of the result of ANDing the second bits of the two operands together; and so on for all 32 bits of the integer values.

The "truth table" for the boolean AND operator is as follows:

aba & b
000
010
100
111

Example:

local a = 0x00FF; // all 1's in the low-order 8 bits
local b = 123456; // in hex, this is 0x1E240
local c = a & b;  // yields 0x40, or decimal 64

^

This operator calculates the exclusive OR ("XOR") of its operands. The result depends on the types of the operands:

The "truth table" for the boolean XOR operator is as follows:

aba ^ b
000
011
101
110

Example:

local a = 0x00FF; // all 1's in the low-order 8 bits
local b = 123456; // in hex, this is 0x1E240
local c = a ^ b;  // yields 0x1E200, or 123392 decimal

|

This operator calculates the bitwise OR of two integers. Both operand values must be integers, otherwise a run-time error ("integer value required") is generated.

The "bitwise OR" of two values is the result of applying the boolean OR operator to each pair of bits from the binary representations of the operands. That is, the lowest-order bit of the result is the result of ORing the lowest-order bit of the first operand with the lowest-order bit of the second operand; the second bit of the result of ORing the second bits of the two operands together; and so on for all 32 bits of the integer values.

The "truth table" for the boolean OR operator is as follows:

aba | b
000
011
101
111

Example:

local a = 0x00FF; // all 1's in the low-order 8 bits
local b = 123456; // in hex, this is 0x1E240
local c = a | b;  // yields 0x1E2FF, or 123647 decimal

&&

This operator computes the logical AND of its operands.

For the purposes of this operator, an operand value is considered logically "false" if it's nil or 0, or logically "true" if it's any other value or type. Given this, the truth table for the operator is as follows:

aba && b
"false""false"nil
"false""true"nil
"true""false"nil
"true""true"true

This is a "short-circuit" operator. This means that it stops evaluating its operands as soon as it knows the outcome. The operator always evaluates its left operand first. If the left operand is "false" (as defined above), the operator immediately knows that its overall result will be nil, without even looking at the second operand - if the first operand is "false," it doesn't matter what the second operand is, because the result will be nil in any case. This is where the short-circuit behavior comes in: since the operator already knows the result will be nil, it simply stops there and yields its value, without ever having evaluated the second operand. This is important if the second operand has side effects, because it means that the side effects will never be triggered if the first operand evaluates to "false." On the other hand, if the first operand evaluates to "true," then the operator must proceed to evaluate the second operand - thereby triggering its side effects - in order to determine the outcome.

Example:

local a = 0;
local b = 1;
local c = (a != 0 && b++ == 17);

After running the code above, the value of b will be 1. Look at that carefully: that b++ is never executed, because the && operator short-circuits that part of the expression - it never bothers to evaluate the b++, because it can see that the overall AND expression will be nil just by evaluating the a != 0 part.

||

This operator computes the logical OR of its operands.

For the purposes of this operator, an operand value is considered logically "false" if it's nil or 0, or logically "true" if it's any other value or type. Given this, the truth table for the operator is as follows:

aba || b
"false""false"nil
"false""true"true
"true""false"true
"true""true"true

As with &&, this is a "short-circuit" operator - see the description of && for details. In the case of ||, the operator stops after evaluating the first operand if the first operand value is "true" - since the result will necessarily be true in this case, regardless of the value of the second operand, the operator bypasses the evaluation of the second operand entirely.

Example:

local a = 0;
local b = 1;
local c = (a == 0 || b++ == 17);

After running the code above, the value of b will be 1 - the b++ evaluation is skipped because the || operator can tell that its result will be true as soon as it evaluates the first operand, a == 0.

??

This operator tests a value to see if it's nil, and yields a second value if so. It takes two operands:

a ?? b

If a is any value other than nil, the result is a. If a is nil, the result is b. The operator evaluates a exactly once, and it evaluates b only if a is nil.

One way to read this operator verbally is "a else b".

?? is concise and efficient for the common situation where you want to substitute a suitable default value if another value is nil. This comes up a lot with function and method arguments, return values from functions you call, and property values you're using that were originally assigned by other parts of the code.

For example, suppose the property location gives an object's container, and nil means that the object isn't currently in any location. Now suppose we have some code that wants to check if an object's location is lit. If we just wrote obj.location.isLit, we'd trigger a run-time error when the location is nil, since it's an error to get a property of nil. The traditional way to handle this is an if test:

local lit;
if (obj.location == nil)
   lit = nil;
else
   lib = obj.location.isLit;

We can simplify this with the ?? operator:

local lit = (obj.location ?? limbo).isLit;

If obj.location is a valid object, the ?? operator yields that object as the result, so we take its isLit property. If the location is nil, though, the ?? operator returns the right operand, limbo, which we've defined separately as an object representing where objects go when they're not in play. Since limbo is an ordinary object, the property evaluation succeeds without triggering an error.

The ?? approach is not only shorter to write than the if test, but it's also more efficient. The if test evaluates obj.location twice - once to test if the location is nil, and again to get the isLit property if it isn't. The ?? operator skips this extra step.

Note that (a ?? b) isn't quite the same as (a != nil ? a : b), which might appear to be equivalent at first glance. The ?? operator only evaluates a once, regardless of the outcome. The version using ? : evaluates a twice, because that operator always evaluates the condition, and then always evaluates either the "true" or "false" operand. If a has side effects (such as a function call, assignment, ++ operator, etc.), the side effects will only be triggered once when using the ?? operator.

? :

The "conditional" operator is unusual in that it takes three operand values. The syntax is:

condExpr ? thenExpr : elseExpr

One way to read this operator is like this: "if condition then then-part else else-part." The operator first evaluates the condition expression, condExpr. If this evaluates to true, a non-zero integer, or any other type or value, the operator then evaluates the "then part," thenExpr, and yields its value as the overall result. If the condition expression evaluates to 0 or nil, the operator instead next evaluates the "else part," elseExpr, and yields its value as the overall result.

Note that no matter what happens, only one of thenExpr or elseExpr is ever evaluated. The condition expression is always evaluated in any case, and is always evaluated first.

This operator has another unusual feature: it associates right-to-left. Because this operator has so many parts, this can be confusing. Some people mistakenly take it to mean that a nested conditional is executed first:

local x = a ? b ? c : d : e;  // which do we evaluate first, a or b???

At first glance, you might look at this and think that the right-to-left association means that we'd have to evaluate b first. After all, we have two ? operators in a row, and those operators associate right-to-left, so we have to do the one on the right first, right? Actually, that's wrong. Here, we do not have a case of associativity at all - we have a simple case of nesting. If you look at this carefully, you'll see that associativity doesn't even apply here, simply because there's absolutely no ambiguity about how to interpret the expression. Try putting parentheses into the expression to control the order of evaluation - you'll find that there's only one way you can do it:

local x = a ? (b ? c : d) : e;

There's simply no other distinct and valid way to parenthesize this expression. There's no question of associativity, so instead we simply rely on the basic rule of the ?: operator: the condition expression is always evaluated first, before either of the other parts. So the answer to the question posed above is that we evaluate a first.

So where does the right-to-left associativity even matter? The answer is that it comes into play when the second ?: operator occurs in the "else" part of the expression:

local x = a ? b : c ? d : e;

In this case, there really is some ambiguity in how to parenthesize this. Here are the two possibilities:

local x = (a ? b : c) ? d : e;
local x = a ? b : (c ? d : e);

See the difference? In the first case, we treat the whole first conditional a ? b : c as the condition expression of the second conditional. In the second case, we treat the second conditional c ? d : e as the "else" part of the first conditional.

So which is it? Since we know that this operator has right-to-left associativity, it's easy to see that the second grouping is the right one - right-to-left associativity simply means that you add parentheses starting at the right end when you need to resolve ambiguity. And it's fortunate that the second grouping is the one that most people would intuitively assume just by reading the original expression - naively, the original looks like it should read "if a then b, else if c then d, else e." This is no coincidence, of course; the whole point of making this operator associate right-to-left is that it produces this intuitive result.

,

The "comma" operator simply evaluates its two operands in sequence, first the left operand, then the right operand, and yields the result of the right operand.

This operator might seem strangely pointless, but it comes in handy in a number of situations. For one, this operator is useful in for statements, since it lets you write a whole series of initializers or re-initializers in a slot that's nominally designed for a single expression. Another place where the comma operator is often used is in macros, since it allows you to write a macro that evaluates a whole series of expressions, but as a unit acts as though it were a single function call that you can drop into an arbitrary expression.

Example:

local a = 7;
local b = a++, a++, a++, a/2;

When this code is done, a has the value 10, since the second line incremented it three separate times; and b has the value 5, since the comma operator yields the result of the right-hand operand. (In this case, since we have several comma operators in a row, we rely on the left-to-right associativity of the operator: we execute the subexpressions from left to right, and yield the value of the last subexpression.)

=   op=

The simple assignment operator, =, evaluates its right-hand operand first, then assigns the resulting value to the "lvalue" on the left. (See below for an explanation of lvalues.)

The op= operators combine a calculation and an assignment. An expression of the form a op= b is equivalent to a = a op b. These operators evaluate the left operand first, then the right operand; they then perform the implied calculation exactly as though it were written as a separate calculation, and finally assign the result to the lvalue.

The valid op= operators are:

+=  -=  *=  /=  %=  &=  |=  ^=  >>=  >>>=  <<=

In addition to performing an assignment, an assignment operator yields a result value. The result is simply the value assigned. For example:

local a = 10, b = 20;
local c = (a = 7) + (b += 5);

The subexpression (a = 7) yields the value assigned, in this case 7. The subexpression (b += 5) yields 25, because that's the result of adding 5 to b. So, after this code finishes, c has the value 32.

The assignment operators are right-to-left associative. For example:

local a = b = 7;

This is equivalent to a = (b = 7): first, the b = 7 sub-expression is evaluated, which assigns the value 7 to b and yields 7 as the result; then the result is assigned to a.

Common expression syntax elements

This section explains the syntax elements that several of the operators above refer to.

lvalue

An lvalue is a "left-hand side value," so named because it can be used on the left-hand side of an assignment operator. This type of expression is something that you can assign a value to.

There are several kinds of lvalues. You can assign to:

The syntax of an lvalue is:

localName
expression [ expression]
propertySpec
expression . propertySpec

When an indexed list value is used as an lvalue, it has some special behavior. Lists are immutable, so assigning a new value to an element of a list requires creating a new list that's a copy of the original, but with the assigned element replaced with its new value. Now, the new list has to be referenced somewhere, otherwise its creation would have been a pointless excercise. Therefore, when an indexed list value is used as an lvalue, and the indexed value is also an lvalue, the newly-created list is assigned to the indexed-value lvalue. If the indexed value isn't itself an lvalue, the new list is still created, but its value is never assigned anywhere, so it will simply be discarded by the garbage collector.

For example:

local l1 = [1, 2, 3];
local l2 = l1;
l1[2] = 10;

The first line assigns a list to local variable l1, and the second line sets l2 to refer to the same list. The two variables contain the same list reference at this point. The third line assigns a value to an indexed element of the list in l1. Since lists are immutable, this must create a new list, [1, 10, 3] - the original list is left unchanged, and a new list object is created. The reference to the new list is then assigned to l1. This won't affect l2: the original list is still there, unchanged, and l2 still contains a reference to the original. So when the code is finished, l1 and l2 refer to different lists: l1 refers to the new list [1, 10, 3], and l2 refers to the original list [1, 2, 3].

Note that none of this applies to Vector or LookupTable objects, because those types are mutable (i.e., their contents can be changed dynamically).

propertySpec

The propertySpec element is a property name or expression that specifies a property. When a propertySpec is required, you can supply either of these forms:

propertyName
( expression )

The first form simply specifies the literal name of a property.

The second form lets you use any expression to calculate the property; the expression must yield a property pointer value. Note that the expression must be enclosed in parentheses.

argumentList

An argumentList element lets you specify the arguments to a function or method. The syntax is:

( [ expression [ , expression ... ]  )

Each expression can be any valid expression. Note, though, that the comma has a special meaning in this context: it separates successive argument expressions. This means that if you want to use the general-purpose "comma operator" within one of these expressions, you must enclose the expression in parentheses, so that the compiler can tell that it's a comma operator rather than an argument separator.

Note that an empty argument list - just an empty pair of parentheses - is valid. This signifies an argument list with zero arguments.

Arithmetic type conversions

Most of the arithmetic operators can accept any combination of numeric operands. This means that you can perform arithmetic on integers, BigNumbers, or combinations of the two types.

The one-operand ("unary") operators generally yield a value of the same type and precision as the operand. That is, if the operand is an integer, the result is an integer; if the operand is a BigNumber, the result is a BigNumber of the same precision as the operand.

The two-operand ("binary") arithmetic operators generally yield a value of the same type and precision as the operand with the greater precision. Specifically:

Any exceptions to these rules are mentioned in the descriptions of the individual arithmetic operators.

Integer overflow and automatic promotion

When you use the basic arithmetic operators (+, -, *, /, and the corresponding compound operators such as ++ and +=) with integer operands, the result is normally an integer as well. However, the integer type has a limited range, from -2,147,483,648 to +2,147,483,647. If you perform a computation with a result outside of this range, it's called an integer overflow, because the result is too large (or too small) to be represented with the integer type. For example, adding 1,000,000,000 to 2,000,000,000 yields 3,000,000,000, which is too large to represent as an integer.

When an integer overflow occurs with one of the basic operators, TADS automatically changes the result to a BigNumber. This is called a "promotion", because BigNumber is a superior type in the sense that it's capable of storing a wider range of values than the integer type. (Superior in this case just means bigger, not better. BigNumbers and integers each have their own advantages. BigNumbers have a wider range, but integers are much faster and use less memory. That's why TADS doesn't just do everything with BigNumbers to start with.) The promotion ensures that results are arithmetically correct even when they're out of bounds for the integer type. For the most part, TADS lets you use integer and BigNumber values interchangeably, so you probably won't even notice in most cases.

The bit-oriented operators, such as &, |, ~, <<, and >>, don't perform any promotions. These operators are specifically intended for manipulating bit patterns stored as integers, so promotions don't make sense for them.

(The automatic promotion is a new feature starting in version 3.1.1. Before 3.1.1, TADS behaved like the more system-oriented languages like C and C++, and simply ignored overflows. Results were truncated to fit the 32-bit integer type, by discarding overflowing bits. It's difficult to do anything sensible with such an overflowing result, because information is lost in the truncation, so you can't determine the actual arithmetically correct result from the truncated value. The only really good approach was to avoid overflows in the first place, which is difficult if you're working with external data or anything entered by a user. The new treatment with automatic promotions is more in keeping with the general philosophy of TADS as a high-level environment where you don't have to worry about hardware-level details like this.)

The compiler similarly promotes constant integer expressions to BigNumber when they overflow the integer type. The compiler shows a warning message for each integer expression it promotes; since integers and BigNumbers can't always be used interchangeably, the warning ensures that you know about the conversion, in case it wasn't what you intended. You can remove the warning on a case-by-case basis by explicitly using floating-point notation, by adding a decimal point to the number in question. The compiler doesn't promote integers stated in hex or octal notation (e.g., 0x80000000 or 040000000000) as long as they fit within an unsigned 32-bit integer, which allows values up to 4294967295 (0xFFFFFFFF in hex). Hex and octal are often used for things like bit masks or binary file format parsing, where you want to specify a bit pattern rather than an arithmetic value, so the compiler assumes that's your intention when you use these formats. However, the compiler will promote even a hex or octal number if it exceeds the 32-bit unsigned limit, since there's simply no way to fit such a value into a 32-bit integer, no matter how you interpret its signedness. For example, 0x100000000 will be promoted to BigNumber even though it's stated in hex. Entering a value in hex also won't stop the compiler from promoting the result of a constant expression if the result value overflows a signed integer, since once you start performing arithmetic, everything is back in the signed integer domain. For example, (0x7FFFFFFF + 1) will result in a promotion, even though the seemingly equivalent 0x80000000 won't. Hex values over 0x7fffffff are treated as negative integers for the purposes of arithmetic evaluations, so (0x80000000 - 1) results in an overflow and yields -2147483649, not 0x7FFFFFFF.

Pseudo-variables

In addition to constant values and ordinary variables, TADS 3 has several "pseudo-variables" that you can use within expressions. We call these pseudo-variables because they look like variables, syntactically, but they don't behave like ordinary variables. For one thing, you don't have to define their names anywhere, because they're built into the VM. Another difference is that you can't assign new values to these variables - they're "read-only" from the program's perspective.

The pseudo-variables give you access to information within the VM about the current execution context. The VM automatically keeps these up-to-date as the execution context changes, so at any given time you can use these variables to get information about the code that's currently executing.

self

The self pseudo-variable provides a reference to the object whose method was originally invoked to reach the current method. Because of inheritance, this is not necessarily the object or class where the current method is actually defined. For example:

class Base: object
  name = 'Base'
  test()
  {
    "Base.test: self = <<self.name>>\n";
  }
;

class Sub: Base
  name = 'Sub'
;

main(args)
{
  local obj = new Sub();
  obj.name = 'my new object';
  obj.test();
}

In this example, when we invoke obj.test(), the VM will see that the object inherits the method from the class Base - there are no overriding definitions of the method, so we invoke this inherited definition. Even though the method is defined in class Base, though, self will still be the object that was in the variable obj, so the name displayed will be "my new object".

self remains unchanged when you use inherited or delegated. For example, suppose we change the class Sub in the example above as follows:

class Sub: Base
  name = 'Sub'
  test()
  {
    "Sub.test: self = <<self.name>>\n";
    inherited();
  }
;

Now when we run this code, calling obj.test() will invoke the method in class Sub, since this overrides the one defined in class Base. This method will display the name of the object, and as before, this will be "my new object", since self is the original target of the method invocation. After displaying the message, the Sub method will inherit the base class method, so we'll now proceed to the original one in class Base. This will display the object name a second time, and it will still be the same name - "my new object" - because self is not changed by an inherited call. The same would apply if we used delegated.

The "self" object is implied any time you call a method or evaluate a property without explicitly specifying which object is to be targeted. For example, we could rewrite the "test" method in class Base above as follows:

class Base: object
  name = 'Base'
  test()
  {
    "Base.test: self = <<name>>\n";
  }
;

Notice how we've removed the "self." prefix from the "name" property evaluation. Even though we've removed the explicit mention of "self" as the target object, the new version works exactly like the original, because "self" implied any time there's a method or property call with no target object specified.

self is valid only in method contexts - that is, within methods defined in objects or classes. It's not valid within functions; a function isn't associated with any object, and thus a call to a function doesn't involve targeting any object.

targetprop

The pseudo-variable targetprop provides access at run-time to the current target property, which is the property that was invoked to reach the current method. This complements self, which gives the object whose property was invoked.

You can use this variable only in contexts where self is valid.

targetobj

The pseudo-variable targetobj provides access at run-time to the original target object of the current method. This is the object that was specified in the method call that reached the current method. The target object remains unchanged when you use inherited to inherit a superclass method, because the method is still executing in the context of the original call to the inheriting method.

The targetobj value is the same as self in normal method calls, but not in calls initiated with the delegated keyword. When delegated is used, the value of self stays the same as it was in the delegating method, and targetobj gives the target of the delegated call.

You can use this variable only in contexts where self is valid.

definingobj

This pseudo-variable provides access at run-time to the current method definer. This is the object that actually defines the method currently executing; in most cases, this is the object that defined the current method code in the source code of the program.

You can use this variable only in contexts where self is valid.

argcount

This pseudo-variable contains an integer value giving the number of arguments that the caller supplied to the current function or method. This value is valid whether or not the current method or function takes varying arguments (although it's probably not particularly useful otherwise). The argcount value is always the total number of arguments - for a varying-parameter function or method, this means that any named arguments are included in the total.

invokee

invokee provides a pointer to the currently executing function.

For a regular function, invokee is a pointer to the function. The same value can be obtained by using the name of the function without an argument list.

myfunc(x)
{
   // f and g will have the same values
   local f = invokee, g = myfunc;
}

For an anonymous function, invokee is the anonymous function object. This lets an anonymous function invoke itself recursively.

local factorial = new function(x) {
   if (x <= 0)
       return 1;
   else
       return x * invokee(x-1);
};

For a dynamic function, invokee is the dynamic function object. As with anonymous functions, this can be used for recursive invocation.

For an ordinary method, defined as part of an object definition in the source code, invokee is a function pointer to the method's code. Unlike regular function pointers, it's not usually a good idea to invoke a method function pointer directly, since doing so would call the method code with a nil value for self. This will cause a run-time error if the method tries to evaluate a property of self, inherit, or otherwise reference self. The main way to use this kind of function pointer is with setMethod.

For a dynamic method defined with setMethod, invokee yields the original value passed to setMethod when the method was created. As with oridinary method pointers, it's not always safe to invoke this value directly as a function pointer, since doing so passes a nil value for self, and the underlying code of a method usually assumes that there's a valid self in effect.

Notes for TADS 2 users

TADS 2 users will notice some changes to the expression syntax. Most of these are simply additions, but there are a few changes to constructs you're familiar with from the old system.

No more Pascal-style assignments

The Pascal-style assignment (:=) and equality (=) operators are no longer allowed. TADS 3 allows only the Java/C-style operators. There's no compiler option for changing this. Although some people prefer the Pascal style of these operators, it was too confusing to have different, switchable syntax options, so the new language uses the Java/C style exclusively.

No more "delete" operator

The "delete" operator has been deleted from the language. It's no longer necessary, since the T3 VM has automatic garbage collection: the system deletes objects automatically when they're no longer reachable through any references anywhere in the program. Not only does this eliminate a lot of tedious coding work, but more importantly eliminates several classes of bugs that plague programs written in languages like C and C++, where memory must be managed explicitly. With automatic garbage collection, it's impossible to create a dangling pointer, for example, or free the same memory twice.

"self." is always implied

In TADS 3, you can almost always omit "self." prefix when calling a method or evaluating a property of the "self" object.

Essentially the only time you need to write "self." explicitly is when invoking a method through a property ID variable. In this case, the "self." is required, since otherwise there's no way for the compiler to know that you want to invoke the method rather than just evaluate the variable.

All other method and property invocations implicitly target "self" if no other "obj." prefix is used. This was often true in TADS 2 as well, but with the important caveat that it only worked when the property to be invoked was already defined as a property name, earlier in the source file. Because of this snag, TADS 2 programmers usually found themselves writing "self." explicitly every time, to avoid the uncertainty.

TADS 3 compiles in two passes, so it recognizes every property name everywhere, regardless of the order of the definitions in the source files. This means that you can safely and reliably drop the "self." prefixes. This makes for more concise and readable code, and saves a lot of typing. Of course, you can still write "self." explicitly if you want to, and on occasion it's clearer to do so. But most of the time you can just leave it out.