T3 Metaclasses

T3 VM Metaclasses

The T3 VM design specifies several metaclasses that will be present in most implementations of the VM. The metaclass system is modular, so these metaclasses are not special in any way, except that they're specifically described in this VM specification. A particular implementation may choose not to include any of these metaclasses, or may include only a subset. However, because of their general utility, we expect that most VM implementations will include most or all of these metaclasses.

TADS Object
Weak Reference
String
List
Array
Weak Reference List
Weak Reference Array
LookupTable

The "TADS Object" Metaclass

The "TADS object" metaclass implements what the TADS language programmer thinks of as an object. This type of object stores a list of base classes, and a list of property/value pairs. The property list is mutable throughout the object's lifetime; the values of existing properties can be altered, and new properties can be added. Properties cannot be deleted, however.

We have several goals in implementing the TADS object metaclass. First, property access should be fast, since getting the value of a property is a very common operation. Second, setting the value of a property should be fast. Third, because an object can have many properties, it would be desirable to be able to save an object's state at run-time by saving only the changes made to the load image; objects can have quite large load images, so we could save substantial space in saved state files by storing only the changes between the dynamic state and the initial load image.

The object extension of a TADS object contains a header, the superclass list, the load image property list, and the changed property list. The header contains the number of superclasses, the number of properties in the load image list, the number of properties in the changed list, and the number of allocated property slots. The superclass list, load image property list, and altered property list are kept in memory in portable byte format, to allow very rapid reading and writing of the information.

The load image property list contains the original property list from the load image, if any. This list is never altered throughout the object's lifetime. We keep the load image property list sorted in ascending order of property ID. We require the compiler to sort the property list when creating the load image; we simply assume the list is in sorted order when loaded. Since this list never changes, we do not incur any overhead keeping the list in sorted order.

The modifiable property list is an array of properties that have been changed since the object was loaded. This list is not kept in sorted order; although sorting the list would allow faster access, it would create substantial overhead when setting property values, because it would be necessary to reorder the list to accomodate each insertion.

To find a property, we first scan the modifiable property list. If the property is found in this list, this value is returned; otherwise, we look in the load image list. The load image list is sorted, so we can perform a binary search of this list.

Saving and restoring state: To save state, we save only the changed property list. The load image property list never changes, so we do not need to save this in a saved state file. To restore state, we discard the entire current changed property list, and load the changed property list from the saved state file.

Dynamic construction from stack arguments: The stack contains one or more arguments. The first argument is the superclass object; if this value is nil, the object has no superclass. The object is initialized with the given superclass and no property values. Then, the object's "construct" method is invoked, passing all additional arguments as parameters to the "construct" method.

Garbage collection: Note that we never have to mark as referenced any objects referenced from the load image property list. This list is always identical to the list in the load image, so must by its nature refer only to objects in the root set, which are always reachable and thus never need to be explicitly marked as such during garbage collection. Hence, we only need to mark the objects in the changed property list.

Inheritance: TADS Objects can inherit from other TADS objects. A TADS object that inherits from another TADS object is said to be a subclass of the second object, or to be derived from the second object. If a TADS object inherits from another TADS object, it implicitly defines all of the properties defined by the second object, and has the same value for each property in the second object that doesn't also appear in the first object. Any properties defined in the subclass object override the same properties in the subclass.

For example, suppose we have two objects, A and B, with B defined as inheriting from A. Suppose also that A defines the properties P and Q (i.e., A's property table contains entries for the property ID's of properties P and Q), and that B defines the properties Q and R.

Inheritance is implemented as a search up the superclass tree. When we want to get a property of an object, we start with the object's property table; if the property is defined in the object's property table, the value from that definition is used. Otherwise, we start with the first superclass in the list of superclasses (the order in the table is significant, in that it defines the search order). For each superclass, we recursively try to get the property from the superclass; if it's found, we note the value and the actual object that supplied the value (because the superclass may in turn inherit the value from one of its superclasses, the value may not come directly from the superclass). In general, if the same property is found in two superclasses, the one that came earlier in the list takes precedence over the one that came later. However, there is a complex situation where this is not the case: if the two superclasses share a common base class, and the first superclass inherits the property from the common base class, and the second superclass overrides the property inherited from the common base class, then we use the value from the second superclass, because the value from the second superclass is more specific than one from the first.

To illustrate, consider the following class tree:

    Base defines properties P, Q, R
    A inherits from Base, defines Q
    B inherits from Base, defines R
    C inherits from A, B, defines P

If we get property P from C, then we retrieve the value C.P, because the property is defined directly in C.
If we get property Q from C, we find that C does not have this property in its table, so we must look at superclasses. We start with A, and find that Q is defined in this property table. We continue to superclass B; we do not find Q here, but we do find it in B's superclass Base. However, since Base does not derive from A, where we originally found the property, we ignore this, and return A.Q.
If we get property R from C, we again find that C does not have the property in its table, so we look at superclasses. We start with A; we find that R is not in its table, so we look at its superclass Base, where we find the property. So, we note Base.R and continue looking. We look next at superclass B, where we find R defined. Now, since B derives from Base, where we originally found R, this value overrides the original value we found, so we use this value. Since we have no further superclasses to look at, we return B.R.

Inheritance comes into play in two separate cases. In the first case, we are searching for a given method of a given object; the search order is as described above. In the second case, we are explicitly inheriting from a currently executing method; in this case, the search occurs exactly as in the first case, except that we pretend that the currently executing method did not exist, so we continue the same inheritance search to determine what would come next in the search order if the executing method were not defined. To support this second case, each executing method maintains in its activation frame all of the information needed to continue a previous search: the original target object (so that we can go back to the exact same starting point in the inheritance search that we used to find the executing method in the firs tplace), and the current defining object (which is the point we must reach, and then pass, to find the next object in the search order).

Image File Data

The image file data for an object of metaclass TADS-Object is arranged as follows:

Generic Metaclass Header
UINT2 superclass count
UINT2 load image property count
UINT2 object flags
UINT4 superclass 1 object ID
...
UINT 4 superclass N object ID
Property value 1
...
Property value N

Each property value is arranged as follows:

UINT2 property ID
DATAHOLDER property value

The property table is arranged in order of ascending property ID, with the property ID values ordered as unsigned 16-bit integers. The sorted order is required so that VM implementations can use a binary search or other algorithm that depends upon the table being pre-sorted.

The superclass table is arranged in order of definition of the superclasses, so the first superclass of an object is first in the table; this ordering determines the search order for inherited properties, in that we search the first superclass in the list first.

The "object flags" value is a set of bit-field flags stored as a portable UINT2 value. The valid flag values are shown below. All other bits in the value should be set to zero, to ensure compatibility with any future versions that define additional flag values.

0x0001 The object represents a class, not an instance.

The "Weak Reference" Metaclass

This metaclass works exactly like the regular TADS object metaclass, except that any objects referenced by properties of this object are only "weakly" referenced. This means that a reference from this object to another object does not prevent the garbage collector from deleting the object; instead, if the garbage collector deletes the object, we forget our reference and replace it with nil.

Weak references are useful for certain types of caching and indexing mechanisms. In many cases, when we construct a cache or index, we store references to objects in order to find them more quickly, but we don't want their presence in the cache or index to keep the objects in memory if they are not reachable through any other means. Weak references accomplish this.

The weak reference metaclass is a simple subclass of the TADS object metaclass. The differences are in the garbage collection methods in the generic object interface:

Mark referenced objects: does nothing at all. Our references don't count for garbage collection purposes, since if we are the only object referencing another object, the object can be deleted.
Remove deleted weak references: we scan each object that we reference via a property. For each object whose object descriptor flags indicate that the object is no longer reachable, we delete our reference by replacing the reference with nil.

Weak reference objects cannot be subclasses of regular TADS objects.

Undo interaction: we store undo as usual for explicit property changes made by user code. However, we do not store undo when we forget a reference to an object that is being deleted. In the first place, such an undo record would be useless, because applying undo can never re-create a deleted object (refer to the section on undo for more information). In the second place, such undo information should never be necessary anyway: if an object is being deleted by the garbage collector, it means that the object was not reachable by any means, either from a direct reference from some object or through an undo record. This means that we could undo to the oldest existing savepoint, and the object would still not be reachable. Hence, even after applying all available undo, there would still be no need to restore the weak reference to the object, so there is no need to save the loss of the weak reference in the undo.

Due to interactions with the garbage collector and the undo mechanism, weak references will sometimes continue to refer to an object for some time after the object is not reachable through any other means. Reaching an object through a weak reference does not imply that the object is otherwise reachable, since the garbage collector may simply not have run since the object became unreachable or because the object is reachable only through undo records. User code that makes use of weak references must be designed with this in mind, and must make any appropriate checks, according to its own data model, to determine if the object is still meaningful in the context of the user code's processing. Of course, the object is always still valid as long as it is referenced even through the weak link, so accessing the object will never cause a VM error or result in any data corruption.

Note that if an object is reachable only through a weak link, and some other code accesses the object through the weak link and creates a new strong reference to the object (for example, by putting a reference to it on the stack or in a property of a regular TADS object), there is no ambiguity created with respect to garbage collection and no danger of a dangling reference (a reference to a deleted object). In this situation, if the garbage collector runs after the new strong reference is created, the GC will know it cannot delete the object. So, even though the object could have been deleted if the GC had been invoked during the period of the time the object was reachable only through the weak reference, the fact that it did not run during that time means that the object will never have been deleted. There is no danger of a dangling reference.

The "String" Metaclass

Dynamically-constructed strings are represented as objects of metaclass String. Because the same operations that are possible on dynamic strings are also possible on constant strings, and because a constant string has no associated object, the methods implemented in the String metaclass are all accessible to the byte-code interpreter through static methods that take a constant string's byte array pointer as an argument. The metaclass member functions can all be implemented as calls to the corresponding static methods with the instance's string byte array as an argument.

The variable-size object extension for a String object contains a two-byte length prefix, followed by the bytes of the string. The length prefix gives the length of the string in bytes, not counting the length prefix bytes themselves.

A String object always has a constant text string associated with it, even though the String object itself was dynamically constructed. In other words, an operation such as concatenation applied to a String object does not affect the String object itself, but instead creates a new String object containing the result of the operation. This is important, because it means that a given String object never changes, so an operation on a String object will not affect any other references to the same String object. Consider the following TADS code:

    s1 := substr('hello there', 1, 5);
    s2 := s1;
    s1 += '!!!';

The first line creates a dynamic string containing the text 'hello' and assigns a reference to the dynamic string to the local variable s1. The second line assigns a reference to the same string to the local variable s2. The third line changes s1 by appending the text '!!!'. This does not change the underlying String object to which s1 and s2 refer just before this statement, but instead creates a new String object that contains the result of the concatenation, and assigns a reference to the new object to s1. After the concatenation, s2 still refers to the original String object, which still constains 'hello'.

The String metaclass performs the following operations for the generic object interface:

Get property value. The String metaclass implements several methods through the get-property mechanism, to provide access to these operations to user code or built-in function implementations. Methods include "substring", to extract a substring of the string; "upper case" and "lower case", to convert the case of the string; "search", to find a substring within the string.
Add. The argument (the right operand of the "+" operator) must be a string constant or an object or metaclass String. Retrieves the byte string from the argument, appends it to the byte string of this object, and create and returns a new object of metaclass String containing the concatenation.
Index. Retrieves the character at the given character position, returning a primitive Integer value giving the character code.
Assign to index. This takes an integer index value and an integer value or a string constant or String object value. This sets the character at the given character position to the given new value (in the case of an integer value, the integer is interpreted as a character value; in the case of a string constant or object, the first character of the string is used), creating and returning a new object reflecting the substituted character. Note that this does not change the text of the string stored in this object, since a String object's text value is immutable.

Dynamic construction from stack arguments: Strings cannot currently be constructed dynamically using the NEW instruction. (There'd be little point in constructing a new string from an existing string, since it would just be an unnecessary copy of the same string. It may be desirable to add some form of this in the future, though, perhaps for certain types of type conversions.)

The "List" Metaclass

Dynamically-constructed lists are represented as objects of metaclass List. Because the same operations that can be performed on List objects can also be performed on constant lists, and because a constant list has no associated object, the methods implemented in the List metaclass are all accessible to the byte-code interpreter through static methods that take a constant list's byte array pointer as an argument. The metaclass member functions can all be implemented as calls to the corresponding statis methods with the instance's byte array as an argument.

A List object's variable-size extension contains a two-byte element count, followed by the elements of the list. The elements are stored as a packed array of value holders.

As with String objects, a List object's value is immutable, even though the object itself was dynamically constructed. So, an operation on a list that yields a different value for the list also yields a new List object, rather than changing the original List object's data. This means that a given List object never changes, so an operation on a List object will not affect any other references to the same List object. Consider the following TADS code:

    s1 := ['x' 'y'] + 'z';
    s2 := s1;
    s1[1] := 'w';

The first line creates a dynamic list containing the list ['x' 'y' 'z'], and assigns a reference to the object to the local variable s1. The second line assigns a reference to the same object to the local variables2. The third line changes s1 by assigning the string 'w' to the first element of its list. This does not change the underlying List object to which s1 and s2 refer just before this statement, but instead creates a new List object that contains the result of the index assignment, and assigns a reference to the new object to the original List object, which still contains the original list ['x' 'y' 'z'].

The List metaclass performs the following operations for the generic object interface:

Enumerate references. Calls the enumerator callback function for each element of the list that is an object. Note that this method does not need to be implemented for constant lists, because a constant list can only refer to objects that are part of the root set and are thus inherently reachable at all times.
Add. Appends the value of the argument (the right operand of the "+" operator) to the end of the list. If the argument is of primitive type Integer, Float, Object ID for an object of any metaclass other than List, Property ID, or String constant, this simply adds the value of the argument as a new element. If the argument. If the argument is a List object or list constant, this adds the elements of the list to the end of this list; in other words, concatenating two lists creates a new list whose number of elements is the sum of the numbers of elements of the two original lists. This behavior is different than for other types, in that the argument list is not added to the original list as a sublist at the last element, but rather the argument list's elements are each added one by one to the original list.
Index. Retrieves the list element at the given index value.
Assign to index. Creates a new list that contains the original list with the element at the given index replaced by the given new value. This does not change the original list, but rather creates a new list with the value replaced.

Dynamic construction from stack arguments: Constructing a list from stack arguments results in a list that contains each of the arguments, in the same order in which they appear on the stack (the first item popped becomes the first element of the list, and so on, so the list elements should be pushed in the reverse of the order that they should have in the resulting list). Constructing a list with zero arguments creates an empty list.

The "Array" Metaclass

In addition to the List metaclass, the VM provides an Array metaclass. This metaclass behaves very similarly to the List metaclass, with a few important distinctions:

An array is fixed in size. The number of elements in an array can never changes.
An array's contents are mutable. Assigning a new value to an element of an array does not create a new Array object, but rather changes the original Array object.
There is no such thing as a constant array. The only way that an array can be constructed is by using the new Array operator. However, an array can be constructed based on a list (either a list constant or a List object) by passing the list as the argument to the new Array operator. Alternatively, an array can be constructed simply by specifying the number of elements in the array as the argument to the new Array operator.

The second point is important because it means that, if there are several references to an array, changing an array through one reference will also change the array visible to the other references. This is not true of lists. Consider this TADS code:

    s1 := new Array(10);
    for (i := 1 ; i < 10 ; ++i)
        s1[i] := i;
    s2 := s1;
    s1[5] := 100;

This code first creates an array with ten elements and assigns it to the local variable s1. Next, the code initializes the array by storing a number in each element equal to the element's index. Next, the code assigns a reference to the same array to s2. Finally, the code assigns the number 100 to element 5 of the array in s1. After this final line, the value of s1[5] is 100; in addition, the value of s2[5] is also 100. Since both variables refer to the same array object, and the underlying object is changed by the assignment, both variables see the change. The Array metaclass performs the following operations for the methods in the generic object interface:

Enumerate references. Calls the enumerator callback function for each element of the array that is an object.
Index. Retrieves the list element at the given index value.
Assign to index. Creates a new list that contains the original list with the element at the given index replaced by the given new value. This does not change the original list, but rather creates a new list with the value replaced.

The "Weak Reference List"

Metaclass

This metaclass is exactly like the ordinary list metaclass, except that any object references contained in the list are treated as "weak" references. The presence of a weak reference does not prevent an object from being deleted by the garbage collector in the absence of any other references.

The weak reference list metaclass is implemented as a subclass of the list metaclass. The garbage collection methods (mark references, delete unreachable weak references) are overridden in the subclass to provide weak reference rather than strong reference behavior.

When a referenced object is deleted, the element of the list containing the element is deleted from the list. Note: this behavior is somewhat inconsistent with regular lists, which are immutable.

The "Weak Reference Array"

Metaclass

Much as the weak reference list metaclass provides a subclass of the list metaclass using weak references, the weak reference array metaclass provides a subclass of the array metaclass for arrays of weak references. This metaclass behaves as (and is a suclass of) the regular array metaclass; the only difference is that any object references stored within elements of a weak reference array are treated as weak references.

When a referenced object is deleted, the element of the array containing the element is set to nil.

The "LookupTable" Metaclass

The T3 VM provides a "lookup table" metaclass that provides an efficient mechanism for looking up objects by key value. While the same functionality could be implemented from simpler objects in user code, this native-code metaclass is provided for better performance.

LookupTable objects can be created explicitly by user code via the new LookupTable operator. Once created, a lookup table can be populated using a set of method calls.

A lookup table contains a set of associations; each assocation relates a key and an object. The table can be efficiently searched given a key to yield a list of the objects associated with the key.

A key is simply a string value. The associated value can be any object.

The lookup table object extension stores a hash table. The hash table starts with a count, giving the number of elements in the table, then an array with the given number of elements. Each element of the array consists of a value holder, which contains the key value for the entry, or nil if there is no entry at that element; and a second value holder, which contains a constant list or the ID of a Weak Reference List object, or nil if the element is unused. Each entry of the list is an object that has been associated with the key.

Undo interaction: the lookup table metaclass must store undo information for any changes it makes to the table structure. Each time a table entry is changed, we must store an undo record reflecting the index of the change, the original key value, and the original value. Because an undo record can only contain a single value, we must record a change to the key value and a change to the list value separately.