Copyright © 1999, 2002 by Michael J. Roberts. Preliminary version 15 March 1999.
The "TADS object" metaclass implements what the TADS language programmer thinks of as an object. This type of object stores a list of base classes, and a list of property/value pairs. The property list is mutable throughout the object's lifetime; the values of existing properties can be altered, and new properties can be added. Properties cannot be deleted, however.
We have several goals in implementing the TADS object metaclass. First, property access should be fast, since getting the value of a property is a very common operation. Second, setting the value of a property should be fast. Third, because an object can have many properties, it would be desirable to be able to save an object's state at run-time by saving only the changes made to the load image; objects can have quite large load images, so we could save substantial space in saved state files by storing only the changes between the dynamic state and the initial load image.
The object extension of a TADS object contains a header, the superclass list, the load image property list, and the changed property list. The header contains the number of superclasses, the number of properties in the load image list, the number of properties in the changed list, and the number of allocated property slots. The superclass list, load image property list, and altered property list are kept in memory in portable byte format, to allow very rapid reading and writing of the information.
The load image property list contains the original property list from the load image, if any. This list is never altered throughout the object's lifetime. We keep the load image property list sorted in ascending order of property ID. We require the compiler to sort the property list when creating the load image; we simply assume the list is in sorted order when loaded. Since this list never changes, we do not incur any overhead keeping the list in sorted order.
The modifiable property list is an array of properties that have been changed since the object was loaded. This list is not kept in sorted order; although sorting the list would allow faster access, it would create substantial overhead when setting property values, because it would be necessary to reorder the list to accomodate each insertion.
To find a property, we first scan the modifiable property list. If the property is found in this list, this value is returned; otherwise, we look in the load image list. The load image list is sorted, so we can perform a binary search of this list.
Saving and restoring state: To save state, we save only the changed property list. The load image property list never changes, so we do not need to save this in a saved state file. To restore state, we discard the entire current changed property list, and load the changed property list from the saved state file.
Dynamic construction from stack arguments: The stack contains one or more arguments. The first argument is the superclass object; if this value is nil, the object has no superclass. The object is initialized with the given superclass and no property values. Then, the object's "construct" method is invoked, passing all additional arguments as parameters to the "construct" method.
Garbage collection: Note that we never have to mark as referenced any objects referenced from the load image property list. This list is always identical to the list in the load image, so must by its nature refer only to objects in the root set, which are always reachable and thus never need to be explicitly marked as such during garbage collection. Hence, we only need to mark the objects in the changed property list.
Inheritance: TADS Objects can inherit from other TADS objects. A TADS object that inherits from another TADS object is said to be a subclass of the second object, or to be derived from the second object. If a TADS object inherits from another TADS object, it implicitly defines all of the properties defined by the second object, and has the same value for each property in the second object that doesn't also appear in the first object. Any properties defined in the subclass object override the same properties in the subclass.
For example, suppose we have two objects, A and B, with B defined as inheriting from A. Suppose also that A defines the properties P and Q (i.e., A's property table contains entries for the property ID's of properties P and Q), and that B defines the properties Q and R.
Inheritance is implemented as a search up the superclass tree. When we want to get a property of an object, we start with the object's property table; if the property is defined in the object's property table, the value from that definition is used. Otherwise, we start with the first superclass in the list of superclasses (the order in the table is significant, in that it defines the search order). For each superclass, we recursively try to get the property from the superclass; if it's found, we note the value and the actual object that supplied the value (because the superclass may in turn inherit the value from one of its superclasses, the value may not come directly from the superclass). In general, if the same property is found in two superclasses, the one that came earlier in the list takes precedence over the one that came later. However, there is a complex situation where this is not the case: if the two superclasses share a common base class, and the first superclass inherits the property from the common base class, and the second superclass overrides the property inherited from the common base class, then we use the value from the second superclass, because the value from the second superclass is more specific than one from the first.
To illustrate, consider the following class tree:
Base defines properties P, Q, R
A inherits from Base, defines Q
B inherits from Base, defines R
C inherits from A, B, defines P
Inheritance comes into play in two separate cases. In the first case, we are searching for a given method of a given object; the search order is as described above. In the second case, we are explicitly inheriting from a currently executing method; in this case, the search occurs exactly as in the first case, except that we pretend that the currently executing method did not exist, so we continue the same inheritance search to determine what would come next in the search order if the executing method were not defined. To support this second case, each executing method maintains in its activation frame all of the information needed to continue a previous search: the original target object (so that we can go back to the exact same starting point in the inheritance search that we used to find the executing method in the firs tplace), and the current defining object (which is the point we must reach, and then pass, to find the next object in the search order).
|Generic Metaclass Header|
|UINT2 superclass count|
|UINT2 load image property count|
|UINT2 object flags|
|UINT4 superclass 1 object ID
UINT 4 superclass N object ID
|Property value 1
Property value N
Each property value is arranged as follows:
|UINT2 property ID|
|DATAHOLDER property value|
The property table is arranged in order of ascending property ID, with the property ID values ordered as unsigned 16-bit integers. The sorted order is required so that VM implementations can use a binary search or other algorithm that depends upon the table being pre-sorted.
The superclass table is arranged in order of definition of the superclasses, so the first superclass of an object is first in the table; this ordering determines the search order for inherited properties, in that we search the first superclass in the list first.
The "object flags" value is a set of bit-field flags stored as a portable UINT2 value. The valid flag values are shown below. All other bits in the value should be set to zero, to ensure compatibility with any future versions that define additional flag values.
|0x0001||The object represents a class, not an instance.|
Weak references are useful for certain types of caching and indexing mechanisms. In many cases, when we construct a cache or index, we store references to objects in order to find them more quickly, but we don't want their presence in the cache or index to keep the objects in memory if they are not reachable through any other means. Weak references accomplish this.
The weak reference metaclass is a simple subclass of the TADS object metaclass. The differences are in the garbage collection methods in the generic object interface:
Weak reference objects cannot be subclasses of regular TADS objects.
Undo interaction: we store undo as usual for explicit property changes made by user code. However, we do not store undo when we forget a reference to an object that is being deleted. In the first place, such an undo record would be useless, because applying undo can never re-create a deleted object (refer to the section on undo for more information). In the second place, such undo information should never be necessary anyway: if an object is being deleted by the garbage collector, it means that the object was not reachable by any means, either from a direct reference from some object or through an undo record. This means that we could undo to the oldest existing savepoint, and the object would still not be reachable. Hence, even after applying all available undo, there would still be no need to restore the weak reference to the object, so there is no need to save the loss of the weak reference in the undo.
Due to interactions with the garbage collector and the undo mechanism, weak references will sometimes continue to refer to an object for some time after the object is not reachable through any other means. Reaching an object through a weak reference does not imply that the object is otherwise reachable, since the garbage collector may simply not have run since the object became unreachable or because the object is reachable only through undo records. User code that makes use of weak references must be designed with this in mind, and must make any appropriate checks, according to its own data model, to determine if the object is still meaningful in the context of the user code's processing. Of course, the object is always still valid as long as it is referenced even through the weak link, so accessing the object will never cause a VM error or result in any data corruption.
Note that if an object is reachable only through a weak link, and some other code accesses the object through the weak link and creates a new strong reference to the object (for example, by putting a reference to it on the stack or in a property of a regular TADS object), there is no ambiguity created with respect to garbage collection and no danger of a dangling reference (a reference to a deleted object). In this situation, if the garbage collector runs after the new strong reference is created, the GC will know it cannot delete the object. So, even though the object could have been deleted if the GC had been invoked during the period of the time the object was reachable only through the weak reference, the fact that it did not run during that time means that the object will never have been deleted. There is no danger of a dangling reference.
The variable-size object extension for a String object contains a two-byte length prefix, followed by the bytes of the string. The length prefix gives the length of the string in bytes, not counting the length prefix bytes themselves.
A String object always has a constant text string associated with it, even though the String object itself was dynamically constructed. In other words, an operation such as concatenation applied to a String object does not affect the String object itself, but instead creates a new String object containing the result of the operation. This is important, because it means that a given String object never changes, so an operation on a String object will not affect any other references to the same String object. Consider the following TADS code:
s1 := substr('hello there', 1, 5); s2 := s1; s1 += '!!!';
The first line creates a dynamic string containing the text 'hello' and assigns a reference to the dynamic string to the local variable s1. The second line assigns a reference to the same string to the local variable s2. The third line changes s1 by appending the text '!!!'. This does not change the underlying String object to which s1 and s2 refer just before this statement, but instead creates a new String object that contains the result of the concatenation, and assigns a reference to the new object to s1. After the concatenation, s2 still refers to the original String object, which still constains 'hello'.
The String metaclass performs the following operations for the generic object interface:
Dynamic construction from stack arguments: Strings cannot currently be constructed dynamically using the NEW instruction. (There'd be little point in constructing a new string from an existing string, since it would just be an unnecessary copy of the same string. It may be desirable to add some form of this in the future, though, perhaps for certain types of type conversions.)
A List object's variable-size extension contains a two-byte element count, followed by the elements of the list. The elements are stored as a packed array of value holders.
As with String objects, a List object's value is immutable, even though the object itself was dynamically constructed. So, an operation on a list that yields a different value for the list also yields a new List object, rather than changing the original List object's data. This means that a given List object never changes, so an operation on a List object will not affect any other references to the same List object. Consider the following TADS code:
s1 := ['x' 'y'] + 'z'; s2 := s1; s1 := 'w';
The first line creates a dynamic list containing the list ['x' 'y' 'z'], and assigns a reference to the object to the local variable s1. The second line assigns a reference to the same object to the local variables2. The third line changes s1 by assigning the string 'w' to the first element of its list. This does not change the underlying List object to which s1 and s2 refer just before this statement, but instead creates a new List object that contains the result of the index assignment, and assigns a reference to the new object to the original List object, which still contains the original list ['x' 'y' 'z'].
The List metaclass performs the following operations for the generic object interface:
Dynamic construction from stack arguments: Constructing a list from stack arguments results in a list that contains each of the arguments, in the same order in which they appear on the stack (the first item popped becomes the first element of the list, and so on, so the list elements should be pushed in the reverse of the order that they should have in the resulting list). Constructing a list with zero arguments creates an empty list.
s1 := new Array(10); for (i := 1 ; i < 10 ; ++i) s1[i] := i; s2 := s1; s1 := 100;
This code first creates an array with ten elements and assigns it to the local variable s1. Next, the code initializes the array by storing a number in each element equal to the element's index. Next, the code assigns a reference to the same array to s2. Finally, the code assigns the number 100 to element 5 of the array in s1. After this final line, the value of s1 is 100; in addition, the value of s2 is also 100. Since both variables refer to the same array object, and the underlying object is changed by the assignment, both variables see the change. The Array metaclass performs the following operations for the methods in the generic object interface:
This metaclass is exactly like the ordinary list metaclass, except that any object references contained in the list are treated as "weak" references. The presence of a weak reference does not prevent an object from being deleted by the garbage collector in the absence of any other references.
The weak reference list metaclass is implemented as a subclass of the list metaclass. The garbage collection methods (mark references, delete unreachable weak references) are overridden in the subclass to provide weak reference rather than strong reference behavior.
When a referenced object is deleted, the element of the list containing the element is deleted from the list. Note: this behavior is somewhat inconsistent with regular lists, which are immutable.
Much as the weak reference list metaclass provides a subclass of the list metaclass using weak references, the weak reference array metaclass provides a subclass of the array metaclass for arrays of weak references. This metaclass behaves as (and is a suclass of) the regular array metaclass; the only difference is that any object references stored within elements of a weak reference array are treated as weak references.
When a referenced object is deleted, the element of the array containing the element is set to nil.
The T3 VM provides a "lookup table" metaclass that provides an efficient mechanism for looking up objects by key value. While the same functionality could be implemented from simpler objects in user code, this native-code metaclass is provided for better performance.
LookupTable objects can be created explicitly by user code via the new LookupTable operator. Once created, a lookup table can be populated using a set of method calls.
A lookup table contains a set of associations; each assocation relates a key and an object. The table can be efficiently searched given a key to yield a list of the objects associated with the key.
A key is simply a string value. The associated value can be any object.
The lookup table object extension stores a hash table. The hash table starts with a count, giving the number of elements in the table, then an array with the given number of elements. Each element of the array consists of a value holder, which contains the key value for the entry, or nil if there is no entry at that element; and a second value holder, which contains a constant list or the ID of a Weak Reference List object, or nil if the element is unused. Each entry of the list is an object that has been associated with the key.
Undo interaction: the lookup table metaclass must store undo
information for any changes it makes to the table structure. Each
time a table entry is changed, we must store an undo record
reflecting the index of the change, the original key value, and the
original value. Because an undo record can only contain a single
value, we must record a change to the key value and a change to the
list value separately.