tads-gen Function Set
The tads-gen function set provides general utility and data manipulation functions. These functions have no user interface component.
To use the tads-gen function set in a program, you should #include either <tadsgen.h> or <tads.h> (the latter includes both <tadsio.h> and <tadsgen.h>, for the full set of TADS intrinsics). If you're using the adv3 library, you can simply #include <adv3.h>, since that automatically incluedes the basic system headers.
tads-gen functions
dataType(val)
firstObj(cls?, flags?)
If the flags argument is specified, it can a combination (with the | operator) of the following bit flags:
- ObjInstances - the function returns only instances, not "class" objects. This is the default.
- ObjClasses - the function returns only objects originally defined as "class" objects.
- ObjAll - this is defined for convenience as (ObjInstances | ObjClasses).
If the flags argument is omitted, only instances are enumerated, as though ObjInstances had been specified.
getArg(idx)
getFuncParams(funcptr)
- returnValue[1] is the minimum number of arguments taken by the function;
- returnValue[2] is the number of additional optional arguments taken by the function;
- returnValue[3] is true if the function accepts any number of additional arguments (i.e., it's a "varargs" function), nil if not.
The second element gives the number of optional arguments; this element is always zero, because there's no way for an ordinary function (non-intrinsic) to specify optional arguments. This element is included in the list specifically so that the list uses the same format as the Object.getPropParams() method.
If the third element is true, it indicates that the function was defined with the ... varying argument list notation.
getTime(timeType?)
- GetTimeDateAndTime - returns the current system date and time
as a list with the following elements:
[year, month, monthDay, weekDay, yearDay, hour, minute, second, timer].
- year is the AD year number (the full four-digit number)
- month is the month number (January = 1, February = 2, etc.)
- monthDay is the day of the month (on March 13, this returns 13)
- weekDay is the day of the week (Sunday = 1, Monday = 2, etc.)
- yearDay is the day of the year (January 1 = 1, January 2 = 2, etc.)
- hour is the hour of the day on a 24-hour clock (2:00 PM returns 14)
- minute is the minute of the hour (2:35 PM returns 35)
- second is the second of the minute (2:35:12 PM returns 12)
- timer is the number of seconds since January 1, 1970.
- GetTimeTicks - returns the number of milliseconds since some arbitrary zero point. The precision of the timer varies by system, so the fact that the return value is represented with millisecond precision doesn't necessarily mean that the system is actually capable of measuring time differences that precisely. The zero point is chosen as the time of the first call during a VM session, which reduces the likelihood that the program will ever encounter a timer rollover (i.e., the point at which the timer exceeds the 31-bit precision of the integer return value and thus resets back to zero), which occurs after about 23 days of continuous execution.
makeString(val, repeatCount?)
- If val is a string, the return value is the given number of copies of the string appended one after the other. For example, makeString('abc', 3) yields 'abcabcabc'.
- If val is a list (or a list-like object), the list must contain integers. Each integer in the list gives a Unicode character value. The function constructs a string with the same number of characters as the list has elements, and with each character of the string having the Unicode code point of the corresponding integer in the list. This string is then repeated the given number of times. For example, makeString([65,66,67]) yields 'ABC'.
- If val is an integer, the function returns a string consisting of the single Unicode character whose code point is given by the integer, repeated the specified number of times. For example, makeString(65, 5) yields 'AAAAA'.
- Other types are invalid.
If repeatCount is not specified, the default is 1. If the value is less than zero, an error is thrown.
max(val1, ...)
min(val1, ...)
nextObj(obj, cls?, flags?)
rand(x, ...)
- With a single integer argument, rand() generates a random integer from 0 to one less than the given value, inclusive, and returns the integer value; for example, rand(10) returns a value from 0 to 9.
- With two or more arguments, rand() randomly selects one of the arguments, evaluates it, and returns the result. Unlike an ordinary function call, rand() only evaluates the single selected argument, which means that only the selected argument's side effects are triggered. This is useful because it means you can intentionally use rand() to randomly select a set of side effects to trigger. For example, rand(f(x), g(x), h(x)) picks one of the functions f(), g(), or h() at random and calls only that selected function.
- With a single argument that's a list, Vector, or list-like object, rand() randomly selects one of the elements of the list and returns it. For example, rand(['red', 'green', 'blue']) will pick one of the color names at random and return it.
- With a single string argument, rand() returns a string of
random characters based on a template given by the argument string.
The template is a series of codes specifying classes of characters
to select randomly. For example, 'd' specifies a random digit, so
if you want to create a string of five random digits, you can write
'ddddd'. The template syntax also has some extended features that
allow repetition and grouping, using syntax that's similar to the
regular expression language. Here are all of the template codes:
- a - randomly selects a lower-case letter, 'a' to 'z'
- A - randomly selects an upper-case letter, 'A' to 'Z'
- c - a mixed-case letter, 'a' to 'z' or 'A' to 'Z'
- z - a digit or mixed-case letter, '0' to '9', 'a' to 'z', or 'A' to 'Z'
- d - a digit, '0' to '9'
- x - a lower-case hexadecimal digit, '0' to 'f'
- X - an upper-case hex digit, '0' to 'F'
- i - any printable ASCII character (U+0020 to U+007E)
- l - (lower-case L) any printable Latin-1 character (U+0020 to U+007E, U+00A0 to U+00FF)
- u - any printable Unicode character; this includes all Unicode characters except undefined characters, the private use area, and the control characters.
- b - a random "byte" - that is, a character whose Unicode character code fits in eight bits (U+0000 to U+00FF)
- %x - literally 'x' (copies the character following the % sign exactly as given). This removes the special meaning from the character. To include a literal '%', write '%%'.
- "abc" - literally "abc"; removes the special meanings of all characters within the quotes, and groups the characters together as though they were enclosed in parentheses. For example, "abc"{3} turns into 'abcabcabc'. To include a quote, write it twice in a row: "abc""def".
- [abcw-z] - one character chosen randomly from the listed characters. "w-z" includes the range of characters from "w" to "z" (so it includes w, x, y, and z). To include ']' or '-' in the set, use '%]' or '%-', respectively (and to include '%', use '%%').
- a|b - select one of a or b at random. Each of these can be any string of character codes. Any number of alternatives can be strung together with additional '|' symbols. For example, 'ddd|aaa|AAA' random selects a string of three digits or three lower-case letters or three upper-case letters. Each alternative is equally likely, so if there are four alternatives, each one has a 25% chance of being chosen.
- {5} - repeat the preceding item 5 times. Each repeat is a new random selection; for example, d{9} produces nine random digits. You can use parentheses to repeat a group of items: (ddd%-){2} produces three random digits and a hyphen, then three more random digits and another hyphen. ((d{3}%-){2} is another way to write the same thing.)
- {5,8} - repeat the preceding item a random number of times from 5 to 8. A uniformly distributed random integer is chosen within the given range to determine the number of repeats.
- {5,} - repeat the preceding item at least five times, plus a random number of times more than that. Each additional item is added with a 2/3 probability, so the odds of no more items being added are 1/3, the odds of exactly one extra item are 2/9, and so on.
- ? - repeat the preceding item zero or one times, at random. Same as {0,1}.
- * - repeat the preceding item zero or more times, at random. Same as {0,}.
- + - repeat the preceding item one or more times, at random. Same as {1,}.
- (a) - parentheses can be used to group items for the symbols |, { }, *, ?, and +. For example, ab{3} is the same as abbb, while (ab){3} is equivalent to ababab.
- All other characters are ignored (and produce no output at all).
In all cases, rand() chooses numbers that are uniformly distributed over the relevant range, which means that each value in the range has equal probability.
rand() uses a cryptographic pseudo-random number generator called ISAAC. "Pseudo-random" means that the numbers aren't truly random; rather, they come from a mathematical formula that generates numbers that look random. They look random in two senses: mathematically, the distribution of the numbers satisfies various statistical tests of randomness; and practically, it's impossible to predict the next value in the sequence, even if you've collected a lot of past values, because the generator has hidden internal state that you can't infer from observing the output sequence.
The formula that produces the numbers is deterministic, so you'll always get the same series of output values for a given starting state for the generator. This makes it important to randomize the initial state, so that the sequence changes on each run, making it truly unpredictable. As of TADS 3.1, the interpreter automatically randomizes the initial state by default when the program starts. TADS gets the random initial "seed" values from the operating system; most modern systems have sources of true entropy for this purpose. In some cases you might actually want the same sequence of numbers on every run; for example, when running regression tests, it's useful to have a reproducible sequence of events that plays out exactly the same way every time. To prevent TADS from randomizing the initial state, you use the -norand option when you start the TADS interpreter.
randomize()
See rand() for more details on the random number generator.
Starting in version 3.1, TADS automatically seeds the rand() generator at startup (except when running with the debugger, such as Workbench on Windows). However, the user can override this using the -norand option when running the interpreter. If you're writing a game where it's important for the random sequence to be unpredictable, you might still want to call randomize() to prevent the user from trying to rig the game by using -norand to get a repeatable rand() sequence.
On the other hand, if you're running regression tests, it's important to have rand() return a repeatable sequence of numbers, so that the program produces exactly the same results on every run even if it calls rand(). For this situation, you should use the -norand option, and skip any calls to randomize(). This will make the interpreter use a fixed initial seed value so that rand() returns the same sequence on each run.
restartGame()
restoreGame(filename)
All objects, except transient objects, are restored to the state they had when the state was saved to the given file.
If an error occurs, the function throws a run-time error. The errno_ property of the RuntimeError exception object gives a VM error code describing the problem; the possible errors are:
- 1201 - the file does not contain a saved state (it has some other type of data)
- 1202 - the state was saved by a different program, or by a different version of the same program
- 1207 - the file is corrupted
rexGroup(groupNum)
Only ordinary "capturing" groups are counted in the numbering scheme. Assertions and non-capturing groups aren't counted.
The return value is nil if groupNum is higher than the number of groups in the regular expression, or if there was no match for the group. If there's a match for the group, the return value is a three-element list: the first element (at index [1]) is the character index of the group match within the original source string; the second element is the length in characters of the group match; and third element is a string giving the matching text.
rexMatch(pat, str, index?)
If the leading substring of str matches the regular expression, the function returns the number of characters of the matching substring; if there is no match, the function returns nil. This does not search for a match, but merely determines if str matches the expression in its leading substring. Note that a regular expression can successfully match zero characters, so a return value of zero is distinct from a return value of nil: zero indicates a successful match that's zero characters long, and nil indicates no match.
If index is given, it indicates the starting index for the match; index 1 indicates the first character in the string, and is the default if index is omitted. If index is negative, it's an index from the end of the string (-1 for the last character, -2 for the second to last, etc). This can be used to match a substring of str to the pattern without actually creating a separate substring value.
Refer to the regular expressions section for details on how to construct a pattern string.
rexReplace(pat, str, replacement, flags?, index?)
The return value is the resulting string with the substitutions applied.
pat can a string that uses the regular expression syntax to specify the search pattern, or it can be a RexPattern object. (The latter is more efficient if you'll be performing the same search repeatedly, since it saves the work of re-parsing the regular expression each time.) pat can also be a list (or a Vector or other list-like object) containing multiple search patterns; if it is, replacement can similarly be a list of replacements. More on this shortly.
Refer to the regular expressions section for details on how to construct a pattern string.
The flags value is optional. It controls variations on the replacement process. If it's not provided, the default is ReplaceAll. If flags is specified, it's a bitwise combination (with '|') of the following values:
- ReplaceOnce: replace only the first match for the pattern(s).
- ReplaceAll: replace all matches for the pattern(s). This is the default if ReplaceOnce isn't specified, and supersedes ReplaceOnce if both are included.
- ReplaceIgnoreCase: ignore case (that is, capitalization) when searching for the pattern. If this flag isn't included, the default is to search for the pattern exactly, matching capitals only to capitals and lower-case only to lower-case. However, any <case> or <nocase> directive in the regular expression itself supersedes the presence or absence of this flag.
- ReplaceFollowCase: capitalize lower-case letters in the replacement text to follow the capitalization pattern in the matched text for each match. Specifically, if all of the letters in the matched text are upper-case, every letter in the replacement text is capitalized; if all of the letters in the match are lower-case, the replacement text is unchanged; if the match has a mix of capitals and minuscules, the first lower-case letter in the replacement text is capitalized, and the rest are unchanged. This only affects lower-case letters in the replacement string. "%" sequences aren't affected. If the replacement is a callback function instead of a string, it's not affected either; we assume that the function returns the exact replacement text it intends.
- ReplaceSerial: if a list of patterns is provided, this flag scans for the patterns in the list serially: first, we replace all occurrences of the first pattern only (or just the first occurrence, in ReplaceOnce mode); then we start over with the updated string, and replace occurrences of the second pattern only; and so on for each pattern in the list. The default mode is "parallel" mode, which scans the string for all of the patterns at once, replacing the leftmost match for any of them, then repeating this process on the remainder of the string after the first match. See below for more details.
Note that you should never use 0 as the flags value. For compatibility with older versions, 0 has a special meaning equivalent to ReplaceOnce. If you have no other flags to specify, always use either ReplaceOnce or ReplaceAll, or simply omit the flags argument entirely.
If index is given, replacements start with the first instance of the pattern at or after the character index position. The first character is at index 1. If index is omitted, the search starts at the first character. If index is negative, it's an index from the end of the string (-1 for the last character, -2 for the second to last, and so on). Note that a negative index doesn't change the left-to-right order of the replacement; it's simply a convenience for specifying the starting point.
replacement is a string to be substituted for each occurrence of a match to the regular expression pattern pat (or for just the first match, when ReplaceOnce is specified). Each match is deleted from the string, and replacement is inserted in its place.
The replacement text can include the special sequences %1 through %9 to substitute the original text that matches the corresponding parenthesized group in the regular expression. %1 is replaced by the original matching text of the first parenthesized group expression, %2 by the second group's matching text, and so on. In addition, %* is replaced by the match for the entire regular expression. Because of the special meaning of the percent sign, you have to use the special code %% if you want to include a literal percent sign in the replacement text.
For example, this would replace negative numbers in a string with accountant's notation, by putting each negative number within parentheses and coloring it red:
str = rexReplace('-(<digit>+)', str, '<font color=red>(%1)</font>', ReplaceAll);
Note that we've used a parenthesized group in the pattern to group the digits together. This grouped part of the match is available as %1 in the replacement text. This is how we manage to specify a replacement that includes the original numeric value that we matched. Note also that the minus sign is outside of the group, because we don't want to include it in the substitution - we want to change a string like "-120" to "(120)".
Using a pattern list: pat can be specified as a list of regular expressions (as strings, RexPattern objects, or a mix of the two). This lets you make substitutions for several different patterns at one time, without making successive calls to rexReplace().
When you supply a list of patterns, you can optionally supply a list of replacements (as strings, callback functions, or a mix). Each item in the pattern list is matched up with the corresponding item - the item at the same list index - in the replacement list. That is, pat[1] will be replaced with replacement[1], pat[2] will be replaced with replacement[2], and so on. If there are more patterns than replacements, the excess patterns are replaced with empty strings. Any excess replacements are simply ignored.
If pat is a list but replacement isn't, rexReplace() simply uses the same replacement for every pattern. Note that this is different from passing replacement as a list containing one element: when replacement is a single-item list, all patterns after the first are replaced by empty strings, because of the rule for when the pattern list is longer than the replacement list.
There are two ways that rexReplace() can apply a list of replacements. The default is "parallel" mode. In this mode, rexReplace() scans the string for all of the patterns at once, and replaces the first (leftmost) occurrence of any pattern. (If two of the patterns match at the same position in the string, the one with the lower pat list index takes precedence.) If the ReplaceOnce flag is specified, the whole operation is done after that first replacement; otherwise, rexReplace() scans the remainder of the string, to the right of the first replacement, again looking for the leftmost occurrence of any of the patterns. It replaces that second occurrence, then repeats the process until there are no more matches for any of the patterns.
Parallel mode is similar to combining all of the patterns in the list using "|" to make a single pattern. There's a key difference, though: using a list of patterns allows you to specify a separate replacement for each pattern.
The other mode is "serial" mode, which is used when you specify the ReplaceSerial flag. In serial mode, rexReplace() starts by scanning only for the first pattern, replacing each occurrence of that pattern throughout the string (or, if the ReplaceOnce flag is used, replacing just the first occurrence). If ReplaceOnce is specified, and we replaced a match for the first pattern, we're done. Otherwise, rexReplace() starts over with the updated string - the result of applying the replacements for the first pattern - and scans this updated string for the second pattern. As with the first pass, we scan only for the second pattern on this pass, and we replace all occurrences (or just the first, if ReplaceOnce is used). We repeat this process for each additional pattern.
The ReplaceSerial mode is almost equivalent to calling rexReplace() iteratively, once for each pattern in the search list, using the result of the first call as the subject string on the second call, the result of the second as the subject string for the third, and so on. The difference is that a serial mode list will result in only one replacement overall, whereas calling the function iteratively could make another replacement on each iteration.
You should note an important feature of the serial mode: the replacement text from one pattern is subject to further replacement on the next pattern. This is because the entire result from each pass is used as the new subject string on the next pass. In contrast, in parallel mode the replaced text is never rescanned.
Using a callback function to generate the replacement: You can supply a function for replacement, instead of a string. This can be a regular named function or an anonymous function. When a function is specified, rexReplace() invokes it for each match to determine the replacement text. This is very powerful because it lets you apply virtually any transformation to each replacement, rather than just substituting a fixed string.
A replacement function is invoked once for each matching string, as follows:
func(matchString, matchIndex, originalString);
The matchString parameter receives a string containing the text that the regular expression matched, and which is to be replaced. matchIndex is the character index within the original string where this match starts. originalString is the full original string that's being searched. The function should return a string giving the replacement text. It can alternatively return nil to replace the match with nothing, which is equivalent to returning an empty string. Within the function, you can use rexGroup() to retrieve the match text for any parenthesized groups within the search pattern.
You can omit one or more of the parameters when you define the callback function, because rexReplace will only supply as many arguments as the function actually wants. The arguments are always in the same order, though - the names don't matter, just the order. This means that if you provide a callback that only takes one argument, it gets the match string value; with two arguments, they'll be assigned the match string and match index, respectively.
Here's an example that uses a replacement function to perform "title case" capitalization on a string. This capitalizes the first letter of each word in the string, except that it leaves a few small words (such as "of" and "the") unchanged, but only when they occur in the middle of the text. This takes advantage of a callback function's ability to vary the replacement based on the matched text and its position in the subject text. Note that this function omits the third parameter, since it doesn't need the original string to carry out its task.
titleCase(str) { local r = function(s, idx) { /* don't capitalize certain small words, except at the beginning */ if (idx > 1 && ['a', 'an', 'of', 'the', 'to'].indexOf(s.toLower()) != nil) return s; /* capitalize the first letter */ return s.substr(1, 1).toUpper() + s.substr(2); }; return rexReplace('%<(<alphanum>+)%>', str, r, ReplaceAll); }
rexSearch(pat, str, index?)
If index is given, it gives the starting character position in str for the search. The first character is at index 1. If index is omitted, the search starts with the first character. A negative value is an index from the end of the string: -1 for the last character, -2 for the second to last, etc. Note that a negative index doesn't change the left-to-right order of the search; it's simply a convenience for specifying the starting point. The index value can be used to search for repeated instances of the pattern, by telling the function to ignore matches before the given point in the string.
If the function finds a match, it returns a list with three elements: the character index within str of the first character of the matching substring (the first character in the string is at index 1); the length in characters of the matching substring; and a string giving the matching substring. If there is no match, the function returns nil.
Refer to the regular expressions section for details on how to construct a pattern string.
saveGame(filename, metaTable?)
filename specifies the file to save to; this can be a string giving the name of a file in the local file system, or a TemporaryFile object.
If an error occurs, the function throws a run-time error to indicate the problem. The saved state can later be restored using restoreGame().
metaTable is an optional LookupTable object containing "metadata" information to store in the file. This is a collection of game-specific descriptive information; this could include things like the current room name, score, number of turns, chapter number, etc. The interpreter and other tools can extract this information and display it to the user when browsing saved game files. For example, the file selector dialog for a RESTORE command could display the metadata for each available file.
The metaTable LookupTable must consist of string key/value pairs. saveGame() simply ignores any non-string keys or non-string values found in the table. Both the keys and the values are meant to be displayed to the user, so the keys should be descriptive titles for their respective values.
savepoint()
sprintf(format, ...)
format is a string that can contain a mix of plain text and "format codes". The additional arguments after format are data values that are substituted into the result string according to the format codes. Each format code in the format string corresponds to an item in the argument list, and is replaced by the string-formatted value of that argument.
The return value is a new string, consisting of the text of the format string, with each format code replaced by the corresponded argument value, formatted according to the format code.
Here's a simple example:
local str = sprintf('i=%d, j=%d, k=%d', 99, 23, 145);
This produces the result string 'i=99, j=23, k=145'.
A format code is a special sequence of characters within the format string, consisting of the following, in order: % flags width.precision type-spec. The % and type-spec are required, and everything else is optional.
The flags, if present, consist of one or more of the following, in any order:
[n] | Argument number. n is a number from 1 to the number of arguments after the format string. The value for this item will be taken from the given argument number rather than the default positional argument. For example, sprintf('i = %[2]d, j = %[2]d', 100, 200) produces 'i = 200, j = 100'. |
- | Left alignment. If the formatted value is shorter than the width value, padding will be added after the value. By default, the value is right-aligned (padding is added before the value). For example, sprintf('i=%-4d', 123) produces 'i=123 '. |
+ | Always show the sign. A "+" sign is shown before positive numbers and 0. By default, only negative numbers are shown with a sign. For example, sprintf('i = %+d', 123) produces 'i = +123'. |
(Space character): Show a space character before positive numbers and 0. This can be used to make positive and negative values use the same number of characters, without forcing a "+" sign before positive values. | |
, | Digit grouping. For integer and floating point types (b, d, e, E, f, g, G, o, u, x, X), adds a comma between each group of three digits (only before any decimal point). For example, sprintf('i = %,d', 1234567) produces 'i = 1,234,567'. |
_x | Padding character: changes the padding character from the default (a space) to x, which is any single character. E.g., %_*8x formats 123 as '*****123'. |
# |
For integer types (b, d, o, u, x, X), if a width
is specified, adds leading zeros as needed to display at
least width digits.
For the floating point types (e, E, f, g, G) displays a decimal point even if there are no digits after the decimal. For floating point types g and G, keeps all trailing zeros after the decimal point (trailing zeros are normally removed) so that precision digits are always displayed. |
width is an optional number giving the minimum number of characters to use for the item. If the formatted value is shorter than width, padding will be added before or after the item to fill out the specified field width. Spaces are used for padding by default, except that if width starts with a zero (e.g., "%08d"), leading zeros are used instead, provided that left alignment isn't also specified. The "_" flag (see above) lets you specify a custom padding character. width is only a minimum; if the displayed value is longer than width, the value isn't truncated.
precision is another optional number, preceded by a period ".". For example, %.8d specifies a precision of 8. The meaning of precision varies by type:
Integer types (b, d, u, o, x, X) | The minimum number of digits to display. Leading zeros will be added as needed. For example, '%.8d' displays 1234 as '00001234'. By default, no leading zeros are added. |
Basic floating point types (e, E, f) | The number of digits to display after the decimal point. The default is 6 digits. If the argument value has more digits than can be displayed, the value is rounded. For example, %.3f formats 123.456789 as '123.457'. |
Variant floating point types (g, G) | The maximum number of significant digits to display (including before and after the decimal point). If the argument value has more significant digits than can be displayed, the value is rounded. For example, %.3g formats 12.789 as '12.8'. |
Other types | Ignored |
type-spec is the type specifier, which determines how the argument value is interpreted and formatted. This is a single character, from the following list:
% | a literal % sign. This type doesn't use an argument value. |
b | binary integer. The argument is interpreted as a number, and its unsigned integer value is rendered in binary (base 2, using 1s and 0s to represent the bits). |
c | character. If the argument is a string, the first character of the string is used; otherwise the argument is interpreted as a number giving a Unicode character code, and that character is used. |
d | decimal integer. The argument is interpreted as a number, and its integer value is rendered in decimal. |
e | number in scientific notation ("exponent" format, such as 1.23e+010). The argument is interpreted as a number, and its value is rendered in scientific notation. By default, the value is displayed with exactly 6 digits after the decimal point, but you can change this by specifying a precision. For example, %.8e uses 8 digits after the decimal point. A precision of zero, %.0e or %.e, omits the decimal point, unless the # flag is specified (e.g., %#0.e). |
E | same as e, but displays the exponent with a capital "E". |
f | floating point number. The argument is interpreted as a number, and its value is rendered in floating point format. By default, the number is displayed with no limit on the digits before the decimal point, and exactly 6 digits after the decimal point. You can change the number of digits after the decimal point by specifying a precision value: %.8f displays 8 digits after the decimal point. A precision of zero, %.0f or %.f, omits the decimal point, unless the # flag is specified (e.g., %#0.f). |
g | uses the shorter of e or f format. Interprets the argument as a number, and displays it in f format if the decimal exponent is in the range from -4 to the precision value, otherwise uses e format. The precision option for this format specifies the total number of significant digits to display; the default is 6. By default, trailing zeros after the decimal point zeros are removed, and the decimal point itself is removed if there are no digits to display after it. The # flag keeps any trailing zeros, and keeps the decimal point even if there aren't any digits to display. |
G | same as g, but displays a capital "E" if scientific notation is used. |
o | octal integer. The argument is interpreted as a number, and its unsigned integer value is rendered in octal (base 8). |
s | string. The argument value is rendered as a string. By default, the entire string is shown, but if there's a precision setting, it specifies the maximum number of characters to show from the string; if the string is longer, it's truncated to that number of characters. |
u | decimal integer, unsigned. The argument is interpreted as a number, and its unsigned integer value is rendered in decimal. |
x | hexadecimal integer. The argument is interpreted as a number, and its unsigned integer value is rendered in hexadecimal (base 16) using lower-case letters (abcdef). |
X | same as x, but uses upper-case letters (ABCDEF). |
Other characters are not valid as type specifiers. If you use an invalid type code, the whole % sequence will be retained in the result string without any substitutions.
The first (leftmost) % item in the format string is matched up with the first argument in the argument list, and each subsequent % item is matched up with the next argument. (You can also use the [ ] flag to select a particular argument, instead of automatically using the next argument in the list.)
Each type-spec code expects a particular datatype for its argument value. If the value isn't of the correct type to begin with, sprintf will automatically try to convert it to the correct type, as follows:
Floating point types (e, E, f, g, G) |
|
Integer types (b, d, o, u, x, X) |
|
Character (c) |
|
String (s) |
|
Unsigned integers: several of the integer types (b, o, u, x, X) display "unsigned" integer values. This means that if the argument is a regular 32-bit integer value, and it's negative, the value is interpreted as an unsigned quantity in the native hardware format of the machine the program is running on. Almost all modern computers use two's complement format, which represents negative numbers as though they were very large positive numbers. For example, %x formats -1 as 'ffffffff'. See toString for more discussion on unsigned integers.
There's no such thing as an unsigned BigNumber, and no way to interpret a BigNumber as unsigned. If you format a negative BigNumber value with an unsigned integer type spec, the "unsigned" aspect of the format code is ignored, and the value will be shown as negative, with a minus sign. For example, %x formats -255.0 as '-ff'. If you really want the two's complement version of a BigNumber value, use toInteger() to explicitly convert the argument to an integer (but if you do this, note that the value must be in the valid range for a 32-bit integer, -2147483648 to +2147483647).
toInteger(val, radix?)
If the radix value is specified, the conversion uses the given radix as the numeric base for the conversion; this value can be any integer from 2 to 36. If radix is omitted, the default is 10 (decimal).
The interpretation of val depends on its type:
- If val is an integer, the return value is simply val.
- If val is a BigNumber, the value is converted to an integer by rounding to the nearest whole number. If the number is too large for the integer type to hold (that is, outside the valid integer range, -2147483648 to +2147483647), a run-time error occurs ("numeric overflow").
- If val is nil or the string 'nil' (ignoring any leading or trailing spaces), the return value is 0.
- If val is true or the string 'true' (ignoring any leading or trailing spaces), the return value is 1.
- If val is any other string value, the function skips any
leading spaces in the string, then parses the text as an
integer in the given radix. If the first character after any leading
spaces is a "+" sign or hyphen "-", the function notes the sign, and
skips the sign symbol and any spaces after it. The function then
scans all following consecutive numerals in the given radix and
returns the resulting integer value. If the radix is greater than 10,
the letters A through Z (in upper or lower case) represent the
"digits" 10 through 35, in analogy to hexadecimal notation. Parsing
stops at the first character that isn't a valid digit in the given
radix. For example, if radix is 12, the string '-A1C' returns
the value -121: parsing stops at the 'C' because it's not a valid
digit in base 12.
If the value is outside the bounds of the 32-bit integer type, a numeric overflow error is thrown. However, there's special treatment for "programmer" bases - binary, octal, and hexadecimal, base 2, 8, and 16, respectively. As long as there's no '-' sign, a hex, octal, or binary number will be treated as an "unsigned" 32-bit integer, which can hold values up to 4294967295. The result will still be returned as an ordinary (signed) integer, so any value above 2147483647 will be returned as a negative number. For example, parsing 'ffffffff' in radix 16 returns the integer value -1. This special handling is used because it's traditional in C-style languages to use hex constants (and sometimes octal, and occasionally binary) to represent bit vectors rather than arithmetic values. It's convenient in bit vectors to be able to use all 32 bits of the integer type directly, rather than having to worry about how the positive/negative sign is represented. Remember, this only applies to hex, octal, and binary inputs; ordinary decimal numbers, or numbers in unusual bases like base 12, will overflow if they go outside the normal signed integer boundaries of -2147483648 to +2147483647.
- If val is of any other type, an error is generated ("invalid type for built-in").
See also the toNumber function, which can parse strings containing floating point values and whole numbers too large for the ordinary integer type.
toNumber(val, radix?)
If the radix value is specified, the conversion uses the given radix as the numeric base for the conversion; this value can be any integer from 2 to 36. If radix is omitted, the default is 10 (decimal).
The interpretation of val depends on its type:
- If val is an integer, the return value is simply val.
- If val is a BigNumber, the return value is simply val.
- If val is nil or the string 'nil' (ignoring any leading or trailing spaces), the return value is 0.
- If val is true or the string 'true' (ignoring any leading or trailing spaces), the return value is 1.
- If val is any other string value, the function
parses it as an integer value, using the same syntax rules as toInteger().
If the string represents a whole number that fits within the bounds of a 32-bit integer, the value is returned as an ordinary integer. If the value is outside the bounds of the 32-bit integer type, it's returned as a BigNumber value. This makes it possible to parse integer values of effectively unlimited size.
If the radix is 10, and the string contains a decimal point (a period, '.') or a scientific notation exponent (the letter 'e' or 'E', followed by an optional '+' or '-' sign, followed by at least one digit), the value is parsed as a floating point number, and the result is returned as a BigNumber. Note that a decimal point always causes the value to be returned as a BigNumber, even when the actual value turns out to be a whole number that would have fit in the ordinary integer type, such as '1234.000' or '1234.'.
Floating point values can only be represented in decimal. For any radix other than 10, decimal points are considered non-digit characters and terminate parsing. Similarly, scientific notation isn't recognized in non-decimal bases: if the radix is 15 or higher, 'E' and 'e' represent the digit value 14, and for radix 14 or lower they're simply non-digit characters that terminate parsing.
- If val is of any other type, an error is generated ("invalid type for built-in").
See also the toInteger function, which explicitly converts values to integers.
toString(val, radix?, isSigned?)
- If val is an integer, the value is converted to a textual representation of the number. If radix is specified, the conversion is performed in that numeric base. If it's omitted, the default is decimal (base 10). The integer's numerical value is interpreted as "signed" if isSigned is true, or if isSigned omitted and the radix is 10; otherwise, the value is interpreted as "unsigned". See below for more details on the radix and signed/unsigned treatment.
- If val is a BigNumber, and the
radix value is 10 (or omitted) or the number has a
fractional component, it's converted to a decimal (base 10) floating
point representation, using the default BigNumber formatting. This is
the same as calling the formatString() method on the
value with all of the options set to defaults.
If radix is value other than 10, and the value is a whole number with no fractional part, the value is represented in the given numeric base instead of decimal. Scientific notation (with an exponent, as in '1.0e+2') will never be used for a non-decimal base, no matter how large the number is, and no decimal point will appear in the result, since only whole numbers can be converted to a non-decimal base.
The isSigned value is ignored for BigNumbers.
- If val is true or nil, the result is the string 'true' or the string 'nil', respectively.
- If val is a ByteArray object, it's converted to a string by treating each byte as a Unicode character value.
- If val is a string, the return value is simply val.
- For any other type, an error occurs ("no string conversion").
isSigned is meaningful only with integer values. It's ignored for other types (including BigNumber values, even when they're whole numbers with no fractional part). true means that an integer value is represented in the result as its ordinary positive or negative arithmetic value; negative numbers are represented with a dash '-' followed by the absolute value, and positive numbers simply with the digits of the absolute value. If isSigned is omitted, the default is true if the radix is 10 (or omitted, in which case the default value is 10), nil for any other radix value - so the default is "unsigned" for hex, octal, and all other non-decimal bases.
"Unsigned" means that the value is interpreted according to its native hardware storage format on the computer the game is running on, instead of its ordinary arithmetic value. For positive values this makes no difference, because the "unsigned" interpretation of a positive value is always the same as the "signed" interpretation. But for negative values, the unsigned interpretation is quite different. All computers store negative binary integers by reserving one bit of the standard binary integer type to carry the sign information. This means that this reserved bit isn't included in the arithmetic value, which reduces the effective range of the type by a factor of two. That's why a 32-bit signed integer can only hold values up to +2,147,483,647, even though 232 is 4,294,967,296. The "unsigned" interpretation means that we ignore the special meaning of the sign bit and consider it to be just part of the numeric value. This doubles the effective range of the type, but the price is that there's no such thing as a negative number in this interpretation (thus the name "unsigned").
The actual storage format for negative numbers is usually a little more complicated than just a "sign bit". Nearly all modern hardware uses two's complement notation. Some older hardware used one's complement notation or sign-and-magnitude notation. All of these formats do have one thing in common, though, which is that one bit is set aside to carry the sign information.
The main value of an unsigned interpretation is when you're using an integer as a combination of bit flags (using the bitwise operators | and &), rather than for its arithmetic value. In this case, an unsigned view lets you use all of the bits directly, including the bit normally reserved as the sign bit, without regard to how the machine encodes negative numbers.
undo()
When the function returns nil, it will have made no changes to the system state. The function never makes any changes unless it has a complete set of undo information back to a savepoint, so the function will never leave the system in an inconsistent state. The VM has an internal limit on the total amount of undo information retained in memory at any given time, to keep memory consumption under control during a long-running session; as new undo information is added, the VM discards the oldest undo information as needed to keep within the memory limits. This maintains a rolling window of the most recent undo information.