Message-id: From: Andreas Kupries To: Joe English Date: Sat, 06 Mar 2004 19:20:04 -0800 Subject: Notes on variables ... Variables in Tcl ================ Main concepts ------------- (1) Contexts Each "context" contains a mapping from _names_ to _data elements_ or _links_. For the purposes of this discussion a "context" can actually be considered to _be_ that mapping. The combination of name and data-element/link is called a "variable". While this term is used to document the various commands handling them, we have to look deeper to gain a fuller understanding of its workings, thus the split into four terms (name, data element, link, mapping [relationship]). Each command is executed within a context and may modify its behavior based on that context. Tcl has two types of contexts. (a) _Procedure context_. These are arranged in a stack, with the context of the currently executing procedure at the top. This stack expands and shrinks dynamically as procedures are called and return. In addition each procedure context knows the _namespace context_ itself belongs to. b) _Namespace context_. These contexts are arranged in a tree. The root of this tree is the _global context_. Namespace contexts are created and destroyed explicitly through command invocations. The only exception is the global context, which is predefined, i.e. exists always and forever. The global context is special in another way. It is not only the root of the tree of namespace contexts, but also considered to be at the bottom of the stack of procedure contexts. From now on G is a shorthand for the global context. (2) Names For the purposes of this discussion a name is simply a sequence of characters, i.e. a string. However the exact properties of the name are relevant later on, most is the type of the name: N matches Type of N Term --------- --------- ---- ^:: Fully-qualified name (FQN) absolute .*::.* Qualified name (QN) relative !(::) Simple name local --------- --------- ---- The last element of a (F)QN is called the "local part" => local(N) The leading part of a FQN is called the "context part" => ctx(N) The 'x' operator is for joining a fully qualified context part with a QN, or context part, to form a FQN, or another fully qualified context part. (3) Links Links are one of the entitities a variable _name_ can be mapped too. It simply refers to another data element in some scope and can be expressed as a tuple containing a context and a name in that context. Please note that the above automatically excludes chains of links from consideration. This is no practical limitation at all. If the core detects an attempt to create a link to a link F it will simply follow F to its source and link directly to that. Another important fact about links is that most operations on variables and their values automatically follow the link and operate on the referenced data element, and not the link itself. (4) Data elements These contain the actually stored values which seem to be associated to a variable. A data element may contain no value at all. Variable states --------------- Combining all of the above a variable (name) in a context can be in one of the following states: Short name ---------- (a) The name is not contained out-of-scope in the mapping. (b) The name is contained in the in-scope-undefined mapping, and maps to a data element which does not contain a value. (c) The name is contained in the in-scope-valued mapping, and maps to a data element which has a value. (d) The name is contained in the in-scope-link-undefined mapping, and maps to a link to a data element which does not contain a value. (c) The name is contained in the in-scope-link-valued mapping, and maps to a link to a data element which has a value. >From the point of view of a Tcl script a number of the states are indistinguishable. However some commands make very fine distinctions based on the detailed state, which can trip up the unwary. Together with the fact that the type of the context the variable name is part of plays a role too we can get quite a mess. Locating a variable ------------------- Whenever a variable is accessed the command performing that access has only the --> name N of the variable in question. To get at the value it has to first determine the context the variable belongs to, to later obtain the mapping of the name, i.e. the data element or link, and from that at last the value. The context determined by the algorithm below is called the "V context". The elements going into the algorithm are the type of the execution context (procedure, or namespace), and the properties of the name itself. And now the algorithm, actually a decision table, based on the --> execution context E of the command, and the properties of N type(N) type(E) Other V context ------- ------- ----- --------- absolute => ctx (N) relative Procedure => ctx (G x N) [=] Namespace N in E => ctx (E x N) [=] !(N in E) => ctx (G x N) local Procedure E [=] Namespace N in E E [x] [=] !(N in E) G [*] [x] ------- ------- ----- --------- [*] Recap: G is the global context. [x] Actually ctx (C x N), but == C, as N has only a local part. Reading from a variable N ------------------------- When reading from a variable N the system first determines V (see above) and then looks at the data element mapped directly or indirectly (through link). If a value is present it will be delivered as the result of the operation, else the operation fails. In other words, reading the value of a variable is successful if and only if the variable's state is a member of the set {c, e}. In that case the variable is said to "exist". Otherwise, i.e. membership of the variable's state in the set {a, b, d}, the variable "does not exist" and the read operation fails. This same criterion is also used by [info exists]. Writing to a variable --------------------- When writing to a variable N the system first determines V (see above). If V does not contain a mapping for N such a mapping is created __in V__ [+], referencing a new data element. Now that a mapping for local(N) is present in V, newly created or not, the system will go to the data element directly or indirectly reachable through that mapping and set the new value into it. If the data element did not contain a value before then the variable "comes into existence" for reading. *** DANGER *** Note carefully the interaction of [+] and [=] above, when determining the V context. If the command writing the variable is executed in a namespace context, the name is not absolute, and nothing is known at all then the new variable/mapping is always created under the global context. Only if a mapping preexists will the write happen in the execution context or below. In other words, the location of the newly created variable is highly dependent on the exact commands executed before the write operation itself, i.e. dependent on if there where scope commands (see later) before the write or not. This is an extreme trap for the unwary. __Always__ use scope commands to ensure the presence of a mapping for the name in the desired context before performing the write. This is one of the situations where we need the detailed model for the state of a variable and where simple "existence" is not enough. The behaviour of the writing command is different for out-of-scope and in-scope-*undefined. I.e. we have two types of non-existing variables. Those truly unknown, and those for which we have a mapping to an undefined value. And the latter can be created only through a scope command. RMW --- Commands which read from, modify and then write to a variable are not handled specially, in terms of semantics. First the variable is located and read from, at last it is written to, both times following the rules laid down above. This means that a variable which does not exist causes an error in the read phase and will not come into existence. The variable has to exist beforehand, and will exist afterward. Scope commands -------------- Tcl has three scope commands. These are "global", "upvar", and "variable". They are the only commands which can creates variables in the *-undefined states, and also variables in the *-link-* states. The important part is that the V context as described above does not apply to them. Note: The "variable" command can also write to a variable. This is modeled here as a combination of a plain scope operation followed by a regular write. Note 2: Important to note here is that in most cases we need only the local part of the variable name during import into a scope. IOW even if the context part is dynamic we might still be able to compute a fixed local name. X/N := [ctx(X x N), local(N)] (link tuple, context and local parts) global ~~~~~~ /=/ including :'s, inaccessible variable type(N) type(E) Result ------- ------- ------ absolute Procedure local(N) in E, links to G/N May create local(N) in G/N. Namespace N in E (/=/, ERROR) ------- ------- ------ relative Procedure local(G x N) in E, links to G/N May create local(N) in G/N. Namespace N in E (/=/, ERROR) ------- ------- ------ local Procedure N in E, links to G/N May create local(N) in G/N. Namespace N in E, links to E/N May create local(N) in E/N. ------- ------- ------ Split the table above for local results, and results in the origin namespace. Local Results ///////////// type(N) type(E) Result ------- ------- ------ local * N in E, links to G/N ------- ------- ------ * Procedure local(G x N) in E, links to G/N * Namespace N in E (ERROR) ------- ------- ------ Origin results ////////////// type(N) type(E) Result ------- ------- ------ * Procedure May create local(N) in G/N. ------- ------- ------ local Namespace May create local(N) in E/N. ------- ------- ------ absolute Namespace ERROR relative Namespace ERROR ------- ------- ------ variable ~~~~~~~~ type(N) type(E) Result ------- ------- ------ absolute local(N) in E, links to G/N May create local(N) in G/N. relative Procedure local(G x N) in E, links to G/N May create local(N) in G/N. Namespace local(E x N) in E, links to E/N May create local(N) in E/N. local Procedure N in E, links to G/N May create local(N) in G/N. Namespace N in E, links to E/N May create local(N) in E/N. ------- ------- ------ upvar ----- Upvar is special. Normally it is used to create links between variables in two procedure contexts (which may be the same), but it can also be used to link variables in namespace contexts into a procedure context. type(N) type(E) Level Result ------- ------- ----- ------ absolute Proc #0 absolute Proc #n absolute Proc n absolute Namespace #0 absolute Namespace #n <= depth absolute Namespace n <= depth absolute Namespace #n > depth absolute Namespace n > depth ------- ------- ----- ------ relative Proc #0 relative Proc #n relative Proc n relative Namespace #0 relative Namespace #n <= depth relative Namespace n <= depth relative Namespace #n > depth relative Namespace n > depth ------- ------- ----- ------ local Proc #0 local Proc #n local Proc n local Namespace #0 local Namespace #n <= depth local Namespace n <= depth local Namespace #n > depth local Namespace n > depth ------- ------- ----- ------ Incomplete ...