www.want-tool.org
Execution Runtime
   

From: "Andrew Wozniewicz"
Newsgroups: sdforum.want
Sent: Wednesday, November 07, 2007 3:04 AM
Subject: Execution Runtime [LONG]

It's been pretty quiet here for a while, so here is what I am currently working on, albeit not as much as I would like to...

I am working to design and implement a Runtime Execution Engine, hereinafter referred to as the WANT 2.0 Runtime (W2R), or simply the "Runtime". I am making steady progress there, but I could certainly use some feedback. So here is an initial description of how I envision the Runtime to work. The details are a conglomerate of my currently half-cooked implementation, and the remaining "vision" I have for the Runtime. Please, bear with me as I cover some fundamental ground - I needed to write this to clarify some things for myself, anyway. Sorry, I didn't have the time to make it any shorter.

INTRODUCTION

The Runtime executes XML or, more precisely, an in-memory DOM-like structure constructed out of nodes - implementors of INode interface, with XML being just one of the possible representations of the DOM (DML being another, for example). Here is the essence of INode and its associated INodeCollection interfaces:

  INodeCollection = interface
    ...
    function GetItem(AIndex: Integer): INode;
    function GetCount: Integer;
    procedure SetItem(AItend: Integer; AValue: INode);
    //
    function Add(ANode: INode): Integer;
    procedure Insert(AIndex: Integer; ANode: INode);
    procedure Delete(AIndex: Integer);
    procedure Clear;
    function FindByAttr(const AAttrName, AAttrValue: String;
      CaseSensitive: Boolean = False): INode;
    //
    property Item[Index: Integer]: INode
      read GetItem write SetItem; default;
    property Count: Integer read GetCount;
  ...
  end;

  INode = interface
   ...
    function GetNodeName: String;
    procedure SetNodeName(const AValue: String);
    function GetAttrValue(AName: String): String;
    procedure SetAttrValue(AName: String; const Value: String);
    function GetChildren: INodeCollection;
    function GetXML: String;
    //
    property NodeName: String read GetNodeName write SetNodeName;
    property Children: INodeCollection read GetChildren;
    property AttrValue[Name: String]: String
      read GetAttrValue write SetAttrValue; default;
    property XML: String read GetXML;
  ...
  end;

So, essentially, a Runtime DOM node - an INode - has (String) attributes, and (optionally) other INode children presented as INodeCollection. With a nod to the needs of XML generation, it also has a "NodeName" property that corresponds to the tag of its XML representation (which is not to be confused with the "Name" attribute, which it may, or may not have).

Notice that - unlike most other DOM nodes I've seen - the INode does NOT define a unique "parent", which actually makes it quite powerful and universal (not only is it possible to construct "trees", but also more generic DAGs - Directed Acyclic Graphs). Also note that this DOM is a different implementation of a DOM from that in WANT 1.0 - the INode interface, my TNode implementation instead of Juanca's, and an XML parser/generator, are just a few of the most obvious differences.

The actual executable-DOM (or the "Abstract Syntax Tree" that the Runtime understands, which in this case is not a tree but a DAG) is structured very differently from the current "executable" DOM of WANT: it's much more detailed (verbose!) and is thus not very suitable for manual human input at all (not that the original WANT XML script *was* suitable, but still, it was much more amenable).

The WANT 2.0 Runtime (W2R) is more of a generic-script-executable-DOM, designed explicitly to support all features I wanted to see in Modula7, yet it is a language-agnostic DOM, much like the CLR of .NET. It is much more than WANT by itself potentially needs, but I am deliberately building it in such a way that WANT features will just fall into place naturally - and then the power will be there to extend it into as yet unknown directions. The fact that the Runtime is language-agnostic allows me to sidestep - for now - the issue of the scripting language design altogether, and to concentrate on the under-the-hood run-time functionality common to all possible scripting languages that could potentially run on top of it.

In short, the new DOM, when represented as XML, does not resemble the old WANT XML script at all. I call this new XML format "XML eXecutable", or XMLX. I am currently forced to use XMLX files created by hand to exercise my Runtime as I am building it, which is a bit of a pain given the XMLX's verbosity, but of course, in the future the DOM may and will be generated directly by any one capable script parser and XMLX files will not be necessary at all. For now, one can think of them as precompiled "object" files, or "intermediate code" that can be loaded into the Runtime engine to execute.

To pre-empt the protests by performance buffs, the implementation is String-based, i.e. all literal values are ultimately (huge) Delphi Strings. This means that yes, the integers are represented as strings, and that yes, there needs to be a StrToIt+IntToStr conversion whenever an actual computation is taking place. This is by design, and deliberate. I'll explain the rationale at some other time, but just be advised that if you are implementing the avionics for a supersonic aircraft, you should select something other than a W2R-based script for your implementation. I still think it's good-enough for WANT.

MODULE = DATA + CODE

Now, the fundamental concept in the Runtime is that of a MODULE. A module, at its simplest, is a combination of data and code, i.e. DATA + CODE = MODULE (Niklaus Wirth's shadow looming large here). A module is also an embodiment of an Abstract Data Type, with DATA + OPERATIONS on that data. When you "invoke" a module, you run its "code" (a designated method) which operates on its "data". Examples of modules include classes, and methods.

By analogy, the Delphi concept of a unit is just an example of a static module, while a class is a module that can (typically) be instantiated (non-static). Less obvious, perhaps, is a realization that a subroutine (a method, a procedure, a function) is also a module. Unlike Delphi (or most other programming environments) W2R does NOT make the distinction among the different kinds of modules and treats them pretty much uniformly: class, "unit" (actually called "module" in M7), method, procedure, function, etc., are all essentially the same thing to the Runtime.

The fundamental characteristic of a module is that it is a recursive concept, i.e. a module definition may contain other module definitions inside it, i.e. Module = (Data + Code) + Other_Modules. Thus a class definition contains methods, and a method definition may contain classes, ad infinitum. As far as the runtime is concerned there are only classes and their methods, nested in one another to an arbitrary depth, and the two are just specialized kinds of modules.

Looking at it a bit closer, following are the constituent components/sections of a module (these are immediate child nodes of the module, and each potentailly containing other nodes):

Module:

  • static data
  • static initialization (class constructor)
  • static finalization (class destructor)
  • static methods
  • parameters
  • local data
  • initialization (default constructor)
  • finalization (default destructor)
  • non-static methods (embedded/contained method modules)
  • module types (embedded/contained non-method modules)
  • non-static code (for direct invocation)

Each of these components is optional, and different kinds of modules (for example a method versus a class) will typically have a different mix of them, but all are allowed in every module.

The code sections of a module - which include both static and non-static code, initialization, and finalization sections - when non-empty, contain code statements, such as assignments, and function calls (they are, you guessed it, also nodes of the DOM). These code sections are invoked via the IExecutable interface's Execute call:

  IExecutable = interface
   ...
  function Execute: INode;
 end;

This means that the "executable" code sections within the DOM must implement IExecutable interface, in addition to the INode interface that makes them part of the DOM in the first place.

Alternatively, a method can be implemented externally to the Runtime, by being mapped to a Delphi class that implements the IExecutable interface. Note that a Runtime method implementation is thus a Delphi class instance that can be dynamically registered with the Runtime.

An externally-implemented method (externally with respect to the Runtime) has a reference to the external class instance instead of executable script nodes as its code section. The external class is instantiated upon the loading of the code block, and is available throughout the Runtime's execution thereafter.

EXTERNAL METHODS

So, the short of it is that in order to implement an (external) method, one has to define (in Delphi) a class that implements IExecutable. This also applies if one needs to make a native (e.g. Windows API) function available to the script - a wrapper class must be implemented that exposes the parameters of the external function as script parameters, and enables the Runtime to marshall data between the two. Since there are potentially lots of such methods that one might eventually want to implement, there are numerous classes that need to be defined.

This is why I want to make it (relatively) easy to implement such externally defined function-wrapper-classes. These classes must be able to provide some metadata to the Runtime for it to know how to invoke the method correctly, but I don't want the task of generating the metadata to become tedious for the method-implementation-class writer, i.e. the programmer such as you. Enter the RTTI.

An external method implementation would be defined along the following lines (I'll use a wrapper for the system Copy function in this example):

type
  TSystem_Copy_Method = class(TMethodParent,IExecutable)
 protected
  function GetResult: String;
   procedure SetS(AValue: String);
  procedure SetIndex(AValue: Integer);
 published
   property Result: String read GetResult;
   property S: String write SetS;
  property Index: Integer write SetIndex;
  property Count: Integer write SetCount;
 public
    procedure Execute;
 end;

The implementation of the Execute could then be as easy as this simple wrapper:

 procedure TSystem_Copy_Method.Execute;
 begin
   Result := System.Copy(S, Index, Count);
 end;

The reason I would like to do it this way is that it makes it possible to use the RTTI to gather the metadata about the method call from the properties of the implementing class. The Runtime could use this info to detect the parameters, and it makes it (relatively) easy to implement the actual code of the method.

In the example above, the wrapper declares three input parameters to the method (write-only properties of the implementing class), and one output-only parameter (the read-only property called the Result).

Given the above declaration of TSystem_Copy_Method, it would just take a call to

  RegisterStaticMethod('System.Copy',TSystem_Copy_Method);

for the Runtime to be able to gather all the information it needs about the method to be able to call it with appropriate parameters. The net effect of these declarations corresponds to the following script function header, if it were to be represented in Pascal:

  function Copy(S: String; Index, Count: Integer): String;

The RTTI embedded in the TSystem_Copy_Method class declaration is sufficient for the Runtime to determine the signature of the method in question: three input integer parameters, and a String result. The read and write access of each property tells the Runtime whether the parameter is IN (write-only), OUT (read-only), or both (read+write).

It is also worth noting that a script (or external) "method" is more like a stored procedure in a relational database than a function or procedure in a high-level language: it may have an arbitrary number of in parameters, an arbitrary number of OUT parameters, and an arbitrary number of IN-OUT parameters, each of which can be optional (with a default value).

There is also support for variable argument lists within the Runtime. This feature allows for the implementation of a wrapper to Delphi System.Write() and System.WriteLn() procedures for example. An argument within the variable list is nameless. To mark the start of a variable argument list in an external method implementation, define a published property "___" (three underscores) of type INode. These parameters will be assigned unmarshalled, so that the implementor of the external Execute method will have to access them via the INode intrface.

Here is an example of an external method that supports a variable argument list:

type
  TSystem_String_Format_Method = class(TMethodParent,IExecutable)
  protected
    function GetResult: String;
    procedure SetFormatStr(AValue: String);
    procedure Set___(AValue: INode);
  published
    property Result: String read GetResult;
    property FormatStr: String write SetFormatstr;
    property ___: INode write Set___;
  public
    procedure Execute;
  end;

The equivalent Modula7 header would look like:

  function Format(FormatStr: String; ... ): String;

As an example, it can be called (in M7 script), like this:

  S := System.String.Format("%d %s %n", IntVar, StrVar, NumericVar);

IEXECUTABLE VERSUS INODE

A method implementation external to the Runtime is exposed as an instance of IExecutable, which in itself is also an INode and is placed in the method's code section.

Each code section implements IExecutable potentially in its own way. When an external method is invoked, the Runtime marshalls the parameters into the external implementation by assigning the implementation class published, write only properties, calls the implementation's Execute method, and then marshalls any result (out-parameters) back into the node representation (I am very tempted to call it "managed" representation here).

When an external method is registered with the Runtime via a call to RegisterXXXMethod(), the method-implementation class is instantiated as the "code" object.

RUNTIME FOR WANT

There is a natural mapping between the concept of an external method of the Runtime and the old concept of a WANT task, namely, a WANT task IS simply an external method.

So, a WANT task writer implements the task as an external method=Delphi class, and implements the Execute method of that class to use the published properties as parameters.

This is mostly how it currently works in WANT.1, anyway. WANT.2 just clarifies and formalizes the usage of published properties of the task as being the parameters of a (script)method call, making it generic and universally extensible.

Note that all published properties of an external implementation of a script method are treated as parameter definitions, so only those that are intended as parameters should be published. The existing WANT tasks will require some cleanup of their published sections.

-Andrew