Difference between revisions of "Adam Evolution"

From Adobe Open Source Wiki
Jump to: navigation, search
(initial population)
 
 
(2 intermediate revisions by one user not shown)
Line 1: Line 1:
During the 1.0.13 development I added better diagnostic support for Adam (driven by Mat Marcus who was struggling to learn how Adam worked). This led to a couple of surprises in what the Adam solving engine was actually doing, which let to a significant increase in my understanding of the problem and solution. Here I'm going to describe how Adam _should_ solve, pointing out differences in how it solves in the 1.0.13 release, and how that is different than the prior releases. Because of the significant increasing in understanding, I will be rewriting Adam soon - this will have few visible side effects (except performance, and an easier to read engine).
+
This page is intended to document how the property model library (aka Adam) works, or in some cases is supposed to work.
  
Before I jump into the Adam solver, a note on terminology -
+
A few notes on terminology:
  
Adam and Eve are at the end of their lives for useful "code names" for these libraries. Few people can remember the difference between what Adam and Eve are, and there is much confusion about what comprises these libraries (I often get questions like, "can an Eve button do ____?" These are non-sensical questions - as there is no knowledge of buttons (or any other view class) built into Eve.
+
property model library (aka Adam)
 +
layout library (aka Eve)
  
For Eve, I will be replacing the name Eve with "layout solver", "layout parser" - etc. Collectively known as the layout libraries (or ASL layout libraries to be more specific). For Adam, I haven't settled on a term yet. Something like "function model library" would be good - or "sheet library" although sheet conveys little meaning. Ideas and suggestions for this are welcome.
+
The property model library is a system for modeling relationships between properties (simple key/values). These properties often represent parameters and related attributes of a function (this is the case when the library is used for a dialog box in an application) but can also be document properties (the case when used for a palette). The data structure known as a sheet (the name is borrowed from spreadsheet) is created to describe the properties and their relationships. The resulting model is used to assist the user in providing a valid set of parameters to a function or setting valid properties in a document. The model is independent of any specific UI, and serves equally well as the interface to a scripting system as to a graphical UI.
  
At the top level, Adam is a system for modeling functions. The data structure known as a sheet (name is borrowed from spread sheet - might use something like "function sheet" instead) is created to describe a function in your system. The resulting model is used to assist the user in providing a valid set of parameters. The model itself is independent of any specific UI, and serves equally well as the interface to a scripting system as to a graphical UI.
+
It is important to point out that the property model library doesn't solve all UI issues. The purpose of a user interface is to assist the user in selecting a function and providing a valid set of parameters to that function. The property model library is concerned with the later part of this, what it means to assist in providing a valid set of parameters to a function. The library only deals with a limited but common subset of the possible models.
 
+
It is important to point out here that Adam doesn't solve all UI issues. The purpose of a user interface is to assist the user in selecting a function and providing a valid set of parameters to that function. Adam is concerned with the later part of this - what does it mean to assist in providing a valid set of parameters?
+
  
 
A simple example to illustrate. Given a function declared like this:
 
A simple example to illustrate. Given a function declared like this:
Line 17: Line 16:
 
; Result : This function will split the source_file into ceil(size(source_file)/size) pieces, placing them into the destination_directory.
 
; Result : This function will split the source_file into ceil(size(source_file)/size) pieces, placing them into the destination_directory.
  
Given this description of the function we can begin to construct a sheet. We start with the basic parameters - I'll use the Adam language to illustrate although this could be driven programmatically.
+
Given this description of the function we can begin to construct a sheet. We start with the basic parameters - I'll use the property model language to illustrate although this could be driven programmatically.
  
 
<pre>
 
<pre>
Line 31: Line 30:
 
</pre>
 
</pre>
  
At this point all we've done is collected the parameters to the function. Here we assume the source_file and destination_file is being provided from someplace. But what if they aren't valid. We can extend our model to ensure the preconditions for our function:
+
At this point all we've done is collected the parameters to the function. Here we assume the source_file and destination_file is being provided from someplace. But what if they aren't valid? We can extend our model to ensure the preconditions for our function:
  
 
<pre>
 
<pre>
Line 50: Line 49:
 
Here are invariant checks are very simple, but Adam allows us to easily plug-in C++ functions so we could add a valid_path() function if we wanted to be more sophisticated in our model (Adam also has an easily extended type system).
 
Here are invariant checks are very simple, but Adam allows us to easily plug-in C++ functions so we could add a valid_path() function if we wanted to be more sophisticated in our model (Adam also has an easily extended type system).
  
If either invariant is violated, than any cells which contribute the the invariant are marked as "poison" and any cell derived from poisoned cells are marked as invalid. In this way if we had a GUI connected to this sheet, and an OK button bound to result then the OK button would disable unless we had a valid source and destination.
+
If either invariant is violated, than any cells which contribute to the invariant are marked as "poison" and any cell derived from poisoned cells are marked as invalid. In this way if we had a GUI connected to this sheet, and an OK button bound to result then the OK button would disable unless we had a valid source and destination.
  
 
Invariants in our model are one way of expressing the functions pre-conditions.
 
Invariants in our model are one way of expressing the functions pre-conditions.
Line 81: Line 80:
 
We could connect a UI to this and the user could edit either size or count and the other value would be properly reflected. When the user clicked OK (which is bound to result) the application would receive a valid set of parameters for split_file. The application also could retrieve the values in the sheet which currently contribute to this for script recording. So if the user opts to create 10 files and let the system calculate the size, then you could record the count - and play it back on a new file to break that file into 10 pieces. Likewise, someone scripting this system could provide either count or size, and the system would behave correctly.
 
We could connect a UI to this and the user could edit either size or count and the other value would be properly reflected. When the user clicked OK (which is bound to result) the application would receive a valid set of parameters for split_file. The application also could retrieve the values in the sheet which currently contribute to this for script recording. So if the user opts to create 10 files and let the system calculate the size, then you could record the count - and play it back on a new file to break that file into 10 pieces. Likewise, someone scripting this system could provide either count or size, and the system would behave correctly.
  
Enough with the short introduction - how does it work? The solver performs the following tasks:
+
How does it work? The solver performs the following tasks:
  
# initialization
+
# (re)initialization
 
# evaluate predicates for conditional relations
 
# evaluate predicates for conditional relations
 
# determine flow for active relations
 
# determine flow for active relations
Line 90: Line 89:
 
# evaluate any unresolved interface cells.
 
# evaluate any unresolved interface cells.
  
At least, that is the planned order - I'll talk a little about each -
+
I'll talk a little about each -
  
; Initialization : Currently Adam defers evaluating all initializers until the first update() call on the sheet. This means that initializers can refer to cells after them. This adds significant complexity which I don't believe is necessary. For basic_sheet_t, which is used for layouts, initializer are evaluated as they are parsed, the most significant impact of this is that when a function is connected to monitor a cell value it is notified immediately of the current cell value. If I made this change to the Adam sheet today I'd also have to change the order that widgets are connected to the sheet and layout. Currently they connect to the sheet first, then the layout, but this means they would be notified of their initial state (which could effect visibility - which effects layout) before being connected to the layout. The order of connections will be reversed in the future.
+
; (Re)initialization : Initializers are evaluated at the time cells are added to the sheet (when used with the language parser, this is in the order they appear in the sheet. Initializers for interface cells are kept, and contributing values to the initializer are tracked. When input cells are set, those cells are marked as dirty and when reinitialize() is invoked on the sheet, any dependent initializers are evaluated. Note that initializers on non-interface cells are discarded. Also, an interface cell may have a value which is not calculated by it's initializer (such as a cell set by the user) after reinitialize if it wasn't dependent on a dirty input cell. When an interface cells value is set by an initializer, it's priority is also updated. There is no difference between using an initializer and programmatically setting the cell. Initializers provide a simple "spreadsheet like" declarative way to populate a model from a set of external properties.
  
; Evaluate Predicates : Currently Adam will attempt to resolve any cells that aren't connected to a relate clause prior to resolving predicates (more on this later). Then it attempts to evaluate predicates and if it gets stuck (because a predicate goes through an unresolved relate clause) then it will back out and defer resolving the predicate until later. In the future, it will be an error to have a predicate involved in a relate clause.
+
; Evaluate Predicates : All predicates for conditional relations ("when" clauses) are evaluated. Cells contributing to a conditional relation contribute to all outputs. This is a conservative rule. A future version will track potentially connected components of relationships and only predicates of relate clauses which are potentially connected (would be connected if the predicate evaluated to true) to a cell which contributes to the output will contribute to the output. When clauses will also be decoupled from applying to a single relate clause, and will apply to a group of relate clauses. An "otherwise" construct will also be added. For example:
 +
 
 +
<pre>
 +
when (p) {
 +
    relate {
 +
        x <== y;
 +
        y <== x;
 +
    }
 +
    relate {
 +
        z <== y;
 +
        y <== z;
 +
    }
 +
} otherwise {
 +
    relate {
 +
        x <== z;
 +
        z <== x;
 +
    }
 +
}
 +
</pre>
  
; Determine Flow for Active Relations : For every active relate clause, exactly one of the terms must be executed in any state. To determine which term, we select the interface cell with the highest priority [oops, I haven't mentioned yet, but every interface cell has a unique priority, setting a cell raises the priority to the highest in the sheet. There is also an interface to touch cells which raises priority on a set of cells, without changing their relative order or values.] That cell becomes an anchor - it is currently evaluated immediately (and if it can't be fully evaluated, it is deferred and we tray again later). In the future it will not be evaluated immediately, only the flow will be marked.
+
; Determine Flow for Active Relations : For every active relate clause, exactly one of the terms must be executed in any state. To determine which term, we select the interface cell with the highest priority. Every interface cell has a unique priority, setting a cell raises the priority to the highest in the sheet. There is also an interface to touch cells which raises priority on a set of cells, without changing their relative order or values.] That cell becomes an anchor - it is currently evaluated immediately (and if it can't be fully evaluated, it is deferred and we tray again later). In the future it will not be evaluated immediately, only the flow will be marked.
  
 
Here there is a change in 1.0.13 from previous version, in previous versions determining the flow had no effect on existing priorities. In 1.0.13 the priorities are "rewritten" in the order the cells are evaluated in with all "roots" having the same relative order and all higher in priority then any derived cells. How this will happen will change in the future but the results should be the same. This change was done to prevent "back flow" of values when a conditional relation toggles (which could cause a cell to be derived from itself). This currently breaks the mark() function a bit, which remembers the current priority to filter out changes in contributing values for scripting. How mark() will work in the future is still a somewhat open issues (probably by capturing a complete list of contributing names).
 
Here there is a change in 1.0.13 from previous version, in previous versions determining the flow had no effect on existing priorities. In 1.0.13 the priorities are "rewritten" in the order the cells are evaluated in with all "roots" having the same relative order and all higher in priority then any derived cells. How this will happen will change in the future but the results should be the same. This change was done to prevent "back flow" of values when a conditional relation toggles (which could cause a cell to be derived from itself). This currently breaks the mark() function a bit, which remembers the current priority to filter out changes in contributing values for scripting. How mark() will work in the future is still a somewhat open issues (probably by capturing a complete list of contributing names).

Latest revision as of 15:23, 24 March 2009

This page is intended to document how the property model library (aka Adam) works, or in some cases is supposed to work.

A few notes on terminology:

property model library (aka Adam) layout library (aka Eve)

The property model library is a system for modeling relationships between properties (simple key/values). These properties often represent parameters and related attributes of a function (this is the case when the library is used for a dialog box in an application) but can also be document properties (the case when used for a palette). The data structure known as a sheet (the name is borrowed from spreadsheet) is created to describe the properties and their relationships. The resulting model is used to assist the user in providing a valid set of parameters to a function or setting valid properties in a document. The model is independent of any specific UI, and serves equally well as the interface to a scripting system as to a graphical UI.

It is important to point out that the property model library doesn't solve all UI issues. The purpose of a user interface is to assist the user in selecting a function and providing a valid set of parameters to that function. The property model library is concerned with the later part of this, what it means to assist in providing a valid set of parameters to a function. The library only deals with a limited but common subset of the possible models.

A simple example to illustrate. Given a function declared like this:

void split_file(path source_file, path destination_directory, size_t size);
Result 
This function will split the source_file into ceil(size(source_file)/size) pieces, placing them into the destination_directory.

Given this description of the function we can begin to construct a sheet. We start with the basic parameters - I'll use the property model language to illustrate although this could be driven programmatically.

sheet split_file
{
 input:
    source;
    destination;
    size: 4 * 1024; // default to 4K pieces
 output:
    result <== { source: source, destination: destination, size: size };
}

At this point all we've done is collected the parameters to the function. Here we assume the source_file and destination_file is being provided from someplace. But what if they aren't valid? We can extend our model to ensure the preconditions for our function:

sheet split_file
{
 input:
    source;
    destination;
    size: 4 * 1024; // default to 4K pieces
 output:
    result <== { source: source, destination: destination, size: size };
 invariant:
    valid_source <== source != empty;
    valid_destination <== destination != empty;
}

Here are invariant checks are very simple, but Adam allows us to easily plug-in C++ functions so we could add a valid_path() function if we wanted to be more sophisticated in our model (Adam also has an easily extended type system).

If either invariant is violated, than any cells which contribute to the invariant are marked as "poison" and any cell derived from poisoned cells are marked as invalid. In this way if we had a GUI connected to this sheet, and an OK button bound to result then the OK button would disable unless we had a valid source and destination.

Invariants in our model are one way of expressing the functions pre-conditions.

At this point we are assuming that we are simply being handed all of the parameters at the start - parameters that can be adjusted are stored in interface cells. We'll move size to an interface cell. But we can do better than just letting the user adjust the size - in our description of our function there is a relationship between size, the size of the source file, and the number of files (count) in our result. This relationship is reversible between the size and count. We can express that in Adam as well:

sheet split_file
{
 input:
    source;
    destination;
 interface:
    size: 4 * 1024; // default to 4K pieces
    count;
 logic:
    relate
    {
        count <== ceil(file_size(source)/size);
        size  <== ceil(file_size(source)/count);
    }
 output:
    result <== { source: source, destination: destination, size: size };
 invariant:
    valid_source <== source != empty;
    valid_destination <== destination != empty;
}

We could connect a UI to this and the user could edit either size or count and the other value would be properly reflected. When the user clicked OK (which is bound to result) the application would receive a valid set of parameters for split_file. The application also could retrieve the values in the sheet which currently contribute to this for script recording. So if the user opts to create 10 files and let the system calculate the size, then you could record the count - and play it back on a new file to break that file into 10 pieces. Likewise, someone scripting this system could provide either count or size, and the system would behave correctly.

How does it work? The solver performs the following tasks:

  1. (re)initialization
  2. evaluate predicates for conditional relations
  3. determine flow for active relations
  4. evaluate invariants - marking any poison cells
  5. evaluate output cells
  6. evaluate any unresolved interface cells.

I'll talk a little about each -

(Re)initialization 
Initializers are evaluated at the time cells are added to the sheet (when used with the language parser, this is in the order they appear in the sheet. Initializers for interface cells are kept, and contributing values to the initializer are tracked. When input cells are set, those cells are marked as dirty and when reinitialize() is invoked on the sheet, any dependent initializers are evaluated. Note that initializers on non-interface cells are discarded. Also, an interface cell may have a value which is not calculated by it's initializer (such as a cell set by the user) after reinitialize if it wasn't dependent on a dirty input cell. When an interface cells value is set by an initializer, it's priority is also updated. There is no difference between using an initializer and programmatically setting the cell. Initializers provide a simple "spreadsheet like" declarative way to populate a model from a set of external properties.
Evaluate Predicates 
All predicates for conditional relations ("when" clauses) are evaluated. Cells contributing to a conditional relation contribute to all outputs. This is a conservative rule. A future version will track potentially connected components of relationships and only predicates of relate clauses which are potentially connected (would be connected if the predicate evaluated to true) to a cell which contributes to the output will contribute to the output. When clauses will also be decoupled from applying to a single relate clause, and will apply to a group of relate clauses. An "otherwise" construct will also be added. For example:
when (p) {
    relate {
        x <== y;
        y <== x;
    }
    relate {
        z <== y;
        y <== z;
    }
} otherwise {
    relate {
        x <== z;
        z <== x;
    }
}
Determine Flow for Active Relations 
For every active relate clause, exactly one of the terms must be executed in any state. To determine which term, we select the interface cell with the highest priority. Every interface cell has a unique priority, setting a cell raises the priority to the highest in the sheet. There is also an interface to touch cells which raises priority on a set of cells, without changing their relative order or values.] That cell becomes an anchor - it is currently evaluated immediately (and if it can't be fully evaluated, it is deferred and we tray again later). In the future it will not be evaluated immediately, only the flow will be marked.

Here there is a change in 1.0.13 from previous version, in previous versions determining the flow had no effect on existing priorities. In 1.0.13 the priorities are "rewritten" in the order the cells are evaluated in with all "roots" having the same relative order and all higher in priority then any derived cells. How this will happen will change in the future but the results should be the same. This change was done to prevent "back flow" of values when a conditional relation toggles (which could cause a cell to be derived from itself). This currently breaks the mark() function a bit, which remembers the current priority to filter out changes in contributing values for scripting. How mark() will work in the future is still a somewhat open issues (probably by capturing a complete list of contributing names).

Relate clauses can cause cycles in the system - if all cells involved in a relate clause are resolved, without evaluating the relate clause, then that relate clause is ignored and a diagnosed is issued to cerr on debug builds pointing out that this relationship wasn't necessary in the current state. It is possible to construct a system where the relation in the cycle which is being dropped is implied under certain circumstance - here it was determined that there needed to be a way to specify which relation in the cycle is dropped. The keyword "weak" was added to allow you to mark a weak relate clause - any weak relate clause is resolved after all non-weak relate clauses, and if all cells in it are resolved first, then it is ignored with no diagnostic.

Other Subtopics

Path Stability

placeholder