Structured Assembly
Created Summer 1997 - Last Updated August 9, 1998 - Jack Harich - Document Map
|
With the summary motto:
Configure systems, code reusables.
Introduction
Indentification of core principles is crucial for proper infrastructure design. With the right infrastructure building systems from reusable parts is a "snap", pun intended. In the small infrastructure is an assembly tool. In the large it is a suite of tools, practices, reusables and evolving systems all based on the same unifying principles.
The first 3 principles are the crucial uncommon technical abstractions. The last 2 play more of a support role. Then again, a better perspective is Continuous Evolution is the kingpin and the rest are supporters. In that case you could say the goal of Structured Assembly is to support Continuous Change and summarize by saying that Configurability enhances Continuous Change.
Just as Structured Programming greatly improved individual programs, Structured Assembly (SA) improves systems assembled from parts. One goal of SA is "good system structure". A system having good structure will exhibit these traits:
- Usefulness, to all types of users, including developers
- High understandability
- Consistency, including no runtime dependency failures and automatic enforcement
- Ease of assembly and evolution
It follows that a good SA tool will provide "good system structure" at all times, plus it should make the process efficient and enjoyable.
It also follows that SA principles can be derived from the pursuit of the above goals. To put this in perspective, lets review the principles of Structured Programming, originated by Edsger Dijkstra in 1969, which stated that programs should consist only of:
- Sequence - A set of statements executed in order
- Selection - A control structure causing statements to be executed selectively
- Iteration - A control structure causing statements to be executed multiple times
Another similar contribution was Structured Design, first described in 1974, and made up of these practices:
- System organization - Systems are organized into black boxes, routines that have well-defined, narrow interfaces and whose implementation details are hidden. This has come to be associated with a preference for encapsulation, high cohesion and low coupling.
- Strategies for developing designs, such as top-down decomposition and bottom-up composition.
- Criteria for evaluating designs.
- A clear statement of the problem to guide the solution.
- Graphical and verbal tools for expressing designs, including using certain diagram types.
As you can see, Structured Design was not nearly as focused on principles as Structured Programming, but attempted to provide a process or smorgasbord of best practices. It made valuable contributions but failed to take root, since no one process has yet proved to be superior. Thus as we develop the theory behind Structured Assembly we should be careful to stick to principles, and eschew sneaking in process or best practices. If these are needed then that should be a separate topic, since they are highly subject to change and appropriateness. Principles, on the other hand, tend to be stable and highly appropriate.
"System" here means a discrete collection of objects that perform a particular mission. We exclude large, complex systems such as the WWW, operating systems, etc. We are more concerned with applications and collections of applications.
1 - Hierarchical Composition
Complex systems are best internally organized and visually presented as a hierarchical tree of containers and workers. This is wonderfully understandable, consistent, navigatible and scalable. It makes composing larger systems from smaller systems routine, a huge benefit.
Gone is the basic question of "How do I organize this system?" Experienced designers will have noticed that as their design shorthand gets smaller and more refined, it often slips into a hierarchy of subsystems rather than class models and such. Structured Assembly standardizes this tendency.
We will spare the reader a tedious justification of why Hierarchical Composition is best.
The simplest form of this would be the Bean Assembler System Tree.
A large problem with designing reusable components is their interactions with other components can introduce so many dependencies that the component is hard to use outside of these dependencies.
In Java we have Java Beans as components in the small. They expose properties, events and methods. We thus have these forms of collaboration:
- Normal method calls
- Method calls using polymorphism via inheritence or interfaces
- Method calls using the Bean event mechanism
- Public fields, usually only used as constants
As you can see, interfaces are the key reusable collaboration mechanism. The problem is the interface must be known by each collaborator and many interfaces are needed. Suppose A, B, C, D and E are collaborating in a reusable manner and so each implements an interface. This introduces 5 interfaces into the system. With careful design this might be reduced to 3 or so, but you see the trend. A large reuse repository would have many interfaces and the reusing class would have to know what they are in advance. This would make very high reuse extraordinarily difficult.
What's needed is a very reusable way for classes to collaborate without knowing what other classes they are working with. This would make their relationship anonymous, giving the loose coupling so necessary for widespread reuse. This can be done with a single resuable event class and event listener interface, if it is designed to handle all collaborations. This is easily done by:
- Each event has a name. Listeners respond only to event names they are interested in.
- Each event has named properties. This allows sending and receiving state of any kind.
- Event properties can be set by the listener. This allows returning state of any kind.
- Collaborators use interfaces to be an event source or listener.
- Only mediators establish event source/listener links. We let the containers do this.
The simplest form of this would be:
public interface MessageListener { public void processMessage(Message message); } public interface MessageSource { public void addMessageListener(String name, MessageListener listener); public void removeMessageListener(String name, MessageListener listener); } public class Message implements java.io.Serializable { private String name; private Hashtable properties = new Hashtable(); public Message(String name) { this.name = name.intern(); } public String getEventName() { return name; } public void setProperty(Sring key, Object value) { properties.put(key, value); } public Object getProperty(String key) { return properties.get(key); } } |
While this breaks Java's strong typing, it opens the door to loose coupling, resulting in highly reusable components. In practice the tradeoff has proven to be no problem, especially since we have added to the above simple example published event property types by sources. We have to use other ways besides a compiler to enforce strong typing, such as DTDs in XML.
Note we are not saying all classes should collaborate with anonymous events. Most components are a Fascade for a subsystem of classes, which may use direct method calls for collaboration. Thus most high speed work is done efficiently with methods, not events.
An indication of the powerful flexibility of Messages is suppose a FileChooser component emits a Message named "FileSelected" with a String property named "FileName". Suppose an TextEditor component is listening for a Message named "EditText" with a String property named "TextFileName". We'd like the two to collaborate without hard coding anything. This is easily done by configuring (with parameters) a MessageTranslator to translate the event. Poof! We have high reuse due to Anonymous Collaboration, which makes such tricks easy. The FileChooser sends a Message to MessageTranslator, who translates it and sends it on to TextEditor. The benefit here is we do this without any custom code, since MessageTranslator is a parameter driven reusable. (thanks Curt)
3 - Parameter Driven
Structured Assembly is primarily a reusable approach to component composition, collaboration and configuration. Configuration is setting initial state so something can proceed to do its work. A component is not very reusable unless it can be configured to behave in a certain way. Configuration can be accomplished a variety of ways.
First there's the question of how best to describe and store the initial state that will be passed to a component or container for initialization. There are really only two choices: code or text parameters. Code is a terribly complex approach, introducing more problems than it solves. A gigantic benefit of text parameters is separation of business rules from behavior, ie separation of "what to do" from "how to do it". Examples of successful complex text parameters are XML, HTML and Visual Basic FRM files.We therefore choose text parameters.
Then there's the question of how best to pass parameters from text to an object. We could:
- Pass the text and let the object do the rest.
- Turn the text into a list of properties and set individual properties.
- Turn the text into a data structure and pass just that. Call this a Param.
Option 1 greatly expands a class's responsibilities and is not very object oriented.
Considering 2 and 3, perhaps the text properties themselves have a clue about what's best. From experience we see that complex components require complex initialization parameters, which means lots of parameters and deep nesting. Turning nested properties into a list of properties is difficult and increases tight coupling, and so we are left with 3 as the best technique. This could be called Bulk Parameter Driven. Note that a Param expresses a component's goal clearly and completely. It also has the benefit of an atomic mutator that can modify a object's or subsystem's behavior in a single transaction, which can be crucial if the system is running.
The simplest form of this for parameter driven components (containers or workers) would be:
public interface ParamDriven { public void setParam(Param param); }
How do we store an object's state for later re-intialization. Options include:
- Serialization.
- Mapping properties to columns, storing in a database.
- Letting the component do it any way it wants.
- Reverse translating the state into a Param and converting that to text parameters.
Serialization has problems with class migration, speed, language dependency and ability to hand edit stored state. Mapping to a database is wonderfully awkward and not very lightweight. Letting the component do it is anarchic and lacks a reusable, centralized approach. The 4th option has none of these problems and has the infinite advantage of circular reuse, since we end up with what we need to initialize the component the next time. Note we store only that state needed for re-initialization, not all state.
It seems that Bulk Parameter Driven is best for reusing subsystems, while setting properties individually is best for reusing individual classes. We have found that systems are assembled far more easily from subystems, not tiny individual classes. Here each subsystem is a well designed mega component, which is a fascade to a multitude of classes and smaller subsystems.
An interesting benefit of Parameter Driven is parameters are easily generated or modified on the fly, opening up new design vistas. Parameters are much easier and more logical to generate than code. For example in our first framework built on top of the Bean Assembler, the Data Framework, we frequently take 8 lines of "meta parameters" and expand them into 140 lines of normal parameters using the system schema, and then pass them to the ParamDriven component, a DataEditor. This is a huge productivity savings.
In practice a parameter driven mindset leads to different and much better designs. One no longer struggles with complicated model partioning - Instead the most important partition is already there: What and How. Bulk Text Parameters are also language independent, giving one the long term freedom to migrate from the old to the hot new language of choice far more easily. Parameters and the classes they drive can also be varied independently. For example you can swap out an entire paremeter driven framework. Parameters can also be the end result of the requirements stage, particularly at the task level. As you can see, parameters open up a whole new and very rich set of possibilities.
Parameters are also known as Declarative Knowledge.
4 - Continuous Evolution
There is no such thing as writing new software. After the first compile and run it's all maintenance. What we really do is evolve toward a goal, just like Mother Nature. Therefore system infrastructure must be designed to support continuous change as the norm. If it cannot do this Hyper Change Overload will occur.
The compile and run cycle is a stone age throwback to the code centric mindset we have been in far too long. Why not eliminate it? Replace code with parameters, which require no compilation, and the compile and run cycle disappears.
The design and implement cycle is also unnecessary. Why not use infrastructure that allows system modification anytime, whether the system or subsystem is running or not? This becomes easy with parameters, which unlike classes can be modified and redeployed with ease.
Thus if a system is defined as a hierarchy of containers and workers, and these are both initialized with text parameters, the system can be continuously evolved at runtime.
The simplest form allowing Bulk Parameter Initialization and Continuous Evolution is:
public interface ParamDriven { public void setParam(Param param); public Param getParam(); public void applyNewParam(Param param); }setParam() is used for the initialization, getParam() is used to get the Param to modify and applyNewParam() is used for later re-initialization using the modified Param.
5 - High Quality Components
Without good components it's garbage in, garbage out. We really don't need to do much original work here, since Object Oriented principles have long identified what a good object (aka component) should do. Candidate components for a reuse respsitory should pass a checklist with items such as the following:
First and foremost are:
- A component is designed to work well with other components in a system.
- A component does one thing and does it well.
- Has a small, simple API hiding powerful abilities.
There are a few High Quality traits that apply to SA components and not objects.
- Container centric. Uses container services and pretends the rest of the world doesn't exist.
- Has a minimum of event interests.
- Fires a minimum of events.
- Supports reapplying initialization parameters while running well.
These High Quality traits are similar to good objects:
- A component does too little rather than too much. Avoid the kitchensink complex.
- Smaller is better. (Less is More) Keep your components small (when possible) and nimble.
- Collaboration needs are published.
- How to use it is published, with code examples as necessary.
- A unit test is available if complex.
- Reliable.
- Very flexible, handles a variety of needs.
- Failure impact is localized.
- Simple to understand and use.
- Uses a minumum of other components.
- Black box reuse. The user should not have to examine code to reuse a class.
- Has been reused enough to optimize and stabilize the component's design.
The five principles of Structured Assembly yield ultra configurable systems. A system built using Structured Assembly is very configurable, a huge productivity gain. Now configuring different copies of the same system for different customer's or markets is as easy as falling off a log. So is creating completely different systems from skeleton systems. So is modfying a system to meet new user needs. So is making trial changes to test behavior or show a customer. So is shipping systems that are mostly configured by customers. So is meeting our goal of supporting a Continuous Change Process. So is....
"The ideal of beauty is simplicity and tranquility."
"We cannot possess what we do not understand."
Johann Wolfgang von Goethe
For an exploration of trying to do all this in a configurable manner, see the Configurability document. In fact, with a large amout of magic, experience, failure, more experience and mumbo jumbo, we can rephrase most of the above verbage as the Principles of Configurable Systems.
Comments
"I, as a software architect, do not consider a design a failure or success until it has had to adapt to change (either functionality or interaction wise). Smarter, smaller sub-systems was the answer which is working for us. They are obviously easier to manage, modify and reuse, but the big benefit comes when a situation requires a major change. The smaller systems tend to minimize the impact of the changes. To me, a failed design is one that does not adapt."
"Another aspect is that a big obstacle to actual reuse often is the fact that classes tend to depend on other classes much more than anticipated, e.g. classes draw in clusters which may draw in other clusters and then you suddenly find you have a whole lot of unwanted stuff in your project. Then you can either accept the bloat or forget about the reuse and code from scratch. Neither is a good solution.
"All this seems a lot easier to avoid with Beans since they don't have a lot of formal or infrastructural requirements. And with SA's idea of anonymous cooperation via untyped events, the goal of actual reusability will definitely come even closer.
"Another advantage of the single setParam() that I cannot remeber seeing explicitly mentioned in your paper is the fact that it allows an object's or subsystem's configuration to be altered in one atomic transaction. With discrete setPropertyX() methods, one would have to provide (and require the use of) methods to lock/suspend/reserve an object or subsystem to be able to change a set of interrelated operating parameters in a safe, race-condition free manner." (We have changed the document to mention the 'benefit of an atomic mutator')