192 81 17MB
English-German Pages 502 [504] Year 1975
ON THE DEVELOPMENT OF SYSTEMS OF MEN AND MACHINES H. D. Mills,
IBM Corporation
and The Johns Hopkins University
With Appendix by R. C. Linger,
IBM Corporation
ABSTRACT
We formulate the development programming
of systems of men and machines
problem for multiprocessing
are men and some are machines. courses,
etc., are determined
as machine specifications, of the total operation. miniature
in which some processors
In this way, users guides,
training
from processing requirements,
just
which are consistent with the objectives
An Appendix illustrates
for a supermarket
as a
this idea in
checkout operation.
Systems of Men and Machines
Our topic is the architecture, large s y s t e ~
implementation,
of men and machines
enterprise -- managing a business, tion system,
women)-tenders;
operating an airline reserva-
running a government agency,
and back, etc.
and operation of
in some definite and coherent
getting men to the moon
In such systems there are many kinds of men (and
managers,
clerks, specialists
of various kinds, and machine
there are also many kinds of machines -- computers,
minals, sensors,
Ordinarily
actuators,
communications
computer programming
of the system, and programmers
equipment,
that modern principles necessity
etc.
is regarded as part of the machine side as part of the machine tenders.
We bring a different view -- that the architecture tions of the enterprise
ter-
is programming, of programming
as well.
of the operaOur thesis is
-- forced on us out of
in dealing with machines of much logical capability but no common
sense -- can play as vital a role in bringing systematic
discipline and
standards into all phases of large systems development.
For this point of view we escalate the concept of programming providing comprehensive The operations
instructions
to that of
for either man or machine activities.
of an enterprise becomes a multiprocesssing
operation of men
and machines;
then the architecture
tions and types of men and machines direct each type of man and machine. one part of a cooperating
of the operations in the system,
defines the configura-
and the programs which
For example, a users guide becomes
system of programs operating in separate processors
(the user and a machine).
Of course~
the characteristics
of man and machine are quite different
in
such a system, just as machines are different among themselves.
And yet
their architectural
For example,
properties
can be treated in a uniform way.
in deciding on a particular machine requirement,
various considerations
of
physical or logical capability will arise, as well as whether the requirements can be met with off-the-shelf apply to a man requirement,
equipment;
these same considerations
in common terms -- e.g., how many letters can
a postal clerk sort in an hour, and can this be done with minimal
training,
etc.
It is well understood things.
that men and machines do well at quite different
Machines are good at doing what they are told to do, very rapidly
and accurately. instructions
Men are good at using common sense -- even disobeying
when they are obviously misconceived;
-- discovering
at pattern recognition
information by no special or dependable process;
invention -- creating a new idea for the enterprise to act on. but very important
form of pattern recognition
speech to and from machine readable text. activities
and at One simple,
is the translation of human
This is routine for clerical
that interface with persons outside the enterprise -- e.g., in
an airline reservation system.
It may be asked how the concep~ of programming so differently and unpredictably,
applies to men, which operate
compared to machines.
noting that programs are used in a local way by machines moment,
under these conditions,
do this next.
It applies by -- i,e., at this
A good deal of programming
on the man side is already subsumed under general instructions
and common
sense -- if the telephone rings, answer it; if you want to execute a program, keypunch a job deck and submit it to the operator. of explicitly programming
human activities
We have no intention
now done by general instructions
and common sense; the typical level of human programming which is found in users guides, operating with machine operations.
instructions,
envisioned
is that
ere., associated
Principles of Pro~rammin~ for Men and Machines
The recent, twenty-five year, history of computer programming has seen an explosive and traumatic growth of human activity and frustration in attempting to realize the promise and potential of electronic and electromechanlcal devices for processing d~ta in science, industry and government.
Out of
this history has come the stored program computer; programming languages, compilers, and libraries; and new technical foundations for programming computers, e.g., as propounded by McCarthy [ 4], Dijkstra [ i], Hoare [ 3], and Wirth [I0].
In this period the computer has been the center of attention.
In the beginning, the numerical computing power was so great, compared to manual methods, and the availability so limited, that men readily adapted to this new tool -- from decimal to binary, from equations to algorithms. But in a short time, the remarkable possibilities for more general data processing (nonnumerical) were realized, and a new industry was horn in just a few years.
In the later part of this period (up to now) the large
data processing systems appeared -- management information systems, airline reservation systems, space tracking systems, etc.
Even then, although
human factors were considered, these systems were conceived primarily as data processing systems, which responded to users.
But in our proposed
perspective, the users are as much a part of the multiprocesslng enterprise as the machines and their programs.
Thus, whereas the computer has forced us to find more effective programming principles than we would otherwise have, it has also warped our sense of perspective. focus.
By this time, simple human factors questions are in better
There is enough data processing power available to invest part of
it in creating more human-like interfaces than binary and machine code. But in extending programming to instructing men, with their entirely different characteristics, additional ideas and principles are needed, such as
1.
Languages.
The programming languages used in the architecture of
systems of men and machines need be near natural languages.
The difficulties
of processing natural languages are well known and that is not proposed. Rather, what is proposed is a "naturalization" of processable programming languages which is close enough for use as a dialect of natural language by nonprogrammers of the enterprise. both sides of this.
Sammet [ 8] and Zemanek [ii] discuss
2.
Procedures.
The concept of procedure should be extended to include
indefinite procedures where it is not possible or desirable to define them. In simple terms, "find values for these variables so that these equations hold" (Wilkes discusses this in [ 9]), or more complex, "make sure no one's feelings are hurt".
3.
Interactions.
The principal subject of the architecture is multi-
processing -- the conduct of the operation of the enterprise through programs distributed to men and machines.
The creation of such programs in an orderly,
systematic way will require a new and fundamental development beyond programming principles for synchronous operations.
The idea of the Petri net [ 7]
may be an embryonic step in such a development.
Dennis illustrates
Petri nets as multiprocessing control mechanisms, in [ 2].
Architecture P r i n c i l ~
A system depends on its components.
The architecture of a system specifies
the types of men and machines required~ as well as how they are to interact as a system operation, i.e.~ the selection and arrangement of the system components, as well as detailed instructions for their behavior.
In the
case of machines, the usual considerations apply -~ define feasible requirements, either already embodied in existing maehines~ or possible with special development where justified; tradeoffs and comparisons with alternative approaches, even with manual approaches where feasible~ etc. men, the system architect is frequently shortsighted.
In the case of
There is a reason; it
is much more difficult to predict a human performance than a machine performance.
Sometimes the human fails to live up to a requirement.
But often
the human will exceed a requirement in a totally unexpected way, by acquiring a skill not imagined possible beforehand.
This is happening with computer
programmers right today, who are beginning to program with a precision not believed possible five years ago.
It happened with typing when touch typing
was introduced early in this century.
In retrospect, it is easy to identify a pitfall in overestimating machine possibilities, leading to a common scenario in many large systems in recent history, which put a "man in the loop" at the last moment, with marginal operational results.
In such a case, the operation was originally planned
as completely automatic, depending on some key algorithm (often involving some form of pattern recognition); as a result of the planning, the data processing functions surrounding the algorithm were developed in parallel
according to a general system design; then at the last moment, the performance of the key algorithm proves inadequate, and the man is brought into the loop, with two costs: 1.
The human factors are bad (e.g., interacting with programs which
require long, fixed argument lists); these can be fixed up.
2. the loop.
There is a lost opportunity in not having a better trained man in The effort on algorithm development is frequently different than
that required for insight development for a human executing an indefinite procedure, and the time is lost, anyway.
These operational experiences lead to the following principles of systems architecture:
i.
Components.
Regard men and machines as equal status components
for system operations with equal requirements for development, state-of-art projections, and improvement, according to their own characteristics.
2.
Evolution.
Plan on the unexpected, by well-structured interfaces,
that permit the replacement of components by improved versions which perform identical functions more effectively.
Parnas [ 6] deals with this inter-
changability by axiomatizing such interfaces.
3.
Intesrity.
Value system integrity above all else, by requiring
that the multiprocessing operation of men and machines be described and scrutinized according to the best principles of programming -- particularly with respect to methods of specification and validation of programs.
Note
especially the technique defined by Wirth [I0] and Mills [ 5].
In an Appendix dealing with a minature problem, R. C. Linger illustrates the idea of programming an operation through a set of abstract processes which only later are specialized to either men or machines, depending on their requirements.
It is an easy transition from man as a user to man as
a processor and yet it seems a critical one in providing system coherence and integrity with respect to a given operation.
Literatur
[i]
O. ~J. Dahl, E. W. Dijkstra and C. A. R. Hoare, Structured Programming, Academic Press, London. 1972.
[2]
J. B. Dennis~ ~Concurrency in Software Systems", Lecture Notes in Economics and Mathematical Systems, 81 Advanced Course on Software Engineering (Ed. F. L. Bauer), Springer-Verlag, Berlin, Heidelberg~ New York.
[3]
C. A. R. Hoare~ ~An axiomatic basis for computer programmlng"" CACM 12.
[4]
1973.
i969.
pp. 576-580, 583.
J~ McCarthy, "A basis for a mathematical theory of computation", Co___mputer pro~rammin$ and Forma! Systems, (Eds. P. Braffort and D. Hirschberg)~ North-Holland Publishing Company.
[5]
(To appear).
D. L0 Parnas, "A technique for software module specification with examples", CACM 15, 5.
[7]
pp. 33-70.
H. D. Mills, "The new math of computer programming", CACM 17, 1975.
[6]
1963.
May 1972.
pp. 330-336
C. Ao Petri, Communications with Automata.
Supplement i to
Technical Report KADC-TR-65-377, Vol. i, Griffiss Air Force Base, New York.
1966.
(Originally published in German:
Kommunikation mit Automaten,
[8]
J. E. Sammet, "The use of English as a programming language", CACM 9, 3.
[9]
March 1966.
pp. 228-230.
M. Wilkes, "Constraint-type statements in programming languages", CACM 7.
[i0]
University of Bonn, 1962).
1964~
pp. 587-588.
N. Wirth, Systematic Programming: Hall, Englewood Cliffs, New Jersey.
[Ii]
An Introduction, Prentice1973.
H. Zemanek, "Semiotics and programming languages '~, CACM 9, 3.
March 1966.
pp. 139-143.
Appendix
SUPERMARKET
CHECKOUT AS A MAN-MACHINE MULTIPROCESSING
ACTIVITY
R. C. Linger, IBM Corporation
Consider the problem of specifying a design for a super market checkout operation.
Several possibilities
and cashbox,
come to mind; a man with an adding machine
a man with a cash register,
encoded prices,
etc.
Whatever
design with a procedural
a man with an OCR device which reads
the final configuration,
we can begin our
description of the checkout process itself.
view, the dynamics of the process are of central interest.
In this
That is, we do
not begin with a static description of components which must somehow fit together in a presumed system~ but rather begin with a process which can be shown to work, and derive required
(and possibly alternate)
urations of men and machines from it.
component config-
Our tentative first refinement
is:
start checkout open checkout station at 9 a.m. do while before 5 p.m. checkout next customer,
if any
od close checkout station stop
Here, the actions of men and machine are abstracted;
we cannot determine
which does what, but can agree that the description seems reasonable so far. Our second refinement might be: start checkout establish man on duty at 9 a.m. power up machines at 9 a.m. do while before 5 p.m. accept next customer,
if any
total cost of all items inform customer of total accept payment present change, if any, to customer bag all items od balance cash total with sales total power down machines stop
Here the functions of man and machine begin to emerge; presumably
they will
cooperate to "total cost of all items" through manual keystroke entry or OCR input, and the man will "bag all items," without mechanical identify
"before 5 p.mo" as a man predicate,
for clockwatching,
We
which hopefully yields to common sense to complete checkout
of customers waiting in line at quitting timel processing"
assistance.
in view of the human propensity
arises in this procedure,
coffee break is desired~
The possibility
of "exception
as when the machines break down, or a
These situations
are analogous
to the interrupt
handling facilites of modern computers.
An aspect of the activity distribution between man and machine can be explored through elaboration
of the process to "total cost of all items:"
start total cost of all items initialize subtotals
to zero
do while items remain if item type is produce then add price to produce subtotal else if item type is meat then add price to m e a t
subtotal
else add price to grocery subtotal fi fi od add subtotals
to find total cost
stop
The cooperating process here~ expressions
give and take actions of man and machine appear as an abstract
"items remain" is likely a man predicate,
appear to be machine predicates
set by man.
lations are best handled as machine functions.
while the "item type" The subtotal accumu-
We can establish concrete procedures for man and machine by defining an interface between them.
In illustration, consider an electromechanlcal
cash register with control keys as follows:
0, 1,...,9
price digits decimal point
I
initialize (for new customer)
P
produce
M
meat
G
grocery
T
total
For this interface, the man procedure becomes
start total cost of all items -- man part push I do while items remain get next item push digit and decimal keys for price i_~fitem type is produce then push P else i f item type is meat then push M else push G fi fi od push T stop
and the corresponding machine procedure is:
10
start total cost of all items -- machine part (I key depressed) set produce subtotal to zero set meat subtotal to zero set grocery subtotal to zero do until T key wait for TIPIMIG key i f ~ T key then read and clear price register I f P key then add price to produce subtotal else i f M key then add price to meat subtotal else add price to grocery subtotal fi fi fi od add produce, meat, grocery subtotals display result
A different set of procedures derive from an alternate interface, as with a cash register using OCR input, and equipped with only I and T keys.
In
this case the man procedure simplifies to
start total cost of all items -- man part push I do while items remain pass item label over OCR window od push T
and the matching machine procedure must extract both item type and price from encoded labels.
We observe that the man procedures above can evolve naturally into users guides and training courses, possibly containing instructions no machine would understand, as, "Don't be distracted while pushing price keys."
The
machine procedures give a explicit basis for electromechanical component design, or executable procedures in the case of programmable devices.
A New Look at the Program Development Process P. Hiemann,
IBM Boeblingen
This paper is a contribution to the discussion of defining a program development process frame with distinct validation points for analysing and measuring process results providing methods,
techniques and tools to support
the program development process.
O.
INTRODUCTION
In the 25 years history of professional system programming,
system
programmers have developed systems of increasing complexity in function and size. Their experience shows that it is increasingly more difficult to meet specified system integrity/quality objectives without complying to disciplined programming methods based upon a firm definition of the program development process. System programmers have identified and developed a series of programming support techniques which are mandatory for producing and validating a high quality of specifications and code. In the first section of this paper, a process frame with its validation points and some basic process measurements are described. The second section describes the methods supporting the program development process. The final section of the paper provides some proof regarding the quality of succeeding OS releases that have been improved by applying new process support techniques.
12
I.
THE PROGP~d~9,1ING PROCESS AND SOME MEASURES
Most people
think of programming
four activities Designing:
(Figure
Where
Testing_! Where
While
during
is written°
that
testing
the pieces
an application, testing
composed of
do and how it w~ll work.
it is verified
uncovered
I n t e g r a t i n g : Where create
will
Where the program
problems
as being
it is determined what is needed and defined
what the p r o g r a m m i n g Coding:
cycles
I):
the program will work and are fixed.
or modules
a subsystem,
and integration
are put together
or an entire
are closely
to
system.
related
in that inte-
gration has to take place before
larger units of code can be
tested,
if one has to go back to design
it is resource
during coding The results programs
or even testing.
of coding,
of growing
testing
to the smallest
It may be a module,
- A function a command, - A component
is composed
a macro,
is composed of functions.
- The components
self-contained
piece
of
A function may be
routine. A component may be an
or the supervisor.
are integrated
I, OS Release
2)
or a subroutine.
or a recovery
a compiler,
are operational
(Figure
of program units.
a parameter,
access method,
Release
and integration
size and complexity
- Program Unit relates code.
consuming
to form a system~
21, and so on.
as VS/2
13
About
two or three years
and statistical
ago,
information
customer was experiencing
IBM started to analyse
relating
to quality of systems.
an APAR rate that was too high.
shows the average number of all APARs in other words, customers. APARs
financial
all submitted APARs,
The
Figure
3
for OS submitted per customer, divided by the number of
The area labeled Valid is the average number of valid
submitted per customer.
A valid APAR is one that requires
a fix either in code or in documentation. valid APARs
and the total,
user errors.
largely represents
The average number of APARs
not changed significantly
The difference duplicates
between and
submitted per user has
over the past four releases,
nor has
the average number of valid APARs. Corresponding
to the high error rates the analysis
76 % of the cost for Release
only 24 % was spent on development cludes
cost related to designing
and printing publications, integration. laboratory
Maintenance
(Figure 4). Development
and coding programs,
and to performing includes
in-
to writing
all testing
and
FE cost to find problems
also indicated that the cost of finding
increased as the programming
process progressed.
and
thirty times the cost of fixing it in unit test, that the code goes through
The cost of finding As we progress
a problem during
coding
at each testing
stage. More
with the code as it moves
(Figure
5).
is virtually nothing.
field usage,
and more people
through the cycle,
(an APAR) was
the first set
in development
±rom unit test through
a problem
The cost of finding
and fixing a problem after the system was released
creases
and
costs to fix the problems.
The analysis
of tests
showed that
]8 was spent on m a i n t e n a n c e
the cost inare involved
and more people
and their work are affected when a problem occurs. The net result of the analysis was to establish that will help in: reducing
errors
finding problems
as early as possible.
an improved process
14
The base of the new process was a frame of seven distinct phases with distinct Validation
validation points at the end of each phase
regards
to
Quality of the results produced Progress
in terms of goals
Usefulness
determined
during one phase
and schedules
and Profitability
to be developed The process
of the system/component/function
further.
should be continued validation
criteria
to its next phase only~
about the functions
data processing
capability.
if pre-
are met.
Phase 0 is a planning phase to develop statements
(Figure 6).
requirements.
These are
needed to provide new or improved
People study,
analyze,
and perhaps
survey users to determine what they require. Phase I determines package° formance~ Phase
the requirements
It addresses
configuration,
storage
programming
requirements,
per-
etc.
II is the external
commands,
for a particular
parameters,
design phase.
output
major system components
the module
and interfaces
and the interfaces
design phase~
structure
are developed.
that covers all functional Phase 0 to Ill are strictly the implementation
(for example with
are developed.
Phase !II is the internal the program,
formats)
User interfaces
The internal
logic of
as well as internal
During this phase,
variations sequential
and resting'phases,
data areas
a test plan
is also developed. steps; phase however,
IV and V
interact with
each other~ Nevertheless,
Phase
IV is the main coding phase
and function
testing.
including
unit
15
Each phase has its own unique output or results: Documentation
in the first 4 phases
Code in Phase IV A system in Phase V Fixes in the maintenance Besides
the planning
directly Market
Phase VI
documentation,
the following
documents
relate
to the program being developed:
requirements
define a need,
considerations.Objectives programming
package.
of the program,
independent
of any programming
state the requirements
External
specifications
its user interfaces,
other parts of the system.
and its interfaces
Internal specifications
logic and the method of operation of the program interfaces.
Code is also considered
field engineer
receives
of a particular
define the purpose the
and its internal
as documentation
source code documented
with
describe
since the
in the form of
microfiche.
2.
THE METHODS
According
SUPPORTING THE PROGRAMMING
to the phase structure
PROCESS
of programming
projects,
there
are -
methods
supporting
-
methods
and criteria
results
of a phase
In describing -
the programm
the methods,
Principles Tools and Techniques Management
Aspects
development
process
to check the completeness
this paper distinguishes
of the
between
16
2.1
Principles
2.1.1
Top-Down Design
Top-down design is used today for most new development projects° In top-down design, we take a functional viewpoint.
We define
the basic functions of the program and design them with a high level of detail.
Then each of those basic functions
into more detailed subfunctions. of detail,
is broken
We move down through levels
creating a tree structure like the one shown in Figure
We continue this process until all subfunctions a consistent
are defined to
level of detail. When we are finished with the top-
down design process,
we will know about all of our interfaces,
all of our logic decisions, box in the tree structure a code segment,
and how the data is structured.
represents
Each
a function and, in most cases,
an inline block of code, or a subroutine
that
does not exceed a listing page. Top-down design avoids simultaneous, finition.
inconsistent
interface de-
Top-down design has reduced the complexity of the design
and of the programs
that result from the design.
for orderly logic development.
It also provides
When design is done from the bottom
up, decisions may be assumed to be in the upper-level
logic,
when in fact they should happen in the bottom levels.
2.1o2
Structured Programs
(Figure 8)
To reflect the program structure
that has been developed during
design, we emphasize on development of structured code. Structured code is composed of one-page
segments.
Control always enters
a segment at its top and leaves at the bottom. All segments referenced by name. GO TOs are not permitted. returns
7.
to its caller or moves
are
Each segment either
inline to the next segment.
17 Some advantages
-
of structured
code are:
It is easy to read. A listing of a program can be read and understood if the program
- Structured
can learn structured
and practice;
Elements
Structured
coding methods.
and to extend.
coding in about a week of
and once he has learned it, he generally
will not go back to traditional 2.1.3
of Structured
methods. Code
(Figure 9)
coding uses basic program elements
to code any program.
By using simple building
blocks
THEN-ELSE,
we can simplify programming,
and DO-WHILE,
code
This is often impossible
is written using traditional
code is easier to maintain
A programmer coursework
very quickly.
in structured
for sequential
statements,
IF-
reducing
complexity. 2.1.4
Continuous
As mentioned
earlier,
of code pieces
(Figure
code development
of growing
We use a process integrated
Integration
10)
implies
size and complexity
called continuous
the integration into the system.
Integration.
Functions
into the system as they become available
to a disciplined
plan.
be added in a logical
"Disciplined" sequence,
voted to the right functions
means that functions must
and that resources must be de-
at the right time. We start inte-
gration early,
for large efforts
fore shipment.
A driver is a subsystem used for subsequent
Drivers
are
according
are built by continually
as much as eighteen months beadding
functions.
system followed top-down design and implementation, can take the integration plan directly
testing.
If the entire then one
from the top-down plan.
18
Here
are some of the advantages
of continuous
integration:
Code is entered
into the system when it is complete.
sets of modules
are added at one time,
to put all the modules gration process
Small
instead of trying
together at once~
i.e.
the inte-
is less complex.
A real system is always
running,
and it is easier to keep
it running. Detailed planning to think through
becomes
their dependencies
faces will become more to become more
2.1~5
Tests
We d i s t i n g u i s h gration
necessary.
apparent.
realistic
are forced
so that their inter-
It also forces people
in their scheduling.
& Test Criteria between
Programmers
(Figure
11)
four types of tests
according
to the inte-
levels:
. Unit Test o Function Test Component
Test
System Test Unit testing It tests
is done by the developer
the smallest
a module~ completion
a macro,
self-contained
or a subroutine.
after his code is completed. piece
of code,
The m i n i m u m
is that all the branches
for example
criterion
for
in the code are executed
both ways. Function test occurs test, we put units a command,
after unit
test is completed.
together to form discrete
a recovery procedure,
~or c o m p l e t i o n
is 1OO % execution
hich must be successful.
In function
functions,
or a parameter.
for example
The criterion
of all test cases,
90 % of
19
Component
test takes place after function test is completed.
Functions
are put together to form a component.
be an access method, for completion
the supervisor,
A component may
or a compiler.
The criterion
is 100 % of all test cases were executed success-
fully. The final test is system test. The components to form the system, Criteria
for completion
and all problems groups,
I.
are all test cases executed successfully
fixed. Whatever
it is important
throughout
are put together
for example OS 20.1 or VS/2 Release
criteria
that predetermined
the whole process
(including
is used by other development criteria are used
specification
validation)
and should never be compromised.
2.2
Tools and Techniques
This section describes
techniques
to perform their tasks according 2.2.1
HIPO
and tools that support programmers to established principles.
(Figure 12)
We are supporting
the design task by a technique
called HIPO
that in turn is supported by a tool that provides
of updating
and print facilities. HIPO stands for Hierarchy, means proceeding following
plus Input, Process,
from very generalized
top-down design practices.
sively lower-level
to very detailed
The diagrams
drawing,
refer to succes-
diagrams.
Each diagram shows the inputs to a function, within the function,
and the outputs.
volved in the process, replacement
Output. Hierarchy
therefore,
for flowcharts.
the processing
steps
The data and function in-
becomes visible.
HIPOs is a
20
2.2.2
Cause / Effect Graph
Due to the absence of formal design languages, gram functions racteristic
in prose~
of burying
less visible.
Unfortunately
relationships,
A technique
we describe pro-
prose text has the chai.e. to make relationships
of analyzing
external
specifications
is to draw a cause and effect graph. A cause and effect graph is a Boolean representation mented in specifications, representing expression. omissions
a valid such as a measure
review of a specification.
The cause/effect
test of all variations graphs
design of such test case buckets. and a tool produces all causes
Figure
the cause/effect
The test bucket
requires
Integration
at different
to each other°
codes the graph
14 shows what the tool would prograph shown in Figure 13: for having
all causes in-
observed.
integrating
levels and possibly
needs
The programmer
& Build Support
of continuously
Figure
of program
can be used to derive the
6 test cases
voked and all related effects
The prQcess
test case
the list of test cases needed for testing
and effects.
duce assuming
projects
They provide
a large effort is needed to develop
for a comprehensive
functions~
manner°
13 shows an example of a graph of how to construct
inconsistencies.
for a thorough
docu-
Test Case Design
Traditionally~
2.2.4
relationships
These graphs help to find errors early,
and a structure
buckets
Figure
some specifications
and logical
2.2.3
of the logical
code and building
drivers
for a series of programming
a complex tool to perform all steps in a controlled 15 shows the support
functions
and how they relate
21
The programmer stores new code in a development library. He tests his code under a specific test system called driver. Upon test completion
(successful) his code is integrated and stored in
a master library. At that point, this code becomes available to other development groups. In particular,
test system or drivers are built from the master
library and are then used by all development groups. Finally, the system to be released is build from the master library. Errors found during testing are fixed in the development library and, after successful testing the modified code is integrated into the master library. The whole integration and build process is fully controlled in that all actions are recorded and problems as for example unresolved dependencies are reported. Preferrably,
the described library system should be capable of
supporting specification development as well. 2.2.5
VM/370 (Figure 16)
Testing needs a high amount of computing resources; testing system programs in particular requires different hardware configurations. VM/370 provides for the capability to have a variety of test systems running on the same installation. This makes the testing effort more flexible and economical. 2.3
Management Aspects
The previous sections dealt with the principals of the programming development process and the tools supporting it. I want to add some comments about managing this process. All techniques and tools, phase plans, and controls will not work unless programmers are involved in the implementation of the process. The programmers must do the detailed planning upon which the manager can base his overall plans.
22
People are evaluated on the basis of the results whereby tity,
results
are stated in terms of quality,
and cost.
cording
It should become
and analyzing
Programming
Programmers Generally
professional
quan-
that reare part
responsibility.
Teams
of different
skills are best organized
in teams.
one to three teams report to a first-level
Each first-level
manager has a librarian
The roles of the team members Figure
timeliness,
common understanding,
the above process measurements
of a system programmers 2.3.1
they achieve
manager.
to support his teams.
and the librarien are shown in
17.
The team leader is responsible cification
and has technical
specifications,
for preparing
responsibility
and all code. He makes
for the project.
For most design,
his job is to review and analyse programmers.
him.
(Evaluation
and employee
addition,
the team leader designs,
The co-team
writes
and code rather
for specifications
logic specs,
In
and codes
that his team is producing.
of the product.
responsibility
logic
by other
are done by the manager.)
leader helps the team leader
also codes key elements leadership
appraisals
of the product
the programmer,
of end results
spe-
decisions
logic specifications,
and code,
the key elements
for all design,
the technical
the results produced
In this role, he supports
than evaluates
the functional
if necessary.
in all his duties
and
He will assume full team He is a technical
peer
of the team leader and his back up. The programmers and code,
in the team are responsible
their own detailed planning,
for lower-level
and testing.
design
23 The librarian is an important part of the team. His job is to create, maintain,
and own the library for the project. The project
library includes both documentation and code. The librarian schedules and receives the runs from the computing center, and provides clerical services to the rest fo the group. The librarian is a full-fledged team member. There are many advantages in programming teams. Senior people are directly and actively involved in the project.
In the past
these people would design the function and sometimes leave the project. Keeping senior people involved provides continuity. Since they generally produce work of higher quality, they can educate the junior members of the team. The team approach provides for more detailed and realistic plans. 2.3.2
Formal Reviews
During the last two years, there have been several projects which applied very formal review techniques to specifications,
code,
test cases, and publications. One review technique has been called Walkthru,
another Inspection which word stems from a common in-
spection practice in the world of hardware engineering.
Both
techniques are common in that a very detailed structured review, attended by the most knowledgeable and affected persons regarding system technology and dvelopment process,
is performed. However,
there is a difference between a Walkthru and an Inspection, in that only the latter emphasizes the recording and analysis of not only defects discovered but also development process measurments like number of defects by type. The formal review is used by a developer as a resource to help him produce error-free products. The developer's attitude must be that other people are there to help, not to kill him with personal comments. The other participants
(a moderator,
the designer,
24
other developer(s),
a tester) must also realize that they are
there to help. Managers feelings.
Without
will not work, evaluating
this psychological
Formal
number of problems
Formal
reviews
to reinforce
atmosphere,
of existing
just that they are found
or projected
are performed
study the external
quality exposures.
during phases
questions.
specification
They probe
II, III, and IV. Fi-
by all present.
the material
The issues
walkthru
concentrates
material
like HIPOs
a code walkthru.
The participants
and related material
it for errors
In the meeting
about the
should use the number of problems
18 shows the way a Phase II walkthru works.
effect graphs.
tool for
is not concerned
found at this stage;
these
formal reviews
are not a management
The manager
The manager
as an indicator
gure
reviews
programmers.
and corrected.
have consciously
a list of
is thoroughly
analyzed
are only recorded;
on internal
like cause/
and develop
the phase
specifications
and Test Cases. The phase
III
and related
IV walkthru
is
The length and the number of participants
may
vary. A review of an entire spec could involve eight people, a couple of days notice,
and an offsite meeting.
few changes have been made to an existing may involve only three people: and the independent We have analyzed The results a problem
If relatively
spec, the walkthru
the developer,
his team leader,
reviewer.
the effect
of walkthrus
on costs
(Figure
19).
showed that it was 14 to 15 times cheaper to find
during a walkthru
test is the point
than it was in unit test. And unit
in the testing
cycle at which it is cheap to
find and fix an error.
3.
CLOSING
We believe
to have proof that the analysis
the programming quality,
development
and improvement
process has resulted
i.e. fewer APARs relative
of
in improved
to the released
code
(Figure 20)~
25
We are observing
even better results with newly developed
code.
Figure 2] shows the APAR rate for the TSO scheduler code the rate of which is about one third the rate of code that consists in a conglomerate The same methods
of old, changed and added code. that increase quality have also a favorable
effect on costs accounted
over the whole process
from inception
until shipment of a system/component. We hope that more and more system programming contribute veloping
new ideas on how to improve
systems.
will
of de-
This will be needed if new systems of growing
size and increasing tude in managing
professionals
the total process
functional
capability
the development
system programmers,
and hardware
require
another magni-
process which involves engineers.
users,
26 REFERENCES_
(I)
W.B° Cammack and H.J. Rodgers, Jr. Improving the Programming Process, IBM Technical Report 00.2483 (Oct. 73)
(2)
W.R° Elmendorf Cause-Effect
Graphs in Functional Testing
IBM Technical Report 00.2487
(3)
(Nov.
73)
F.T. Baker System Quality through Structured Programming 1972 Fall Joint Computer Conference
(4)
F.T° Baker Chief Programmer Team Management
of Production Programming
IBM Systems Journal, Vol. If, Nov. I, 1972 (5)
H.D. Hills Top-Down Programming
in Large Systems Courant Computer Sciences
Symposium I New York Univ., June 1971
(6)
(7)
P.M. Metzger Managing a Programming Prentice-Hall 1973
Project
IBM HIPO Audio Education Package SR20-9413
27
Fig.
I
System build up
Fig.
2
28
Experience Average APARs Per User OS
4
Total APARs/User
Valid 18
19
20
21
Release Fig.
3
Release 18 G FCS ÷ 2yrs.
Fig.
4 Ill D ' V
29
of
0 L-.-~
C~ling
Fig.
Unit Test Time
APAR's
5
Fig. 6 see page 3 o - ~
Top- Down Design ....
Fig.
7
30
8t
Structured Programs
Fig.
Structured Program Elements Sequence rNm
Lm
m.m
n
w,m m ~
n
i
mmm ~
I
i
~
n
~
~
mmm ~
m
m
~
~
~
mmm w,m
m ~ e m m m
~
i
um.
m
w.n
~
~
~
~ *
uw
~
,..~
u--.i
u
If T h e n Else
r
~
m
,
~
I i
~
I
.............. L
~
mmm ~m
wmmP I n t o
~m
~lmm uwm
~
~
J
L. . . . . . . . . . . . . . . . . . . . . . Do While
Fig. 9
~
~
~
~
~
1
I I
I
_1
32
I n t e g r a t i o n - As it Is FUNCTION
A--B---" C---D-'E---F--G---Driver 1 Driver 2 Driver 3 Fig.
Fig°
Io
11
33
HIPO Hierarchy
Input
CVT
Process
Fig. 12
Draw the Graph
Nodes 1. "OPI" is fixed binary 2. "OP2" is fixed decimal 3. "OPERATOR" is + 4. "OPERATOR" is 5. "OPI' is invllid 6. "OP2" is invalid 7. e x p m l l i o n is valid 8. OPERATOR is valid 9. OPERATOR is invalid
Fig. 13
Outpu,
34
The Test Case Library Design lnvocable causes 1 2 3 4
T E S T S
I 2 3 4 5
| | ! 1 | 61
,I
,
,
Observable effects I
SSSS ~lSI I l l S l l S S i S i S S l i IS I lrllllll I 1234
i l 12 g3 t4 15 16
5 I6 7 9 T E S T S
1 2 3 4 5 6
I ! i I I i
P PAP AAPA AAPA AAA P APAA PAAA
1 2 3 4 5 6
5679 | = Involved S = Suppressed Fig.
P = Present A = Absent
14
Integration and Build New Code ~rror ;orrection
Integration Source Code
I Fig.
15
T
i
System Build
I I
Test
eQ ee e
J.
J.l~
i
i
i ~iii i~i¸
36
Phase 1I Walkthru * Probe spec for mistak.es and omissions
Probb
uj
• Record problems Problem I
* Resolve problems
Fig. 18
Find Programming Errors Early
Cost errors
Fig. 19
Walkthru
Unit Test
37
Results Average APARs Against Base Declining
OS
Valid APARs
Per I n s t ~
18 Fig.
19
20
21
Release
2o
Results New Code is Higher Quality TSO Scheduler Valid APARs
(.o..dzed)
Base Plus Fig.
2~
C~,,ged
New
Organizing
for Structured Programming
F. T. Baker;
IBM Federal Systems Division,
Maryland,
Gaithersburg,
USA
ABSTRACT A new type of programming methodology,
built around structured
programming ideas, has been gaining widespread acceptance production programming.
for
This paper discusses how this method-
ology has been introduced into a large production programming organization.
Finally it analyzes the advantages and disad-
vantages of each component of the methodology and recommends ways it can be introduced in a conventional programming environment.
INTRODUCTION At this point in timer the ideas of structured programming have gained widespread acceptance,
not only in academic
circles, but also in organizations doing production programming.
An issue [1] of Datamation,
one of the leading business
data processing oriented magazines in the U.S., several articles on the topic.
featured
The meetings of SHARE and
GUIDE, two prominent computer user groups, have had an increasing number of sessions on subjects related to structured programming.
The IBM Systems Science Institutes are offering
courses and holding seminars,
and several books on the topic
are in print° What is perhaps not so widely appreciated, the organizations,
however,
is that
procedures and tools associated with the
implementation of structured programming are critical to its success.
This is particularly true in production programming
environments, grams)
where program systems
are developed,
reliable, maintainable
(rather than single pro-
people come and go, and the attainment of software on time and within cost esti-
mates is a prime management objective.
In this environment,
89
module level coding and debugging activities typically account for about 20~ of the effort spent on software development [2] . Thus, narrow applications of structured programming ideas limited only to these activities have correspondingly It is therefore desirable to adopt a broad,
limited effects.
integrated approach
incorporating the ideas into every aspect of the project from concept development to program maintenance to achieve as many quality improvements and cost savings as possible.
BACKGROUND
The IBM Federal Systems Division
(FSD) is an organization in-
volved in production programming on a large scale.
Although
much of its software work is performed for federal,
state and
local governmental agencies,
the division also contracts with
private business enterprises for complex systems development work.
Work scope ranges from less than a man-year of effort
on small projects to thousands of man-years spent on the development and maintenance of large, evolutionary,
long-term
systems such as the Apollo/Skylab ground support software. Varying customer requirements cause the use of a wide variety of hardware,
programmihg languages,
software tools, documenta-
tion procedures, management techniques,
etc.
Problems range
from software maintenance through pure applications programming using commercially available operating systems and program products to the concurrent development of central processors, ~eripherals,
firmware,
support software and applications soft-
~are for avionics requirements.
Thus, within this single or-
ganization can be found a wide range of software development efforts.
FSD has always been concerned with the development of improved software tools, techniques and management methods.
Most re-
cently, FSD has been active in the development of structured programming techniques
[3]
This has led to organizations,
procedures and tools for applying them to production programming projects,
particularly with a new organization called a
Chief Programmer Team. [4]
The Team, a functional organization
40
based on standard support tools and disciplined application of structured programming principles,
had its first trial on
a major software development effort in 1969-71. [5],[6]
In the
three years since the completion of that experimental project, FSD has been incorporating structured programming techniques into most of its software development projects. scope and diversity of these projects,
Because of the
it was impossible to
adopt any single set of tools and procedures or any rigid type of organization to all or even to a majority of them. cause of the ongoing nature of many of these systems,
And beit was
necessary to introduce these techniques gradually over a period of many years.
The approach which was adopted,
the problems
which have been encountered and the results which were achieved, are the subject of this paper.
It is believed that any software
development organization can improve the quality and reduce the costs of its software projects in a similar way.
PLAN
To introduce the ideas into FSD work practices and to evaluate their use, a plan with four major components was implemented. First, a set of guidelines was established to define the terminology associated with the ideas with sufficient precision to permit the introduction and measurement of individual components of the overall methodology. and directives Second,
These guidelines were published,
regarding their implementation were issued.
support tools and methodologies were developed,
par-
ticularly for projects using commercial hardware and operating systems.
For those projects where these were not employed,
standards based on the developed tools enabled them to provide their own support.
Third, documentation of the techniques and
tools, and education in their user were both carried out.
These
were done on a broad scale covering management techniques,
pro-
41
gramming methodologies and clerical procedures.
Fourth,
a
measurement program was established to provide data for technology evaluation and improvement.
This program included both
broad measurements which were introduced immediately,
and de-
tailed measurements which required substantial development work and were introduced later.
The next four sections cover
the components of this plan and their implementation in detail.
GUIDELINES
A number of important considerations
influenced the establish-
ment of a set of guidelines for the application of structured programming technology within FSD.
First and most important,
they had to permit adaptation to the wide variety of project environments described above.
This required that they be use-
ful in program maintenance situations where unstructured program systems were already in being,
as well as in those where com-
pletely new systems were to be developed.
Second, they had to
allow for the range of processors and operating systems in use. This necessitated the description of functions to be provided instead of specific tools to be used.
Third, they had to allow
for differences in organizations and methodology fications,
documentation,
(e.g., speci-
configuration management)
required
or in use on various projects.
The guidelines resulting from these considerations are a hierarchical set of four components, in Figure I.
graphically illustrated
Use of the component at any level presupposes use
of those below it.
Thus, by beginning at a level which a
project's environment and status permit,
and then progressing
upward, projects can evolve gradually toward full use of the technology.
42
1.
Development
The introductory
Support Libraries level is the Development
Support Library,
which is a tool designed with two key principles a°
Keep current project status organized all times.
in mind:
and visible at
In this way, any programmer,
manager or
user can find out the status or study an approach directly without depending b.
Make it possible
on anyone else.
for a trained secretary to do as
much library maintenance clerical
from intellectual
A DSL is normally Librarian~
as possible, activity.
the primary responsibility
Programmers
thus separating
of a Programming
interface with the computer primarily
through the library and the Programming
Librarian.
This allows
better control of computer activity and ensures that the library is always complete with up-to-date standings
and current.
versions
of programs
and inconsistencies
and modification
Programmers
are always working
and data,
so that misunder-
are greatly reduced.
A version
level are associated with all material
in the
library to permit change control and assist in configuration nanagement.
In general,
in increasing
the library system is the prime factor
the visibility
reducing risk and increasing The guidelines
reliability.
provide that a Development
being used if the following
ao
of a developing project and thus
conditions
A library system providing the PPL
(see TOOLS below)
Support Library is
prevail:
the functional is being used.
equivalent of
43
b.
The library system is being used throughout the development process, ject code,
c.
not just to store debugged source or ob-
for example.
Visibility of the current status of the entire project, as well as past history of source code activities and run executions,
d.
is provided by the external library.
Filing procedures are faithfully adhered to for all runs, whether or not setup was performed by a librarian.
e.
The visibility of the code is such that the code itself serves as the prime reference for questions of data formats, program operation,
etc.
f.
Use of a trained librarian is recommended.
2.
Structured Pro~rammin~
In order to provide for use of structured programming techniques on maintenance as well as development projects, necessary to depart from t h e c l a s s i c a l and adopt a narrower definition.
In FSD, then, we distinguish
between those practices used in system development Development)
it was
use of the terminology (Top-Down
and those used in coding individual program modules
(Structured Programming).
Our use of the term "structured pro-
gramming" in the guidelines thus refers primarily to coding standards governing control flow, and module organization and construction.
They require three basic control flow figures and
permit two optional ones, as shown in Figure 2. a Guide [7]
They refer to
(see DOCUMENTATION below) which contains general
information and standards for structured programming,
as well as
44
d e t a i l e d standards
for use of various p r o g r a m m i n g
languages,
They also require that code be r e v i e w e d by someone other than the developer~
The d e t a i l e d g u i d e l i n e s for s t r u c t u r e d pro-
g r a m m i n g are as follows:
a.
The c o n v e n t i o n s e s t a b l i s h e d in the S t r u c t u r e d P r o g r a m m i n g Guide are being followed. documented.
E x c e p t i o n s to c o n v e n t i o n s are
If a language is being used for w h i c h con-
v e n t i o n s have not been p u b l i s h e d in the Guide,
then use
of a locally g e n e r a t e d set of conventions c o n s i s t e n t w i t h the rules of s t r u c t u r e d p r o g r a m m i n g is acceptable.
b.
The code is being r e v i e w e d for functional i n t e g r i t y and for a d h e r e n c e to the s t r u c t u r e d p r o g r a m m i n g conventions.
c.
A Development
Support L i b r a r y is being used.
3.
Top-Down Development
T o p - d o w n d e v e l o p m e n t refers to the process of c o n c u r r e n t design and d e v e l o p m e n t of p r o g r a m systems c o n t a i n i n g more than a single c o m p i l a b l e unit. which minimizes
It r e q u i r e s d e v e l o p m e n t to proceed in a way interface p r o b l e m s n o r m a l l y e n c o u n t e r e d during
the i n t e g r a t i o n p r o c e s s typical of "bottom-up development" by i n t e g r a t i n g and testing m o d u l e s as soon as they are developed. Other a d v a n t a g e s are that:
a.
It permits a p r o j e c t to man up more g r a d u a l l y and should reduce the total m a n p o w e r required.
bo
C o m p u t e r time r e q u i r e m e n t s tend to be spread more evenly over the d e v e l o p m e n t period.
45
c.
The user gets to work with major portions of the system much earlier and can identify gross errors before acceptance testing.
d.
Most of the system has been in use long enough by the time it is delivered that both the user and the developer have confidence in its reliability.
e.
The really critical interfaces between control and function code are the first ones to be coded and tested and are in operation the longest.
The term "top-down" may be somewhat misleading if taken too literally.
What top-down development really implies in every-
day production programming is that one builds the system in a way which ideally eliminates
(or more practically,
minimizes)
writing any code whose testing is dependent on other code not yet written,
or on data which is not yet available.
This re-
quires careful planning of the development sequence for a large system consisting of many programs and data sets, since some programs will have to be partially completed before other programs can be begun.
In practice,
it also recognizes that exi-
gencies of customer requirements or schedule may force deviations from what would otherwise be an ideal development sequence. The guidelines for top-down development are as follows:
a.
Code currently being developed depends only on code already operational,
except in those portions where devi-
ations from this procedure are justified by special circumstances.
46
b.
The project schedule reflects a continuing integrationr as part of the development process,
leading directly to
system test, as opposed to a development, integration,
followed by
followed by system test, cycle.
c.
Structured Programming
is being used.
d.
A Development Support Library system is being used. (While ongoing projects may not be able to meet this criterion,
an implementation of structured coding practice
is acceptable
e.
in these cases.)
The managers of the effort have attended a structured programming orientation course
4.
(see EDUCATION below).
Chief Progrgmmer Teams
A Chief Programmer Team
(CPT) is a functional programming or-
ganization built around a nucleus of three experienced professionals doing well-defined parts of the programming development process using the techniques and tools described above. It is an organization uniquely oriented around them and is a logical outgrowth of their introduction and use. detail in
[4], [5], and
Described in
[6], it has been used extensively in
FSD on projects ranging up to approximately i00,000 lines of source code and is being experimented with on larger projects. The guidelines
as
for CPT's are as follows:
A single person;
the chief programmer,
technical responsibility
has complete
for the effort.
He will
ordinarily be the manager of the other people.
47
b.
There is a backup programmer prepared to assume the role of chief programmer.
c.
Top-D~wn Development, velopment
d.
Structured Programming
and a De-
Support Library are all being used.
Top level code segments and the critical control paths of lower level segments
are being coded by the chief and
backup programmers.
e.
The chief and backup programmers
are reviewing the code
produced by other members of the team. f.
Other programmers
are added to the team only to code
specific well defined functions within a framework established by the chief and backup programmers.
TOOLS
Tools are necessary
in order to permit effective
of, and achieve m a x i m u m benefits programming.
Development
implementation
from the ideas of structured
Support Libraries,
introduced above,
are a recognized and required component of the methodology ployed in FSD.
em-
Standards are necessary to ensure a consistent
approach and to help realize benefits of improved project communications effective
and manageability.
Procedures
are required for
use of the tools and to permit functional breakup and
improved overall efficiency ly, other techniques ment can be helpful
in the programming
of design,
programming,
in a structured programming
well as in a conventional
one.
process.
Final-
testing and manageenvironment
as
48
1.
Development Su~ort
Libraries
The n e e d for and value of D e v e l o p m e n t Support L i b r a r y support,
(DSL)
both as a n e c e s s i t y for s t r u c t u r e d p r o g r a m m i n g and
as a vehicle
for p r o j e c t c o m m u n i c a t i o n and control, has been
t h o r o u g h l y covered in
[4],
[5],
[6],
[7], and
[8].
Early work
on DSL's c e n t e r e d on the p r o v i s i o n of libraries for p r o j e c t s using IBM's S y s t e m / 3 6 0 O p e r a t i n g System and Disk O p e r a t i n g System.
The OS/360 P r o g r a m m i n g P r o d u c t i o n L i b r a r y
(PPL) is
typical of those we are using in b a t c h p r o g r a m m i n g d e v e l o p m e n t situations. external
It consists of internal
(human-readable)
procedures.
libraries,
(computer-readable)
and
and office and m a c h i n e
Similar concepts and approaches apply to our other
library systems,
i n c l u d i n g those w o r k i n g in an online environ-
ment.
The PPL keeps all m a c h i n e a b l e data on a p r o j e c t - source code, object code, test data,
linkage e d i t o r language,
job control language~
and so on - in a series of data sets w h i c h c o m p r i s e
the internal l i b r a r ~
(see Figure 3).
Since all data is kept
i n t e r n a l l y and is fully b a c k e d up, there is no need for programmers to g e n e r a t e or m a i n t a i n their own p e r s o n a l copies.
Cor-
r e s p o n d i n g to each type of data in the internal library there is a set of c u r r e n t status binders w h i c h c o m p r i s e the external librar[
(see F i g u r e
~).
These are filed c e n t r a l l y and used by
all as a s t a n d a r d m e a n s of communication.
There is also a set
of a r c h i v e s of s u p e r s e d e d status pages w h i c h are r e t a i n e d to assist in d i s a s t e r recovery, run results.
Together,
and a set of run books c o n t a i n i n g
these record the a c t i v i t i e s - current
and h i s t o r i c a l - of an entire p r o j e c t and keep it c o m p l e t e l y organized°
49
The m a c h i n e p r o c e d u r e s , as the name implies,
are c a t a l o g e d
p r o c e d u r e s w h i c h p e r f o r m internal library maintenance, up, e x p a n s i o n and so on.
back-
Most of them are used by Program-
ming L i b r a r i a n s by means of simple control cards they have been trained to prepare.
A c o m p l e t e list is given in Table i.
The 0ffice p r o c e d u r e s are a set of "clerical algorithms" used by the P r o g r a m m i n g L i b r a r i a n to invoke the m a c h i n e procedures, to prepare the input and file the output.
Once new code has
been created and placed in the library initially,
a programmer
makes c o r r e c t i o n s to it by m a r k i n g up pages in the external library and giving them to the P r o g r a m m i n g L i b r a r i a n to make up control and data cards to cause the c o r r e s p o n d i n g changes or additions to be made to the internal library.
As a result,
clerical effort and w a s t e d time on the part of the p r o g r a m m e r s are s i g n i f i c a n t l y reduced.
Figure 5 shows the w o r k flow and the
central role of the P r o g r a m m i n g L i b r a r i a n in the process.
Be-
cause p r o g r a m m e r s are served by the L i b r a r i a n and the PPL, they are freed from i n t e r r u p t i o n s and can work on more routines in parallel than they p r e v i o u s l y did. ~ The PPL m a c h i n e and office p r o c e d u r e s are d o c u m e n t e d for programmers librarians in
in
[7] and for
[9].
S u b s e q u e n t work on DSL's in FSD has e x t e n d e d the support to some of the n o n - S y s t e m / 3 6 0
equipment in use and also introduced
interactive DSL's for use both by librarians and programmers. Furthermore,
a study of general requirements
for DSL's has
been p e r f o r m e d under c o n t r a c t to the U. S. Air Force and has been p u b l i s h e d in
[i0].
DSL's are now available for and in
use on m o s t p r o g r a m m i n g projects in FSD.
50
2.
Standards
To support s t r u c t u r e d p r o g r a m m i n g in the v a r i o u s languages used, p r o g r a m m i n g standards were required.
These covered
both the i m p l e m e n t a t i o n of the control flow figures in each language as w e l l as the c o n v e n t i o n s
for f o r m a t t i n g and in-
d e n t i n g p r o g r a m s in that language.
There are four a p p r o a c h e s w h i c h can be taken to provide the basic and o p t i o n a l control flow figures in a p r o g r a m m i n g language,
a~
and each was used in certain situations
The figures may be d i r e c t l y a v a i l a b l e as s t a t e m e n t s in the language.
In the case of PL/I, all of the basic
figures were of this variety. (with slight restrictions) PERFORM)
b.
in FSD.
In COBOL,
the IFTHENELSE
and the D O U N T I L
(as a
were present'.
The figures may be e a s i l y s i m u l a t e d using a few s t a n d a r d statements.
The CASE s t a t e m e n t may be readily s i m u l a t e d
in PL/I using an indexed GOTO and a LABEL array, w i t h e a c h case i m p l e m e n t e d via a D O - g r o u p e n d i n g in a GOTO to a c o m m o n null s t a t e m e n t f o l l o w i n g all cases.
c.
A s t a n d a r d p r e - p r o c e s s o r may be used to augment the basic language s t a t e m e n t s to p r o v i d e n e c e s s a r y features.
The
m a c r o a s s e m b l e r has been used in FSD to add s t r u c t u r i n g features to System/360,
System/370
and System/7 A s s e m b l e r
Languages.
d.
A special p r e - p r o c e s s o r may be w r i t t e n to compile augmented language statements
into standard ones, w h i c h may
then be p r o c e s s e d by the normal computer.
This was done
51
for the FORTRAN language, which directly contains almost none of the needed features.
The result of using these four approaches was a complete set of figures for PL/I, COBOL, FORTRAN and Assembler.
Using these
as a base, similar work was also done for several specialpurpose languages used in FSD. To assist in making programs readable and in standardizing communications and librarian procedures,
it was desirable that
programs in a given language should be organized, indented in the same way.
Link Editor Languages as well as of the procedural mentioned above.)
formatted and
(This was true of the Job Control and languages
Coding conventions were developed for each
covering the permitted control structures, naming, use of comments,
segment formatting,
labels, and indentation and formatting
for all control flow and special
(e.g., OPEN, CLOSE, DECLARE)
statements. 3.
Procedures
An essential aspect of the use of DSL's is the standardization of the procedures associated with them.
The machine procedures
used in setting up, maintaining and terminating the libraries were mentioned above in that connection.
However,
the office
procedures used by librarians in preparing runs, executing them and filing the results are also quite extensive.
These were
developed and documented [9] in a form readily usable by nonprogramming oriented librarians. ~.
Other
While the above constitute the bulk of the work originated by FSD, certain other techniques and procedures have been assimi-
52
lated into the methodology in varying degrees. management techniques,
These include
HIPO diagrams and structured walkthroughs.
FSD has been a leader in the development of management techniques for programming projects.
A book [II] resulting from
a management course and guide used in FSD has become a classic in the field.
As top-down development and structured program-
ming came into use, it became apparent that traditional management practices would have to be substantially revised IMPLEMENTATION EXPERIENCE below).
(see
An initial examination was
done, and a report [12] was issued which has been very valuable in guiding managers
into using the new methodology.
This
material is now being added to a revised edition of the FSC Programming Project Management Guide,
from which the book [II]
mentioned above was drawn. A documentation technique called HIPO Process-Output)
diagrams
[8,13]
(Hierarchy plus Input-
developed elsewhere in IBM has
proved valuable in supporting top-down development.
HIPO con-
sists of a set of operational diagrams which graphically describe the functions of a program system from the general to the detail level.
Not to be confused with flowcharts, which de-
scribe procedural
flow, HIPO diagrams provide a convenient means
of documenting the functions identified in the design phase of a top-down development effort.
They also serve as a useful intro-
duction to the code contained in a DSL and as a valuable maintenance tool following delivery of the program system. Structured walk-throughs [8] were developed on the second CPT project as a formal means for design and code reviews during the development process.
Using HIPO diagrams and eventually the
53
code itself, reviewers. grammer
the d e v e l o p e r
"walks through" his efforts for the
These latter may consist of the Chief or Backup Pro-
(or lead p r o g r a m m e r if a CPT is not being employed),
other p r o g r a m m e r s and a r e p r e s e n t a t i v e formally test the programs. detection,
not correction,
from the group w h i c h will
Emphasis is on error avoidance and and the attitude is open and non-
d e f e n s i v e on the part of all participants be tomorrow's reviewee).
(today's reviewer will
The reviewers prepare for the walk-
through by studying the diagrams or code before the meeting, and followup is the r e s p o n s i b i l i t y of the reviewee, who m u s t notify the reviewers of corrective actions taken.
D O C U M E N T A T I O N AND E D U C A T I O N
Once the fundamental tools and guidelines were established, was n e c e s s a r y to begin d i s s e m i n a t i n g them throughout FSD.
it Much
e x p e r i m e n t a l work had already been done in d e v e l o p i n g the tools and guidelines themselves,
so that a cadre of people familiar
w i t h them was already in being.
Most of the d o c u m e n t a t i o n has been referred to above.
The pri-
mary reference for p r o g r a m m e r s was the FSC S t r u c t u r e d Pro@ramm i n @ Guide [7]
In addition to the standards for each language
and for use of the PPL, it contained general i n f o r m a t i o n on the use of top-down d e v e l o p m e n t and structured programming, as the p r o c e d u r e s
as well
for m a k i n g exceptions to them when necessary.
It also contained provisions
for sections to be added locally
when s p e c i a l - p u r p o s e languages or libraries were in use. tributed t h r o u g h o u t FSD,
Dis-
the Guide has been updated and is still
the standard reference for programmers.
The FSC P r o ~ r a m m i n ~ Li-
b r a r i a n ' s Guide [9] serves a similar purpose for librarians and also has provisions for local sections where necessary.
While
54
the use of the macros
for System/360 A s s e m b l e r L a n g u a g e was in-
cluded in the Pr_ogrammin~ Guide, was a v a i l a b l e on them if desired. m e n t a t i o n in the form of
[ii] and
a d d i t i o n a l d o c u m e n t a t i o n [14] Finally,
m a n a g e m e n t docu-
[12] was also available.
It was r e c o g n i z e d that p r o v i d i n g d o c u m e n t a t i o n alone was not s u f f i c i e n t to p e r m i t m o s t p e r s o n n e l to b e g i n a p p l y i n g the techniqueso
S t r u c t u r e d p r o g r a m m i n g r e q u i r e s s u b s t a n t i a l changes
in the p a t t e r n s and p r o c e d u r e s of programming,
and a s i g n i f i c a n t
m e n t a l e f f o r t and amount of p r a c t i c e is n e e d e d to o v e r c o m e old habits and instill new ones. m a j o r language)
A series of courses
in s t r u c t u r e d p r o g r a m m i n g and DSL techniques. five hours, portantly,
L a s t i n g twenty-
these courses p r o v i d e d i n s t r u c t i o n and, m o r e imp r a c t i c e p r o b l e m s w h i c h forced the p r o g r a m m e r s to
begin the t r a n s i t i o n process. retrained,
(one for each
was set up to train e x p e r i e n c e d FSD p r o g r a m m e r s
Once all p r o g r a m m e r s had been
these courses w e r e discontinued,
and s t r u c t u r e d pro-
g r a m m i n g is now i n c l u d e d as p a r t of the basic p r o g r a m m e r training courses given to newly hired personnel.
The same s i t u a t i o n held true for m a n a g e r s as well as programmers.
Because FSD w i s h e d to apply the m e t h o d o l o g y as rapidly as
possiblea
it was d e s i r a b l e to a c q u a i n t m a n a g e r s w i t h it and its
p o t e n t i a l immediately.
Thus, one of the first actions taken
was to give a h a l f - d a y o r i e n t a t i o n course to all FSD managers. This p e r m i t t e d them to e v a l u a t e the depth to w h i c h they could begin to use it on current projects, its use on p r o p o s e d projects.
and to begin to plan for
This was then followed up by a
t w e l v e - h o u r course for e x p e r i e n c e d p r o g r a m m i n g managers,
ac-
q u a i n t i n g t h e m w i t h m a n a g e m e n t and control t e c h n i q u e s p e c u l i a r to t o p - d o w n d e v e l o p m e n t and s t r u c t u r e d programming.
(It was ex-
55
pected that most of these managers would also attend one of the structured programming courses described above to acquire the fundamentals.)
Again, now that most programming managers
have received this form of update, the material has now been included in the normal programming management co~rse given to all new programming managers.
MEASUREMENT
One of the problems of the production programming world is that it has not developed good measures of its activities.
Various
past efforts, most notably the System Development Corporation studies [15] have attempted to develop measurement and prediction techniques for production programming projects.
The
general results have been that a number of variables must be accurately estimated to yield even rough cost and schedule predictions,
and that the biggest factors are the experience and
abilities of the programmers involved.
Nevertheless,
it was
felt in FSD that some measures of activity were needed, not so much for prediction as for evaluation of the degree to which the methodology was being applied and the problems which were experienced in its use.
To these ends, two types of measure-
ments were put into effect.
The first type of measurement,
implemented immediately, was a
monthly report required from each programming project. programming manager was required to state:
I.
The total number of programmers on the project.
2.
The number currently programming.
3.
The number using structured programming.
Each
56
4.
The number of p r o g r a m m i n g groups on the project.
5.
The number of CPT~s.
6.
W h e t h e r a DSL was in use.
7.
W h e t h e r top-down d e v e l o p m e n t was in use.
These figures w e r e s - ~ m a r i z e d m o n t h l y for various
levels of
FSD m a n a g e m e n t and were a v a l u a b l e tool in e n s u r i n g that the m e t h o d o l o g y was indeed being introduced.
The second type of m e a s u r e m e n t was a m u c h more c o m p r e h e n s i v e one.
It r e q u i r e d a great deal of r e s e a r c h in its preparation,
and e v e n t u a l l y took the form of a q u e s t i o n n a i r e from w h i c h data was e x t r a c t e d to b u i l d a m e a s u r e m e n t data base. naire contains
The q u e s t i o n -
105 q u e s t i o n s o r g a n i z e d into the f o l l o w i n g eight
sections:
1.
I d e n t i f i c a t i o n of the project.
2.
D e s c r i p t i o n of the c o n t r a c t u a l environment.
3.
D e s c r i p t i o n of the p e r s o n n e l environment.
4.
D e s c r i p t i o n of the p e r s o n n e l themselves.
5.
D e s c r i p t i o n of the technical environment.
6.
D e f i n i t i o n of the size, type and q u a l i t y of the p r o g r a m s produced.
7.
I t e m i z a t i o n of the financial~
c o m p u t e r and m a n p o w e r re-
sources used in their development.
57
8.
Definition of the schedule.
The questionnaire
is administered at four points during the
lifetime of every project. ginning,
The first point is at the be-
in which all questions are answered with estimates.
The next administration is at the end of the design phase, when the initial estimates are updated as necessary.
It is
again filled out halfway through development, when actual figures begin to be known.
And it is completed for the last
time after the system has been tested and delivered, results are in.
and all
The four points provide for meaningful com-
parisons of estimates to actuals,
and allow subsequent projects
to draw useful guidance for their own planning.
The data base
permits reports to be prepared automatically and statistical comparisons to be made.
IMPLEMENTATION EXPERIENCE
Each of the four components of the methodology which FSD has introduced has resulted in substantial benefits.
However,
experience has also revealed that their application is neither trivial nor trouble-free.
This section presents a qualitative
analysis of the experience to date, describing both the advantages and the problems.
i.
Development Support Libraries
Most projects of any size have historically gravitated toward use of a program library system of some type.
This was cer-
58
tainly true in FSD~ w h i c h had some h i g h l y d e v e l o p e d systems already in place w h e n the m e t h o d o l o g y was introduced. p r i m a r i l y used as m e c h a n i s m s to control the code,
These w e r e
so that dif-
fering v e r s i o n s of c o m p l e x systems could be segregated.
In
some cases they p r o v i d e d p r o g r a m d e v e l o p m e n t services such as compilation,
t e s t i n g and so forth.
However, none were being
used p r i m a r i l y to achieve the goals of i m p r o v e d c o m m u n i c a t i o n s or w o r k f u n c t i o n a l i z a t i o n w h i c h are the p r i m a r y b e n e f i t s of a DSL.
In fact, the general a t t i t u d e toward the services they
p r o v i d e d was that they were there to be used w h e n and if the p r o g r a m m e r s wished.
M o s t code in them was p r e s u m e d private,
with
the usual e x c e p t i o n s of m a c r o and s u b r o u t i n e libraries.
One of the m o s t d i f f i c u l t p r o b l e m s
in the i n t r o d u c t i o n of the
DSL a p p r o a c h was to c o n v i n c e o n g o i n g p r o j e c t s that their p r e s e n t library systems f u l f i l l e d n e i t h e r the r e q u i r e m e n t s nor the intents of a DSL.
A DSL is as m u c h a m a n a g e m e n t tool as a pro-
grammer convenience.
A P r o g r a m m i n g L i b r a r i a n ' s p r i m a r y respon-
s i b i l i t y is to management,
in the sense of s u p p o r t i n g control
of the p r o j e c t ' s assets of code and data -- a n a l o g o u s to a controller's r e s p o n s i b i l i t y to m a n a g e m e n t of s u p p o r t i n g control of financial assets.
The p r o j e c t as a w h o l e should be e n t i r e l y
d e p e n d e n t on the DSL for its operation, other criterion,
and this, more than any
is the d e t e r m i n i n g factor in w h e t h e r a library
s y s t e m m e e t s the g u i d e l i n e s as a DSL.
When all functions are provided,
and a p r o j e c t implements a DSL,
then a h i g h degree of v i s i b i l i t y is available.
P r o g r a m m e r s use
the source code as a b a s i c means of c o m m u n i c a t i o n and rely on it to answer q u e s t i o n s on i n t e r f a c e s or suggest approaches to their problemso
M a n a g e r s use the code itself
(or the summary
59
features of more sophisticated progress
of the work.
basis.
even at an early
basis.
in itself is valuable,
even on a laissez-faire
But when it is coupled with w e l l - m a n a g e d above,
that the specifications
the planned test coverage, and last but not least,
ensure
reviews the code,
have been addressed,
assisting
in standards
constructively
While the review procedure concomitant
The walk-
or equivalent procedures,
that someone in addition to the developer verifying
code-reading
it also provides quality improvements.
throughs described
of the
of beginning to use the de-
system on an experimental
procedures,
the
from the ready availability
test data and the feasibility
The visibility
to determine
Users also benefit,
stage of implementation, veloping
DSL's)
criticizing
checking
compliance the content.
is obviously greatly facilitated by
use of structured programming,
it is possible with-
out it and was included with the DSL guidelines
to encourage
its adoption.
The archives which are an integral part of a DSL provide an ability to refer to earlier versions of a routine - sometimes useful in tracing intent when a program is passed from hand to hand.
More importantly,
cover from a disaster stroyed.
they give a project the ability to re-
in which part of its resources
(It is perhaps obvious but worth m e n t i o n i n g
will not be complete
from the working versions.)
sees
separate
There was an initial tendency
and to over-formalize
It appears unnecessary object code,
that this
insurance unless project management
to it that the backup data sets are stored physically FSD to over-collect
are de-
in
the archiving process.
to retain more than a few generations
run results and so forth.
of
The source code and test
60
data g e n e r a l l y w a r r a n t longer retentionl
but even here it
rapidly b e c o m e s i m p r a c t i c a l to save all versions. s u f f i c i e n t archives
In general,
should be r e t a i n e d to provide c o m p l e t e
r e c o v e r y c a p a b i l i t y w h e n used in c o n j u n c t i o n w i t h the b a c k u p data sets, plus enough a d d i t i o n a l to provide back references.
The s e p a r a t i o n of f u n c t i o n i n t r o d u c e d by the DSL office procedures has two main benefits.
The obvious one is of lowered
cost t h r o u g h the use of c l e r i c a l p e r s o n n e l instead of p r o g r a m mers for p r o g r a m maintenance,
run setup and filing activities.
A s i g n i f i c a n t a d d i t i o n a l b e n e f i t comes about t h r o u g h the resulting m o r e c o n c e n t r a t e d use of programmers. terruptions~
By r e d u c i n g in-
librarians afford the p r o g r a m m e r s a w o r k environ-
m e n t in w h i c h errors are less likely to occur.
Furthermore,
they p e r m i t p r o g r a m m e r s to work on more routines in p a r a l l e l than t y p i c a l l y is the case.
The last major b e n e f i t d e r i v e d from a DSL rests in its support of a p r o g r a m m i n g m e a s u r e m e n t activity.
By a u t o m a t i c a l l y col-
lecting s t a t i s t i c s of the types d e s c r i b e d above,
they can en-
h a n c e our ability to m a n a g e and improve the p r o g r a m m i n g process. The e a r l y DSL's in FSD did not include m e a s u r e m e n t features, and the next g e n e r a t i o n is only b e g i n n i n g to come into use, so a full a s s e s s m e n t of this support is not yet possible.
It was d i f f i c u l t to c o n v i n c e FSD p r o j e c t s in some cases that a w e l l - q u a l i f i e d P r o g r a m m i n g L i b r a r i a n could b e n e f i t a p r o j e c t as m u c h as a n o t h e r programmer.
In fact, there was an initial
t e n d e n c y to use junior p r o g r a m m e r s or p r o g r a m m e r t e c h n i c i a n s to provide librarian support.
This had two d i s a d v a n t a g e s
is not recommended.
the use of p r o g r a m m i n g - q u a l i f i e d
First,
and hence
61
personnel is not necessary because of the well-defined procedures inherent in the DSL's. dividuals
Use of overqualified in-
in some cases led to boredom and sloppy work with
a resulting loss of quality.
Second,
such personnel cannot
perform other necessary functions when needed.
One of the
advantages of using secretaries as librarians is that they can use both skills effectively over the lifetime of a typical project.
During design and documentation phases,
they can
provide typing and transcription services; while during coding and testing phases,
they can perform the needed librarian
work.
Two problems remain in defining completely the role of librarians.
First,
the increasing use of interactive systems
for program development is forcing an evolution of librarian skills toward terminal operation and test support rather than coding of changes and extensive filing.
The most effective
division of labor between programmer and librarian in such an environment remains to be determined.
It also appears
possible to use librarians to assist in documentation, as in preparation of HIPO diagrams.
such
Second, FSD has a number
of small projects in locations remote from the major office complexes and support facilities - frequently on customer premises.
Here it is not always possible to use a librarian
cost-effectively.
In this situation,
the programmer/librarian
better definition of
relationship in the interactive system
development environment may permit development and librarian support to some extent from the central facility instead of requiring all personnel to be on-site.
62
2.
Structured
Pro~rammin@
Recall that the FSD use of the term "structured programming" is a narrow one, adopted to permit ongoing projects to use some of the methodology.
In this usage,
might be called "structured techniques
coding"
used in developing
a single compilable
Combined with usage of a DSL, of code, enforces modularity and maintainability, manageability properties on here.
programming
and thus encourages testing,
and accountability.
An additional,
(module).
changeability
and permits improved
These are all well-known
programming unplanned
and need not be elaborated
for, benefit of structured
is that it tends to encourage
~11ocality of reference",
unit
it provides enhanced readability
simplifies
of structured
it is more like what
and is limited to those
the property of
which improves performance
in a virtual
systems environment. Reflecting
on the advantages
ming and the use of DSL's, techniques
fundamentally
gramming discipline. individualistic, cipline
attributed
program-
are directed toward encouraging
Historically,
undisciplined
programming
activity.
Thus,
introducing
in-
plus those due to better
and control.
The introduction achieved
itself,
dis-
recog-
yields double rewards -- the advantages
herent in the methodology standardization
pro-
has been a very
in the form of practices which most programmers
nize as beneficial,
in FSD.
of structured programming
was not easily
The broad variety of projects,
support has already been mentioned, DSL's,
to structured
one is struck by the fact that the
languages
and the development
and
of
the Guides [7'9] and the education program were necessary
before widespread
application
of the methodology
could take
63
place.
Furthermore,
the ongoing nature of many of the systems
meant that structured programming could take place only as modules were rewritten or replaced. This gradual introduction created a problem of educational timing.
Practically,
it was most expedient to have programmers
attend the education courses between assignments.
The nature
of the courses was such that they introduced the techniques and provided some initial practice.
Yet they required sub-
stantial work experience using the techniques to be fully effective.
Structured programming requires the development of a
whole new set of personal patterns in programming.
Until old
habits are unlearned and replaced by new ones, it is difficult for programmers to fully appreciate the advantages of structured programming.
For best results, this work experience and
the overcoming of the natural reluctance to change habits should follow the training immediately.
This was not always feasible
and resulted in some loss of educational effectiveness. A second problem arose because of the real-time nature of a significant fraction of FSD's programming business.
Here the
difficulty was one of demonstrating that structured programming was not detrimental to either execution speed or core utilization.
While it is difficult to verify the advantages quantita-
tively, a working consensus has arisen.
Simply stated, it is
that the added time and thought required to structure a program pay off in better core utilization and improved efficiency which generally are comparable to the effects achieved in unstructured programs by closer attention to detail. note that even in "critical" programs,
It is also useful to a relatively small frac-
tion of the code is really time- or core-sensitive,
and this
64
fraction may not in fact be predictable a priori.
Hence it is
probably a better strategy to use structured programming throughout to begin with.
Then, if performance bottlenecks do appear
and cannot be resolved otherwise,
at most small units of
code must be hand-tailored to remedy the problems. way the visibility,
In this
manageability and maintainability
ad-
vantages of structured programming are largely retained. Perhaps the most difficult problem to overcome in applying structured programming is the purist syndrome,
in which the
goal is to write perfectly structured code in every situation. it must be emphasized that structured programming is not an end in itself, but is a means to achieving better, more reliable, more maintainable programs.
In some cases
a loop when a search is complete,
(e.g., exiting from
handling interrupt conditions),
religious application of the figures allowed by the Guide may produce code which is less readable than that which might contain a GO TO
(e.g., to the end of the loop block, or to return
from the interrupt handler to a point other than the point of interrupt).
Clearly the exceptions must be limited if discipline
is to be maintained,
but they must be permitted when desirable.
Our approach in FSD has been to require management approval and documentation
for each such deviation.
This ensures that only
cases which are clearly justified will be nominated as exceptionsr
3.
since otherwise the requirements are prohibitive.
Top-Down_Developme~
As defined above~
top-down development is the sequencing of pro-
gram system development to eliminate or avoid interface problems.
65
This permits development and integration to be carried out in parallel and provides additional advantages such as early availability discussed under GUIDELINES.
Top-down development is the most difficult of the four components to introduce, probably because it requires the heaviest involvement and changes of approach on the part of programming managers.
Top-down development has profound
effects on traditional programming management methodology. While the guidelines
sound simple, they require a great deal
of careful planning and supervision to carry out thoroughly in practice,
even on a small project.
top-down development,
The implementation of
unlike structured programming and DSL's,
thus is fundamentally a management and not a programming problem.
Let us distinguish at this point between what might be called "top-down programming"
and true top-down development.
While
they were originally used interchangeably and the guidelines do not distinguish between them, the two terms are valuable in delineating levels of scope and complexity as use of the methodology increases.
Top-down programming is primarily a single-program-oriented concept.
It applies to the development of a "program", typical-
ly consisting of one or a few load modules and a number of independently compilable units, which is developed by one or a few programmers.
At this level of complexity the problems are pri-
marily ones of program design, and the approaches used are those of classical structured programming definition)
(here not the narrower FSD
such as "levels of abstraction "[16] and the use of
Mills' Expansion Theorem [3] .
Within this scope of development
66
external problems and constraints are not as critical, while management involvement is needed, pervasive as in top-down development.
and
it need not be so Many of FSD's suc-
cessful projects have been of this nature,
and the experience
gained on them has been most valuable.
Top-down development,
on the other hand,
program oriented idea.
is a multiple-
It applies to the development of a
"program system ~', typically consisting of many load modules and perhaps a hundred or more independently compilable units, which is developed by one or more programming departments with five or more people in each.
Now the problems expand to those
of system architecture~
and external problems and constraints
become the major ones.
The programs in the system are usually
interdependent and have a large number of interfaces, perhaps directly but also frequently through shared data sets or communications
lines.
They may operate in more than one processor
concurrently - for example, System/370
in a System/7
"front end" and a
"host".
The complexity of such a system makes management involvement in its planning and development essential even when external constraints are minimal.
It involves all aspects of the project
from its inception to its termination.
For example,
a proposal
for a project to be implemented top-down should differ from one for a conventional
implementation
and usage of computer time.
in the proposed manning levels
Functions must be carefully analyzed
during the system design phase to ensure that the requirements of minimum code and data dependency are met, and a detailed implementation sequence must be planned in accordance with the overall proposed plan and schedule.
The design of the system
67
very probably should differ significantly from what it would have been if a bottom-up approach were to be used. implementation,
sure that this sequence is being followed, are being met.
During
progress must be monitored via the DSL to enand that schedules
The early availability of parts of the system
must be coordinated with the user if he intends to use these parts for experimentation or production. ferent type of test plan must be prepared, testing over the entire period. components,
An entirely diffor incremental
Rather than tracking individual
the manager is more concerned with the progress of
the system as a whole, which is a more complicated matter to assess.
This is normally not determinable until the integration
phase in a bottom-up development,
when it suddenly become a
critical item; in top-down work it is a continuing requirement, but one which enables the manager to identify problems earlier and to correct them while there is still time to do so. In a typical system development environment such as those in FSD, however, exception. be met.
external constraints are the rule rather than the
A user will have schedule requirements which must
A particular data set must be designed to interface
with an existing system.
Special hardware may arrive late in
a development cycle and may vary from that desired.
These are
typical of situations not directly under the developers'
control
which have profound effects on the sequence in which the system is produced.
Now the manager's job becomes still more complex
in planning and controlling development.
Each of these external
constraints may force a deviation from what would otherwise be a classical,
no-dependency development sequence.
Provision may
have to be made for testing, documentation and delivery of
88
products
at intermediate
will typically will p r o b a b l y
points
in the overall
change the schedule
This and
increase the complexity of the m a n a g e m e n t
This is especially hundred thousand
true on a very large project
lines of source code or more),
realistic
schedule may well require
developed
in parallel
fashion
cycle.
from the ideal one,
job.
(several since any
that major subsystems be
and integrated
in a nearly conventional
(hopefully at an earlier point in time than the end of
the project). of 400,000
This was carried out successfully
lines of source code,
on a project
the largest known to the
author to date. When carried to its fullest extent~
top-down development
large system p r o b a b l y has greater effects on quality indirectly,
on productivity)
methodology.
than any other component
Even when competent m a n a g e m e n t
its implementation,
of the
is fully devoted
to
there are two other problems which potential-
ly can arise and must be planned for. overlapping
of a
(and thus,
nature of design,
These both relate to the
development
and integration
in a
top-down environment. The first of these concerns the system to be delivered ly, a user receives
the nature of materials
documenting
to and reviewed by the user.
Typical-
a p r o g r a m design document at the end of the
design phase and must express his concurrence before d e v e l o p m e n t proceeds.
This is impractical
in top-down development
because
d e v e l o p m e n t must proceed in some areas before design
is complete
in others.
a detailed
functional
To give a user a comparable specification
all external
is desirable
aspects of a system,
gorithms of concern to a user,
opportunity,
instead.
This describes
as well as any processing
but does not address
al-
its internal
69
design.
This type of specification is probably more readily
assimilated by typical users,
is more meaningful than a de-
sign document and should pose no problems in most situations. Where standardized procurement regulations
(such as U.S.
Government Armed Services Procurement Regulations) effect,
are in
then efforts must be made to seek exceptions.
top-down development becomes more prevalent,
(As
then it is hoped
that changes to such procedures will directly permit submission of this type of specification.)
The second problem is one of the most severe to be encountered in any of the components and is one of the most difficult to deal with.
It has to do with the depth to which a design should
be carried before implementation is begun.
If a complete,
de-
tailed design of an entire system is done, and implementation of key code in all areas is carried out by the programmers who begin the project,
then the work remaining for programmers added
later is relatively trivial.
In some environments this may be
perfectly appropriate and perhaps even desirable; may lead to dissatisfaction latecomers.
in others it
and poor morale on the part of the
It can be avoided by recognizing that design to
the same depth in all areas of most systems is totally unnecessary.
The initial system design work
"architecture"
(the overworked term
still seems to be appropriate here)
should con-
centrate on specifying all modules to be developed and all inter-module interfaces.
Those modules which pose significant
schedule, development or performance problems should be identified, and detailed design work and key code writing done only on these.
This leaves scope for creativity and originality on
the part of the newer programmers,
subject obviously to review
and concurrence through normal project design control procedures.
On some projects,
the design of entire support sub-
70
systems w i t h interfaces to a m a i n s u b s y s t e m only through standard,
s t r a i g h t f o r w a r d data sets has been left to late in the
project.
Note that while this may solve the p r o b l e m s of chal-
lenge and morale,
it also poses a risk that the d i f f i c u l t y has
been u n d e r e s t i m a t e d .
Thus, here again m a n a g e m e n t is c o n f r o n t e d
w i t h a d i f f i c u l t d e c i s i o n w h e r e an incorrect a s s e s s m e n t may be n e a r l y i m p o s s i b l e to r e c o v e r from.
4o
Chief P r o g r a m m e r Teams
The i n t r o d u c t i o n of CPT~s should be a natural o u t g r o w t h of t o p - d o w n development.
The use of a smaller group b a s e d on a
nucleus of e x p e r i e n c e d people tends to reduce the c o m m u n i c a t i o n s and control p r o b l e m s e n c o u n t e r e d on a typical project.
Use of
the other three c o m p o n e n t s of the m e t h o d o l o g y e n h a n c e s these a d v a n t a g e s t h r o u g h s t a n d a r d i z a t i o n and visibility.
In order for a CPT to function effectively, m u s t be given the time,
r e s p o n s i b i l i t y and a u t h o r i t y to p e r f o r m
the t e c h n i c a l d i r e c t i o n of the project. this poses no problems;
the Chief P r o g r a m m e r
In some e n v i r o n m e n t s
in FSD it is sometimes d i f f i c u l t to
achieve b e c a u s e of other demands w h i c h may be levied upon the chief.
In a c o n t r a c t p r o g r a m m i n g e n v i r o n m e n t he may be called
upon to p e r f o r m three d i s t i n c t types of activities:
technical
m a n a g e m e n t - the s u p e r v i s i o n of the d e v e l o p m e n t p r o c e s s itself, p e r s o n n e l m a n a g e m e n t - the s u p e r v i s i o n of the p e o p l e r e p o r t i n g to him,
and c o n t r a c t m a n a g e m e n t
t i o n s h i p s w i t h the customer.
- the s u p e r v i s i o n of the rela-
This latter in p a r t i c u l a r can be a
very t i m e - c o n s u m i n g f u n c t i o n and also is the s i m p l e s t to secure a s s i s t a n c e on.
Hence m a n y FSD CPT's have a p r o g r a m m a n a g e r who
has the p r i m a r y c u s t o m e r i n t e r f a c e r e s p o n s i b i l i t y in all non-
71
technical matters.
The Chief remains r e s p o n s i b l e for technical
c u s t o m e r i n t e r f a c e as well as the other two types of m a n a g e ment;
in m o s t cases this makes the situation manageable,
and
if not then a d d i t i o n a l support can be p r o v i d e d where needed.
The Backup P r o g r a m m e r role is one that seems to cause people a great deal of d i f f i c u l t y in accepting, p r o b a b l y because there are o v e r t o n e s of "second-best"
in the name.
Perhaps the name
could be improved, but the functions the Backup performs are e s s e n t i a l and cannot be d i s p e n s e d with.
One of the p r i m a r y
rules of m a n a g e m e n t is that every m a n a g e r should identify and train his successor.
This is no less true on a CPT and is a
m a j o r reason for the e x i s t e n c e of the Backup position.
It is
also h i g h l y d e s i r a b l e for the Chief to have a peer w i t h w h o m he can freely and openly interact, e s p e c i a l l y in the critical stages of system design. and balance on the Chief.
The Backup is thus an e s s e n t i a l check Because of this,
it is important that
the C h i e f have the right of refusal on a p r o p o s e d Backup;
if
he feels that an open r e l a t i o n s h i p of mutual trust and respect cannot be achieved,
then it is useless to proceed.
The require-
m e n t that the Backup be a peer of the Chief also should not be waived,
since it is always p o s s i b l e that a Backup w i l l be
called on to take over the project and must be fully q u a l i f i e d to do so.
One of the limits on a CPT is the scope of a p r o j e c t it can r e a s o n a b l y undertake.
It is d i f f i c u l t for a single CPT to get
much larger than eight people and still permit the Chief and Backup to e x e r c i s e the e s s e n t i a l amount of control and supervision.
Thus, even at the h i g h e r - t h a n - n o r m a l p r o d u c t i v i t y rates
achievable by CPT's it is d i f f i c u l t for a single Team to produce
72
much more than perhaps 20,000 lines of code in its first year and 30-~0q000 lines thereafter.
Larger projects must there-
fore look to multiple CPT'sr which can be implemented in two ways.
First, as mentioned above under Top-Down Development,
interfaces may be established and independent subsystems may be developed concurrently by several CPT's and then integrated. Second,
a single CPT may be established to do architecture
and nucleus development for the entire system.
It then can
spin off subordinate CPT's to complete the development of these subsystems.
The latter approach is inherently more appealing,
since it carries the precepts of top-down development through intact.
It is also more difficult to implement;
the experiment
under way by the author ran into problems because equipment being developed concurrently ran into definition problems and prevented true top-down development. It is difficult to identify problems unique to CPT~s which differ from those of top-down development discussed above.
Perhaps the
most significant one is the claim frequently heard that,
"We've
had Chief Programmer Teams in place for years - there's nothing new there for us. ~' While it is certainly true that many of the elements of CPT~s are not new, the identification of the CPT as a particular ciplined,
form of functional organization using a dis-
precise methodology
particular,
suffices to make it unique.
In
the emphasis on visibility and control through
management code-reading,
formal structured programming tech-
niques and DSL's differentiate true CPT's from other forms of programming teams [17]
And it is this same set of features
which make the CPT approach so valuable in a production programming environment where close control is essential if cost and schedule targets are to be met.
73
MEASUREMENT RESULTS
It is not possible, because it would reveal valuable business information,
to present significant amounts of quantitative
information in this paper.
At this time, the results of the
measurement program do show substantial improvements in program~ing productivity where the new technology has been used. A graph has been prepared where each point represents an FSD project.
The horizontal axis records the percentage of struc-
tured code in the delivered product, records the productivity. the project,
and the vertical axis
(The latter includes all effort on
including analysis,
design,
testing, management,
support and documentation as well as coding and debugging.
It
also is based only on delivered code, so that effort used to produce drivers,
code written but replaced, etc., tends to
reduce the measured productivity.)
A weighted least squares
fit to the points on the graph shows a better than 1.5 to 1 improvement in the coding rate from projects which use no structured programming to those employing it fully.
It is also possible, leased elsewhere,
because the data has already been re-
to make one quantitative comparison between
productivity rates experienced using various components of the technology on some of the programming support work which FSD has performed for the National Aeronautics and Space Administration's Apollo and Skylab projects.
This comparison
is especially significant because the only major change in approach was the degree to which the new methodology was used; the people, experience
level, management and support were all
substantially the same in each area.
Figure 6 shows the productivity rates and the components of the technology used.
In the Apollo project,
a rate of 1161
74
bytes of new code per m a n - m o n t h was e x p e r i e n c e d on the Ground Support S i m u l a t i o n work. all p r o j e c t effort.)
(Again, all numbers are based on over-
This work used none of the components
d e s c r i b e d in this paper. the Skylab project,
In the d i r e c t l y c o m p a r a b l e effort on
a DSL,
s t r u c t u r e d p r o g r a m m i n g and t o p - d o w n
d e v e l o p m e n t were all employed,
and a rate of 3756 bytes of
new code per m a n - m o n t h was a c h i e v e d -- almost twice as m u c h new code was p r o d u c e d w i t h s l i g h t l y m o r e than half the effort. It is i n t e r e s t i n g also to remark that this was a c h i e v e d on the p l a n n e d s c h e d u l e in spite of over Ii00 formal changes made d u r i n g the d e v e l o p m e n t of that product, m a n p o w e r and c o m p u t e r time.
along w i t h cuts in b o t h
Finally, w h i l e the i m p r o v e m e n t
may rest to some extent on the similar w o r k done previously, this was not d e m o n s t r a t e d C o n t r o l work.
in the p a r a l l e l M i s s i o n O p e r a t i o n s
There p r o d u c t i v i t y d r o p p e d from 15~7 to 841 bytes
per m a n - m o n t h on c o m p a r a b l e w o r k which in neither case used anything other than a DSL.
In a d d i t i o n to m a k i n g q u a l i t y m e a s u r e m e n t s
and d e t e r m i n i n g
p r o d u c t i v i t y rates, the m e a s u r e m e n t a c t i v i t y has served a number of other useful purposes.
First,
it has built up a substantial
data base of i n f o r m a t i o n about FSD projects. added,
As new data is
checks are made to ensure its validity,
data is r e v i e w e d before being added.
and q u e s t i o n a b l e
The result is an in-
c r e a s i n g l y c o n s i s t e n t and useful set of data.
Second,
it has
enabled FSD to begin studies on the value of the components of the methodology. of other factors p r o j e c t activity. ongoing projects~
Third,
and related,
(e.g.~ environment, Fourth,
it also permits the study personnel)
affecting
it is used to assist in r e v i e w i n g
w h e r e the o b j e c t i v e data it contains has
p r o v e d quite valuable. for p r o p o s e d projects,
And fifth,
it is used in e s t i m a t i n g
w h e r e it affords an o p p o r t u n i t y to com-
pare the new w o r k a g a i n s t similar w o r k done in the past, to i d e n t i f y risks w h i c h may exist.
and
75
CONCLUSIONS
It should be clear at this point that FSD's experience has been a very positive one.
Work remains to be done, particularly
in the management of top-down development and the formalization and application of CPT's.
Nevertheless,
FSD is fully committed
to application of the methodology and is continuing to require its use.
In retrospect,
the plan appears to have been a success and
could serve as a model for other organizations interested in applying the ideas.
The FSD experience shows that this is
neither easy nor rapid. and, most important,
It takes substantial time and effort
commitments and support from management,
to equip an organization to apply the methodology.
To summarize,
it appears that once a base of tools,
and education exists,
standards
it is most appropriate to begin with use
of structured and top-down programming and DSL's.
Where the
people, knowhow and opportunity exist, then top-down development should be applied on a few large, complex projects to yield an experienced group of people and the required management techniques.
It is likely that one or more of these may
also present the opportunity to introduce a CPT.
This is es-
sentially the approach that FSD has taken, and it appears to be an excellent way to organize for structured programming.
76 REFERENCES
[ 1]
Datamation,
[2]
B. W. Boehm,
"Software
Assessment",
Datamation,
H. Do Mills,
Mathematical
Programming,
Report No. FSC 72-6012,
[3]
Vol~
Gaithersburg,
[4]
H. D. Mills, Procedures,
Production
Vol.
A Quantitative
19, No. 5, May,
Foundations
Chief Programmer
Teams:
Maryland,
USA, June,
~'Chief Programmer
Programming",
1973, p. 52
for Structured IBM Corporation,
USA, February,
Report No. FSC 71-5108,
F° T. Baker,
1973, pp. 50-63
and its Impact:
Maryland,
Gaithersburg,
[ 5]
19, No. 12~ December,
1972 Principles
and
IBM Corporation, 1971
Team Management
IBM Systems Journal,
of Vol.
ii.~
Noo I, 1972, pp. 56-73
[ 6]
Fo T. Baker;
~'System Quality Through Structured
ming ~', AFIPS Conference
Proceedings,
Vol.
Program-
41, Part I,
1972r pp~ 339-343 [7]
Federal
Systems Center Structured
Report No. FSC 72-5075, Maryland,
[8]
USA, July,
Improved Technology ment Overview, August,
[ 9]
Programming
IBM Corporation,
1973
Guide,
Gaithersburg,
(revised)
for Application
IBM Corporation,
Development:
Bethesda,
Manage-
Maryland,
USA,
1973
Federal Systems Center Programm.ing Librarian's Report No~ FSC 72-5074, Maryland,
USA, April,
IBM Corporation,
1972
Guide,
Gaithersburg,
77
[io]
F. M. Luppino and R. L. Smith, Programming (PSL) Functional poration, Contract
Requirements:
Gaithersburg,
[ii]
Final Report,
Maryland,
#F30602-74-C-0186
HQ Rome Air Development
IBM Cor-
USA, prepared under
with the U. S. Air Force
Center,
Griffiss Air Force Base,
New York, USA, July,
197~
Contracting
Mr. Paul DeLorenzo)
officer,
Support Library
(Release subject to approval of
P. W. Metzger,
Managing a Programming
Prentice-Hall,
Englewood Cliffs,
Project,
New Jersey,
USA,
1973 [12]
R. C. McHenry, Structured Maryland,
[13]
Management Concepts
Programming, USA, November,
HIPO - Hierarchical
Gaithersburg,
1972
Input - Process - Output Docu-
mentation Technique: Corporation,
for Top Down
IBM Corporation,
Audio Education Package,
Form No. SR20-9~13
IBM
(Available through
any IBM Branch Office) [14]
M. M° Kessler, ming Macros, USA,
[15]
Assembly Language Structured
IBM Corporation,
September,
Gaithersburg,
ProgramMaryland,
1972
G. F. Weinwurm et al, Research into the Management Computer Programming:
A Transitional
Estimation
System Development
Techniques,
Santa Monica,
California,
from the Clearinghouse nical Information
Analysis
USA, November,
Corporation, 1965
for Federal Scientific
as AD 631 259)
of
of Cost (available and Tech-
78
[16]
E. W. Dijkstra~ ming System", No~ 5, May,
[17]
~'The Structure
Communications
ii.,
1968, pp. 341-346
G. M. Weinbergr
The Psychology
ming, Van Nostrand 1971
of the THE Multiprogram-
of the ACM, Vol.
Reinhold,
of Computer Program-
New York, New York, USA,
79 J ro~trelation. On the other hand, they need not be checked at any reference but just at the interface. This leads to the second reason: It is the designers responsibility to minimize the information passed across interfaces.
7.
SUMMARY
The preceding sections presented an attempt to handle errors systematically. Error isolation a n d ~ecovery were treated under one aspect~ Error isolation led to the realization of the p a r t i a l fanction concept and the provision of resource manager~. Error recovery in addition necessitated the i n t r o d u c t i o n Of a c o m m i t m e n t discipline in conjunction with a h i e r a r c h i c a l structure of processes.
REFERENCES
I.
J.J. Ho~ning and B. Eandell: Process Computing Surveyse Vol. 5, No I, March 1973
2.
N. Wirth: Informatica
.
The Programmin 9 I, 35-63 (1971)
N. Habermann: Language Pascai,
Language
Structuring,
Pascal,
Acta
3ritical Zomments on the Programming Acta Informatica 3, ~7-57 (1973)
P. Brinch Hansell: Concurrent Programming Computing Surveys, Vol. 5, No 4, December 1973 5~
Ch. T. Dawies, Jr.: Recovery Semantics System, 1973 P r o c e e d i n g s of the ACM
6.
L.A. Bjork: Proceedings
recovery Scenario of the ACM
for
a DB/DC
for
Conceptsr
a
DB/DC
System,
1973
7. D. L. Parnas: Software Engineering or Methods for the Multi-Person Construction of Multi-Version Programs, published in these proceedings 8. N. Wirth: The Design of 1, 309-333 (1971)
a P A S C A L Compiler~
Software,
Vol
97
Precedence Relation between Processes
Fig.
I
98
Mapping of Input States to Output States defined by a Process
l~p = {X+,X2} d
X2 Xl
~ig. 2
99
Distinction between Functional and Exceptional Actions of a Process I
~ig. 3
IIII
I
II
I
100
Use of Rangesand Error Descriptions
Example 1: 1 X(IXl '
The result form.
It
is an A P L G O L is left
source p r o g r a m in
in a global
expanded,
c h a r a c t e r array
indented
v a r i a b l e named
"OUT".
The
APLGOL
System
was d e s i g n e d
requires d i f f e r e n t human or editing its
a p r o g r a m than when
structure.
d e s i g n e d to
accept text with
the A P L G O L
the
user
to display
compiler
c o m b i n a t i o n s of
has
to the
p r o d u c e s source p r o g r a m s w i t h
in statements,
i n d e n t i n g for
been
a b b r e v i a t e d and
and m a n y source statements
line, w h i l e the REVERSE c o m p i l e r fully spelled k e y w o r d s
that
he lists a p r o g r a m
Consequently,
completely spelled keywords
with t w o - s p a c e
recognizing
factors c a p a b i l i t i e s w h e n he is typing
one s t a t e m e n t
each layer
per line,
of nesting.
Thus,
a
p r o c e d u r e can be entered as:
SAMPL; ~ A~B I~B A+C; ~ J÷I ~ L(N-1 +I)÷2 L2+L3÷L4L-J-I; L4÷ -L-l+pY÷7 DYADF L; E; E E A÷D; 2:ppZ T X C[2]÷(I+pZ)÷N~ Z+N,C,N,P,Q,R; E while
it is
p r o d u c e d by
the
reverse c o m p i l e r
s u b s e q u e n t editing as: ~ROCEDURE SAMPL~ ZF A~B ~HEN BEGIN A+C; ~OR J÷I ~NTIL L(N-I+I)~2 ~0 L2÷LS+L4L-J-I; BEGIN Lq÷-L-I+pY÷7 DYADF L; END~ END ELSE A÷D; ~Z i : p p Z ~HE~ [XIT C[2]÷(I+pZ)÷N~ Z+N,C,N,P,Q,R~ END PROCEDURE
and used
for
173
3.1
The A P L G O L Editor
A single c o n t e x t - o r i e n t e d editor is used to enter new programs, edit
old
ones,
and
p a t t e r n e d after with typical APL not to
of
a
lines
of text
b o t t o m of
relative to
A P L G O L statements lines
may
an implied
or down
a few
The text
to
cursor,
has no special
enter
an
one
replacing,
lines or
been
to context and
changing
deleting,
and p h y s i c a l lines;
be used
has
4). C o m p a r e d
functions include locating
inserting,
the text.
editor (Ref.
this editor is keyed
Its
cursor up
This
VM/370
c h a r a c t e r pattern,
pattern to another,
m o v i n g the
programs.
editor in
editors,
line numbers.
occurrence
list
the CMS
the next character
or p r i n t i n g
and,
finally,
to the
top or
r e l a t i o n between
an a r b i t r a r y
A P L G O L statement,
number of or
many
statements can be entered on a line.
APLGOL
source
Keywords all recognition,
programs are begin w i t h
actually
an u n d e r l i n e d
APL
character
first letter
arrays. for easy
while other letters are not u n d e r l i n e d in order to
reduce keystrokes.
Also, w h e n programs
the first u n d e r l i n e d
are being entered,
letter need be used.
facilitate the e n t e r i n g
of programs and to
This
only
is intended to
reduce m i s s p e l l i n g
of keywords.
To create an entirely new source
p r o g r a m the editor is invoked
by entering:
NEWPGM+EDIT
(O,N)p
~
t
974
where
NEWPGM
specifies and may lines
is the
the accept
commands
of source
is i n v o k e d
name
line width.
text.
according
of
for
new
source
is then
inserting,
To edit to the
the
The e d i t o r
and
N
C o m m a n d Mode
deleting,
a text array,
following
text~
in
or c h a n g i n g
the edit p r o c e d u r e
example:
ARRAYs-EDIT A R R A Y
where
ARRAY
the newly
is the
edited
and new texts
typing
may
INSERT
to abort
status
The E D I T O R ....
typing
this,
text
an
any other
DELETE
Mode
by
the
....
the
text
editing
following
after
"I" f o l l o w e d lines
the other
Insertion entered
editing
source
edited
and
for the old
either into
by
typing
internal
process
"FILE"
APL,
without
or by
affecting
commands:
the c u r r e n t
is in C o m m a n d Mode,
successive
one after
being
can be used
or I ....
W h e n the e d i t o r by
the old text names
of the procedure.
accepts
Insert
terminate
the A P L G O L
"QUIT"
the prior
of both
Different
if desired.
The p r o g r a m m e r to t r a n s l a t e
name
text.
and
pressing
by a
line.
Insertion carriage
Mode
is e n t e r e d
return.
Following
of text m a y be inserted
as they return
are to appear to
Command
the c a r r i a g e
return
by t y p i n g
in the text. Mode,
a
null
on a n e w
them
To leave line
is
line b e f o r e
character.
n or D n
Delete
n
lines b e g i n n i n g
with
the
current
line.
If
n is
175
omitted,
then
1 is assumed.
PRINT n or P n Print n
lines b e g i n n i n g
omitted,
1 is assumed.
NEXT n
w i t h the c u r r e n t
line.
If
n is
or N n
Step
forward
assume
n lines
in
the Text.
If n is
omitted
then
i.
UP n or U n Step up n lines.
If n is o m i t t e d
then assume
i.
TOP or T Position
line p o i n t e r
at first line
line p o i n t e r
at last line
in text.
B O T T O M or B Position
L O C A T E / .... / Search
or
L/ .... /
the lines
containing
the
following text s t r i n g
are used to
delimit
part of the
string being
be used
in text.
....
the argument
as a delimiter.
of the function,
the c u r r e n t
line
for the line
The c h a r a c t e r s of locate;
searched
for.
If the p o i n t e r
the search will b e g i n
/
and
/
they are not
Any c h a r a c t e r is on the last
may line
at the
top of the
and replace
it by text2
function.
CHANGE
/textl/text2/
Search
or C / t e x t l / t e x t 2 /
the c u r r e n t
if it occurs. delimiter
line for textl,
As noted,
and may
above,
be replaced
the c h a r a c t e r by any
/ serves
character.
as a
If textl
176
is null,
text2
If text2
is nu!ll
REPLACE
textl
at the b e g i n n i n g
of the
line.
is deleted.
.... or R ....
Replace given,
FILE
is i n s e r t e d
the
current
this
line by the
is the same
as DELETE,
text
....
If no
UP,
INSERT
the newly
edited
text
is
combined.
or F Leave target
the editor array
and a s s i g n
text
to the
status
of the
specified.
QUIT or Q Leave
the editor
function. it
and make
will
still
function,
remain
If
the a r g u m e n t
DELETE
is such
that
last
line.
point
3.2
The
The A P L G O L
compiler
syntax text
to the
is
scanner, generator.
initialize invoking
the
then
the
line,
line
to editing,
it was
a
defined
be unaltered.
NEXT,
Similarly,
the top first
to
If
prior
PRINT,
LOCATE,
the line p o i n t e r w o u l d
end of the function,
point b e y o n d
will
to the
undefined
undefined.
its d e f i n i t i o n
Note:
to the
no change
If the f u n c t i o n was
line p o i n t e r if the
move past
or the
is set to point
the UP c o m m a n d line p o i n t e r
tries is set
to to
in the text.
Compiler
organized
into three
(2) a lexical One of tables
the syntax
two for
scanner.
scanner, driving the
main and
sections: (3) an
procedures
appropriate
(i)
a
APL o b j e c t is
syntax
used
to
before
177 The syntax scanner is the c o n t r o l l i n g procedure, symbols
from
rules, and
the
lexical scanner,
then i n v o k i n g
the text
obtaining meta
identifying g e n e r a t o r as
the
grammar
n e c e s s a r y to
produce the a p p r o p r i a t e A P L object text.
On each invocation, symbol
the lexical scanner returns
number r e p r e s e n t i n g
character,
or
an APL
scanned for one
a
expression.
e.g.,
the first
Source
special
text characters
are
search for a specific meta symbol;
c h a r a c t e r in each reserved
w o r d is u n d e r l i n e d m e t a symbol, both
scanner and for the reader.
listed in
a single m e t a label,
characters, which is
d i s t i n g u i s h i n g it as a p a r t i c u l a r
for the lexical symbols
word,
of a very limited set of
then used to key a d e t a i l e d
to aid in
reserved
Appendix
II are
The terminal meta
detected
in the
lexical
scanner.
When a
grammar rule is i d e n t i f i e d
text g e n e r a t o r is
by the syntax
invoked with a number
scanner,
the
c o r r e s p o n d i n g to that
grammar rule to locate the applicable p o r t i o n of the generator. Information a c c u m u l a t e d in
the compile stack and
then used to g e n e r a t e labels, branches,
elsewhere is
or to produce a single
APL text line from a b u f f e r e d APL expression.
An output
p r o c e d u r e is
text,
is invoked from the
and
line is created. workspace,
so
employed to
form an
array of
object
text g e n e r a t o r each time
a new
This object text array remains in the c o m p i l e r that
extraneous b r a n c h e s
and
labels
can
be
removed after all the object text has been produced.
The semantics
of the control
structures are indicated
in the
e n c l o s e d figures.
In
the
APL
object
compiler, many of
text
produced
by
the labels and branches
the
simple
one-pass
may be unnecessary,
178
particularly absolute
the o b j e c t After by
those
branch. code
all the
an
target
of the
appear
on a
removed,
3.3
It
label
absolute single
been
internal
APLGOL
possible
to
produced
by
the single
branch.
one
and
APL
APLGOL
implementation
entity
from
with
if m u l t i p l e
labels
all
but one
are
accordingly.
systems
to r e t a i n
and to
rather
heavily
the other there
(Re,.
dissimilar,
this
to
requires
and
not m u l t i p l e
from
forms
of
of the
In the o r i g i n a l
not
only
a separate
guarantee
level.
by h a v i n g
been
advantage
source was
one could
reduced
it has
by the p a t t e r n s
The m a i n
same m a i n t e n a n c e
are
This
single
translation
conditioned way.
are
The
5) the A P L G O L
the A P L object, at the
a
translate
as necessary.
get out of s y n c h r o n i z a t i o n .
requirements
removed
to the
text,
translators.
is
form is that to
followed
for each direction.
both
however,
program
as
references.
a label and
and an
tables
to r e f e r e n c e s
Further,
are c h a n g e d
APL are
translation
were
revised
of o b j e c t
label
builds
and their
detected
representation,
source
forms
be
each APL p r o c e d u r e
construct
A P L to APLGOL,
storage
can
in
a pair of t r a n s l a t o r s ,
Although
of a
Back-Compiler
customary
form for
labels
suitably
line
The A P L to A P L G O L
or from a c h a r a c t e r
both
showing
and the r e f e r e n c e s
has
consist
the c o m p i l e r
code has been produced,
branch
to that
which
them,
is formed object
absolute
references
statements
To remove
that
Additionally, a single
copy
of a procedure.
The
reverse
stylized
the o r i g i n a l duplicate
translation
canonical
form,
APLGOL
program
the original,
an i n d e n t e d graphically.
format w h i c h In this
from
APL to
w h i c h may as
entered.
the t r a n s l a t o r displays
form,
keywords
APLGOL
appear
will p r o d u c e
quite
Rather was
different
than a t t e m p t
designed
the s t r u c t u r e are fully
a
from to
to produce
of the p r o g r a m
spelled,
and each
179
control
level is
statement
per
consistently
line.
This
indented,
graphical
typically with
representation
p r o g r a m is very useful to depict the p r o g r a m structure. this form is
a v a i l a b l e for listing an A P L G O L
been s u c c e s s f u l l y
t r a n s l a t e d to
always
in a
be listed
APL, the
standard format,
of
one the
Because
p r o g r a m that has
APLGOL program thereby p r o m o t i n g
can a
c o n s i s t e n t style.
In general the of the
reverse c o m p i l e r does not
A P L G O L program.
number of APL statements edited
to have
present in
two
however, keep
in each block.
or m o r e
the o r i g i n a l
will add a BEGIN
It does,
change the structure
statements
A P L G O L source,
...END pair
count of
the
If the A P L f u n c t i o n is where
only one
the reverse
was
compiler
around the statements during back
compilation.
4.
ACKNOWLEDGEMENTS
The author wishes to a c k n o w l e d g e the S c i e n t i f i c Center staff
work of the IBM Palo Alto
for their i n s p i r e d work
i m p l e m e n t i n g and testing the A P L G O L
system.
in designing,
The m a j o r o r i g i n a l
c o n t r i b u t i o n s were made by Robert A. Kelly, John R. Walters and Dan M c N a b b Myers
(See Refs.
for the
5-7).
Special thanks are due to H. Joseph
more recent
modifications
and testing
APLGOL
under m a n y conditions.
The
author is
making
p a r t i c u l a r l y grateful
available to
us
functions w r i t t e n in APL.
his
to John
m o s t recent
(Ref.8)
R. Walters
copyrighted
for
APLGOL
180
APPENDIX I:
References
io
APL/360 User's Manual, GH20-0683,
2.
Dijkstra,
E.
Lecture,
Communications
Wo~
"The
Humble of
IBM Corporation.
Programmer",
the
ACM,
Vol.
1972 15,
Turing No.
I0,
October 1972. 3.
Dijkstra,
E.W.,
Dahl and C.
"Structured Programming",
A. R. Hoare) Academic
(with O.
Press, London,
J.
October
1972. 4.
IBM
Virtual Machine Facility/370:
5.
Kelley,
Edit
Guide, GC20-1805,
IBM Corporation. for
R. A.,
APL",
"APLGOL,
IBM
Palo
a Structured Programming Language
Alto
Scientific Center
Report
No.
320-3299, August 1972. 6.
7.
Kelley,
R.
A.,
"APLGOL,
Programming
Language",
Development,
VOlo
Kelley, R. Programming
an
IBM
Experimental Journal
of
Structured Research
and
17, No. I, January 1973.
A. and Walters, System for
J. R.,
APL"r
IBM
"APLGOL-2 A Structured Palo Alto
Scientific
Center Report No. 320-3318, August 1973. 8.
Kelley, R. Programming
A. and Walters, Language
J. R.,
System
APL-VI Conference, May 14-17, 9.
Knuth~ Stanford
D.
E.,
"A
University
STAN-CS-73-371,
Review
for
Proceedings
of
1973. of
Computer
June 1973.
"APLGOL-2 A Structured APL".
Structured Science
Programming" Department,
181
APPENDIX .
.
.
.
.
.
.
.
II: .
.
.
APLGOL
.
.
.
.
.
.
.
.
.
SYNTAX .
.
.
[I]
[2]
[3] [4]
< S T A T E M E N T LIST> ::= < S T A T E M E N T > ; I
[5] [6]
< S T A T E M E N T > ::: ::= < C O M M E N T S T A T E M E N T > I
[9] [10] [11]
< S T A T E M E N T - A > ::= < E X P R E S S I O N > :
NULL :
~12]
~13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23]
P R O C E D U R E _I_
::=
_I_
~:=
PROCEDURE
LIST>
;
;
EZZT [XIT < B E G I N > < S T A T E M E N T L I S T > ~NTIL < T R U E P A R T > < S T A T E M E N T > ~ 0 < S T A T E M E N T > < W H I L E H E A D > DO < S T A T E M E N T > DO < C A S E E X P R > B E G I N < S U B C A S E L I S T >
[24]
CASE>
::=
HEAD>
~F
EASE
182
A. P. P. E. N. D.I.X. . II: . . .
A.P .L G. O. L.
.
.
[40]
[41]
B ~0 [3] BEGIN [4] REPEAT [5] B+C[I]; [6] WHILE I z in WORLD5° We may attempt to achieve this relation directly, by inserting the instruction sort(x z) in the middle of the program, yielding
251
the p r o g r a m sort(x y) sort(x z) sort(y z). However, this action, though ultimately
correct, risks disturb-
ing the relation x(y p r o t e c t e d in WORLD5.
A more prudent course
is to pass the predicate back still further to WORLDI, r e l a t i o n is p r o t e c t e d at all.
where no
Passing the r e l a t i o n back to WORLDI
gives the two predicates x ( z if y > z and x ~ y in WORLDI and y ( z if x > z and x > y in WORLDI. We must achieve both of these relations.
If we achieve the first
r e l a t i o n first, we insert the i n s t r u c t i o n sort(x z) at the beginning of the program, vially satisfied.
then beeuase x(z, the second relation is tri-
The p r o g r a m o b t a i n e d is the
sort(x z) sort(x y) s o r t ( y z). If, on the other hand we try to achieve the second r e l a t i o n first, by i n s e r t i n g the i n s t r u c t i o n sort(y z) at the b e g i n n i n g of the program,
the first r e l a t i o n will also be satisfied automatically,
and the p r o g r a m w i l l then be sort(y z) sort(x y) sort(y z). The simultaneous goal p r o b l e m is essential to all p r o g r a m synthesis:
what we have said in this section only begins to explore
the subject. This concludes the p r e s e n t a t i o n of our basic p r o g r a m synthesis techniques.
In the next part we will show how the same techni-
ques work together in the synthesis of some more complex examples.
III. PROGRAM SYNTHESIS:
THE P A T T E R N - M A T C H E R
We w i l l present the synthesis of a simple p a t t e r n - m a t c h e r to show how the concepts discussed in the pre%ious n o n - t r i v i a l problem.
section can be applied to a
Later, in Part IV, we shall show how we can con-
252
struct an even more complex program~ the u n i f i c a t i o n a l g o r i t h m of R o b i n s o n thesize.
[1965], by m o d i f y i n g the p r o g r a m we are about to syn-
We must first describe the data structures
tive operations
i n v o l v e d in the p a t t e r n - m a t c h i n g
and primi-
and u n i f i c a t i o n
problems.
A.
Domain and N o t a t i o n s
The main objects in our domain are 9 x p r e s s i 0 n ~ and substitutions. i.
Expressions
Expressions
are atoms or nested lists of atoms{ cog.,
is an expression.
An atom may be e i t h e r a v a r i a b l e or a constant.
(In our examples we will use A,B,C,... for variables.)
(A B (X C) D)
for constants
We have basic p r e d i c a t e s
and U,V,W,...
atom, var and eonst to
d i s t i n g u i s h these objects: atom(R) var(~) and
~ ~ is an atom, ~ Z is a variable,
const(Z)
~ ~ is a constant.
To decompose an expression, we will use the p r i m i t i v e functions head(~)
and tail(Z)~
defined w h e n ~ is not an atom.
head(~)
is the first element of Z~
tail(~)
is the list of all but the first element of Z.
Thus head(((A
(X) B) C (D X)))
: (A (X) B),
tail(((A
(X) B) C (D X)))
: (C (D X)).
We will a b b r e v i a t e head(~)
as £i and tail(~)
as ~2"
To construct e x p r e s s i o n s we have the c o n c a t e n a t i o n function: is any e x p r e s s i o n and m is a n o n a t o m i c expression,
if
~.m is the
e x p r e s s i o n w i t h £ i n s e r t e d before the first element of m.
For
example (A (X) B)
(C (D X)) = ((A (X) B) C (D X)).
The p r e d i c a t e o c c u r s i n ( x ~) is true if x is an atom that occurs in e x p r e s s i o n ~ at any level, e.g., oceursin(A
(C (B (A) B) C)) is true
but o c c u r s i n ( x Y) is false. Finally, we will introduce the p r e d i c a t e
constexp(~), w h i c h is
true if ~ is made up e n t i r e l y of constants.
Thus
253
constexp((A
(B) C (D E)))
constexp(X)
is false.
is true
but
Note that
constexp
differs
from const in that constexp may be
true on nonatomic expressions. 2.
Substitutions
A substitution
replaces
other expressions. of pairs.
certain variables
We will represent
of an expression
a substitution
by
as a list
Thus ( )
is a substitution. The instantiation to an expression
function inst(s £.
For example,
£) applies the substitution if s is the substitution
s
above
and £ is (X (A Y) X) then inst(s
~) is
((A B) (A (C Y)) Note that the substitution currences
(A B ) ) . is applied by first replacing
of X simultaneously
of Y simultaneously
by
(C Y).
all oc-
by (A B) and then all occurrences Thus,
if the substitution
s were
( ), then inst(s
~) would be
(C (A C) C). The empty substitution Thus,
A is represented by the empty list of pairs.
for any expression
~,
inst(A ~) = ~. We regard two substitutions
s I and s 2 as equal
(written Sl=S 2) if
and only if inst(s I ~) = inst(s 2 ~) for every expression
~.
Thus
( ) and ( ) are regarded
as the same substitution.
We can build up substitutions position):
If v is a variable
using the functions
p a i r and o (com-
and t an expression,
pair(v t) is
254
the s u b s t i t u t i o n that replaces v by t; i.e.~ a ~ i ~ ( v t) : (). If s I and s 2 are two substitutions~
Sl°S 2 is the s u b s t i t u t i o n
w i t h the same effect as a p p l y i n g s I followed by s 2.
Thus
inst(sl°s 2 ~) : inst(s 2 inst(s I ~)). For examp!e~
if
s I = ( ) and s then
2
= ( )
Sl°S 2 = ( ). Note that for the empty s u b s t i t u t i o n A A°s : s°A : s for any s u b s t i t u t i o n
B.
s.
The s p e c i f i c a t i o n s
The p r o b l e m of p a t t e r n - m a t c h i n g may be d e s c r i b e d as follows. are given two expressions,
but ar~ is assumed to contain no variables; is true.
We
p a t and ar___gg. Pat can be any expression, i.e.,
constexp(a_r~)
We want to find a s u b s t i t u t i o n z that transforms pat in-
to arg, such that inst(z pat)
= arg.
We will call such a s u b s t i t u t i o n a match.
If no m a t c h exists, we
want the p r o g r a m to return the d i s t i n g u i s h e d For examp!e~
constant NOMATCH.
if is
(X A
(Y B))
and is (C A
(D B)),
we want the p r o g r a m to find the m a t c h ( ). On the other hand, if pat is
(X A (X B))
arg is
(B A (D B)),
and then no s u b s t i t u t i o n will t r a n s f o r m pa t into arg, so the p r o g r a m will y i e l d NOMATCH. This version of the p a t t e r n ~ m a t c h e r
is simpler than the pattern-
m a t c h i n g a l g o r i t h m s u s u a l l y i m p l e m e n t e d in p r o g r a m m i n g languages because of the absence of "sequence"
or "fragment" variables.
Our variables must m a t c h exactly one expression, whereas ment v a r i a b l e may m a t c h any n u m b e r of expressions.
a frag-
Because of
255
the absence of fragment variables, be unique.
a match,
if it exists, w i l l
Thus if pat is
(X Y Z) and
at@ is
((A B) C (A B)),
X and Z must be bound to
(A B), and Y must be bound to C.
If
pat is (X Y) and arg is (A B C), no match is possible at all.
(If X and Y were fragment variables,
four matches would be possible.) In m a t h e m a t i c a l n o t a t i o n the specifications
for our p a t t e r n - m a t c h e r
are:
Goal i: I match(p0t a~g) : Find z such that inst(z pat) = arg
r
.......else ..... z = NOMATCH
where
......
"Find z such that P(z) else Q(z)" means find a z such that
P(z) if one exists; otherwise,
find a z such that Q(z).
The above s p e c i f i c a t i o n s do not c o m p l e t e l y capture our intentions; for instance,
if
pat is (X Y), and arg is (A B), then the s u b s t i t u t i o n z = ( ) will satisfy our s p e c i f i c a t i o n s as well as z
= ( ).
We have n e g l e c t e d to include in our s p e c i f i c a t i o n s that no substitutions
should be made for variables that do not occur in pat.
We will call a m a t c h that satisfies this additional condition a most seneral match. An i n t e r e s t i n g c h a r a c t e r i s t i c of the synthesis we present is that even if the user forgets to require that the match found be most general~
the system will be able to strengthen the specifications
a u t o m a t i c a l l y to imply this condition, in Section II-D.
using the method outlined
T h e r e f o r e we will begin the synthesis using the
w e a k e r specifications.
C.
T h e Synthesis:
The Base Cases
Rather than listing all the k n o w l e d g e we require in a special see-
256
tion
at the beginning~
about
to be used.
vial we will
we will m e n t i o n
Furthermore~
omit
a rule
if a rule
it entirely.
only w h e n
seems
The g e n e r a l
it is
excessively
strategy
tri~
is to first
work on Goal
2:
Find
If this
z such that
inst(z
pat)
is found to be i m p o s s i b l e
such
z exists)~
Goal
3:
we will
is seen to be t r i v i a l l y
Thus~
from now on we will be w o r k i n g
ever,
in w o r k i n g
cases
in w h i c h
return
Goal
NOMATCH~
We have
satisfied
is i m p o s s i b l e
2 is p r o v e n
which
base
Goal
z to be NOMATCH.
on Goal
a portion
to achieve.
impossible~
satisfies
in our k n o w l e d g e
by t a k i n g
primarily
on any goal we devote
the goal
that no
z = NOMATCH;
which
that
if it is proven
work on
Find a z such that
showing
= arg.
(i.e.,
2.
How-
of our time
to
When we find
we will
automatically
3.
a number
of rules
concerning
inst,
including Rule
i:
inst(s
x) = x for any s u b s t i t u t i o n
Rule
2:
inst(pair(v
s
if constexp(x) t) v) = t
if var(v) We assume
that
these
tion i n v o c a t i o n cons texp(pat) ditions;
on Goal
and pat
their
truth
to the program~ thetical
Rule
We will
are r e t r i e v e d
2.
Rule
= arg.
In the
i tells have
as they
to t i g h t e n
stand now~
The p o r t i o n
of the p r o g r a m we have
case that
on the p a r t i c u l a r
the
inputs
conditions
is a s a t i s f a c t o r y
simply
constructed
con-
for a hypo-
specifications
we will
func-
of these
as conditions
any s u b s t i t u t i o n
later;
arg)
in the
either
case that both of these
our p r o g r a m
match(pat
only
prove
depends
predicates
us that
occasion
by p a t t e r n - d i r e c t e d
i applies
We cannot
or f a l s e h o o d
We use these
world-split.
are true~ match.
rules
return
of z÷any.
so far reads
=
if c o n s t e x p ( p a t ) then
if pair = ar_~g then
z + any
else.o. On the other
hand~
in the
case
constex~(pat)
and pat
~ arig, Rule
I
257
tells
us that inst(z
for
any
z.
pat)
Hence
~ arg
we are
led to try
to s a t i s f y
Goal
3 and take
z÷NOMATCH. We n o w
consider
the
case
~ constexp(pat). Ruel
2 establishes
the
subgoal
var(pat). This
is a n o t h e r
var(pat)
occasion
is true,
g r a m we h a v e
the
for a h y p o t h e t i c a l
program
constructed
m a t c h ( p a t arg)
must
return
world-split.
~air(pat
arg);
When the p r o -
so far is =
if e o n s t e x ~ ( p a t ) then
if p a t
else
= arg
then
z ÷ any
else
z ÷ NOMATCH
if v a t ( p a t ) then
z ÷ pair(pat
arg)
else .... Heneefore
we
also
~ 9onstexp(pat).
ing
that
assume
additional
Rule
3:
This
rule
and y.
- var(pat).
knowledge
inst(s
x.y)
applies
We have
about
the
= inst(s
to o u r Goal
some
Recall
To p r o c e e d
that
function
x). inst(s 2 if p a t : x . y
additional
we h a v e
we m a k e
knowledge
been
use of the
assuming follow-
inst:
y)
for any
for
some
about
substitution expressions
expressions
s.
x
in ge-
neral: Rule
4:
u = uI
u2
if ~ atom(u) R e c a l l that u I is an a b b r e v i a t i o n t i o n for tail(u). Rule
5:
u ¢ v.w
for any u, v,
for h e a d ( u )
and
u 2 is an a b b r e v i a -
and w
if atom(u) Using
Rule
4, we
generate
a subgoal
- atom(pat). Since
we h a v e
can a c t u a l l y Therefore
already prove
assumed
~ atom(pat)
p a t = p a t l . p a t 2 and
~ constexp(pat) using
using
knowledge
Rule
and ~ v a t ( p a t ) , in the
3 our Goal
system.
2 is then
we
258
reduced Goal
to
~:
Find
We n o w m a k e Rule
6:
z such that
use
of some
To p r o v e
Applying
this
for
u a n d v.
arg some
x
rule~
u = a__~l a n d
= U
inst(z
general
• y = u we
patl)~inst(z
p a t 2 ) = ar_~.
list-processing
. v, p r o v e
generate
knowledge.
x = u a n d y = v.
a subgoal
to
show that
• V Applying
Rule
4, w e k n o w
this
is t r u e w i t h
v : arg 2 if
~ atom(arg). This
is a n o t h e r
Thus,
by R u l e
duces
to
Goal
for
6~ in t h e
Find
5:
occasion
a hypothetical
case
that
world-split.
~ atom(arg),
our
subgoal
re-
z such that
inst(z
p a t I)
= arg I
inst(z
p a t 2)
: a r g 2.
and We w i l l
portpone
sidered
the
treatment
other
case,
of ~his
goal
until
after we have
con-
in w h i c h
atom(ar___~g) holds.
In t h i s
for
any
z.
can
take
The
program
case
inst(z
Rule
p a t I)
Hence,
our
5 tells
- inst(z goal
us t h a t
p a t 2)
~ arg
is u n a c h i e v a b l e
in t h i s
case,
a n d we
z + NOMATCH. so f a r
match(pat
is ar~)
:
if constexp(~at) then
else
For the forth
as y e t
using
a r g l " arg 2 •
if pat then
z + any
else
z + NOMATCH
if var(pat) then
z ÷ pair(pat
else
if atom(arg)
untreated
Rule
= arg
4 we
arg)
then
z ÷ NOMATCH
else
.. .
case neither
assume
that
pat
pat
nor
a r ~ is
atomic.
is p a t l . p a t 2 a n d a r g
is
Hence-
259
D.
The Synthesis:
The Inductive Case
We will describe the remainder of the synthesis in less detail, because the reader has already seen the style of reasoning we have been using.
Recall that we had postponed our discussion of
Goal 5 in order to consider the ease in which arg is atomic. Now that we have completed our development of that case, we resume our work on Goal 5: Find z such that inst(z pat I) = argl , and inst(z pat 2) = ~Fg2" This is a conjunctive goal, and is treated analogously to the goal in the simultaneous treated separately.
linear equations example:
each conjunct is
The system will attempt to use a recursive
call to the pattern-matcher itself in solving each conjunct. The interaction between the two conjunets is part of the challenge of this synthesis.
It is quite possible to satisfy each conjunct
separately without being able to satisfy them both together.
For
example, if pat=(X X) and ~rg=(A B) then patl=X , pat2=(X) , argl=A and ar~2=(B).
Thus z=() satisfies the first conjunct,
z=() satisfies the second conjunct, but no substitution will satisfy both conjuncts because it cannot match X against both A and B.
Some mechanism is needed to ensure that the expression
assigned to a variable in solving the first conjunct is the same as the expression assigned to that variable in solving the second conjunct. There are several ways to approach this difficulty. programmer may satisfy the two conjuncts
For instance~ the
separately and then attempt
to combine the two substitutions thereby derived into a single substitution.
Or he may actually replace those variables
in pat 2
that also occur in a~9~l by whatever expressions they have been matched against~ before attempting to match pat 2 against arg 2. Or he may simply pass the substitution that satisfied the first conjunct as a third argument to the pattern-marcher in working on the second conjunct. The pattern-marcher must then check that the matches assigned to variables are consistent with the substitution given as the third argument.
260
We will examine in this
section how a system would discover the
second of these methods
and consider in a later section how the
third method
could be discovered.
method here because
We will not consider the first
it is not easily adapted to the unification
problem. Our strategy We will
for a p p r o a c h i n g
consider the first conjunct
Goal 6:
satisfies
z into the second conjunct, 7:
Prove inst(z
If we are successful
independently:
~at2)
find a broader
this goal~ we will substitute
z.
= ar___~g 2.
that satisfy Goal 6 and
select one that also satisfies
strategy,
rule that relates
if we fail,
In other words, we will try to
class of substitutions
method we introduced to solve conjunctive A p p l y i n g this
that
giving
in Goal 7~ we are done~ however,
we will try to generalize from these
goal is as follows.
Find z such that inst(z patl ) = argl.
If we find a z that
Goal
the conjunctive
Goal
7.
This
is the
goals in Section II-E.
we begin work on Goal 6.
We first use a
the construct
Find z such that P(z) to the construct Find z such that P(z) else Q(z). Rule 7:
To find z such that P(z), find z I such that P(z I) else Q(z I) for some predicate
Q,
and prove ~ Q(Zl). This rule~ Goal
8:
applied to Goal 6 causes the generation
of the subgoal
Find z I such that inst(z I pat I) = arg I else q(Zl)~ and prove ~ Q(Zl).
This subgoal matches NOMATCH.
the top-level
This suggests
Goal i, where Q(z I) is Zl=
establishing
a recursion
at this point,
taking z I = m a t c h ( p a t l argl). T e r m i n a t i o n is easily shown, because both pat I and arg I are proper subexpressions show,
of pat and @rg, respeetively,
according to Rule
7, that
zi~NOMATCH.
This
It remains
to
causes another
261
hypothetical NOMATCH
world-split:
in the case Zl=match(patl
(i.e., no substitution will cause p atl and arg I to match),
we can show that no substitution either,
and hence
match(~atl However,
ar~l)=
can cause pat and arg to match
can take z=NOMATCH.
argl)~NOMATCH,
On the other hand,
we try to show Goal
inst(z I pat 2) = arg 2. we fail in this attempt;
if
7, i.e.,
in fact we can find sample in-
puts pa t and arg that provide
a counter-example
to Goal
7 (e.g.,
pat=(A X), arg=(A B), Zl=A).
Thus we go back and try to genera-
lize our solution to Goal 6. We already have a solution to Goal 6:
we know inst(z I patl)=arg I.
We also can deduce that constexp(argl) , because we have assumed constexp(arg).
Hence Rule i tells us that
inst(w arg I) = arg I for any substitution w. Hence inst(w inst(z I p atl))
= argl,
i.e., inst(zl°w p a t I) = arg I for any substitution w.
Thus having one substitution
z I that sa-
tisfies Goal 6, we have an entire class of substitutions, Zl°W ~ that also satisfy Goal 6. sidered to be "extensions"
These substitutions
of Zl; although
sfy Goal "7, perhaps
some extension
The above reasoning
is straightforward
further work is needed to motivate
z I itself may not sati-
of z I will. enough to justify,
9:
but
a machine to pursue it.
It remains now to find a single w so that Zl°W satisfies Goal
of form
may be con-
Goal
7, i.e.,
Find w such that inst(zl°w pat.2)=ar__zg2,
or equivalently, Find w such that inst(w inst(z I P at2))=arg 2. applying Rule 7, we establish Goal i0:
a new goal
Find w such that inst(w inst(z I pat2))=arg 2 else Q(w), and prove
~ Q(w)
This goal is an instance inst(z I pat2),
of our top-level
goal, taking pat to be
a rg to be a rg2' and Q(w) to be w=NOMATCH.
attempt to insert the recursive
call
Thus we
262
z 2 * m a t c h ( $ n & t C z I pat2) into our p r o g r a m must
at this p o i n t
arg2 )
and take w to be z 2.
However,
we
first e s t a b l i s h Q(z2) , m a t c h ( i n s t ( z I pat2)
We c a n n o t p r o v e pat
this.
a r g 2) ~ N O M A T C H .
We have
= (A A)~ ~
eounter-examples,
e.g.,
if
= (A B) and z I = A,
then match(inst(A Therefore
we split
A)
B) = NOkIATCH.
on this
condition.
'In the case m a t e h ( i n s t ( z I pat2) Goal
i0 is s a t i s f i e d .
Z=Zl°Z 2 satisfies Our p r o g r a m
arg2 ) ~ N O M A T C H
Thus w=z 2 also
Goal
satisfies
Goal
9, and
7.
so far is
mateh(p_~
ar_~) =
if e o n s t e x ~ ( p a t ) t h e n if ~
= arg
then
z ÷ any
else
z ÷ NOMATCH
else if v a r ( p a t ) then
z ÷ pair(pat
ar$)
else if a t o m ( a < [ ) then
z + NOMATCH
else
z I ÷ m a t c h ( p a t I argl )
if z I = N O M A T C H t h e n z + N0~IATCH else
z 2 ÷ m a t c h ( i n s t ( z I p_a~2 ) arg2)
if z 2 = N O M A T C H
E.
The S y n t h e s i s :
We have
gone this
fications~ general.
...
else
z ÷ Zl°Z 2
The S < r e n ~ t h e n i n g
far t h r o u g h
i.e., w i t h o u t In fact~
then
the
requiring
the m a t c h
of the
synthesis
Specifications
u s i n g the w e a k
that the m a t c h
speci-
f o u n d be m o s t
f o u n d may or may not be m o s t g e n e r a l
263
d e p e n d i n g on the v a l u e taken for the u n s p e c i f i e d s u b s t i t u i o n "any" p r o d u c e d in the very first case.
The synthesis is nearly complete.
However, we will be unable to complete it without the s p e c i f i c a t i o n s
strengthening
and m o d i f y i n g the p r o g r a m accordingly.
have only one case left to consider.
We now
This is the case in w h i c h
z2 = NO~TCH i.e., m a t c h ( i n s t ( z i pat 2) = NOMATCH. This means that no s u b s t i t u t i o n w satisfies inst(w inst(z I pat2))
= arg2,
or, e q u i v a l e n t l y i n s t ( z l ° w pat 2) # arg 2 for every s u b s t i t u t i o n w. This means that no s u b s t i t u t i o n of form Zl°W could p o s s i b l y satisfy i n s t ( z l ° w pat) We here have a choice:
= a rg. we can try to find a s u b s t i t u t i o n s not of
form Zl°W that satisfies inst(s pat I) = arg 1 and repeat the process; stitution
or we could try to show that only a sub-
s of form Zl°W could p o s s i b l y satisfy inst(s pat I) = argl,
and t h e r e f o r e we can take z=NOMATCH. We have already e x t e n d e d the class of s u b s t i t u t i o n s that satisfy the condition once; t h e r e f o r e we pursue the latter course. try to show that the set of substitutions
We
S=Zl°W is the entire
set of solutions to inst(s pat l) = arg I. In o t h e r words, we show that for any s u b s t i t u t i o n s, if inst(s pat I) = arg I then s = Zl°W for some w. This condition is equivalent to saying that z I is a most general match.
We cannot prove this about z I itself; however,
since z 1
is m a t c h ( p a t I arg I) it suffices to add the condition to the specifications
for match, as d e s c r i b e d in Section II-D.
s p e c i f i c a t i o n s now read Find z such that {inst(z pat) = arg and for all s [if inst(s pat) = arg then s = z°w for some w]} else z = NOMATCH.
The s t r e n g t h e n e d
264
O n c e we h a v e go t h r o u g h
the
cifications In this
strengthened entire
are
case
the
program
satisfied,
no m a j o r
specifications and
see that
modifying
modifications
it is n e c e s s a r y the
new,
to
stronger
spe~
the p r o g r a m
if n e c e s s a r y .
are n e c e s s a r y ;
however,
the
assignment z + any that
occurs
tant
is
in the
further
case
in w h i c h
specified
pat
and
ar$
are
equal
and
to r e a d
z ÷ A. Our final
program
is t h e r e f o r e
match(pat
a_E~) :
if 9 o n s t e x z ( p a t ) then
if p a t = arg
then
z + A
else
z ÷ NOMATCH
else
if v a r ( p a t )
then
z + pair(pat
else
if a t o m ( a r g )
ar$ ,)
then
z ÷ NOMATCH
else
z I ÷ match(~atl
argl )
if z I = N O M A T C H then
z ÷ NOHATCH
else
z 2 ÷ m a t c h ( i n s t ( z I p a t 2) arg 2)
if z 2 = N O M A T C H then
z ÷ NOMATCH
else
z ÷ zl°z 2.
cons-
265
F.
Alternative
Programs
The above p a t t e r n - m a t c h e r
is only one of may pattern-matchers
that can be derived to satisfy the same specifications. suing the synthesis the alternative altogether,
the system has made many choices;
some of
paths result in a failure to solve the problem
whereas
better programs.
other paths result in different,
In this
might make a different sequently,
In pur-
possibly
section we will indicate how a system
decision
in the above synthesis
and, con-
drive a different program.
In the above synthesis we derived a goal Find w such that inst(zl°w
(Goal 9):
Pat2 ) = arg 2.
We chose at that point to transform the goal to the equivalent Find w such that inst(w
inst(z I Pat2))
and then apply Rule 7, which added an else-clause yielding Goal I0. goal, introducing
= arg2, to the goal,
We then matched Goal i0 against our top-level the second recursive
It would be equally plausible
call into our program.
that the system should apply Rule
7 directly to Goal 9, giving a new goal Goal I@': Find w such that inst(zl°w
pat 2) = ar__~2
else Q(w), and prove ~ Q(w). The system may now attempt to use a reeursive Goal I0'
However,
top-level
this goal is not a precise
call to satisfy instance of our
goal~ we are demanding not only that a match be found
between pat 2 and arg2, but also that the match be of form Zl°W , where z I is the output of the previous recursive therefore and alist. alist.
generalize
call.
our program to take three inputs:
We must pat,
ar___gg,
We insist that the match found be an extension of
In the case that alist is A, the new p r o g r a m will be iden-
tical to the old one. and arg:arg2,
On the other hand,
a recursive
if alist=zl~
pat=pat2
call to the new p r o g r a m suffices to
satisfy Goal i0' In mathematical
notation,
the stronger
I Find z such that
specifications
[inst(z pa t ) = arg and z = alist°w for some w]
[ else z : NOMATCH.
now read
266
We will call the g e n e r a l i z e d
program
m a t c h 2 ( R a ~ arg alist). The original p r o g r a m match will be written general
auxiliary program: match(pat ar_~) = m a t c h 2 ( p a t
The portion of the p r o g r a m already of match must be s y s t e m a t i c a l l y ed requirements First,
as a call to the more
ar e A).
constructed
m o d i f i e d to satisfy the strengthen-
of match2.
in the case that constexp(pat)
took z÷any.
in the synthesis
However~
where p at=arg~ we originally
we must now take
z ÷ alist°any, becuase the substitution
found must be an extension of alist.
In the case that vat(pat) however, matched
here we must against
we originally
took z+pair( a ~
check to see w h e t h e r ~
ar_~g);
has already been
something other than ar__~. The new p r o g r a m seg-
ment is if inst(alist
~at)
= pat
then z ÷ alis~°pair( ap~_ arg) else if i n s t ( a l i s t
pat)
= arg
then z ÷ alist else z ÷ NOMATCH. Of course.we derivation
are omitting the details
of synthesis;
the actual
is somewhat more lengthy.
The call z I ÷ m a t c h ( p a t I argl ) must be replaced by z I * mateh2(pat I arg I alist). Having m o d i f i e d the p o r t i o n of the p r o g r a m already constructed~ the system continues
the synthesis.
A recursive
satisfies
z 2 ÷ match2( a~_~2 arg 2 z I) Goal 9; the balance of the synthesis
velopment
of the first program,
ing of the specifications
including
call
parallels
the further
the de-
strengthen-
to ensure that the match found is the
most general possible. The value of the second recursive strengthened
top-level
The p r o g r a m derived
call is shown to satisfy the
goal.
from this alternative
synthesis
is
267
m a t c h ( p a t arg) = m a t c h 2 ( p a t arg A) where m a t e h 2 ( p a t arg alist)
=
if constexp(pat) then if ~ a t = arg then z ÷ a!ist else z ÷ N O ~ T C H else if var(pat) then if inst(a!ist pat)
= pat
then z ÷ a l i s t ° p a i r ( p a t arg) else if inst(alist pat)
= arg
then z + alist else z ÷ N O M A T C H else if atom(ar$) then z ÷ N O M A T C H else z I + m a t c h 2 ( p a t I ~ g l
alist)
if z I = N O M A T C H then z ÷ N O M A T C H else z + m a t c h 2 ( p a t 2 IV.
P R O G R A M MODIFICATION:
arg 2 Zl).
THE U N I F I C A T I O N A L G O R I T H M
In general, we cannot expect a system to synthesize an entire complex p r o g r a m from scratch, as in the p a t t e r n - m a t c h e r example.
We
w o u l d like the system to r e m e m b e r a large body of programs that have been s y n t h e s i z e d before and the m e t h o d by w h i c h they were constructed.
When p r e s e n t e d with a new problem, the system should
cheek to see if it has solved a similar p r o b l e m before.
If so, it
may be able to adapt the t e c h n i q u e to the old p r o g r a m to make it solve the new problem. There are several difficulties i n v o l v e d in this approach.
First,
we cannot expect the system to r e m e m b e r every detail of every synthesis in its history. and w h a t to forget.
Therefore,
it must decide what to r e m e m b e r
Second, the s y s t e m must decide w h i c h p r o b l e m s
are similar to the one being considered, larity is somewhat ill-defined.
and the concept of simi-
Third, having found a similar
program, the system must somehow modify the old synthesis to solve the new problem.
We will concentrate only on the latter of these
268
problems
in this discussion.
program modification RobinsonTs A.
We will illustrate
a technique
as applied to the synthesis
unification
for
of a version of
algorithm.
The Specifications
Unification matching
may be considered
in which variables
to be a generalization
of pattern
appear in both Pat and arg.
lem is to find a single substitution
(called a "unifier")
when applied to both pat and arg , will yield identical for instance,
The probthat,
expressions.
if
pat = (X A) and arg : (B Y)~ then a possible
unifier of pat and arg is
(). The close analogy between pattern-matehing clear.
and unification
If we assume that the system remembers
marcher we constructed goal structure unification
in Sections
The specifications cal notation,
the pattern-
III-2 through
involved in the synthesis,
problem is greatly
is
III-5 and the
the solution to the
facilitated.
for The unification
algorithm,
in mathemati-
are
unify(p_a~ arg) I Find z such that inst(z pat) = inst(z arg) else z : NOPLATCH B°
The Analogy with the Pattern-Matcher
For purposes
of comparison
match(pat
we rewrite the match specifications:
arg) =
Find z such that inst(z pat)
= arg
else z = NOMATCH. In formulating
the analgy, we identify unify with match, pat with
pat, the ar___ggin unify with arg~
(~at arg) with arg, and inst(z arg) also
In accordance
alter the goal structure example,
with this analogy,
we must systematically
of the pattern-matcher
Goal 5 becomes modified to read
synthesis.
For
269
Find z such that inst(z pat I) = inst(z argl ) and inst(z pat 2) : inst(z aFg2 ). In constructing the p a t t e r n - m a t c h e r , we had to break down the synthesis into various oases.
We will try to m a i n t a i n this case
structure in f o r m u l a t i n g our new program.
Much of the savings
derived from m o d i f y i n g the p a t t e r n - m a t c h e r instead of constructing the u n i f i c a t i o n a l g o r i t h m from scratch arises because we do not have to deduce the ease splitting all over again. A difficult step in the p a t t e r n - m a t e h e r synthesis i n v o l v e d the s t r e n g t h e n i n g of the specifications
for the entire program.
We
added the condition that the match found was to be "most general." In f o r m u l a t i n g the u n i f i c a t i o n synthesis, we will i m m e d i a t e l y s t r e n g t h e n the s p e c i f i c a t i o n s in the analogous way. thened s)ecifications
The streng-
read
u n i f y ( p a t argi =
.........
Find z such that {inst(z pat)
= inst(z erg) and
for all s [if inst(s pat)
: inst(s arg)
then s = z°w for some w]} else z = NOMATCH. Following Robinson
[1965], we will refer to a unifier s a t i s f y i n g
the new c o n d i t i o n as a "most general unifier." Note that this a l t e r a t i o n process is p u r e l y syntactic;
there is
no reason to assume that the altered goal structure corresponds to a valid line of reasoning.
For instance,
simply because a-
chieving Goal 2 in the p a t t e r n - m a t c h i n g p r o g r a m is useful in a c h i e v i n g Goal i does not n e c e s s a r i l y imply that a c h i e v i n g Goal 2' in the u n i f i c a t i o n a l g o r i t h m will have any bearing on Goal I' The extent to w h i c h the r e a s o n i n g carries over depends on the soundness of the analogy.
If a portion of the goal structure
proves to be valid, the c o r r e s p o n d i n g segment of the p r o g r a m ean still remain; otherwise, we must construct a new p r o g r a m segment.
C.
The M o d i f i c a t i o n
Let us examime the first two cases of the u n i f i c a t i o n synthesis in full detail,
so that we can see exactly how the m o d i f i c a t i o n
270
process works.
In the pattern-macher,
we generated
the subgoal
(Goal 2) Find z such that inst(z pat) The corresponding
unification
subgoal is
Find z such that inst(z pat) In the p a t t e r n - m a t c h e r where pat=arg.
= arg. = inst(z
arg).
we first considered the case constexp(pat)
In this case the corresponding
p r o g r a m segment
is
z ÷ A. This
segment also satisfies the m o d i f i e d ins t(A a~9~)
The system must also
: inst(A
goal in this case, because
arg).
check that i is a most general
for any s [if inst(s pat)
: inst(s
unifier,
i.e.,
ar__~g)
then s = A°w for some w]. This
condition
is easily
satisfied,
case, the p r o g r a m segment
is correct without
The next case does require marcher,
taking w=s.
Thus,
any modification.
some modification.
In the pattern-
when c o n s t e x p ( p ! ~) is true and p!tCarg,
be NOMATCHo
However,
in this
z is taken to
in this case in the u n i f i c a t i o n
algorithm
we must check that inst(s pat)
~ inst(s arg),
i.eo~ nat
~ inst(s ar$)
for any s, in order to take z=NOMATCH. may contain variables, must therefore way.
In this case
the u n i f i c a t i o n
this condition
try to achieve
Since for u n i f i c a t i o n cannot be satisfied.
the specifications
(where constexp(pat)),
a l g o r i t h m reduce
arg
We
in some other
the specifications
of
to
Find z such that {pat = inst(z
arg) and
for any s [if pat = inst(s
arg)
then s = z°w for some w]} else z = NOMATCH. These
specifications
pattern-marcher
are p r e c i s e l y
with ~
the specifications
and arg reversed;
of the
consequently,
we can
invoke m a t c h ( a r g ap_a~) at this point in the program. The balance manner.
of the m o d i f i c a t i o n
The derived
can be carried out in the same
unification
algorithm is
271
unify(pat
arg)
=
if ~constexp(pat) t h e n if pat
= arg
then
z ÷ A
else
z ÷ match(arg
pat)
else if v a t ( p a t ) t h e n if o c c u r s i n ( p a t
arg)
t h e n z + NO[%ATCH else
z + pair(pat
arg)
else if a t o m ( a r g ) then
z ÷ unify(arg
pat)
else
zI ÷ unify(patl
ar~l)
if z I = N O M A T C H then z ÷ N O M A T C H else z 2 ÷ u n i f y ( i n s t ( z I a ~ 2 )
i n s t ( z I arg2))
if z 2 = N O M A T C H then z + NOMATCH else z + Zl°Z 2. Recall that occursin(pat
arg) m e a n s that pat o c c u r s
in a r & as a
subexpression. The t e r m i n a t i o n
of this p r o g r a m
is c o n s i d e r a b l y
to p r o v e t h a n was the t e r m i n a t i o n ever,
the c o n s t r u c t i o n
of the u n i f i c a t i o n
algorithm
pattern-matcher
is m u c h e a s i e r t h a n the i n i t i a l
pattern-mateher
itself.
N o t e that the p r o g r a m we have branch.
more
difficult
of the p a t t e r n - m a t c h e r .
constructed
f r o m the
synthesis
contains
How-
of the
a redundant
The e x p r e s s i o n if pat = arg then z + A else z ÷ m a t e h ( a F ~
pat)
c o u l d be r e d u c e d to z ÷ mateh(arg Such i m p r o v e m e n t s phase.
pat).
w o u l d not be m a d e until
a later optimization
272
V.
DISCUSSION
A.
l~l~entat~on
Implementation way.
of the techniques
presehted
in this paper is under-
Some of them have already been implemented.
quire further development
before
Others will re-
an implementation
will be possible.
We imagine the rules,
used to represent
reasoning tactics,
expressed
as programs
in a PLANNER-type
language.
mentation
is in QLISP
(Reboh and Sacerdoti
to be
Our own imple-
[1973]).
Rules are
summoned by pattern-directed
function invocation.
Worlds have been implemented
using the context mechanism of QLISP,
which was introduced structure
necessary
actual splitting
in QA4 (Rulifson e t a l . for the hypothetical
[1973]) of INTERLISP
thetical world-splitting experiment
II, or the loop-free The generalization a difficult
strategies
Similarly,
[1974]).
(Bobrow The hypo-
but we have yet to
for controlling
it.
simple programs
the program to sort two variables
segments
of the pattern-matcher
of specifications
technique
develop heuristics tion.
(Teitelman
system is capable of producing
as the union function,
environments
has been implemented,
with the various
The existing
The control-
of the control path as well as the assertional
data base~ is expressed using the multiple and Wegbreit
[1972]).
worlds, which involve an
(Seotions
to apply without
to regulate our approach
such
from Part
from Part III.
II-4 and III-5) is
its going astray.
We will
it in the course of the implementato conjunctive
goals
(Section !I-5)
needs further explication. \
B.
H~storical
Early'work Waldinger bilities
Context
and Contemporary
in program synthesis
Research
(e.g. Simon [19631, Green [1969],
and Lee [1969]), was limited by the problem-solving of the respective
formalisms
involved
lem Solver in the case of Simon, resolution case of the others).
(the General Prob-
theorem proving in the
Our paper on loop formation
get [1971]) was set in a theorem-proging attention to the implementation
problems.
capa-
framework,
(Hanna and Waldinand paid little
273
It is typical of contemporary program synthesis work not to attempt to restrict itself to a formalism;
systems are more likely
to write programs the way a human programmer would write them. For example, the recent work of Sussman [1973] is modelled after the debugging process.
Rather than trying to produce a oorrect
program at once, Sussmants system rashly goes ahead and writes incorrect programs which it then proceeds to debug.
The work re-
ported in Green et al. [1974] attempts to model a very experienced programmer.
For example,
if asked to produce a sort program, the
system recalls a variety of sorting methods and asks the user which he would like best. The work reported here emphasizes reasonging more heavily than the papers of Sussman and Green.
For instance,
in our synthesis
of the pattern-marcher we assumed no knowledge about patternmatching itself.
Thus our system would be unlikely to ask the
user what kind of pattern-matcher he would like. do assume extensive knowledge of lists,
Of course we
substitutions,
and other
aspects of the subject domain. Although Sussman's debugging approach has influenced our treatment of program modification and the handling of simultaneous goals, we tend to rely more on logical methods than Sussman. Furthermore,
Sussman deals only with programs that manipulate
blocks on a table; therefore he has not been forced to deal with problems that are more crucial in conventional programming, as the formation of conditionals
such
and loops.
The work of Buchanan and Luckham [1974]
(see also Luckham and
Buchanan [1974]) is closest to ours in the problems it addresses. However, there are differences in detail between our approach and theirs: The Buchanan-Luckham
specification language is first-order pre-
dicate calculus; ours allows a variety of other notations. method of forming conditionals
uses contexts and the Bobrow-Wegbreit Buchanan-Luckham
Their
involves an auxiliary stack; ours control structures.
In the
system the loops in the program are iterative,
and are specified in advance by the user as "iterative rules," whereas in our system the
(recursive)
loops are introduced by
the system itself when it recognizes a relationship between the
274
top-level goal and a subgoal.
The treatment of programs with
side effects is also quite different in the Buchanan-Luckham system, in which a model of the world is maintained and updated, and assertions
are removed when they are found to contradict
other assertions
in the model.
Our use of contexts allows the
system to recall past states of the world and avoids the tricky problem of determining when a model is inconsistent.
I should
be added that the implementation of the Buchanan-Luekham system is considerably more advanced than ours. C.
Conclusions and Future Work
We hope we have managed to convey in this paper the promise of program synthesis, without giving the false impression that automatic synthesis is likely to be immediately practical.
A compu-
ter system that can replace the human programmer will very likely be able to pass the rest of the Turing test as well. Some of the approaches to program synthesis that we feel will be most fruitful in the future have been given little emphasis this paper because they are not yet fully developed.
in
For example,
the technique of program modifieation, which occupied only one small part of the current paper, we feel to be central to future program synthesis work.
The retention of previously
constructed
programs is a powerful way to acquire and store knowledge.
Further-
more program optimization and program debugging are just special cases of program modification. Another technique that we believe will be valuable is the use of more visual or graphic representations, properties
that convey more of the
of the object being discussed in a single structure.
For example, we have found that the synthesis of the pattern matchef could be made shorter and more intuitive by the introduction of the substitution notation of mathematical resent an expression P as P(Xl,...,Xn),
logic.
If we rep-
where Xl,...,x n is the
complete list of the variables that oeeur in P, then P(tl,...,t n) is the result of substituting variables x i by terms t i in P. can then formulate the problem of pattern matching as follows:
We
275
Let a ~
= pat Lxl,..,,xn)
Find z such that if ar$ = pat(tl,...,t n) for some tl,...,t n then z = { ,...,} else x = NOMATCH. Note that this specification
includes
implicityly
the restriction
that the match found be a most general match, because each of the variables
x i actually occurs in pat.
tions do not need to be strengthened
Therefore,
the specifica-
during the course of the
synthesis. We hope to experiment of applications.
with visual representations
Clearly,
in a variaty
while the reasoning required is simpli-
fied by the use of pictorial notation,
the handling of innovations
such as the ellipsis notation in an implementation
is correspond-
ingly more complex. ACKNOWLEDGEMENTS We wish to thank Robert Boyer, land for giving detailed
Bertram Raphael,
critical readings
would also like to thank Nachum Dershowitz, Fikes, Akira Fusaoka, Carl Hewitt,
Shmuel Katz, David Luckham,
We
Peter Deutsch~ Richard
Cordell Green and his students,
breit for conversations this paper.
and Georgia Suther-
of the manuscript.
Irene @reif,
Earl Saeerdoti,
that aided in formulating
and Ben Weg-
the ideas in
We would also like to thank Claire Collins
and Hanna
Z£es for typing many versions of this manuscript. This research was primarily dation under grants GJ-36146
sponsored by the National and GK-35493.
Science
Foun-
276
BIBLZOGRAPHY i. Balzer, R. M. (September 1972), "Automatic Programming," Institute Technical Memo, University of Southern California/Information Sciences Institute. 2. Biermann, A. W., R. Baum, R. Kirisknasw~my and F. E. Petry (October 1973)~ '~Automatic Program Synthesis Reports," Computer and Information Sciences Technical Report TR-73-6, Ohio State University. 3. Bobrow~ D. G. and B. Wegbreit
(August 1973), "A Model for Control
Structures for Artificial Intelligence Pr0grammin ~ Languages," Adv. Papers 3d. Intl. Conf. on Artificial Intelligence, 253, Stanford University, 4. Boyer, R. S. and J
246-
Stanford, California.
S. Moore (1973), "Proving Theorems about
LISP Functions," Adv. Papers 3d. Intl. Conf. on Artificial Intelligence. 5. Buchanan~ J. R. and D. C. Luckham (March 1974), "On Automating the Construction of Programs," Memo, Stanford Artificial Intelligence Project, Stanford~ California. 6. Bundy, A.
(August 1973), "Doing Arithmetic with Diagrams," Adv.
Papers 3d. Intl. Conf. on Artificial Intelligence, 130-138, Stanford University,
Stanford, California.
7. Floyd, R. W., (1967), "Assigning Meanings to Programs," Proc. of a Symposium in A~plied Mathematics, Vol. 19, (J. T. Schwartz, ed.), Am. Math. Sock, 19-32. 8. Green, C. C. (May 1969), "Application of Theorem Proving to Problem Solving~" Proc. Intl. Joint Conf. on Artificial Intelligenoe~ 219-239. 9. Green~ C. C., R. Waldinger, R. Elschlager, D. Lenat, B. McCune, and D. Shaw~
(1974), "Progress Report on Program-Understanding
Programs~" Memo, Stanford Artificial Intelligence Project, Stanford, California. i0. Hardy, S. (December 1973), "Automatic Induction of LISP Functions," Essex University.
277
ii. Hewitt, C. (1972)~ "Description and Theoretical Analysis Schemata) of PLANNER:
(Using
A Language for Proving Theorems and Mani-
pulating Models in a Robot," AI Memo No. 251, MIT, Project MAC, April 1972. 12. Hoare, C. A. R., (October 1969), "An Axiomatic Basis for Computer Programming," C. ACM 12, i0, 576-580, 583. 13. Kowalski, R. (March 1974), "Logic for Problem Solving," Memo No. 75, Department of Computational Logic, University of Edinburgh, Edinburgh. 14. Luckham, D. and J. R. Buchanan (March 1974), "Automatic Generation of Programs Containing Conditional Statements," Memo, Stanford Artificial Intelligence Project, Stanford, California. 15. Manna, Z. and R. Waldinger (March 1971), "Toward Automatic Program Synthesis," Comm. ACM, Vol. 14, No. 3, pp. 151-165. 16. McCarthy, J. (1962), "Towards a Mathematical Science of Computation," Prec. IFiP Congress 62, North Holland, Amsterdam, 17. Reboh, R. and E. Saeerdoti
21-28.
(August 1973), "A Preliminary QLISP
Manual," Tech. Note 81, Artificial Intelligence Center, Stanford Research Institute, Menlo Park, California. 18. Robinson, J. A., (January 1965), "A Machine-0riented Logic Based on the Resolution Principle," Jour. ACM, Vol. 12, No. I, 23-41. 19. Rulifson, J. F., J. A. Derksen, and R. J. Waldinger 1972), "QA4:
(November
A Procedural Calculus for Intuitive Reasoning,"
Tech. Note 73, Artificial Intelligence Group, Stanford Research Institute, Menlo Park, California. 20. Simon, H. A., (October 1963), "Experiments with a Heuristic Comuter," Jour. ACM, Vol. I0, No. 4, 493-506. 21. Sussman, G. J. (August 1973), "A Computational Model of Skill Acquisition," Ph.D. Thesis, Artificial Intelligence Laboratory, M.I.T., Cambridge, Mass. 22. Teitelman, W., (1974), INTERLISP Reference Manual, Xerox, Pale Alto, California. 23. Waldinger, R. J., and R. C. T. Lee
(May 1969), "PROW:
A Step To-
ward Automatic Program Writing," Prec. Intl. Joint Conf. on Artificial Intelli~enee,
241-252.
A NEW APPROACH TO PROGRAM TESTING
James C. King, IBM T. J. Watson Research Center, Yorktown Heights, New York, USA
ABSTRACT: The current approach for testing a program is, in principle, quite primitive. Some small sample of the data that a program is expected to handle is presented to the program. If the program produces correct results for the sample, it is assumed to be correct. Much current work focuses on the question of how to choose this sample. We propose that a program can be more effectively tested by executing it "symbolically". Instead of supplying specific constants as input values to a program being tested, one supplies symbols. The normal computational definitions for the basic operations performed by a program can be expanded to accept symbolic inputs and produce symbolic formulae as output.
If the flow of control in the program is completely independent of its input parameters, then all output values can be symbolically computed as formulae over the symbolic inputs and examined for correctness.
When the control flow of the program is input dependent, a case analysis can be performed
producing output formulae for each class of inputs determined by the control flow dependencies. Using these ideas, we have designed and implemented an interactive debugging/testing system called EFFIGY.
INTRODUCTION
As tools for realizing correct programs, program testing and program proving are the ends of a spectrum whose range is the number of times the program must be executed. To establish its correctness through testing, one must execute the program at least once for all possible unique inputs; usually an infinite number of times. To establish its correctness through a rigorous correctness proof, one need not execute the program at all; but he may be faced With a tedious, if not difficult, formal analysis. These two extreme points of the spectrum offer other contrasts as well. Correctness proofs usually ignore certain realities encountered in actual test runs, for example, machine dependent details like overflow and precision. (One notable effort to bring machine dependent issues into correctness proofs is the recent thesis by Sites [7]). On the other hand one may finish a proof of correctness, but seldom do we ever finish testing a program, Normal testing and correctness proofs also differ in the degree to which the user is required to supply a formal specification of "correct" program behavior. While a careful statement of correctness may be recommended for program testing, it is not required. A user
279
may choose an interesting input case and then decide a posteriori, in this specific case, if the output appears to be correct. In a formal proof of correctness one must have a careful program specification.
A testing tool is described in this paper which allows one to choose intermediate points on the spectrum between individual test runs and general correctness proofs. One can perform a single "symbolic execution" of the program that is equivalent to a large (usually infinite) number of normal test runs. Test run results can not only be checked by careful manual inspection but if a machine interpretable program specification is supplied with the program it can be used to automatically check the results. Furthermore, by varying the degree to which symbolic information is introduced into the symbolic execution one can move from normal execution (no symbolic data) to a symbolic execution which, in some cases, provides a proof of correctness.
SYMBOLIC EXECUTION
The notion of symbolically executing a program follows quite naturally from normal program execution. First assume that there is a given programming language and a normal definition of program execution for that language. This execution definition must be used for production executions but an alternative symbolic execution semantics for the language can be defined to great advantage for debugging and testing. The individual programs themselves are not to be altered for testing. The definition of the symbolic execution must be such that trivial cases involving no symbols should be equivalent to normal executions and any information learned in a symbolic execution should apply to the corresponding normal executions as well.
An execution of a procedure becomes symbolic by introducing symbols as input values in place of real data objects (e.g., in place of integers and floating point numbers). Here "inputs" is to be taken generally meaning any data external to the procedure, including that obtained through parameters, global variables, explicit READ statements, etc.
Choosing symbols to represent procedure inputs
should not be confused with the similar notion of using symbolic program variable names. A program variable may have many different specific values associated with it during a particular execution whereas a symbolic input symbol is used in the static mathematical sense to represent some unknown yet fixed value. Values of program variables may be symbols representing the non-specific procedure inputs.
Once a procedure has been initiated and given symbolic inputs, execution can proceed as in a normal execution except when the symbolic inputs are encountered. This occurs in two basic ways: computation of an expression involving procedure inputs, and conditional branching dependent on procedure inputs.
Computation of Expressions The programming language has a set of basic computational operators such as addition (+), multiplication (*), etc. which are defined over data objects such as integers. Each operator must be extended to
280
deal with symbolic data. For arithmetic data this can be done by making use of the usual relationship between arithmetic and algebra. The arithmetic computations specified by these operators can be "delayed" or generalized by the appropriate algebraic formula manipulations. For example, suppose the symbolic inputs a and /3 are supplied as argument values to a procedure with formal parameter variables A and B. Denote the value of a program variable X by v(X). Then initially, v(A) = a and v(B) =/1. If the assignment statement C := A + 2*B were symbolically executed in this context C would be assigned the symbolic formula (a + 2*/3). The statement D := C - A , if executed next, would result in v(D) = 2"/I.
Similar symbolic generalization can be done, at least in theory, for all computational operations in the programming language. In the most difficult case, one could at least record in some compact notation the sequence of computations which would have taken place had the arguments been non-symbolic. The success in doing this in practice depends upon how easily these recordings can be read and understood and how easily they can be subsequently manipulated and analyzed mechanically.
Conditional Branching Consider the typical decision-making program statement, the IF statement, taking the form: IF B THEN $I ELSE Sz, where B is some Boolean valued expression in the language and $1 and $2 are other statements. Normally, either v(B) = true and statement S~ is executed or v(B) = false and statement Sz is executed. However, during a symbolic execution v(B) could be true, false or some symbolic formula over the input symbols. Consider the latter case. The predicates v(B) and , v ( B ) represent complementary constraints on the input symbols that determine alternative control flow paths through the procedure. For now, this case is called an "unresolved" execution of a conditional statement. The notion will be refined as the presentation develops. Since both alternatives paths are possible the only complete approach is to explore both: the execution forks into two "parallel" executions, one assuming v(B), the other assuming ~ v(B),
Assume the execution has forked at an unresolved conditional statement and consider the further execution for the case where v(B),
The execution may arrive at another unresolved conditional
statement execution with associated boolean, say C. Expressions v(B) and v(C) are both over the procedure input symbols and it is possible that either v(B) = v(C) or v(B) ~ ~v(C). Either implication being true would show that the assumption made at the first unresolved execution, namely v(B), is strong enough to resolve the subsequent test, namely to show that either v(C) or , v ( C ) .
Because the assumptions made in the case analysis of one unresolved conditional statement execution may be effective in resolving subsequent unresolved statement executions they are preserved as part of the execution state, along with the variable values and the statement counter, and are called the "path condition*' (denoted pc). At the beginning of a program execution the pc is set to true. The revised
281
rule for symbolically executing a condition statement with associated Boolean expression B is to first form v(B ) as before and then form the expressions: ~
v(B)
~ ~v(B). If pc is not identically false then at most one of the above expressions is true. If the first is true the assumptions already made about the procedure inputs are sufficient to eompletely resolve this test and the exe,cution follows only the v(B) case. Similarly if the second expression is true it follows the ~v(B) case.
Both of these cases are considered "resolved" or non-forking executions of the conditional
statement.
The remaining case when neither expression is true is truly an unresolved (forking) execution of the conditional statement. Even given the earlier constraints on the procedure inputs (PC), v(B) and ~v(B) are both satisfiable by some non-symbolic procedure inputs. As discussed above, unresolved conditional statement executions fork into two parallel executions. One when v(B) is assumed, in which case the pc is revised to pc ^ v(B), the other w h e n ~ v ( B ) is assumed and then oe becomes pc ^ ~v(B). Note that the forking is a property of a conditional statement execution not the s t a t e m e n t itself.
One
execution of a particular statement may be resolved yet a later execution of the same statement may not.
The pc is the accumulator of conditions on the original procedure inputs which determine a unique control path through the program.
Each path, as forks are made, has its own pc.
No pe is ever
identically f a l s e since the original pc is true and the only changes are of the form pc := pc ^ q and those only in the case when pc ^ q is satisfiable ((pc ^ q) = , (PC = ~q) which is satisfiable if pc ~ - q is not a theorem). Each path caused by forking also has a unique pc since none are identically false and
they all differ in some term, one containing a q the other a ~ q.
S Y M B O L I C E X E C U T I O N TREE
One can characterize the symbolic execution of a procedure by an "execution tree".
Associate with
each program statement execution a node and with each transition between statements a directed arc connecting the associated statement nodes.
For each forking (unresolved) conditional statement the
associated execution node has more than one arc leaving it labeled by and corresponding to the path choices made in the statement.
In the previous discussion of I F statements there were two choices
corresponding to v(B) and , v ( B ) . The node associated with the first statement of the procedure would have no incoming arcs and the terminal statement of the procedure ( R E T U R N or END statement) is represented by a node with no outgoing arcs.
Also associate the complete current execution state, i.e., variable values, statement counter, and pc with each node.
In particular, each terminal node will have a set of program variable values given as
formulae over the procedure input symbols, and a pc which is a set of constraints over the input
282
symbols characterizing the conditions under which those variable values would be computed, A user can examine these symbolic results for correctness as he would normal test output or substitute them into a formal output specification which should then simplify to true.
The execution tree for a program will be infinite whenever the program contains a loop for which the number of interations is dependent, even indirectly, on some procedure inputs. It is this fact that prevents symbolic execution from directly providing a proof of correctness technique.
Symbolic
execution is indeed an execution and at least in this simplest form described here provides an advanced testing methodology. BurstaU [1] has independently developed the notion of symbolic execution and
added the required induction step needed to have a complete proof of correctness method. Deutsch [2], also independently, developed the notion of symbolic execution as an implementation technique for an interactive program prover based on Floyd's method [3]. In fact, one can see the basic elements of the notion of using symbolic execution as the basis for a correctness method in the earlier work of Good [4]. The author and his colleagues have been pursuing the idea of symbolic execution in its own right as a debugging/testing technique. A particular system we have built called EFFIGY is described briefly in the next section.
EFFIGY - - AN INTERACTIVE SYMBOLIC EXECUTOR
The author and his colleagues at IBM Research have been developing an interactive symbolic execution system for testing and debugging programs written in a simple P L / I style programming language. The language is restricted to integer valued variables and vectors (one dimensional arrays). It has many interactive debugging features including:
execution tracing, break-points, and state saving/restoring.
Of course, it provides symbolic execution and uses a formula manipulation package and theorem prover developed previously by the author [5, 6].
The generat facilities and capabilities available are all that is of real interest and these are perhaps simplest and most economically explained by a system demonstration. An APPENDIX is included which shows an actual script (annotated in italics) from such a demonstration. A method for exploring execution trees with their multitude of forks and parallel executions is up to the user. He is provided the ability to choose particular forks at unresolved conditional statement executions (via go true, go false,
and a s s u m e )
and also has the state save/restore ability so that he may return to
unexplored alternatives later. We are currently experimenting with various "test path-managers" which would embody some heuristics for automating this process, exhaustively exploring all the "interesting" paths, As with previous testing methods the crucial .issue is: if one cannot execute all cases, which ones should he do; which are the interesting ones.
We are also working on practical methods for dealing with more odvanced programming language
283
features such as pointer variables. While, as mentioned above, most such enhancements are straightforward "in theory" many offer fundamental problems in practice.
CONCLUSION
Interactive debugging/testing systems have shown themselves to be powerful, useful tools for program development. A symbolic execution capability added to such a system is a major improvement. The normal facilities are always available as a special case, In addition, the basic system components of a symbolic executor provide a convenient toolbox for other forms of program analysis, including program proving, test case generation, and program optimization. Since such a system does offer a natural growth from today's systems, an evolutionary approach for achieving the systems of tommorrow is available. Valuable user experience and support is also provided. While practical use of the EFFIGY system is still quite limited, considerable insight into and understanding of the general notion of symbolic execution has been gained during its construction.
ACKNOWLEDGMENTS
The colleagues at IBM Research collaborating with me in this work are: S. M. Chase, A. C. Chibib, J. A. Darringer, and S. L. Hantler. They have all contributed significantly to the ideas presented here and to the design and implementation of our EFFIGY system.
We also appreciate the support and
encouragement received from D. P. Rozenberg, P. C. Goldberg, and P. S. Dauber. The manuscript was typed by J. M. Hanisch.
REFERENCES
[1]
Burstall, R. M. Program proving as hand simulation with a little induction, IFIP Congress 74 Proc., Aug. 1974, pp. 308-312.
[2]
Deutsch, L.P. An interactive program verifier, Ph.D. dissertation, Dept. Comp. Sci., Univ. of Calif., Berkeley CA., May 1973.
[3]
Floyd, R.W. Assigning meanings to programs, Proc. Symp. Appl. Math., Amer. Math. Soc., vol. 19, pp. 19-32, 1967.
[4]
Good, D.I. Toward a man-machine system for proving program correctness, Ph.D. dissertation, Comp. Sci. Dept., Univ. of Wisc., Madison, Wisc., June 1970.
[5]
King, J.C. and Floyd, R.W. An interpretation oriented theorem prover over integers, Journal of Comp. and Sys. Sci., vol. 6, no. 4, August 1972, pp. 305-323.
[6]
King, J.C. A program verifier, IFIP Congress 71 Proc., Aug. 1971, pp. 235-249.
[7]
Sites, R.L. Proving that computer programs terminate cleanly, Ph.D. dissertation, Comp. Sei. Dept., Stanford Univ., Stanford, CA., May 1974.
284
APPENDIX
A script from an actual EFFIGY session is shown below. The user's inputs are in lowercase letters and the system responses are in uppercase letters. To prevent any possible confusion the symbol " ~ " is shown here to ~he left of the user inputs. Explanatory comments, in italic letters, have been added as a right hand column.
When EFFIGY is initially invoked it is in an "immediate" mode and will execute statements as they are typed.
Any statement executed in this context is considered part of a main initial procedure called
MAIN. The concept of the MAIN procedure and the concept of immediate execution are distinct since statements can also be executed in an immediate mode in the context of other procedures. MAIN is unique in that it has an immediate mode only and it is the onty procedure privileged to execute the managerial system commands. Programs are made available to EFFIGY for stored program execution by declaring them, in MAIN, with the PROC statement similar to the way that internal procedures are declared in PL/I. However, EFFIGY does consider all procedures as EXTERNAL and they must be declared in MAIN.
Procedures are tested by a CALL from MAIN. SymboIic inputs can be supplied by enclosing a symbol string in double quotes, e.g., "a", "Dog". These symbolic constants can be used in most places instead of integer constants.
The system responses drop the quotes since the context always makes the
distinctions between different uses of identifiers quite clear.
Values always involve the input symbols
and never program variable names. Formulae are stored internal to EFFIGY in a "normalized" form and some of the expressions may appear quite different from what one might expect (e.g., A < @ will be typed out as A - n > - q ). The formulae are also kept in a simplified form (e.g,, 2 * B = 4 is stored as
B-2=0). EFFIGY runs on CMS under V M / 3 7 0 on an IBM/370 model 168. The CMS filing system and context editor are used as an integral part of EFFIGY for creating, changing, and storing procedures and command files. The INPUT command directs EFFIGY to read its input from the designated file (files have two part names in CMS) instead of directly from the user's terminal. As procedures are entered into E FFIGY (by a PROC ,.. END declaration) the statements are sequentially numbered. These statement numbers are used to reference particular points in the procedure for inserting breakpoints, turning tracing on and off, etc.
~effigy EFFIGY READY ~ e d i t a b s o l u t e effigy; NEW FZLE: ~input ~absolute: proc(i,o); dcl (i,o) integer; if i-1)) TRUE BRANCH O=-A STOPPED BETWEEN 3 AND 5 ~display variables, assumption; IN A B S O L U T E I:A O:-A ((A~>-I))
~ r e s t o r e I; STATE I RESTORED. ~ g o false; ((A~>-I))
'
terminal input. Declare a variable in MAIN. Try a numeric execution. Result of display statement.
All tracing on in proc. "absolute'. Set back m MAIN. Try a symbolic input "a". Each statement execution is traced by printing it. Evaluated result of I 0 T H E N DO; ((i~ 0 T H E N DO; ((A~
3 12 -
12 . . . .
4 *---
13
-
14
I
t
5
I g (a) ;
f(b,c)
is sind
yon c der
G,
kosmt
man
in f(c,c) nach
.2
enthaltene
lnalcg ergibf
sich:
b - c}
schliesslich
Ton A angedeuted ,=
enthaltenen
ersetzt is%. Zur der
neuerliohen
A zur~ck, da dutch die Neudefinition in
b
gSnzlich
and
sin~)
ges~ss
smmtlicbe
(wie
dass sp~tere Workommnisse in terpretieren
darin,
h
gehen
c := g (a)
b - c
wird,
und
in
dutch c := b ersetzen
lquival~zinformaticn,
vo.
bet.its
welter
Anueisung
.,. B6 = .=* 6,
{,2-~ ~ #2-A].
~terpretation
a -> f(b,c)
in die
wobei darch die Unterstreichang
Zustands~eschreib~ng
mieder
wieder
•a.A7 = A_,i.¥ dass
is%
yon A in B e z u g auf
speichern.
wiederus:
,a.8 =
,o.~
(mine
indem sie
und dater b -> g(a)
Information
d~r Bedeatu~g,
Vorkcmmnisse
der
Zustandsheschreibung,
Formal:
sich,
San kann jedoch einen
~olgenden
dies
...? = ,,.B8 Wenn B in Bezug auf $,
Ferner
wird,
ist
~ ,o.¥}
Information die
darauf wird diese Eintragung
,~ = { b -> g(a)}.
1
"o
GemMss
Ausdr @cke
[$o-~
wird).
der
verfngbar)
die
Opti aisier un gsver f a hr en =
Interpretation
Zastandsheschreibung aQfgencmmen.
verf~gbar
das
ange.andt
,..A T.
a
Neudefinition
,o.~.
wird auf einen Modal angewandt,
|it
is
indmm
Kapitel
Boole 'sche
$o.e
zunSchst die Aufnahme
ist
unmittelhar die
in
Meiter
Alternativen
gleichbedmu%end
yon
We rden
nicht sich
Zustandsbeschreibung
gege~ene~
Zustandsbeschreibang
wird, in Zeichen
R ekursionsg leich ,ngss ystes mit
der
nims% das Verfahrmn
Information enthaltende)
gleichbedeutmnd
einbmzogen,
bei
welch, Ausdr~cke in welchen Variablen
auf den ~au~tsodal ~ "angewand%" des
wird der ~egrlff miner
der,
Information
won a
wiederum
330
~.~
Das
= ~.~7
= A~..7
I~%erpre%a%icnsverfahren
endet, sobald ein Modu! in ~ezug
auf eine Zus%andsbeschreibu~g auf
die
er
Verfahrers der
A~zahl
yon
I,formafion, Da
ei~e
bereits
ergibt
interprefier%
Mcdule~
s%eht es frei, daf~r eine,
[A ~I I B 5,)
System
entspricht
obeD
vollst~ndig Variahle~
in
blieben.
zu eliminieren
sich notwendig, zu ersefzen
das
vereinfachf,
unver~ndert
~bb.
Um
Idee,
die
is
~u tragen, dass in einem
gebrachf
al!erdi~gs
die Namen
yon
Ausdr~cke ist es
wesent!ichen wurde),
an yon vom
um dem
Programm verschiedene
Variable bezeichnen
f~r die M~glichkeit
werden
!~ferpretatic~sprezesses.
f~r redundante
Namen fragen kBnnen und dass umgekehrt
Versfandnis
zU gehen,
mehr.
yon /I/ @bernommen
Nalen die gleiche
Um ein h~sseres Verfahre~s
zu
wiedergegebene
(wie in /14/ beschrieben)
(eise
gleichen
Namen
Variablen dutch nquivalenzklassen
Ussfand Rec~nung den
6
redundante
Algorifh~us
verschiedene
Modul wieder
ergibt sich das
als darin
value-numbering Variablen
der ~@glichen
neuen
Eli~inationsverfahren
inscfer~e
des
als Besultatprogramm:
keine Redundanzen
¥iedergegebene wurde
Variab!en f~r
Es eufh@If
Bezug
~nd!ichkeit
angewandt auf einen
~o = {P)
in
eingeht.
Wird dies f~r cbiges Beispiel gemach%,
Ausdr~cke
und
der
und aus der Endlichkeit
Zustandsbeschreibung
Flussdiagra~eo Das
ist
~ie ~ndlichkeit
aus
die in die Zustandsbeschreib~ng
fo!gende R e k u r s i c n s g l e i c h u n g S S T S % e m
Diesem
wurde.
sich dabei aufoma%isch
einen Mcdu! ergib%, w~h!en.
xu inferpretieren
im
ohne Das
Folgenden Ausfahrung ers%e
des angegebenen
noch des
Beispiel
kBnnen.
2
~eispiele
eigentlichen betrifft
die
331
Opti~isierung
des
fclgenden
{einem
Programmteiles
unver~ffentlichten ~rtikel yon J.C. Heatt~ entnommen) :
DO I = 1 % 0
N;
IF K(I)
> 0 TH~N M (1)=~(J) ;
END ;
Mit
herr~mmliche~
OFtimisierungsverfahren
sich der nach der ersten Ausf~hrung nicht eliminiereD, Seiteneffekten o~iges ~bh.
8
dam
ResultatFrogramm.
Der
dass
der
in
wurde.
urspr~Dg]ic~
einfache
f~r
duplizierter
Code
ist
Sch!eife
Beis~ie]
und
darin
sell der
kann.
Interyretationsverfahren
der
dieser
Einschl~ss
Zustandsbeschreibung e~alich viele Divergenzgefahr
(do
Bocle,sche gegeben)
der
Ausdruck
wie
das
2
Variable
2 wie
angegebene
yon
ist
ist
nachtr~gliche
er!aubt es, in gevisse~ F~!len ~berflHssige
is
Wahrheitswert in
Wahrheitswerte gibt
yon
Wahrheitswerten
Information
nut
und
i(J)
Entscheidung
entsprechende
es
die in
~inschluss
Propagierung
~ekannt.
Duplizierung
den jeder
dass
werden kann.
zeigen,
Nach
Graphen wiederus
ersichtlich,
durch
dutch
und
werden
symholiscbe
benannte
Aus Abb. 8
Optimisierungsverfahren erweiter%
sei allerdings
das
gerich%ete,
gewHnscht h~chstens einmal ausgefHhrt
Wahrheitswerten
an
erhaltene
nech eine Reduktion durchgef~hrt wurde
Schleifen aufgelRst w~rde
zweite
halber
Ansch!uss
unn~tig
zusammengelegt
Dos
L(J)
Abb. 7 zeigt
Cptimisierung
Vel!s%~ndigkeit
dabei
(wie aus der Syntaxthecrie in
kommen kBnnte.
dutch
Interpre%ationsverfahren beka~nt),
~usdruck
in geeigneter For~ als ~lussdiagramm kodiert.
zeigt
hinzugef~gt,
redundan%e
l~sst
da es sonst 7u dos Programs falsifizierenden
(~nsafe code morion)
Beispiel
(code motion)
ja die
und
wiederum
nut keine
verwendung doyen ~ntscheidungen
zu
erke~nen und damit beis~ielsweise Schleifen zu diagnostizieren, die selbst ge~bten Prograssierern sanchsal Ahb.
9 9ibt ein solches Prog~ammfragsent.
in Abb.
10 zeigt, dass
SchleifeDdurchlSufe
das
~och
Pregramm sinnvoll
des drit%en Sch!eifendurchlauf ~m
Fermalismus
der
f~r
verborgen die
ersten
heiden
ist, und dass es erst nach
unbeding% zu schleifen
symbolischen
bleiben.
Dos Resultatprogra~s
Interpretation
ist
beginnt. dieser
332
Zustand
durch
dam
Auf%re%en
eines
unbedingten,
rekursiven
~oduls leicht z~ erkennen.
~. ~ X ~ A L Z
PARRLLELISIERUNG
Parallelisierung aus
eigent~ich
l~sendes)
ist
~hnlich
dynawisches
Preblem.
Im
der Op%imisierung ein yon Natur (nut
letz%en
~rograms!aufzeit
zu
Abschnitt wurde gezeigt,
zur
dass
sich f~r die Optisisierung ein Tell dieser eines
z~r
~berse%zungs~ei%
Prograsma~lauf werden,
vorwegnehmen
dass
sich,
Paral]e!itMt
zur
Verfahres beruht auf
der
in
~od~larisierungstechnik beschriebe~. gezeigfeD f(1},
Es
wird
her ist
wird) klar~
voneinander erfolgen
Parallelismus
~apitel ist
illustriert
f(100)
2
an
l~sen
Hand
der
und
in
tas
zweiten
/13/
des
Operator
der
l~sst,
angegehenen
/10/
das
~rkennung
Dam Programm berechnet (wobei
gezeigt
~lussdiagramme, -
in
wird
n~her
~bb.
11
die ~usdr~cke f
unhestismt
und summiert diese Ausdr@cke auf. Von der Logik dams
die
uDabh@ngig
kant.
f~r
in
symbolischen
Folgenden
itpliziert
und
Flussdiagramms.
f(2},...,
ge]asse,
Is
vollst~ndig - was eine
La~fzeif
bereits
durchgef~hrten
l~sst.
zumindestens
Para!lelisierungs~rcble,
~ynasik
Zu
Rus~ertung ist
~eigen
~nd ist,
aller
sithin wie
dieser
parallel dieser
~usdr@cke zueinander
"dynasische"
(der im allgemeinen nicht ~on der Programsgr@sse
scnder~ vc~ der L@nge des
Programmablaufs
abh@ng%)
ermittelt
werden kant. Gem~ss ergehen
der sich
in Abschni%t 2 angegebenen die
fclgenden
~ekursio~sgleichunges symbo]ischer
Programs
~ B C a G ~
beschreihenden
Anweisungen
@else als A, B, C, ... angef~hrt,
angede~%e~): ~=
das
(die einzelnen
Modularisierungstechnik werden nur in
wie in
A~h.
11
333
Der
gresse
Vcrteil
Parallelisierung
dieser
ist,
KontrollahhMngigkei%en
-
Rnweisungsausf~hrung aufgezeigt
durch
das
yon
werden.
Programmdarstellung
dass
sind
sie
die
Moduls der
cder
eines
darin
liegt, dann h~n9% die Ausf~hrnng
~usf~hrung
Ahh~ngigkeit
des
wird
(siehe
/7/),
wobei
(d~rch
Anweisung
Modulaufrufes yon
aS.
den
Des leichteren im
in
~olgenden
Er~ei%erung
(Zuweisung
Yariablen
und
lesen, bzw°
Expansionsoperaticnen hinzukcmmen,
dieser
weiteren yon
Um
zu
die ihrer
ist
~ine
Verst~ndnisses
in graphischer
~orm
basierend auf des Konzep% eines Datenflussgraphen
Basisopera%ionen yon
Analyse
~Iternativen
herauszuarbeiten
unumg~nglich.
diese
durchgef~hrf,
Dated
-
im Eereich eines
enthaltenen
Anweisungsausf~hrung
ben~tig%en
Datenflussanalyse halber
betreffenden
eider
~usf~hrung
die
Bereich
eider
~ntscheidungen
Warm immez eine Anweisung
im
die die
Abh~ngigkeiten
vcrangehenden
~oduls liegt, d.h. entweder direkt in einer seiner vorkommt
f@r
genau
(dutch
dieses
Konzeptes
zu
den
Eingabe/Ausgaheoperationen), nach Variablen
schreihen,
Doppelkreis
gekennzeichnet)
noch
die bei Ausf~hrung einen Knoten in einen weiteren
eine
Al~ernative
gegebenen)
Datenflussgraphen
e~pandieren. Ein
Dat~nflussgraph
des
betreffenden
wird gezeichnet,
Medals
nach
der
Schreihcperationen
in
jede Variable
K~sfchenform
beschrieben
(in
Pfeilfcrm gezeichne%
fortlaufende
Kopien
Numerier~ng
nut die let~te dieser d.h.
vcneinander
angefer%ig%en
in darauf folge~den
Ausl~s,ng
der
Entscheidungsvariable ~infachpfeile pote~%ielle
den
~U
oder
V
einen vcn
ergib%
(die
durch
verden)
verwendet
sich
und
iesevorgang allein
werden
yon
der
masgehenden)
Doppelpfeil
angegehen.
Expansionsoperationen
~bergehen
dargeste]Ite Bild. DeE Nebenmodul
einmal
"Aktualit~tswert"
die ers% bei Ausfffhrung
Datenli~ien
Hau~%~odul
Kopien der
and
~uftre%en ein
sind
unterschieden
Expansion
dutch
Datenlinien,
i~ tats~chliche ~nr
h@chstens
Operationen
ist
Lese-
uerden, wohei abet
angegeben) an2ufertigen
dart. F~r Expansionsoperaticnen (f~r
die
werden dart, sodass bei wiederholtes
and @erselben Variablen
besitzt,
indem f~r eine Operation anderen
der
sind
~xpansion
k@nnen. dabei
a hat 2
das
in
~Iternativen,
Ahh.
12
deren
334
entspreche~de wirk]ich
Datenflussgraphen
kommen
kann,
tragen
ist
wiedergegebene i~
yon Eingabe Eine
s ben~%igt einer
gew~hrleisten
ist.
Betrachter
hestimmt),
bereits
Ausf~hrung
erkennen.
werden.
Dies
vorgegeb~nen eigenes Bei
Graphen
deren Ausf~hrung geschrieben
gel~schf
ausgewMhlt
/2/)
de[
Ausf~hru~gs~glichkeiten VerschwindeD
beschreibf°
des Netzes
ausser~alb
dee
Betrachtungen).
ergibt
erweisen.
im
~r
zu
dabei
der
dass
Anweisung
momentan
Die
als
Netz jedem
Graphen" erlaubte, endet
mit
~uswirkungen ~edien,
maximal
der
steht
mBgliche
Programmansf~hrung
yon
oder Nichtvorhandensein
Programs
sich ~
hestimmt
Zu
"precedence
externen
bei
das gegebene
~herzeugen,
ihre
bestehende
Die ~usf~hrung
dutch das Vorha~deDsein
vo, Date~]inien.
das
(die eigentlichen
Parallelit~t
sich
dieser
eingesetzt.
s~mt]iche
das Schreiben
Ausf~hrungen
kann
sohald
~er Weft
wird in
Programmausf~hrung,
se!bst
Wet% an die
werden soll, und eine Eopie des
sfe!!t das Ne%z den
dar,
einfach
ein darin
vorhanden
selhst
ausf~hrbar,
Expansionsoperation
Ausf~hr~Dgszeitpu~k%
selbst,
sind
wird der entsprechende
Datenf]ussgra~hen
der
(vg!.
erzeugt mit der
~xpansion
ist,
scbald ihre Eingabewerte
definiert ist.
Alternative
eDtsFrechenden
parallelen
beginnend
~. Sobald dutch
werden
~tscheidu,gsvariable
anste!]e
einen
ausschliesslich der
und die Operation
zu
werden.
~xpansicnsopera%icnen welche
sie
f~r
m~ssen ~uerst einsal
hergestellt
ausf~hrbar,
jedoch durch
2umindestens
Expansion,
Expa~sio~soperaticn
Dafenf]ussnetz
~usgabevariable
des
dutch
nut
|s := s)
Kcntrollmechanismus
Operationen
nicht
Mod~lal%ernative
(und f~r den sind
geschieh%
Basisoperationen
(hier
~nalyse zeigt ferner,
Kopieroperation
lassen~
den
Zuge
f@r jede
wird, deren Hereitste!l~ng
Datenflussgraphen
auch
zum
serge zu
tiefergehende
eigenen
menschliche~
~amit
~xpansion "interface"
~alle eines Aufrufes der zweiten
~ie Variable
siDd.
zeigt.
und ~usgabevaria~len
abet in /13/ beschriebene)
Einf~hrung
Die
13
bei einer
f~r ein gleichf~rmiges
(gleich A ~ a h l
~odula]~ernafive). dass
~bb.
jede dieser ~]%ernativen
dabei
m@ge sich der die
voneinander
Leser
wiederholten una~hmngig
335
So
anschaulich
u~geeignet n~chster
schri%%
linearisiert gebracht. (nicht
graphische
sein
mag,
wird die graphische
Da~stellung
Lis%e)
einer
der
Operatienen
versehen
werden,
getren~ten
V =
als Menge Graphen
wobei Modulaufrufe die
die
dutch
Eingabe und Ausgabevariablen
Beispiel
ergibt
sich
damit
in
Form:
{So:=O; no:=1; write sz}
Po:=no
(!e_~t m k - X ( y , z )
= w:
f (y,z)) is...
mk...
At
the
the
manipulation
points
w h e r e i t is n e c e s s a r y of
functions,
use
to d i s c u s s is
made
more carefully of
the
Lamhda
notationf(x)
= ...x...
~owever,
these
(see ref.
[13])
let
x = e:
-
uses
will
o f t e n be
"sugared"
) )
g(x)
f = Xx . . . . x...
)
-
( x x . g ( x ) ) (e)
in L a n d i n ' s
style
394
Raps
are
used where %he graph of a function can be explicitly
co,structed -
[dl -> r~ J
explicit definition
[d->
implicit definition
r ] p(d,r)]
+
joining
are the c o ~ n t e r r a r t s of the set concepts.
To
come
now
to
the
problem
of
Semantic
le_~n_it_~o_~n°
d e f i n i t i o n s given beloN will be written in terms from
stated
se@~ntics VDL
domains
to
it is intended
style
models,
ranges.
[16],
in
c o m p o n e n t exists in the i n t e r p r e t i n g discussed
more
construct).
The
full~
below
re la ticn
mathematical
sema,tics
be reviewed
in c c D n e c t i o n
considered.
To begin
in
between
the d i s t i n c t i o n
from
the
which an ezplicit control machine connection functional
(this with
point the
semantics
is
set_e and
ref. [22] is of more i n t e r e s t
and will
with several of the l a n g u a g e
features
with i% is worth showing the definition of
a language which is itself f u n c t i o n a l and thus affords path tc functional
functions
In using the term f_u_n_ct_~onal
to e m p h a s i s e
ref.
of
The
an
easy
semantics.
1.2
Consider
the
language
given by the f o l l o w i n g
rules -
BI expr = inf-expr
B2 inf-expr
B3 var-ref
I var-ref
:: ex~r o F expr
:: id
B~ const =: I~TG
~ const
abstract syntax
395
The
class
op
is
not
existence of a functiom B5 apply-op
Now,
further
specified
than by the
-
: INTG op INTG -> INTG
for a given set of denotations
avoided
other
because
it
will
be
used
(the term "environment" below)
for
the
is
free
identifiers-
B6
DEN
the
: id->
INIG
denotation
of an expression,
which is also an integer,
is
given hy B7 eval-expr(e, den) = cases e: mk-inf-expr (el,op,e2)
->
(l_et v1=eval-expr(el,den) ; l_e_% v2=eval-expr (e2,den) ; r~su_!it_ is
{apply-op(~1,op,v2)))
mk-var-ref(id) mk-co~st(n) type:
This
expr DEN -> INTG
definition
composite
is
programming mathematics, at
from)
introduction
variable
a
has
expression
constructed The
-> den(id)
-> n
the depends
only
the denotations of
perhaps
on
that
the denotation
(and
can
of its component
].anguage. are is forced
most The
distinctive fact
that,
be
expressions.
feature in
of a
therefore
the concept of a dynamic assignment the
given point in time
semantics.
property
to a
of
a
contrast
to
to consider the value of a
variable
Foses problems for the definition
of
396
t.3
&~ais~a!_~a~£~
Consider
the
assignment
language
statements
whose programs consist of a s e q u e n c e of
(as-st)
which can be d e s c r i b e d -
CI as-st*
C2 as-st
The
:: s-lhs:id
effect
of
s-~hs:expr
such
a
sequence
of
statements
transfer~ so~e initial set of d e n o t a t i o n s step
by
step,
into
longer sufficient
their
final
to c o n s i d e r
for
the
denotations.
the DEN as
an
will
be to
variables,
Thus it is no
argument
to
the
interpretation:
the DEN reguired as the a r g u m e n t to the second
(and subsequent)
calls of
changed
by
the
the
interpretation
interpretation
may
have
been
of the first assignment. this proh!em
omitting
This is done because the
a]l
mention
of
the
DEN.
intention is to offer a number of
different
by
The
function given below a p p e a r s to ignore
simply
explanations.
should
be p o s s i b l e to see the intent of what is~ written
accepts
that ~ssign -changes"
assumes e v a l - e x p r
the DEN for
uses the c u r r e n t
the
given
It
if one
id,
and
DEN -
C3 int-st-l(st-!,!)
if i ~ lst-I !_he_n (let mk-as-st(lhs,rhs)
= st-l[i ];
l e t v:eval-expr (rhs) : assign(lhs,w) : int-st-I (st-l,i# I))
i type: It
is
formulae.
as-st ~- IN~g possible
to
The first
=> consider
three
ways
of
possibility is to read them
reading as
such
programs.
397
As
such,
each functicn
refers to one non-local course,
would ccrrespond variable
the same non-local
(i.e.
variable
eval-expr and all calls of int-st-l. trivially
defined
tc
modify
has its usual ordering are
written
in both cases to distinguish c~iven
this view and
the
DEN).
referenced
It
is,
of
by the modified
The sub-program
this variable.
implications.
with "=>", and
to a subrouti,e which
assign
is
The separator
"~"
Subroutine
type
clauses
their calls are marked with a ":", them from pure functions.
using the notation of ref. [ 12], the types
could be given in full by int-st-I
:: DEN:as-st~ INTG ->
eval-expr
:: DEN:expr
assign
:: DEN:id INTG ->
With
this simple, constructive,
discuss
one
of
definition.
the
dynamica~!y
functions. which te~t, there make
has
with
is no incentive
sequencing
language State" to
which
to
a
"Grand
State"
now
tc
all
part
of
the
state
this approach
is that it
the
discussion
The second
ref. [I ], would
in
as-st • INTG DEN -> DEN
eval-expr:
expr DEN -> INTG DEN
assign:
id INTG DEN -> DEN
not
The be
counter
eval-expr.
of alternative
views of the
interpretation
in each case a DEN. This,
int-st-l:
show
variable.
that the statement
possible
give-
and
would
functions as taking an extra argument
an extra result:
style
of program
Although in %his trivial language
further inspection,
int-st-l.
change
to do so, it would have been possible to counter
of taking
without
Returning regard
a
"Small
the possible exception
could net be affected h~, for erample,
function
of
by a side effect on this new non-local
disadvantage clear,
term
put into the state used by the defining
variables,
statement
the
in which only those things
are
are put into the state. the
properties
used
This is in contrast
all
view it is already possible to
desirable
McCarthy
describe a defi,itie, very
-> INTG
is
to
and yielding
which is the view of
398
(I,
fact
examplet
it
is
often
possible
since it can cause
nc
to
simplify;
changes,
in the above
eval-expr
need
not
return a DEN.)
But it is ne longer possible to rely on %he p r o g r a m m i n g view of ":".
It is n e c e s s a r y to d e s c r i b e it
functions.
as
a
combinator
The task of doing this is c o m p l i c a t e d
betwee,
by the various
a l t e r n a t i v e c o n t e x t s and it is e a s i e r to show the result would
come fros using
which
the c o m b i n a t o r -
int-st-I (st- l,itden)
=
if i -< ! s t - I
the__n (l_e~ mk-as-st(lhs,rhs) let
(v,denl)
= st-l[i ];
= eval-expr(rhs,den);
!_et den2 = assign(lhs,v,denl) ; result is
(int-st-l(st-l,i+1,den2)))
_e!s~e den
type:
Since
the
sugared
The
as-st* IN~G DEN -> DEN
"lets" are
now on pure functions,
a
fors cf lambda expressions.
third view one could
of ref.
they are simply
[22 ].
functiona~
The c o m m e n t
language,
take of the f u n c t i o n i n t - s t - i was made on the
definition
that the de,oration of its
was al! that was necessary to d e t e r m i n e
the
sub-components of
a
unit. By regarding the d e n o t a t i o n of an a s s i g n m e n t statement
as
a function this
is
it is again sc
~ossible
to enjoy
the
is that of
denotation
this property.
(That
is mcre c l e a r l y shown if the abstract syntax of a
cosposite statement
is given
recursively.)
would be-
int-st-l:
as-st ~ INTG ->
eval-expr:
expr ->
assign:
id->
(DEN ~> DE~)
(DEN -> INTG BEN)
(IN~G->
(DEN->
DEN})
The r e s u l t i n g
types
399
Again
in
this
combinator. (after
view
it
is
necessary
But now the fact that the
applying
the
functions
to
define
units
to
";"
be
as a
combined
to the static components)
basically f u n c t i o n s of the type ~EN -> ~E~ means that the
are very
pleasing model of functional c e m p o s i t i o n is adequate.
The
Oxford
group
(refs. [22, 23, 19]) have gone rather
far in
designing c c m b i n a f o r s which weuld permit f o r m u l a t i o n s like -
int-st-! (st- l,i) =
X~en.(!f
i
(DEN -> DEN)
w here -
CO~B(vf,uf)
=
kden.(kv,denl.(uf(v) (den1)) (vf(den)))
(It
should
be
made
clear
that if this had been written by a
genuine devotee of mathematical semantics, very
different.
it would have looked
It is the c u r r e n t author's view that excessive
zeal in shortening d e f i n i t i o n s makes them less rather than more readable,
Since
of. ref. [19],
ref. [1]).
this is a function one can look at its result
program!
Consider -
p = (X := I; y := x * 2)
for a test
400
Then after reduction-
i~t-st-I (ptl) = (kden.den + [y -> den(x)+2])
which
is
the result expected.
o (Xden.den + [x -> I])
(This e x e r c i s e is somewhat more
i l l u m i n a t i D g on larger examples.)
to
the earlier d i s c u s s i o n of grand
versus small state a p p r o a c h e s ,
(Reverting
it is worth noting that it would
have
for
been
a
moment
possible
tc make the
(undesirable)
step of putting
the s t a t e m e p t c o u n t e r in the state and still give a in
terms
cf a function
would ~ot,
howe~er, have been possible to provide
combinators. counter
to f u n c t i o n s from states
In
particular,
is required
definition
to states. such
~t
simple
the static role of the s t a t e m e n t
to provide the
required
decomposition
of
the semantics.)
If the a p p r o p r i a t e c o m b i n a t o r d e f i n i t i o n s bow provided three questio~
The
ways of reading
were written,
the f o r m u l a
we have
int-st-l.
positicD
advanced
in the next part of this paper is that
the i n t e r p r e t i v e view is useful during the development translator
mapping
thinkiDg
in
problems be resolved
~rograms
terms
~a~ipulated
definitions
notation
meta-~anguage.
written
in
reasoning
this
style
will
be
the m a t h e m a t i c a l view
or any
is
it
so far it would be easy
variant of the "fRr". the question
reasonable
to
add
In
doing
arises to
as the
Would it, for instance, be a c c e p t a b l e to have a
while c o n s t r u c t ? of
that certain
Thus the notation will develop
this for any r e a s o n a b l y c o m p l e x language m~ch
however,
this leads to r e t e n t i o n of
the a~ount of notation i n t r o d u c e d
how
the
to when d e c i s i o n s are otherwise unclear.
to define ~'if then else"
to
Observe,
of c e m b i n a t o r s ~ e ~ _ ~ d
as if they were oFerational;
say he a~pealed
With
to functions.
moze carefully:
also this view of the formulae. the style above~
of
and it is cn!y then that one need take the
view of mapping source %hat
The
of which should be used must now be discussed.
lhe answer
about
must a l w a y s be sought in
a construct.
the
ease
Thus in ref. [~] both while
401
and ~9~ constructs more
have been included,
restricted
kind
than
the
but they are of
FOR
a
much
of PL/I which was being
defined.
This
topic
definition
1.q
The
brings
in its banishment
standards
valuable
to
situations them
committees. have
and
some
an
problem
hop,
its ability
attempt
for
of
it
has not
languages
appears to he
abnormal
to provide formal
defining ~ 9
its
is
has
the
terminated
sequencing
definitions
for
as
both
technigue
this
the
stack
so
be
order.)
that
below
of the interFreting stack:
in of
if
with
a
can be shorter
The subject of
the
connection
state with
is itself
operation
obeyed and the next step of operation
of
the but
arbitrary
the recta-language
LAMBDA in ref.
removes the top
this
exits,
was more general,
describing
(called
function
block
in ref. [ 2,] was to
into
a VDL definition
function
compound
in the ne,t section.
the control
Instead
as functions,
interpreting
the
whose semantics
discussed.
component
[In fact
upsets the
same phrase structure
for modelling ~oto employed
point is discussed
control
can
is discussed
~acbine.
evaluation
section
this is simpler than
that
problems
introduce a control
~irectly
In
the phrase structure
Although
property
phrase structures
of a unit solely in terms of the
sub-units.
chosen
block terminations
abstract
is that, other than the local
and initiated abnormally
definition
the
serious,
tc leave or enter
should be pEovided.
an
be
mechanism
ability
with
of
statement
this
from descriptions
To
to state the semantics
semantics
The
a semantic
may be one of the tools for comparison.
The
it
the challenge of giving
debate on the morality of the Horn construct
yet resulted by
to
Lan~ua~
Goto
long
us
of ~o~o.
described [24]).
by
A step
instruction
from
is elementary,
it is
is performed;
in the case
402
of
a
"macro"
the centre]
operation~
the a p p r o p r i a t e o p e r a t i o n s are put on
stack so that the next step will e n c o u n t e r them.
So far this can be thought of as a way of d e s c r i b i n g functional application. component
The
an
its e x p l i c i t phrase
purpose,
explicit
~anipulation.
stzucture
structured delete
hoover,
was
Thus one
to
the
making
the
control
way to model ~ot_o out of
define
the
in line with the phrase
from
of
part of the state was to make p o s s i b l e
control
"obvious"
structure,
component
any
a
operations
but
to
simply
operations
which
c o r r e s p o n d e d to parts of the program being jumped over.
The effect of this was that, in general, present a r g u m e n t s structure
whose inductive
of the program.
structure f o l l o w e d the p h r a s e
It was of c o u r s e p o s s i b l e to present
proofs,
but they had to be by i n d ~ t i o n
states
generated
by
it was not p o s s i b l e to
LAMBDA.
One
over
could
the
sequence
argue
p r e c i s e l y the undesirable effect of the ~t__Qo, but in definition
had
the generality, instruction,
gone
too far.
in this case
forced
of
that this is fact
the
It was one of the places where
to
change
the
control
in
any
one to show that, in precisely the places
one did not require the power, it was not used.
In fact the deletion used more
of parts of the control
for exits from blocks: local
phrase
was sometimes only
the reason that it was not used for
structures
was
the
s o l u t i o n adopted for
h a n d l i n g abnormal entry into such phrase structures. some
definitions
was simply changed q_oto
into
and
to point tc the next
statement.
out of phrase structures
on exit from the phrase structure.
However,
unit
by
~anipulating
treatment of ~ND in ref. The
current author
action and
the
to be
into the
tended to cloud
the n e c e s s i t y to describe finding
made
performed
(This can, in fact he v i e w e d
part cf the iAMBDA function
such d e f i n i t i o n s
This
very easy to describe
providing there was no special e p i l o g u e action
as a b s o r b i n g
Hasical!y,
a d o p t e d a "current s t a t e m e n t s e l e c t o r " which
definition).
the normal action by
the successor to an
Fointer
(see,
for
embedded
example,
the
[25]).
became c o n v i n c e d
that setting up the normal
letting a ~oto "take the machine
by
surprise"
was
403
the
wrong
model.
The
proposal
could result in abnormal "abnormal"
result,
made
termination
which
was
was that any unit which
should
some
return
an
case. Any call of a function which could result in an return
must
(Together with W. Henhapl,
selector
treatment,
order
to
this was written
ref.
who provided the
statement
up in ref. [6]).
define Algol 60# in which it is possible to ~Qt__qo
into both "_if" and compound address
abnormal
test for this possibility and perform a p p r o p r i a t e
actions.
In
extra
null value in the normal
statements,
the other part of the problem.
[I ] is to provide functions
structure
without
to fo]]cw.
Since these functions
it
was
necessary
to
The approach employed in
which run through
the
phrase
executing but setting up all of the actions prompted the e x e c u t i o n
where to begin they became called
"cue-functions"
as
to
(as in acting
- Dam Stichwort).
Consider,
for example,
the following
-
~ot_a l: i_f P ~_h_e_n I: sl else s2 : s3
Not oD1y should
this,
rather odd, transfer of control
without evaluating p, it should also set up the performed
thereafter
so
thal
s2
is
get to sl
events
to
be
skipped and s3 is next
considered.
The c o m p l e t e l y functional definition of Algol given in ref. [ 1] became tedious because of the man~ places where the effect of a ~o~o
can
cause
a cha~ge of e v e n t s and therefore the abnormal
return value must be tested. most
common
action
was
It was, however,
next operation and pass back the abnormal level.
that
the
value
to
the
next
In fact %here are very few places where it is necessary
to describe any sFecial action. Lucas
clear
simply to refrain from execQting the
proposed
that
Based on
adopting
some
this unit
observation to
trap
P. the
404
interpretation
where the action
free %o drop the ,,test and
This
abbreviation
normal • ~!
is
for
ABN.
e x p l i c i t l y by the retur,ed
a
to
be
Non-~i~l
exi~
the
(implied)
values
statement.
for
ABN
Normal
return
of
All a
are returned
action
with the same value for
~i~
unit b r a c k e t e d
which it applies.
The
one
on
being
ABN.
An
explicit
performed for a non-hi I ~BN value is defined by
means of a ~ a ~
containing
leave
non-hi ! ABN value is to terminate also the calling
function abnormally action
would
the one used in ref. [ ~] and below.
~eturns are written omitting
walue
to
was required
return" case by convention°
together with the statement
C o m p l e t i o n of the trap unit completes the
function.
developmert of this idea has been d e s c r i b e d in terms of an
i~terpre%er partly
partly because
because
it
is
this is how it actually o c c u r e d
probably
easier
to
first
read
and the
following functions in this way. (In
fact
the
separation
i n t - n s - I and c u e - i n t - n s - i were more
taking
the
mathematical
of
the. largely similar,
would probably not
purely i n t e r p r e t i v e view
given
be
functions
made
view: it is only the
if
one
for the
below
that
functions
are
given
by the f o l l o w i n g abstract
written separately°)
The
language
syntax. is
considered
It is assumed
something
is
that among
the unlisted statement
types
like the a s s i g n m e n t of the previous section which
would force retention of the DEN component.
D1 st = goto-st
} cpd-st
I ---
D2 goto-st :: id
93 cpd-st :: nmd~st • D~ hind-st :: s - n m = [ i d ~ s-body:st
id net further defined
405 The defining "the unique
functions
can now be given
object satisfying")
(the "b" operator
-
D5 int-st(st): cases
st:
ink-gore-st(lab)
-> ~Ii~(lab)
mk-cpd-st(ns-l)
-> int-ns-l(ns-l,1)
type:
st =>
D6 int-ns-l(ns-l,i) :
if i _~ !ns-I
t~_n {(trap exii_t (lab) wit_~h if is-contained (lab,ns-l) then cue-int-ns-l(ns-l,lab) else exit (lab) ; int-st(s-body(ns-l[i~)) int- ns-I (ns-l,im I) )
_e!s~e ! type:
hind-st* INIG =>
;
means
406 D7 c u e - i n t ~ n s - l ( , s - ] , l a b ) : !et i =
(hi) (is-contained(lab,))
if lab = s-nm(ns-l[i]) thhen int-ns-l(ns-l,i)
e_is_e witl
(~ra~ exit(lab)
if is-co~tained |lab,ns-l) t_he~n c u e - i n t - n s - i (ns-l,lab) else exit (lab) ; c u e - i n t - ns-I (s-body (ns-l[i ]) ,lab) ) int- ns-I (ns-l,i+ I) )
type:
hind-st* id =>
D8 i s - c o n t a i n e d ( ] a b , n s - l ) (3i) (s-nm(ns-l[i])
= = lab)
v
(39) (is-cpd-st (s-body(ns-l[j ]) ) ^ i x - c o n t a i n e d (lab,x-body(ns-l[j 9))
type:
[%
id n m d - s t ~ - >
was
observed
obtained
terms.
constructs.
that
a deeper
This
view
is
~irst it is n e c e s s a r y
in the "=>"
int-st:
Tt
above
understanding
by viewing a m o r n - l a n g u a g e c o n s t r u c t
sesantics
hidden
{_t_rue,fa_!_se)
now to
in
a t t e m p t e d of the above uncover
what
st ->
n~d-st~ INTG -) (DEN -> EEN IBN) hind-st* id
to
define
s t a t e m e n t s separated
by
(DEN -> DEN A~N)
the
denotation
";"
in
terms
of two m e t a - l a n g u a g e of
their
denot atic,s. stl;st2
-
been
(DEN -> DEN ~%BN)
int-rs-l:
easy
has
of the type c l a u s e s -
cue-int-ns-l:
is
is often
mathematical
kden. (le_t (denl,abnl)
= st1{den) ;
if abnl = _nil then st2(denl) else
{denl,abn I)
individual
407
This
gives
us the way of creating
-> ~ ABN}
from two functions
test
dynamic
is
~qt_o will, !t
is
a function
depend on the state.
to write a very straightforward
for the t r!2 e_/xi~ but, if this results in the
fact
functions applied make
which
to the whole text
for
which
one
or free
although
the
labels
(in the sense they may the set of labels the
labels.
by the following
example -
1
I:$2 /*no contained
Then
that
can do something is known: it is precisely
~ f p ~he._,.nn~
=
is
within the unit),
This point can be illustrated
S
dynamic
(of the current unit) would
to the trap are unknown
he either contained set of contained
equally
to ascribe a semantics of the required type
The key observation
will come
an
combinator
that the t_ra_~ exit body again uses defining
it impossible
to a unit.
{~
unavoidable because the occurence of the
in general,
is also possible
action,
whose type is
of similar type: the fact that t h e
labels
*/
-
int- ns-l(S,1 ) = !_e_% (den1 ,abnl)
= int-st(if
p _th_e.~ng_o_to 1 e_!Is_~e~Lqto =);
(abnl = all -> int-st(s2) abnl = 1
-> cue-int-ns-l(S,l)
T
-> exit (abnl))
Wow s i n c e cue-int-ns-l{S,l)
it
= int-st(S2)
can be seen how tc construct the denotation
of course, statements
a trivial case. introduces
But eve~
where
looping, an equation
fixed point can be sought.
the
of S. This was, graph
of
~o
will be given whose
408
It should be conceded at ~rife " d e f i N i t i o n s " combination°
The
above
which
do
not
permit
static
This is a cause for further consideration.
mechanis~ is not the one usually e m p l o y e d
mathematical semantics: been
this point that it is also p o s s i b l e to
using exits
accepted
{cf.
the mechanism
ref.
[23])
which
is
in giving
appears
that of
to
have
"Continuations".
Basically,
the denotation of a label is the function
~)
r e p r e s e n t s starting at that label and running to the
which
(say ~
->
end of the program!
This is c e r t a i n l y a more powerful concept:
that
more g e n e r a l languages,
it
can define
found out to his cost possible
when
continuations
while
tried
to
the current author show
that
to e l i m i n a t e c o n t i n u a t i o n s in ref. [21].
maxim is to be sparing
general.
he
to
model Algol 60 labels
was
~owever,
c~ power in the m e t a - l a n g u a g e {ref. [ 1 9 ~
It seems unfortunate, for instance,
it
and
the
using
may be too
that in -
p do
!_f q !_b_e_~so__t_o I: I: $2
ea_d the
The
label I cannot be "treated
actual
locally".
choice between c o n t i n u a t i o n s and the model offered
here must be made in the c o n t e x t of the definition.
Since
use
be
decided.
%.he language
the Oxford group has an interest in proving
c o m p i l e r s c o r r e c t it will await a larger can
of
The
experience
with
arguments on e~i~t is, sc far, e n c o u r a g i n g .
example basing
before
this
correctness
409
B_!_o_cL_s!~9~!uj~L__aan_.q~a_q~
1.5
Both
blocks
and
rrocedures
local level of ~aling.
permit their user to i n t r o d u c e a
Since the names defined within different
(even nested) b l o c k s do not have to be distinct, of 1.3 will not suffice as a state. language
in
"remember"
which
Consider
no recursion is allowed.
the value of a variable,
a nested block in which a n o t h e r to o v e r c o m e this problem statically
distinct
for
the
of
a
say x, over the lifetime of
be
to
make
all
one way
identifiers
example, q u a l i f y i n g them with a
unique b l o c k
number.
The
renaming scheme would not, however,
static
case
It is necessary to
variable x is declared,
would
by,
the s i m p l e DEN
be a d e q u a t e if
recursion
were also allowed.
It would then be necessary to keep
distinct,
multiple i n s t a n c e s of a variable which is declared in
a recursive block.
Before
considering
the passing of procedures as parameters,
is a p p r o p r i a t e to discuss c a l l - b y - r e f e r e , c e since i,troduces
a
tool
which
makes
the
its
it
solution
remaining problems both
easier te state and solve.
C o n s i d e r the following -
b__eain ~oc
p(x): ia~
x; x := a;
p(a)
If
the variable a is passed by reference, the parameter x will
refer to a. In an i m p l e m e n t a t i o n the
parameter
location. i~
x
would
result
in
a
reference a
reference
the argument
in all
referenced.
In
places
and
to the same
The d e s c r i p t i o n of Algol 60 in the Algol Report,
this part very operational.
become -
the non-local
was
The model given was to copy in
where
the
formal
parameter
was
this way the body of the above p r o c e d u r e would
410
a := a.
Some
care
because
was
discussion using
necessary
concrete of
when
abstract
describe
with
What
The storage
directly
with
has really
be i n s e r t e d ) .
for
class
[I]).
and
for
there
associate
of the class. which
to
least is
a
is to show the s h a r i n g
of objects
below
the
Eut even
tedious At
call-by-name}
component
bee~ done
(of.
somewhat
The idea
identifiers
rule" partly
should
ref.
the same m e m b e r
"copy
discussed
becomes
below
i, the e x a m p l e
the
being
(of.
by an e n v i r o n m e n t
(chosen
!ocations)~
it
mechanism.
some a u x i l i a r y
is m a i n t a i n e d
values
(see
equivalent,
by h a v i n g
LOCs
parentheses
copying
call-by-reference
identifiers
were
programs,
this
simpler~
in d e s c r i b i n g
strings
maps
This
identifiers
to be s u g g e s t i v e will
no
but i n s t e a d
is to d e c o m p o s e
into
of m a c h i n e
longer with
both
association
associate
locations.
-
DE~ i d - - ..... > VAL into-
ENV
STG
id ....... > LOC ....... > VAt
hut
in doing
so,
the
possibility
is i n t r o d u c e d
to have
idl .... I--->
n--->
v
id2 ....
so
that
any change
references more
via
thaw
via
one of the i d e n t i f i e r s
the other.
the
expression
(The use of of
an
£0C is,
is r e f l e c t e d
to
in this model,
no
eguivalence
relation
over
identifiers).
Tt
is
now
necessary
environments mentioned dynamica!ly
handles
above~
consider
the block
~he
distinct,
to
and
locations
how
model
recursive
will
so the problem
a
block
be g e n e r a t e d
of entering
which
a
has
problems
so as to block
be and
411
destroying
a
denctafion
c e r t a i n l y been overcome. environment location the
is
case
of
base
bindings,
The
will
later
be
r e q u i r e d has
generated
mapping
the
new
local
to
a new
identifier
(notice such a copying of DEN would be incorrect). a
~eeper blocks the
which
All that happens is that a
block
which
can be known by and called from
(i.e. a procedure), it is necessary to
environment,
to
which
it
will
show
insert
how
its local
is to be found.
most
mcdel
economical
would
be to assume again that all
identifiers are distinct in which case it is possible that
In
any
to
show
valid calling e n v i r o n m e n t c o n t a i n s the required base
environment as a sub-part.
"most
existence
does nct cover the case where procedures can be passed
parameters!
ref.
This
is
variable
the
recent"
as
any
then, only
solution
recent"
of
TB this case,
precisely
references are possible.
[7]).
The
operational
because
then,
"most
discussion
see
will be to "remember"
what its base e n v i r o n m e n t should
be.
In
an
model one would make a procedure d e n o t a t i o n contain
the pair of procedure text and environment. semantics
other than
(For a fuller
general solution,
for any procedure
can be r e f e r r e d to. This
In
a
mathematical
style d e f i n i t i o n one wo~Id use these two entities to
create a function.
The language to be considered is -
El proc :: s-nm:id s-parms:id~
E2 s t
= call-st
~ as-st
E3 call-st :: s-pn:id
~4 as-st :: s-lhs:id
Identifiers
--.
s-args:id~
s-rhs:expr
then correspond either to variables
is considered) required,
I
proc-set s - d c l s : i d - s e t st
in
or procedures: the
latter
in a
associated object.
E5 ENV : id ->
(LOC | PROC-DE~)
the
former
procedure
(only one type
case
a
denotation,
LOC as
is the
412
A not yet initialised
value for a variable is allowed,
so -
~6 S : LOC ~> VAL
E7 LOC = sere infinite set
E8
VAL
=
INTG
~
A functio~a] type for ~rocedure d e n o t a t i o n s is given
Eg PROC-D~N
: (LOC ~ PROC-DEN)~
->
-
{S -> S)
(Notice that qo_tc is net is the c u r r e n t language,
so ~BN is not
required) o In order tc cover recursive
procedures it is necessary that the
d e n o t a t i o n of a procedure is a v a i l a b l e indirectly) denotation
call
itself.
Since
denotations
wherever it might
is discussed
here are functional objects,
(let
env'
validit7
of
= proc;
= = env
+
([parm-l[i] - d e n ° l i t ] [id - eval-dcl(id) [s-nm(proc)
~ 1- YROC-D~N
413
Eli
eva!- dcl (id) : let
l:alloc () ;
assigm (I,?) result is
type:
E12
;
(I)
id => LOC
int-st(st,env|: cases st: mk-call-st (pn,arg-l)
->
(let f = env(pn); let den-i =
The functions -
~13
allot : :> r OC
El@
release : LOC =>
extend
and
restrict,
respectively,
the
domain
of storage.
While-
E15
assign : LOC VAL =>
E16
contents : LOC => INTG
modify and read,
Based
on
the
respectively,
environment
important points.
it
values of storage.
is
possible
First, it is interesting
is precisely the response
to
to note
clarify that
two this
of both c o n s t r u c t i v e and m a t h e m a t i c a l
414
definitions ]a,guageo
to
the
This
problem
the above manipulation cf ~ref.
the environment,
as
the
well
in the state.
he modified the
defining
a
block
structure
environment
better
than,
is say
The VDL models used the grand state approach and
[2~3"?
co,tailed
of
leads to the second point: in what respects
as
all
"stacked"
versions,
This ~eans that, potentially,
by any function.
were
they can
It was then necessary to show that
i n t e r p r e t a t i o n of two successive statements in a statement
list was under the same environment. non-trivial call, had
because if the first statement was, for example,
the e n v i r o n m e n t to
call,
show
was indeed dumped
environments exactly
that by termination of the i n t e r p r e t a t i o n of the
as
arguments,
that two
the
argument.
~£~i~i~.
certain
the
other
calls
of
(This
was
of
hand, shows guite int-st
the
are
passed
subject of the,
~roefs of the first two lemmas in ref. [g].)
language now available
%hat
on
successive
sa~e
somewhat tedious,
of ~ £ n ~
a
then changed: the proof
the dumped e n v i r o n m e n t had been restored. The passing
explicitly
The
Moreover, such proofs were
is rich enough
to discuss the topic
In the d e f i n i t i o n given, it
conditions
is
assumed
hold for a b s t r a c t programs which are
not expressed by the syntax rules.
For example,
the
definition
would simply be undefined for a program which attempted to call a simple variable or which called a defined procedure less
arguments
than
type
rules
the
unnecessary
in
parameters. abstract
encumberance.)
syntax
however,
of
ref.
It would, of course,
write appropriate checks i~tc
but
with
(The attempt to include such
the
defining
[ I]
was
an
he possible to
functions.
This,
would net show that the checks, in this language,
of a static
nature.
That
is,
it
is
possible
to
define
are a
predicate -
is-well-formed
: ~roc ->
{tr_ue,_fa_!se}
which only yields tzue in the case that all such static context c o n d i t i o n s are fulfilled. position
This
is
not
intended
to
take
a
on to what extent type questions in a language should
be s t a t i c a l l y checkable° useful
~roperty
static
and
what
of can
a
It is only
to
definition
to e x p l i c i t l y show what
only
be
checked
argue
that
it
dynamically.
is
a is (Rn
415
associated
~oint
implementation errors
It
is
i~ an unexecnted
would
be
procedures
1.6
that
for programs
it
permits
which contain
freedom
statically
to
an
checkable
part).
to
possible
define
both
and function
blocks
in the style of this section.
F ux!her ToEiN~
This
section
ccnstrucis With
will consider how some other,
familiar,
language
could be tackled in the same spirit as above.
regard
definition
to fl£~£, it is straightforward
to extend the last
to cover S£~2 out of procedure calls.
This,
with the merging of the other features already defined,
along is done
in the Appen@ix.
~he problem is simple because only known,
therefore
recently activated,
most
referenced.
[f the language allows
parameters
this
the
passing
property no longer holds.
to make each instance
of
labels
as
It is now necessary
cf a label dynamically
this requires some mechanism
and
instances of labels can be
distinct
like the activation
and to do
identifiers
of
ref. [~].
The
i,troduction
variables)
into
consideration
a of
label
language
target variable definition
the
something
The
(or,
with
it
affected of
variables
is
forced
fact,
the
a label instance
to be not-greater PL/q d ~ s
in
entry
additional
which Do longer
the
lifetime
than the lifetime of the
not make this restriction to
of
add
a
validity
and
check via
like a set of currently active activations.
definition
creation model
brings
referencing
label being assigned. so
variables
Algol 68 avoids this by constraining
exists. any
of
of
call-by-value
of a special by the in
any
location
changes
A ]gol
60
via
is simple to include
which
will
be
the parameter.
description
aD imaginary block.
of
the
by the
only
one
This is a close
assignment
to
new
The fact that Pi/I makes the
416
choice
between
calling
call-by-value
side
ca~!-~y-~a~a
is
of Algo~
passing
procedures.
it
is
frequently
the
i~Dlementer
wider
cf
definition write
it
referenced
could
is
freedo~
the
First
-
that
in
definition
reasonable
is warning
to leave
This
is
permit
the
user
order.
in
references
such orders
is an open
a
the
to be made
function
In a l l o w i n g
for
an e q u i v a l e n t
to
in an expression contains
such freedom
powerful
mechanism
which g u a r a n t e e
be
designer
define
note
may
the
on
pmore
the
rely on any particular
of how tc formally
(a +
via
in a !a,guage
side-effects.
which
The
of order of evaluation.
variables
language
programs
[4].
handled
if the e x p r e s s i o n cause
call-by-reference
ref.
than o p t i m i s a t i o n s
any order even which
60
~cr instance,
access
and
in
desirable
some
freedom
result.
shown
in a
not
to
The question problem!
(b + c))
the d e f i n i t i o n
may
wish
to allow
not only
-
a h c, a c b, b c au c b a
but also c
It
a
h etc.
is net,
at each The
therefores
branch:
response
of the
performed
were in any
any a v a i l a b l e
The
its
Droperty
of
such
arbitrary
this
was
VDL
to this
~achine But
in
ref.
LAMBDA
into
order
cculd
The
in fact a r e a s o n a b l y
control
operations
to
randomly
be be
chose
for execution.
by
very similar
building
the definition. occur
arbitrariness.
if they were to
function
[I] was in fact nature
the
was to sake the
branches
IAMBDA
of the ccnt£cl
functional
tree.
parallel the
one path or the other
to inherit problem
into a
on
crder and
leaf
definiticn
achieved
to choose
it is n e c e s s a r y of
component
executed
adequate
was in
Since
this
the only
expression
small impact.
and only relevant place
evaluation
417
The
definition
meta-language not
solved
provides
in
ref.
[4]
by introducing because
for the
the
shifts
the
problem
the "," separator.
definition
inheritance
of
of
a
to
the
The problem
combinator
sequencing
freedom
is
which is
not
provided. 9.
Beki%,
pieces than
among
others,
of "interfering" their
has pointed out that to combine two
program it is necessary
(extensional)
this problem is discussed
in ref.
be
the
generally
required,
pursue the definition defining
of
know
more
His approach to
[3]. While this approach
current
author
more axiomatically:
conditions
to
functional meaning.
good
may
would prefer to
in the first place by
cooperation
that
guarantee
~on-interference. One
final
axiomatic great
area,
that of Storage
parts of a definition.
freedom
left
to
the
reasons,
In
some
implementor
mappings of the data structures efficiency
Models,
are
mappings.
worthwhi3e example, list
viewing
I. 7
in
an
ref.
array
there
is
to what storage
required.
In
PL/I,
somewhat
[q],
however,
storage
model
for
and
to
express
the
constrain
it
was
found
abstractly
value as a mapping
For a fuller discussion
(for
from a (hypera
additional
linearised constraints
see ref. [ 2].
S_~u_m_ma__r Z
difficulty
of defining a programming
from a purely functional occur.
A
functional
extensionally Three
as
of integers to values, rather than as
thereof)
separately.
The
Even
to state the basic
)rectangle
languages
the programmer can take alternative views
of an aggregate and thus the language does the
brings in the role of
definition
(of. section
alternative
definition
in
solutions
were program
to
which functions
I. 2) is not immediatel~
as an interpreting
all interpreting
language as distinct
language is that changes
di~ussed;
a
state
are read
applicable.
to consider the
(ref. [ 2 ~ D ; to
functions as ha~ing an additional
consider
argument
and
418
result which is the state fu,ctions [!9]}. to
(ref. [I]): %o consider the
defining
as producing a f u n c t i o n s from states to states
I% is possible, by adopting c a r e f u l l y
write
in
a
chosen
notation,
style which can be read in more than one way.
Except for the problem cf a r b i t r a r y order, c o m b i n a t o r s provided which permit
The
advantages
are returned
Whatever
(ref.
of
can
be
ref. [4] to be read in all three ways.
the different ways of viewing a definition
to in the next part of the paper.
one's chosen viey of a semantic definition,
there are
more important c o n s i d e r a t i o n s which influence what is
written.
The
central
guide-line
proposed is that the definition should
not possess properties ~hich are no% inherent in being
defined.
This
is
that
a proof is often
a
such
details
it
program,
required that certain
the model haYe no effect on the final outcome. eliminate
language
,or to say that such definitions are
wrong in the overall effect they d e s c r i b e for rather
the
but
details of
If the model can
will facilitate the use envisaged
below. ~xa~ples
of
where d e f i n i t i o n s can be o v e r - s p e c i f i c range
the use ef the grand state a p p r o a c h use
of
lists
vhere
to trivial items
like
from the
sets conld suffice for components of the
state in ref~ [25]. Before
proceeding
%e
the
discussion
of
proving a compiler
correct with respect to a language definition, should
be
spePt
on
the
language definition. corresponds hope.
Most
properties
to
a
such
standards
(axioms)
The
direction
encouraging
moments
definition,
descriptions
the
definition
there is little
are
a
mixture
and partial models for the language. which it is not, for the natural
he read precisely,
best i0cemplete,
verbal
few
of the correctness of the
Rs far as -proving" that normal
if it were possible, to
question
a
such d e f i n i t i o n s
of Even
language
would be shown to be at
at worst inconsistent. followed
by
ref.
[25]
is,
of
course,
very
in that it is a huge step towards s t a n d a r d i s i n g via
a document which could be considered
to be formal.
419
In spite of the difficulties possible
to consider
most trivial languages passing
level a definition
should be checked the
implicitly
defined
seeing
why
currently
task
under
cn
a
the source
first
be
is required. "formally"
question
of
the must
two
to
source to
be
to
to
be
entail
is
the creation of a
of
Secondly,
functions from states generated
environment
an understanding is
returned
of to
process to be discussed
resolved
is
is which of the reading
are
to
be
adopted.
the interpretive
This
which a
have
mapping
been shown in the source in the choice of code to of
source
to states has more to be
functions:
will also require
author,
view of the
Firstly, it is very unlikely that
are exactly those required
produced.
It is
(and even ref. [9]).
for taking
distinctions,
case
source
specification
ucrked out. There seem, to the current
arguments
the
definition,
the
The question of whether this doCume, ted
definition
as the basic one.
these
aid
some extent remain open until more examples
definition
be
would
is
this process can begin,
must be
have bee~ fully to
has been
definition
definition
The step by step development
question
an
to the object language.
very similar to that in ref. [15]
styles
the proof
consideration
Based
machine
understanding
The
a
as
of
texts of a source language into texts of a
from
that before
object
below.
Much more
existence
this section discusses how to obtain a
mapping
obvious
believe
like
OF i TRANSL~TOB SPECIFICATICN
language.
a
by current
In ref. [4] an attempt
authors
the
under consideration.
overall
language,
and
it is
it
errors
%o a function.
termination
objects.
program to translate machine
of
The guestion of what
2. DEVELOPMENT
the
of the size required
~cst and assertion comments
the
"consistent".
correctness,
consistency,
to be free of clerical
guesticns
made tc insert pre,
for
like
the wrong number of arguments
subtle are
The
in establishing
some property
programs
developed
to than
manipulations of, for example, modelling in the object state.
the
420
The
abstract
state
of
the
source
definition
permit a large range of implementations. with
a
particular
designer ~ s task particular
machine
is
to
object
might be a
developing
something
of ref.
in view, to be more specific. The
find
machine,
development
concrete
realisations,
for
abstract
the
multi-stage
process
version
properties example
a
in
on
state. the
case
his This of
like a n ~NV to a d i s p l a y model like those
[7]. E a c h stage of d e v e l o p m e n t
concrete,
was chosen to
The time has now come,
of
an
object.
not possessed
by
the
proTides This
more
a
new,
more
model
will
have
abstract
object.
For
list has an o r d e r i n g p r o p e r t y not present in a set.
For this reason,
the a p p r o p r i a t e style to document the r e l a t i o n
believed
to express the c o r r e c t n e s s is from
abstract~
Thus,
(more)
c o n c r e t e to
if Z is a set and modelled by a list L, the set
is r e t r i e v e d by -
retr-$ {L) =
{L[i] ) !- pre-oP(q) and
the
required
{t_~rue,f~!se}
= (3#') (G [OP] ,')
other
cf
which
specifies the input/output
post-OP
: Z Z ->
pre-OP(#)
{tr_ue,false}
^ w [OP] #' = post-OP(#,#')
by -
If both of these conditions hold the fact is recorded pre-OP These
relation
for the operation -
two
post-OP
predicates,
the meaning of OP
then, record everything
(there is of course
necessary
about
nothing about performance
etc.). Suppose
a
proceed?
The problem is decomposed
specification
is
given in this form, how does one
a set of
(simpler)
operations,
to sub-problems
on
the sub-operations
the
be
true
of statements
that time they provide
will be recorded
The
most
languages execution
trivial is of
to the
way
of
separate first
execution of the second. such a combination
in the language
specifications
is
to
with be
of the two operations
in exactly assumptions
work.
two
operations
":"
showing
immediately
The conditions
necessary -
The
being used. Until
for further
combining them
could be
specification.
the sate style. Eventually all of the sub-operation will
choosing
which, if one had them,
combined in some stated way to fulfil assumptions
by
in most
that
the
followed to show
by
that
428
pro-OPt
post-OPl
pre-OP2
post-OP2
will satisfy pro
are
firstly
stated
post
that
each
pro(G1)
= pre-OP1 (=1) A post-OPt(,1,=2)
secondly
~erived
the
overall
such a sisp!e
combination,
requirement
not being suggested
should
be
accompanied
post(,1,,3)
to
In wany sit~atious
For instance,
list has been provided
can be
is proved in refo [12])~ prove
however,
true.
For
three special
with total operations
the first two lemmas are vacuously
it is certainly program
the
excessive.
cases can be applied.
relation
^ post-OP2{,2,~3)=
of these c o n d i t i o n s
appears
~X~)
input/output
-
~ post-OPl(~1,e2)
adequacy
lemmas
%hat
= pre-OP2(q2)
frow the combination
pre{#1) (The
will only be used over its
domain-
pre(,1) and
=
operation
(pre
Furthermore,
that every use of ":" in
by formal
proofs:
a
but a check
to which an appeal can be made
in
case
cf doubt. The
reader
the
next
should stage
properties
observe of
relied
onQ
that the development current providing
proof.
that a specification
development
which
Thus it is not n e c e s s a r y
cf that operation
There
is passed on to
states
is
a
complete
does
not
all
of
the
to later show disturb
split of the problem
the of
J ~stif ications.
More
methods
style
by ref. [ 12], section 9 of that paper also considers
of
combining
operations
the set could be further extended.
are defined in the same how
429
3~
many
respects
this might be the more important
forms of abstraction
being considered:
of the two
it is certainly
the
one
which is under-employed. The idea of data abstraction the earlier mappings
has, in fact,
parts of the paper.
have
both
used
an
The
language
abstraction
(i.e. that class of objects described That
this
was
alternative
necessary
of redefining
can
already
be
been used
definitions
and
of the program text
by the abstract seen
in
by
syntax).
considering
%hose functions in terms of
the
concrete
strings. If
then,
terms
it
ef
is
difficult
detailed
impossible
tc
data
tc even state the specification
representations,
it
will
write a r g u m e n t s for correctness
in
become
at such a level
of detail. The
proposal
is
that development
bring in those properties effect
on
percentage section
the
below.
represe,tafion performance The
of the data structure
algorithm.
That
cf the development
3.~
~hus
this
that
permits
can be seen
one
in
a
have
an
reasonable
the
example
is able to postpone
of
fixing the
of a data object until adequate reason
of a sub-algorithm)
interface
of an algorithm should only
(e.g. the
can be ascertained.
between operations might, then,
terms of sets or maps for example:
in
the
be described
final
code
in
linked
lists or hash tables might be the chosen representations. ~t
is
necessary to discuss
which refiDes a data doing is adding
properties
has an ordering
property
possible
what goes on in a development
representation. to
not
Essentially
the data structure present
in
the
what
step is
(e.g. the list set).
to re~iev~e all of the data of the abstract
the more concrete.
one
It
is
level from
430
Suppose
an
o p s r a t i c ~ ~n states of c l a s s D has been used, s u c h
that -
prod postd
and one now wishes to show that -
pree poste is
an
adequate
simulation.
If
is
sufficient
to
find
a
r e l a t i o n s h i p b e t w e e n the two state c l a s s e s -
retrd : g -> D w h i c h shows that OPe works on a wide enough class of states -
pred(d)
and
that
retrd)
the
ne~
operation
produces states m a t c h i n g
those produced by the old o p e r a t i o n
prod(d)
(This
^ r e t r d ( e ) = d = pree(e}
^ retrd(e)=d
~ction
differs
O p e r a t i o n s are Dot,
(under
-
^ poste {e ,e ') = postd(d,retrd(e'))
from
that
in
ref.
[ 18]
in that the
necessarily, functions).
3j___~le___o_f_~_xa_re_s si o_n_C_o_~ t_ii_on
The
input/output
[elation given for trans-expr in section
defined %o operate conveniently linear form consideriDg
cn
chosen
objects
parsing and
consider a reverse-~olish first p a r s e -
type
"expr".
These
2 is were
to be tree r e p r e s e n t a t i o n s of %he o r i g i n a l
(presumably infix). the
of
Without going all of the way to
tokenising of an e x t e r n a l string,
text which might result from
such
a
431
c-expr
::= c-inf-expr
c-inf-expr c-var-ref c-const The
~ c-const
::= c-expr c-expr c-op ::= ...
::= ...
relation
retrieve
~ c-Tar-tel
of
this to the class expr can be specified
function
which
hy a
uses a stack -
retr-expr (tl) =
!_o_r i = ~ _to ! t l ~o (is-c-var-ref (tl[i ]) -> ~ush (retr-var (tl[i ])) is-c-const (tl[i ]) -> p.ush (retr-ccnst (tl[i ]) ) is-c-op (till ]) -> (!_e_t e2: p_q~; l~t e1:~o~; 9.~sh(mk-ex~r (el,retr-op (tl[i ]) ,e2) ) ) ) ; result type: Not
is
(~o~)
c-expr -> expr
only
criterion suggestive
does
this
retrieve
for the following o f a way
to track
function
translate
give the correctness
function,
the temporaries,
the stacking
is
432
Assuming
an external
variable b -
trans-c-ezpr (t i) : _~_~_r i = I t c _! t l do (is-c-var-ref (tl[i ]) -> (b:=b+
I;
assign (rib ],contents (retr-var (tl[i ~ ) ) ) is-c-const (tl[i ]) -> t;
(b : = b +
assign (t[ b 3, retr-const (tl[i ])) )
is-c-op(tl[i])
->
(assign (t[b-1 ].appl y-op (contents (t[ b- I), retr-op (tl[i ~ , contents (t[ b]) )) ; b := b-l)
type:
c-expr =>
The correctness
ca~ now be proved
if -
b := O; trans-c-expr(tl| trans-expr (retr-expr (tl) , I) This
result
follews
from a Froof, by induction
{and possible c c n s t r u c t i o n s
of)
on the length
tl, of the stronger statement
-
b := k; trans-c~expr(tl) leaves b = k + I and creates
the same as
frans-expr (retr-expr (tl) ,k+ I)
The
rather
short
leave the reader Whilst
the
correctly
unclear as tc
notation
made the
development
treatment of Formal
via
used
step
in
from
operations,
he,
a
ref.
Development bigger [12]
development the
example
offered
example
may
looks.
is t h o u g h t to have via of
functions
to
that report
is
433
unconvincing. was
so
This is mainly because the a l g o r i t h m
oriented
to
arrays
considered
that the use of an abstract data
r e p r e s e n t a t i o n is somewhat artificial.
The
example
ref. [11] is more i n t e r e s t i n g with respect to
of
data a b s t r a c t i o n and a
short
outline
of
a
rewrite
of
its
development is now given -
Specification:
find
a
(general,
table
driven
recogniser)
algorith~ -
REC
: grammar nt s y m b * - >
{_Y_E_S,~O}
Where the abstract form of a grammar is -
grammar : nt -> rhs-set rhs = el* e! = symb I nt
The
pro-condition
n o n - t e r m i n a l and sentence
defines that
that
there
non-terminal.
there
is
The
are
exactly
rules
one
post-condition
should yield "YES" if and only if produced from the given gra$mar.
the
for
rule
states
symbol
each
for
the
that
R~C
string
can
be
("Produceable" is defined).
Step I: Splits the problem into an input stage which stores the grammar:
a main stage which c r e a t e s
"State
contain
information on all possible top-down parses:
stage which yields YES or NO depending on a state
sets.
terms of the the
which
will
an output
predicate
of
the
The storage for the grammar is still s p e c i f i e d in (abstract)
programmer
map.
assigned
This may be
the
task
a
disappointment
to
of c o n s t r u c t i n g the input
routine since he has little to work on yet. fixing
Sets"
But
the
cost
of
this interface for his c o n v e n i e n c e is that the far more
time c o n s u m i n g parsing operation has not yet been developed far enough to get h~s
views on an e f f i c i e n t representation.
The state sets are a l s o described a b s t r a c t l y
(as a list of sets
of tuples)
is
certain
since the purpose of this
stage
to
show
that
upper and lower b o u n d s on the state sets are s u f f i c i e n t
434
to make the final Fredicate correct. of abstract
definition.
Notice this
give~ bounds and different a l g o r i t h m s could be use
this
freedom.
(In
fact
the
considerable use in considering
Scanning] the
which generate nee stales. ~inimum
constructed
specification
to
has been of
(Prediction,
Completion,
The state sets are defined
sets satisfying a certain equation.
are showP to fall
form
optimisations).
Step 2: introduces E a r l e y ' s operations
as
extreme
There is a great deal of freedom in the
within the bounds stated in
Step
Such sets I.
Notice
that not o~ly are these operations defined in terms of abstract data objects, distinction
they are also
implicitly
Step
~:
not
Begins
%c
restriction would,
algorithm
4:
Makes
is not
representing ~t
this
rhs)
the
Using
~t
algorithm
the restriction.
similar ordering step to the data structure
grammars.
what the common oFerations on the
Now is the time to give
REFEP
abstraction [S].
complete.
to the allowable grammars.
the c o n c r e t e
(In fact those used were quite complicated the
lists
mcint most of the algorithms as such, is designed and
it ix clear are.
yet
have been possible to use a different
a
of
on lists it is necessary to iDtrodnce a
(no zero length
however,
by mapping state
object in a yon NeQmann machine,
at _this stage of development and avoid
Step
in
development
But notice that, since a list
a convenient data
chosen
this form of
ccnsider representations
step to a concrete representation the
is
later.
sets onto state lists. is
This
to ref. [8 ] in which the a l g o r i t h m s are programmed
with operations on the a b s t r a c t data: is employed
defined.
option.)
Doing
data
(~L/I)
structures
data objects!
HASEE variables with
this prompts a macro style of data
like refo [5] which is similar to that used in ref.
435
4. SU~M~RY
The
aim
of the paper has been to show how a large problem,
this case the development of a compiler, can be decomposed small
steps.
Providing
each
step is adequately
complete design history is thus
obtained.
One
in
into
documented, of
the
a
views
expressed is that each stage of development should be supported by a justification. recorded
in
a
~his
notation
correctness argument. with
a
view
implies
to
cn
Such
that
which
it
correctness
steps
of
design
are
is nossible to base a arguments
are
sought
human readers rather than mechanical theorem
checkers.
The
key
to
making
ahstraction. minimum
properties
more difficult to provide
such
a
an approach practical is the use of
In each of the sections the value of stating has find
been shown. an
construction,
the
Although it is frequently
~FFropriate
abstraction
than
to
the a d v a n t a g e s of the former make the
effort worthwhile.
By
emF]cying
both data and operational abstraction,
the development are
rea]isations
of detail. argument
is based on showing how the from the
it is possib]e
The
of the same algorithm at ever greater
Taking this view, the normal
he retrieved
operations
design
(more)
of
levels
correctness
abstract model can
(more) concrete realisation.
In this
of
a
way
design language to be able to specify
imp!ici%ly and tc ~se very abstract data objects
different
particular
style
tc avoid general equivalence proofs.
requirements
very
a view of
process is obtained where the s u c c e s s i v e stages
from these of ~rogramming languages.
netatiGn
language,
(that of ref.
are
A!tho~gh a
[4]) has been employed as the
it should be emphasised
that it is the method
not a Farticular notation which is being proposed here.
436
Ackncwle~sement
~ost
of
the
ideas
contained in this paper were developed
co]]ahcraticn
with
se~bers
of
Laboratories.
The members and
the
Hsrs!ey
and
vienna
i~ IBM
meetings of IYIP WG 2.3 have also
been a great stimulus.
~eferences
[I]
C.9.%]]en~
9.N.Chap~an
Definition
of Algol 60",
12.1~5 August
[2]
[Ed.)
on
~ormal
Report,
TR
Algorithmic
Languages"
Lecture
Notes
in
1979.
H.Beki&~
P r e s e n t a t i o n on "Semantics of Actions" given at
Newcastle
University,
September
197~.
Ho~eki£ et a] "A Formal ~efinition of PL/I" to be printed
A.Hansal,
Re~ort Gf T ~
"Soft,are
Laboratory Vienna.
Devices for Processing Graphs Using
Facilities",
W.~e~hap]
a~d
Statements
in the VDL ~, I ~
March
[7]
of
SF~iDger-Verlag
IS8, October
PL/I c o m p i l e - t i m e
[6]
IB~ Hursley T e c h n i c a l
Semantics
E.Engeler,
as a Technical
[5]
"A
H.~ekie and K. Walk, " P e r m a l i s a ~ i o n of Storage Properties"
~ a t h e m a % i c s No.
[~]
C.E. Jones,
1972.
i~ ~'Sym~osiu~
[3]
and
C.B. Jcneso
Into Proc Letters.
1974.
~'On the Interpre%ation of Gore Vienna
Note,
LN
25.3.065,
1970.
~.He~hapl
and
C.B. Jones,
Possible Implementations, IBM Vienna Tech,ical
"The with
Hlock Proofs
Report, TN 25.104,
Concept and Some of
~quivalence",
April
1970.
437
[8]
C.A.R.Hoare,
"Proof
?epresentations",
ef
Correctness
Acta Informatica,
Vo!.
of
1,
pp
Eata 271-281,
1972. [9]
C.B.Jo~es
and
Implementation Algorithmic
P. Lucas,
Techniaues",
Languages"
"Proving
(Ed.) E. Engeler,
Lecture Notes in Mathematics [10] C.B. Jones,
"Sufficie ~t
Correctness:
Assignment
Correctness
in "Symposium
on Semantics
Springer-Verlag
No. 188, October
Properties Language",
of of
for
1970.
Implementation
IB~ Hursley
Note,
TN
9002, June 1971. [11 ] C.B.Jcnes,
"Formal Development
Example Based on Harley's S!GPLAN Conference,
of Correct
~ecogniser",
SIG~AN
Algorithms:
presented
at
An AC~
Notices Vol. 7, No.l, January
1972. [12] C.B.Jcnes,
"Formal Development
Technical
Deport,
[13] P.J.Landin,
"R
Correspondence
Church's Lambda-Notation: No.2, February [14] P.Lucas,
of Programs",
Constructive
"On
Development
[16] P.Lucas
Program
Realisations
cf
I mple me nfa tions",
Press,
[17] J.McCarthy, Computation"
60
ACM,
and
Vol.8,
of
the Block
Vienna
Technical
and
the
presented
Stepwise at
IB~
1972.
K. Walk, "On the Formal Description
in Annual Review in Automatic Pergamon
IB~
Correctness
at Pisa University,
and
Algol
1965.
"Two
Conference
Between
Part !", Comm. of
Co,cept and their Equivalence", Report, TR 25.085, 1968. [15] P.Lucas,
!~M Rursley
TR 12. 117, June 1973.
Programming,
Vol.6,
of PL/I" Part 3,
1969. "Towards
a
Mathematical
presented af !FIP Congress
1962.
Science
of
438
[18] R.~il~er,
"An Algebraic
Programs"~
Stanford
[ Ig] P.gosses,
"~he
Definition
University
~athematical
Oxford University Computing
of Simulation
AIM-I~2,
February
Semantics
Laboratory,
of
Between 1971.
Algol
PNG-12,
60",
January
197g. [20] Po
Naur,
"An Experiment
pp 3~7=365, [21 ] J.C.~eynolds,
Languages",
Conference,
August
Languages",
Computers
Series
Brcok!yn,
1971.
[23] C. Strachey,
presented
~icrowave
Vo!.21,
"~bstract
PL/I", TBM Vienna Technical [25] "PL/I BASIS/I"
National
ACM
A
Semantics
of the Symposium
Research
Polytechnic
,'Continuations:
al,
for Higher-Order
25th
"Toward a Mathematical
which can deal with Full Jumps", et
at
in "Proceedings
and Automata",
Symposia
[2~] K.Wa~k
HIT 12,
1972o
and C.Strachey,
for Computer on
Eevelopment",
~'Definitic~al I,terpreters
Programming
[22] DoScott
on Program
1972.
Institute
Institute
Mathematical
of
Semantics
unpublished.
Syntax
and Interpretation
Report, T~ 25.098,
ECMA ANSI working document,
of
lg69.
February
19g~.
439 A~pendix This appendix c o n t a i n s a definition merging
the separate
written
in a style
mathematical
features
of the language obtained
of section
I.
(cf. the tlpe clauses)
The definition
is
which can he read
as
sesantics.
B___STIR A ]%CC~. SYNTAX prOC
:: s-nm:id
St
=
as-st
:: s-lhs:id
goto-st
:: in
call-st
:: s-pn:id
as-st
s-parms:id~
~ gore-st
s-args:id~
:: nmd-st*
nmd-st
:: s-ns:[id S s-body:st
expr
=
inf-expr
:: expr o~ expr
vat-tel
:: id
const
:: INTG
in
}
op
)
~ vat-rot
/'args dcl,
~ const
not further specified
INTG )
DOMAINS
ENV: i d - > S: L O C - >
(LOCI
v~n = I ~ T G
PROC-DEN: AB.
= [ id ]
~ROC-DEN)
VAL
LOC = infinite
cpd-st
~ cpd-st
s-rhs:expr
cpd-st
inf-expr
proc-set s-dcls:id-set
| call-st
set
I ! (LOC I PROC-DEN}~
-> (S -> S ABN)
bT
proc or parma/
440
FUNCTIONS
e v a ! - ~ r o c - d c l (proc) (env) = le_t < i d , p a r ~ - l , ~ r o c s , d c l s g m k - c p d - s t ( n s - l )
> = proc;
!e_t f~den:!)= {!_~! env'
: e~v + ([ par,-![i ]
den-l[i]
~ 11
an= a o ( a 2 ) { n - 1 ) / 2
Somit e r h a l t e n w i r f a l l s n=l f a l l s n gerade sonst
a n = ~(a2) n/2 I ~.(a2) (n'1)/2
(5)
oder umgeschrieben
(5 ~ )
1)
proc p2 = (!~ a, N n) M: if n:l then a elsf
even n then p 2 ( q u a ( a ) , n/2) else a.p2(qua(a), (n-l)/2)
fi,
mit proc
I)
qua = (M a) M:
a.a
.
Man b e a c h t e , dab der 0bergang von (4) nach ( 4 ) nach ( 5 ' )
eigentlich
nur o r t h o g r a p h i s c h e r
bzw. yon (5)
Natur i s t .
455 Wir haben h i e r
e i n e Form d e r P o t e n z b e r e c h n u n g ,
gewendet w i r d ,
da s i c h d e r T e s t ,
die Division gesetze
durch
2
leicht
ob
n
die
h~ufig
geradzahlig
realisieren
lassen.
(P1) und (P2) geben aber Raum auch f u r
ist
anund
Die Rechen-
andere Ver-
fahren:
(6)
p r o c p3 = (M a, N n) M: if
n = I
then a then q u a ( a )
elsf
n = 2
elsf
n mod 3 = 0 then c u b ( p 3 ( a , n / 3 ) )
elsf
n mod 3 = I
then a . c u b ( p 3 ( a ,
(n-l)/3))
else qua(a).cub(p3(a,
(n-2)/3))
fi
mit
proc cub : Auf e i n e
n~here D i s k u s s i o n
gegenUber nicht
von (6)
vorteilhafter
sein
NatUrlich
p2 und p3 ~ q u i v a l e n t
Funktion Gibt
(5')
n~her e i n g e h e n .
ten p l ,
wir
(~ a) M: a . q u a ( a ) .
zum Axiom
M
(A)
Frage, wir
warm (6)
hier
Rechenvorschrif-
dab s i e
dieselbe
ein
Einselement,
dann haben
das Axiom
3eEM: ¥xEM: x . e = e . x : x
Das E i n s e l e m e n t
ist
ein zweites
, dann f o l g t
e'
eindeutig
e.e'
= e, a b e r auch
e.e'
= e', a l s o
e
=
bestimmt,
denn g i b t
es neben
e
e'
Somit k~nnen w i r
mit
Hilfe
gew~hlte Bezeichnung fur eins :
Menge
sind die drei
i n dem S i n n e ,
es nun i n d e r H a l b q r u p p e
Gilt
wollen
repr~sentieren.
zus~tzlich
(E)
oder a u f d i e k~nnte,
Axiom NU{O}
eines dieses
~eEM: VxEM: x - e :
(E)
, so kann d i e
Kennzeichnungsterms eine freiEinselement e.x :
Definition
der nicht-negativen
einfUhren
x.
d e r Potenz a u f d i e
ganzen Z a h l e n a u s g e d e h n t wer-
456 den, was b e k a n n t l i c h a° = e i n s
setzt.
Halbgruppe ist, nition
die
die erstere
letztere.
wobei
Wir
Struktur
dab d i e D e f i -
kompatibel
ist
mit
bekommen a l s o d i e R e c h e n v o r s c h r i f t
p
n = 0 then e i n s e l s e
mit
pl
~
p2
oder
p r o c ( M , N ) M p = p2.
p(a,n)
p3
p
bei
im S i n n e e i n e s d e f e n s i v e n ~ d . h .
werden kann,
auf eine b e r e i t s
d er K o n s t r u k t i o n die
mes k o n s e r v i e r e n d e n , P r o g r a m m i e r s t i l s Auch im H i n b l i c k
fi,
identifiziert
Der R U c k g r i f f
handene R e c h e n v o r s c h r i f t
Korrektheit
sollte
I d e e kommen, etwa a n s t e l l e
sehr z w e c k m ~ i g
man n ~ m l i c h von
p2
yon
mit
q
kann
sein.
ist
d i e s e Vor-
sp~ter wirklich p3
vor-
e i n e s Program-
auf die ~nderungsfreundlichkeit
gehensweise ratsam, fen,
Einselement eine
p r o c q = (~ ao NU{O} n) M: if
z.B.
da~ man
Halbaruppe mit
w i r d man d a r U b e r h i n a u s f o r d e r n ,
der Potenz f u r
der f u r (7)
i n der Weise q e s c h i e h t ,
Da nun j e d e
arbeiten
auf die zu w o l -
so kann d i e ~nderung d a d u r c h g e s c h e h e n , da~ man d i e
Iden-
tit~tsdeklaration p r o c ( H , N ) M p = m2 durch proc(M,N)M p = p3 ersetzt. Nimmt man nun zu den Axiomen
(A)
und
(E)
die
E x i s t e n z des
Inversen dazu, (I)
VxEM: ~yEM: x . y
dann e r h ~ I t
= eins,
man d i e S t r u k t u r
y = yo(x.y') eindeutig
= y.x
bestimmt
zeichnungsterms die
= (y.x)'y ist,
w~hlte Bezeichnung f u r wie bei
Gruppe. Da das I n v e r s e wegen
~ = y'
k~nnen w i r
Intention
seiT~inverses z u o r d n e t , dabei ~hn!ich
einer
formal
wieder mit
der F u n k t i o n ,
Hilfe
eines
Kenn-
d i e jedem Element
niederschreiben
und e i n e f r e i g e -
diese Funktion einfUhren.
Wir v e r f a h r e n
d e r E i n f U h r u n g der B e z e i c h n u n g
eins
457 proc i n v e r s :
(Mx) M: IyEM: y . x = x . y = eins
Wir nennen d i e s e D e k l a r a t i o n diese Deklaration angibt,
nicht
tionswerte.
lediglich
Entsprechend
genschaft eines Objektes, aber k e i n V e r f a h r e n
Menge
M
(die ja
ist
eins
auch d i e D e k l a r a t i o n intentional,
Z
da d o r t
d i e es e i n d e u t i g
der Funkfur die freizwar e i n e E i -
bestimmt,
zur Auswahl d i e s e s Objektes
sowieso n i c h t
procr
aus der
Gruppen a u f d i e Menge
d e r ganzen Zahlen u n t e r Wahrung der K o m p a t i b i l i t ~ t
(8)
angegeben
k o n k r e t gegeben i s t ) .
Die U b l i c h e E r w e i t e r u n g der Potenz f u r dann s o f o r t
, weil
d i e E i g e n s c h a f t der F u n k t i o n s w e r t e
aber e i n V e r f a h r e n zur K o n s t r u k t i o n
g e w ~ h l t e Bezeichnung ist
i n t e n t i o n a 1
liefert
die Rechenvorschrift : if
(Ma,Zn) M: n