Programming Methodology: 4th Informatik Symposium, IBM Germany Wildbad, September 25–27, 1974 [1 ed.] 3540071318, 9783540071310 [PDF]


192 81 17MB

English-German Pages 502 [504] Year 1975

Report DMCA / Copyright

DOWNLOAD PDF FILE

Table of contents :
On the development of systems of men and machines....Pages 1-10
A new look at the program development process....Pages 11-37
Organizing for structured programming....Pages 38-86
The reliability of programming systems....Pages 87-113
Fehleranalyse und Fehlerursachen in Systemprogrammen....Pages 114-160
APLGOL a structured programming language for APL....Pages 161-191
Systemprogrammierung aus der Sicht der Universitaet....Pages 192-202
Systemprogrammiersprachen und strukturiertes Programmieren....Pages 203-224
Software engineering or methods for the multi-person construction of multi-version programs....Pages 225-235
Knowledge and reasoning in program synthesis....Pages 236-277
A new approach to program testing....Pages 278-290
Interprocedural analysis and the information derived by it....Pages 291-321
Neue Verfahren zur Optimisierung und Parallflisierung von Programmen....Pages 323-346
Automatic programming....Pages 347-361
Nonprocedural programming....Pages 362-385
Formal definition in program development....Pages 387-443
Programmierte Strukturen....Pages 444-465
Axiomatisierung von Programmiersprachen und ihre Grenzen....Pages 466-476
Formalization, history, present, and future....Pages 477-501
Papiere empfehlen

Programming Methodology: 4th Informatik Symposium, IBM Germany Wildbad, September 25–27, 1974 [1 ed.]
 3540071318, 9783540071310 [PDF]

  • 0 0 0
  • Gefällt Ihnen dieses papier und der download? Sie können Ihre eigene PDF-Datei in wenigen Minuten kostenlos online veröffentlichen! Anmelden
Datei wird geladen, bitte warten...
Zitiervorschau

ON THE DEVELOPMENT OF SYSTEMS OF MEN AND MACHINES H. D. Mills,

IBM Corporation

and The Johns Hopkins University

With Appendix by R. C. Linger,

IBM Corporation

ABSTRACT

We formulate the development programming

of systems of men and machines

problem for multiprocessing

are men and some are machines. courses,

etc., are determined

as machine specifications, of the total operation. miniature

in which some processors

In this way, users guides,

training

from processing requirements,

just

which are consistent with the objectives

An Appendix illustrates

for a supermarket

as a

this idea in

checkout operation.

Systems of Men and Machines

Our topic is the architecture, large s y s t e ~

implementation,

of men and machines

enterprise -- managing a business, tion system,

women)-tenders;

operating an airline reserva-

running a government agency,

and back, etc.

and operation of

in some definite and coherent

getting men to the moon

In such systems there are many kinds of men (and

managers,

clerks, specialists

of various kinds, and machine

there are also many kinds of machines -- computers,

minals, sensors,

Ordinarily

actuators,

communications

computer programming

of the system, and programmers

equipment,

that modern principles necessity

etc.

is regarded as part of the machine side as part of the machine tenders.

We bring a different view -- that the architecture tions of the enterprise

ter-

is programming, of programming

as well.

of the operaOur thesis is

-- forced on us out of

in dealing with machines of much logical capability but no common

sense -- can play as vital a role in bringing systematic

discipline and

standards into all phases of large systems development.

For this point of view we escalate the concept of programming providing comprehensive The operations

instructions

to that of

for either man or machine activities.

of an enterprise becomes a multiprocesssing

operation of men

and machines;

then the architecture

tions and types of men and machines direct each type of man and machine. one part of a cooperating

of the operations in the system,

defines the configura-

and the programs which

For example, a users guide becomes

system of programs operating in separate processors

(the user and a machine).

Of course~

the characteristics

of man and machine are quite different

in

such a system, just as machines are different among themselves.

And yet

their architectural

For example,

properties

can be treated in a uniform way.

in deciding on a particular machine requirement,

various considerations

of

physical or logical capability will arise, as well as whether the requirements can be met with off-the-shelf apply to a man requirement,

equipment;

these same considerations

in common terms -- e.g., how many letters can

a postal clerk sort in an hour, and can this be done with minimal

training,

etc.

It is well understood things.

that men and machines do well at quite different

Machines are good at doing what they are told to do, very rapidly

and accurately. instructions

Men are good at using common sense -- even disobeying

when they are obviously misconceived;

-- discovering

at pattern recognition

information by no special or dependable process;

invention -- creating a new idea for the enterprise to act on. but very important

form of pattern recognition

speech to and from machine readable text. activities

and at One simple,

is the translation of human

This is routine for clerical

that interface with persons outside the enterprise -- e.g., in

an airline reservation system.

It may be asked how the concep~ of programming so differently and unpredictably,

applies to men, which operate

compared to machines.

noting that programs are used in a local way by machines moment,

under these conditions,

do this next.

It applies by -- i,e., at this

A good deal of programming

on the man side is already subsumed under general instructions

and common

sense -- if the telephone rings, answer it; if you want to execute a program, keypunch a job deck and submit it to the operator. of explicitly programming

human activities

We have no intention

now done by general instructions

and common sense; the typical level of human programming which is found in users guides, operating with machine operations.

instructions,

envisioned

is that

ere., associated

Principles of Pro~rammin~ for Men and Machines

The recent, twenty-five year, history of computer programming has seen an explosive and traumatic growth of human activity and frustration in attempting to realize the promise and potential of electronic and electromechanlcal devices for processing d~ta in science, industry and government.

Out of

this history has come the stored program computer; programming languages, compilers, and libraries; and new technical foundations for programming computers, e.g., as propounded by McCarthy [ 4], Dijkstra [ i], Hoare [ 3], and Wirth [I0].

In this period the computer has been the center of attention.

In the beginning, the numerical computing power was so great, compared to manual methods, and the availability so limited, that men readily adapted to this new tool -- from decimal to binary, from equations to algorithms. But in a short time, the remarkable possibilities for more general data processing (nonnumerical) were realized, and a new industry was horn in just a few years.

In the later part of this period (up to now) the large

data processing systems appeared -- management information systems, airline reservation systems, space tracking systems, etc.

Even then, although

human factors were considered, these systems were conceived primarily as data processing systems, which responded to users.

But in our proposed

perspective, the users are as much a part of the multiprocesslng enterprise as the machines and their programs.

Thus, whereas the computer has forced us to find more effective programming principles than we would otherwise have, it has also warped our sense of perspective. focus.

By this time, simple human factors questions are in better

There is enough data processing power available to invest part of

it in creating more human-like interfaces than binary and machine code. But in extending programming to instructing men, with their entirely different characteristics, additional ideas and principles are needed, such as

1.

Languages.

The programming languages used in the architecture of

systems of men and machines need be near natural languages.

The difficulties

of processing natural languages are well known and that is not proposed. Rather, what is proposed is a "naturalization" of processable programming languages which is close enough for use as a dialect of natural language by nonprogrammers of the enterprise. both sides of this.

Sammet [ 8] and Zemanek [ii] discuss

2.

Procedures.

The concept of procedure should be extended to include

indefinite procedures where it is not possible or desirable to define them. In simple terms, "find values for these variables so that these equations hold" (Wilkes discusses this in [ 9]), or more complex, "make sure no one's feelings are hurt".

3.

Interactions.

The principal subject of the architecture is multi-

processing -- the conduct of the operation of the enterprise through programs distributed to men and machines.

The creation of such programs in an orderly,

systematic way will require a new and fundamental development beyond programming principles for synchronous operations.

The idea of the Petri net [ 7]

may be an embryonic step in such a development.

Dennis illustrates

Petri nets as multiprocessing control mechanisms, in [ 2].

Architecture P r i n c i l ~

A system depends on its components.

The architecture of a system specifies

the types of men and machines required~ as well as how they are to interact as a system operation, i.e.~ the selection and arrangement of the system components, as well as detailed instructions for their behavior.

In the

case of machines, the usual considerations apply -~ define feasible requirements, either already embodied in existing maehines~ or possible with special development where justified; tradeoffs and comparisons with alternative approaches, even with manual approaches where feasible~ etc. men, the system architect is frequently shortsighted.

In the case of

There is a reason; it

is much more difficult to predict a human performance than a machine performance.

Sometimes the human fails to live up to a requirement.

But often

the human will exceed a requirement in a totally unexpected way, by acquiring a skill not imagined possible beforehand.

This is happening with computer

programmers right today, who are beginning to program with a precision not believed possible five years ago.

It happened with typing when touch typing

was introduced early in this century.

In retrospect, it is easy to identify a pitfall in overestimating machine possibilities, leading to a common scenario in many large systems in recent history, which put a "man in the loop" at the last moment, with marginal operational results.

In such a case, the operation was originally planned

as completely automatic, depending on some key algorithm (often involving some form of pattern recognition); as a result of the planning, the data processing functions surrounding the algorithm were developed in parallel

according to a general system design; then at the last moment, the performance of the key algorithm proves inadequate, and the man is brought into the loop, with two costs: 1.

The human factors are bad (e.g., interacting with programs which

require long, fixed argument lists); these can be fixed up.

2. the loop.

There is a lost opportunity in not having a better trained man in The effort on algorithm development is frequently different than

that required for insight development for a human executing an indefinite procedure, and the time is lost, anyway.

These operational experiences lead to the following principles of systems architecture:

i.

Components.

Regard men and machines as equal status components

for system operations with equal requirements for development, state-of-art projections, and improvement, according to their own characteristics.

2.

Evolution.

Plan on the unexpected, by well-structured interfaces,

that permit the replacement of components by improved versions which perform identical functions more effectively.

Parnas [ 6] deals with this inter-

changability by axiomatizing such interfaces.

3.

Intesrity.

Value system integrity above all else, by requiring

that the multiprocessing operation of men and machines be described and scrutinized according to the best principles of programming -- particularly with respect to methods of specification and validation of programs.

Note

especially the technique defined by Wirth [I0] and Mills [ 5].

In an Appendix dealing with a minature problem, R. C. Linger illustrates the idea of programming an operation through a set of abstract processes which only later are specialized to either men or machines, depending on their requirements.

It is an easy transition from man as a user to man as

a processor and yet it seems a critical one in providing system coherence and integrity with respect to a given operation.

Literatur

[i]

O. ~J. Dahl, E. W. Dijkstra and C. A. R. Hoare, Structured Programming, Academic Press, London. 1972.

[2]

J. B. Dennis~ ~Concurrency in Software Systems", Lecture Notes in Economics and Mathematical Systems, 81 Advanced Course on Software Engineering (Ed. F. L. Bauer), Springer-Verlag, Berlin, Heidelberg~ New York.

[3]

C. A. R. Hoare~ ~An axiomatic basis for computer programmlng"" CACM 12.

[4]

1973.

i969.

pp. 576-580, 583.

J~ McCarthy, "A basis for a mathematical theory of computation", Co___mputer pro~rammin$ and Forma! Systems, (Eds. P. Braffort and D. Hirschberg)~ North-Holland Publishing Company.

[5]

(To appear).

D. L0 Parnas, "A technique for software module specification with examples", CACM 15, 5.

[7]

pp. 33-70.

H. D. Mills, "The new math of computer programming", CACM 17, 1975.

[6]

1963.

May 1972.

pp. 330-336

C. Ao Petri, Communications with Automata.

Supplement i to

Technical Report KADC-TR-65-377, Vol. i, Griffiss Air Force Base, New York.

1966.

(Originally published in German:

Kommunikation mit Automaten,

[8]

J. E. Sammet, "The use of English as a programming language", CACM 9, 3.

[9]

March 1966.

pp. 228-230.

M. Wilkes, "Constraint-type statements in programming languages", CACM 7.

[i0]

University of Bonn, 1962).

1964~

pp. 587-588.

N. Wirth, Systematic Programming: Hall, Englewood Cliffs, New Jersey.

[Ii]

An Introduction, Prentice1973.

H. Zemanek, "Semiotics and programming languages '~, CACM 9, 3.

March 1966.

pp. 139-143.

Appendix

SUPERMARKET

CHECKOUT AS A MAN-MACHINE MULTIPROCESSING

ACTIVITY

R. C. Linger, IBM Corporation

Consider the problem of specifying a design for a super market checkout operation.

Several possibilities

and cashbox,

come to mind; a man with an adding machine

a man with a cash register,

encoded prices,

etc.

Whatever

design with a procedural

a man with an OCR device which reads

the final configuration,

we can begin our

description of the checkout process itself.

view, the dynamics of the process are of central interest.

In this

That is, we do

not begin with a static description of components which must somehow fit together in a presumed system~ but rather begin with a process which can be shown to work, and derive required

(and possibly alternate)

urations of men and machines from it.

component config-

Our tentative first refinement

is:

start checkout open checkout station at 9 a.m. do while before 5 p.m. checkout next customer,

if any

od close checkout station stop

Here, the actions of men and machine are abstracted;

we cannot determine

which does what, but can agree that the description seems reasonable so far. Our second refinement might be: start checkout establish man on duty at 9 a.m. power up machines at 9 a.m. do while before 5 p.m. accept next customer,

if any

total cost of all items inform customer of total accept payment present change, if any, to customer bag all items od balance cash total with sales total power down machines stop

Here the functions of man and machine begin to emerge; presumably

they will

cooperate to "total cost of all items" through manual keystroke entry or OCR input, and the man will "bag all items," without mechanical identify

"before 5 p.mo" as a man predicate,

for clockwatching,

We

which hopefully yields to common sense to complete checkout

of customers waiting in line at quitting timel processing"

assistance.

in view of the human propensity

arises in this procedure,

coffee break is desired~

The possibility

of "exception

as when the machines break down, or a

These situations

are analogous

to the interrupt

handling facilites of modern computers.

An aspect of the activity distribution between man and machine can be explored through elaboration

of the process to "total cost of all items:"

start total cost of all items initialize subtotals

to zero

do while items remain if item type is produce then add price to produce subtotal else if item type is meat then add price to m e a t

subtotal

else add price to grocery subtotal fi fi od add subtotals

to find total cost

stop

The cooperating process here~ expressions

give and take actions of man and machine appear as an abstract

"items remain" is likely a man predicate,

appear to be machine predicates

set by man.

lations are best handled as machine functions.

while the "item type" The subtotal accumu-

We can establish concrete procedures for man and machine by defining an interface between them.

In illustration, consider an electromechanlcal

cash register with control keys as follows:

0, 1,...,9

price digits decimal point

I

initialize (for new customer)

P

produce

M

meat

G

grocery

T

total

For this interface, the man procedure becomes

start total cost of all items -- man part push I do while items remain get next item push digit and decimal keys for price i_~fitem type is produce then push P else i f item type is meat then push M else push G fi fi od push T stop

and the corresponding machine procedure is:

10

start total cost of all items -- machine part (I key depressed) set produce subtotal to zero set meat subtotal to zero set grocery subtotal to zero do until T key wait for TIPIMIG key i f ~ T key then read and clear price register I f P key then add price to produce subtotal else i f M key then add price to meat subtotal else add price to grocery subtotal fi fi fi od add produce, meat, grocery subtotals display result

A different set of procedures derive from an alternate interface, as with a cash register using OCR input, and equipped with only I and T keys.

In

this case the man procedure simplifies to

start total cost of all items -- man part push I do while items remain pass item label over OCR window od push T

and the matching machine procedure must extract both item type and price from encoded labels.

We observe that the man procedures above can evolve naturally into users guides and training courses, possibly containing instructions no machine would understand, as, "Don't be distracted while pushing price keys."

The

machine procedures give a explicit basis for electromechanical component design, or executable procedures in the case of programmable devices.

A New Look at the Program Development Process P. Hiemann,

IBM Boeblingen

This paper is a contribution to the discussion of defining a program development process frame with distinct validation points for analysing and measuring process results providing methods,

techniques and tools to support

the program development process.

O.

INTRODUCTION

In the 25 years history of professional system programming,

system

programmers have developed systems of increasing complexity in function and size. Their experience shows that it is increasingly more difficult to meet specified system integrity/quality objectives without complying to disciplined programming methods based upon a firm definition of the program development process. System programmers have identified and developed a series of programming support techniques which are mandatory for producing and validating a high quality of specifications and code. In the first section of this paper, a process frame with its validation points and some basic process measurements are described. The second section describes the methods supporting the program development process. The final section of the paper provides some proof regarding the quality of succeeding OS releases that have been improved by applying new process support techniques.

12

I.

THE PROGP~d~9,1ING PROCESS AND SOME MEASURES

Most people

think of programming

four activities Designing:

(Figure

Where

Testing_! Where

While

during

is written°

that

testing

the pieces

an application, testing

composed of

do and how it w~ll work.

it is verified

uncovered

I n t e g r a t i n g : Where create

will

Where the program

problems

as being

it is determined what is needed and defined

what the p r o g r a m m i n g Coding:

cycles

I):

the program will work and are fixed.

or modules

a subsystem,

and integration

are put together

or an entire

are closely

to

system.

related

in that inte-

gration has to take place before

larger units of code can be

tested,

if one has to go back to design

it is resource

during coding The results programs

or even testing.

of coding,

of growing

testing

to the smallest

It may be a module,

- A function a command, - A component

is composed

a macro,

is composed of functions.

- The components

self-contained

piece

of

A function may be

routine. A component may be an

or the supervisor.

are integrated

I, OS Release

2)

or a subroutine.

or a recovery

a compiler,

are operational

(Figure

of program units.

a parameter,

access method,

Release

and integration

size and complexity

- Program Unit relates code.

consuming

to form a system~

21, and so on.

as VS/2

13

About

two or three years

and statistical

ago,

information

customer was experiencing

IBM started to analyse

relating

to quality of systems.

an APAR rate that was too high.

shows the average number of all APARs in other words, customers. APARs

financial

all submitted APARs,

The

Figure

3

for OS submitted per customer, divided by the number of

The area labeled Valid is the average number of valid

submitted per customer.

A valid APAR is one that requires

a fix either in code or in documentation. valid APARs

and the total,

user errors.

largely represents

The average number of APARs

not changed significantly

The difference duplicates

between and

submitted per user has

over the past four releases,

nor has

the average number of valid APARs. Corresponding

to the high error rates the analysis

76 % of the cost for Release

only 24 % was spent on development cludes

cost related to designing

and printing publications, integration. laboratory

Maintenance

(Figure 4). Development

and coding programs,

and to performing includes

in-

to writing

all testing

and

FE cost to find problems

also indicated that the cost of finding

increased as the programming

process progressed.

and

thirty times the cost of fixing it in unit test, that the code goes through

The cost of finding As we progress

a problem during

coding

at each testing

stage. More

with the code as it moves

(Figure

5).

is virtually nothing.

field usage,

and more people

through the cycle,

(an APAR) was

the first set

in development

±rom unit test through

a problem

The cost of finding

and fixing a problem after the system was released

creases

and

costs to fix the problems.

The analysis

of tests

showed that

]8 was spent on m a i n t e n a n c e

the cost inare involved

and more people

and their work are affected when a problem occurs. The net result of the analysis was to establish that will help in: reducing

errors

finding problems

as early as possible.

an improved process

14

The base of the new process was a frame of seven distinct phases with distinct Validation

validation points at the end of each phase

regards

to

Quality of the results produced Progress

in terms of goals

Usefulness

determined

during one phase

and schedules

and Profitability

to be developed The process

of the system/component/function

further.

should be continued validation

criteria

to its next phase only~

about the functions

data processing

capability.

if pre-

are met.

Phase 0 is a planning phase to develop statements

(Figure 6).

requirements.

These are

needed to provide new or improved

People study,

analyze,

and perhaps

survey users to determine what they require. Phase I determines package° formance~ Phase

the requirements

It addresses

configuration,

storage

programming

requirements,

per-

etc.

II is the external

commands,

for a particular

parameters,

design phase.

output

major system components

the module

and interfaces

and the interfaces

design phase~

structure

are developed.

that covers all functional Phase 0 to Ill are strictly the implementation

(for example with

are developed.

Phase !II is the internal the program,

formats)

User interfaces

The internal

logic of

as well as internal

During this phase,

variations sequential

and resting'phases,

data areas

a test plan

is also developed. steps; phase however,

IV and V

interact with

each other~ Nevertheless,

Phase

IV is the main coding phase

and function

testing.

including

unit

15

Each phase has its own unique output or results: Documentation

in the first 4 phases

Code in Phase IV A system in Phase V Fixes in the maintenance Besides

the planning

directly Market

Phase VI

documentation,

the following

documents

relate

to the program being developed:

requirements

define a need,

considerations.Objectives programming

package.

of the program,

independent

of any programming

state the requirements

External

specifications

its user interfaces,

other parts of the system.

and its interfaces

Internal specifications

logic and the method of operation of the program interfaces.

Code is also considered

field engineer

receives

of a particular

define the purpose the

and its internal

as documentation

source code documented

with

describe

since the

in the form of

microfiche.

2.

THE METHODS

According

SUPPORTING THE PROGRAMMING

to the phase structure

PROCESS

of programming

projects,

there

are -

methods

supporting

-

methods

and criteria

results

of a phase

In describing -

the programm

the methods,

Principles Tools and Techniques Management

Aspects

development

process

to check the completeness

this paper distinguishes

of the

between

16

2.1

Principles

2.1.1

Top-Down Design

Top-down design is used today for most new development projects° In top-down design, we take a functional viewpoint.

We define

the basic functions of the program and design them with a high level of detail.

Then each of those basic functions

into more detailed subfunctions. of detail,

is broken

We move down through levels

creating a tree structure like the one shown in Figure

We continue this process until all subfunctions a consistent

are defined to

level of detail. When we are finished with the top-

down design process,

we will know about all of our interfaces,

all of our logic decisions, box in the tree structure a code segment,

and how the data is structured.

represents

Each

a function and, in most cases,

an inline block of code, or a subroutine

that

does not exceed a listing page. Top-down design avoids simultaneous, finition.

inconsistent

interface de-

Top-down design has reduced the complexity of the design

and of the programs

that result from the design.

for orderly logic development.

It also provides

When design is done from the bottom

up, decisions may be assumed to be in the upper-level

logic,

when in fact they should happen in the bottom levels.

2.1o2

Structured Programs

(Figure 8)

To reflect the program structure

that has been developed during

design, we emphasize on development of structured code. Structured code is composed of one-page

segments.

Control always enters

a segment at its top and leaves at the bottom. All segments referenced by name. GO TOs are not permitted. returns

7.

to its caller or moves

are

Each segment either

inline to the next segment.

17 Some advantages

-

of structured

code are:

It is easy to read. A listing of a program can be read and understood if the program

- Structured

can learn structured

and practice;

Elements

Structured

coding methods.

and to extend.

coding in about a week of

and once he has learned it, he generally

will not go back to traditional 2.1.3

of Structured

methods. Code

(Figure 9)

coding uses basic program elements

to code any program.

By using simple building

blocks

THEN-ELSE,

we can simplify programming,

and DO-WHILE,

code

This is often impossible

is written using traditional

code is easier to maintain

A programmer coursework

very quickly.

in structured

for sequential

statements,

IF-

reducing

complexity. 2.1.4

Continuous

As mentioned

earlier,

of code pieces

(Figure

code development

of growing

We use a process integrated

Integration

10)

implies

size and complexity

called continuous

the integration into the system.

Integration.

Functions

into the system as they become available

to a disciplined

plan.

be added in a logical

"Disciplined" sequence,

voted to the right functions

means that functions must

and that resources must be de-

at the right time. We start inte-

gration early,

for large efforts

fore shipment.

A driver is a subsystem used for subsequent

Drivers

are

according

are built by continually

as much as eighteen months beadding

functions.

system followed top-down design and implementation, can take the integration plan directly

testing.

If the entire then one

from the top-down plan.

18

Here

are some of the advantages

of continuous

integration:

Code is entered

into the system when it is complete.

sets of modules

are added at one time,

to put all the modules gration process

Small

instead of trying

together at once~

i.e.

the inte-

is less complex.

A real system is always

running,

and it is easier to keep

it running. Detailed planning to think through

becomes

their dependencies

faces will become more to become more

2.1~5

Tests

We d i s t i n g u i s h gration

necessary.

apparent.

realistic

are forced

so that their inter-

It also forces people

in their scheduling.

& Test Criteria between

Programmers

(Figure

11)

four types of tests

according

to the inte-

levels:

. Unit Test o Function Test Component

Test

System Test Unit testing It tests

is done by the developer

the smallest

a module~ completion

a macro,

self-contained

or a subroutine.

after his code is completed. piece

of code,

The m i n i m u m

is that all the branches

for example

criterion

for

in the code are executed

both ways. Function test occurs test, we put units a command,

after unit

test is completed.

together to form discrete

a recovery procedure,

~or c o m p l e t i o n

is 1OO % execution

hich must be successful.

In function

functions,

or a parameter.

for example

The criterion

of all test cases,

90 % of

19

Component

test takes place after function test is completed.

Functions

are put together to form a component.

be an access method, for completion

the supervisor,

A component may

or a compiler.

The criterion

is 100 % of all test cases were executed success-

fully. The final test is system test. The components to form the system, Criteria

for completion

and all problems groups,

I.

are all test cases executed successfully

fixed. Whatever

it is important

throughout

are put together

for example OS 20.1 or VS/2 Release

criteria

that predetermined

the whole process

(including

is used by other development criteria are used

specification

validation)

and should never be compromised.

2.2

Tools and Techniques

This section describes

techniques

to perform their tasks according 2.2.1

HIPO

and tools that support programmers to established principles.

(Figure 12)

We are supporting

the design task by a technique

called HIPO

that in turn is supported by a tool that provides

of updating

and print facilities. HIPO stands for Hierarchy, means proceeding following

plus Input, Process,

from very generalized

top-down design practices.

sively lower-level

to very detailed

The diagrams

drawing,

refer to succes-

diagrams.

Each diagram shows the inputs to a function, within the function,

and the outputs.

volved in the process, replacement

Output. Hierarchy

therefore,

for flowcharts.

the processing

steps

The data and function in-

becomes visible.

HIPOs is a

20

2.2.2

Cause / Effect Graph

Due to the absence of formal design languages, gram functions racteristic

in prose~

of burying

less visible.

Unfortunately

relationships,

A technique

we describe pro-

prose text has the chai.e. to make relationships

of analyzing

external

specifications

is to draw a cause and effect graph. A cause and effect graph is a Boolean representation mented in specifications, representing expression. omissions

a valid such as a measure

review of a specification.

The cause/effect

test of all variations graphs

design of such test case buckets. and a tool produces all causes

Figure

the cause/effect

The test bucket

requires

Integration

at different

to each other°

codes the graph

14 shows what the tool would prograph shown in Figure 13: for having

all causes in-

observed.

integrating

levels and possibly

needs

The programmer

& Build Support

of continuously

Figure

of program

can be used to derive the

6 test cases

voked and all related effects

The prQcess

test case

the list of test cases needed for testing

and effects.

duce assuming

projects

They provide

a large effort is needed to develop

for a comprehensive

functions~

manner°

13 shows an example of a graph of how to construct

inconsistencies.

for a thorough

docu-

Test Case Design

Traditionally~

2.2.4

relationships

These graphs help to find errors early,

and a structure

buckets

Figure

some specifications

and logical

2.2.3

of the logical

code and building

drivers

for a series of programming

a complex tool to perform all steps in a controlled 15 shows the support

functions

and how they relate

21

The programmer stores new code in a development library. He tests his code under a specific test system called driver. Upon test completion

(successful) his code is integrated and stored in

a master library. At that point, this code becomes available to other development groups. In particular,

test system or drivers are built from the master

library and are then used by all development groups. Finally, the system to be released is build from the master library. Errors found during testing are fixed in the development library and, after successful testing the modified code is integrated into the master library. The whole integration and build process is fully controlled in that all actions are recorded and problems as for example unresolved dependencies are reported. Preferrably,

the described library system should be capable of

supporting specification development as well. 2.2.5

VM/370 (Figure 16)

Testing needs a high amount of computing resources; testing system programs in particular requires different hardware configurations. VM/370 provides for the capability to have a variety of test systems running on the same installation. This makes the testing effort more flexible and economical. 2.3

Management Aspects

The previous sections dealt with the principals of the programming development process and the tools supporting it. I want to add some comments about managing this process. All techniques and tools, phase plans, and controls will not work unless programmers are involved in the implementation of the process. The programmers must do the detailed planning upon which the manager can base his overall plans.

22

People are evaluated on the basis of the results whereby tity,

results

are stated in terms of quality,

and cost.

cording

It should become

and analyzing

Programming

Programmers Generally

professional

quan-

that reare part

responsibility.

Teams

of different

skills are best organized

in teams.

one to three teams report to a first-level

Each first-level

manager has a librarian

The roles of the team members Figure

timeliness,

common understanding,

the above process measurements

of a system programmers 2.3.1

they achieve

manager.

to support his teams.

and the librarien are shown in

17.

The team leader is responsible cification

and has technical

specifications,

for preparing

responsibility

and all code. He makes

for the project.

For most design,

his job is to review and analyse programmers.

him.

(Evaluation

and employee

addition,

the team leader designs,

The co-team

writes

and code rather

for specifications

logic specs,

In

and codes

that his team is producing.

of the product.

responsibility

logic

by other

are done by the manager.)

leader helps the team leader

also codes key elements leadership

appraisals

of the product

the programmer,

of end results

spe-

decisions

logic specifications,

and code,

the key elements

for all design,

the technical

the results produced

In this role, he supports

than evaluates

the functional

if necessary.

in all his duties

and

He will assume full team He is a technical

peer

of the team leader and his back up. The programmers and code,

in the team are responsible

their own detailed planning,

for lower-level

and testing.

design

23 The librarian is an important part of the team. His job is to create, maintain,

and own the library for the project. The project

library includes both documentation and code. The librarian schedules and receives the runs from the computing center, and provides clerical services to the rest fo the group. The librarian is a full-fledged team member. There are many advantages in programming teams. Senior people are directly and actively involved in the project.

In the past

these people would design the function and sometimes leave the project. Keeping senior people involved provides continuity. Since they generally produce work of higher quality, they can educate the junior members of the team. The team approach provides for more detailed and realistic plans. 2.3.2

Formal Reviews

During the last two years, there have been several projects which applied very formal review techniques to specifications,

code,

test cases, and publications. One review technique has been called Walkthru,

another Inspection which word stems from a common in-

spection practice in the world of hardware engineering.

Both

techniques are common in that a very detailed structured review, attended by the most knowledgeable and affected persons regarding system technology and dvelopment process,

is performed. However,

there is a difference between a Walkthru and an Inspection, in that only the latter emphasizes the recording and analysis of not only defects discovered but also development process measurments like number of defects by type. The formal review is used by a developer as a resource to help him produce error-free products. The developer's attitude must be that other people are there to help, not to kill him with personal comments. The other participants

(a moderator,

the designer,

24

other developer(s),

a tester) must also realize that they are

there to help. Managers feelings.

Without

will not work, evaluating

this psychological

Formal

number of problems

Formal

reviews

to reinforce

atmosphere,

of existing

just that they are found

or projected

are performed

study the external

quality exposures.

during phases

questions.

specification

They probe

II, III, and IV. Fi-

by all present.

the material

The issues

walkthru

concentrates

material

like HIPOs

a code walkthru.

The participants

and related material

it for errors

In the meeting

about the

should use the number of problems

18 shows the way a Phase II walkthru works.

effect graphs.

tool for

is not concerned

found at this stage;

these

formal reviews

are not a management

The manager

The manager

as an indicator

gure

reviews

programmers.

and corrected.

have consciously

a list of

is thoroughly

analyzed

are only recorded;

on internal

like cause/

and develop

the phase

specifications

and Test Cases. The phase

III

and related

IV walkthru

is

The length and the number of participants

may

vary. A review of an entire spec could involve eight people, a couple of days notice,

and an offsite meeting.

few changes have been made to an existing may involve only three people: and the independent We have analyzed The results a problem

If relatively

spec, the walkthru

the developer,

his team leader,

reviewer.

the effect

of walkthrus

on costs

(Figure

19).

showed that it was 14 to 15 times cheaper to find

during a walkthru

test is the point

than it was in unit test. And unit

in the testing

cycle at which it is cheap to

find and fix an error.

3.

CLOSING

We believe

to have proof that the analysis

the programming quality,

development

and improvement

process has resulted

i.e. fewer APARs relative

of

in improved

to the released

code

(Figure 20)~

25

We are observing

even better results with newly developed

code.

Figure 2] shows the APAR rate for the TSO scheduler code the rate of which is about one third the rate of code that consists in a conglomerate The same methods

of old, changed and added code. that increase quality have also a favorable

effect on costs accounted

over the whole process

from inception

until shipment of a system/component. We hope that more and more system programming contribute veloping

new ideas on how to improve

systems.

will

of de-

This will be needed if new systems of growing

size and increasing tude in managing

professionals

the total process

functional

capability

the development

system programmers,

and hardware

require

another magni-

process which involves engineers.

users,

26 REFERENCES_

(I)

W.B° Cammack and H.J. Rodgers, Jr. Improving the Programming Process, IBM Technical Report 00.2483 (Oct. 73)

(2)

W.R° Elmendorf Cause-Effect

Graphs in Functional Testing

IBM Technical Report 00.2487

(3)

(Nov.

73)

F.T. Baker System Quality through Structured Programming 1972 Fall Joint Computer Conference

(4)

F.T° Baker Chief Programmer Team Management

of Production Programming

IBM Systems Journal, Vol. If, Nov. I, 1972 (5)

H.D. Hills Top-Down Programming

in Large Systems Courant Computer Sciences

Symposium I New York Univ., June 1971

(6)

(7)

P.M. Metzger Managing a Programming Prentice-Hall 1973

Project

IBM HIPO Audio Education Package SR20-9413

27

Fig.

I

System build up

Fig.

2

28

Experience Average APARs Per User OS

4

Total APARs/User

Valid 18

19

20

21

Release Fig.

3

Release 18 G FCS ÷ 2yrs.

Fig.

4 Ill D ' V

29

of

0 L-.-~

C~ling

Fig.

Unit Test Time

APAR's

5

Fig. 6 see page 3 o - ~

Top- Down Design ....

Fig.

7

30

8t

Structured Programs

Fig.

Structured Program Elements Sequence rNm

Lm

m.m

n

w,m m ~

n

i

mmm ~

I

i

~

n

~

~

mmm ~

m

m

~

~

~

mmm w,m

m ~ e m m m

~

i

um.

m

w.n

~

~

~

~ *

uw

~

,..~

u--.i

u

If T h e n Else

r

~

m

,

~

I i

~

I

.............. L

~

mmm ~m

wmmP I n t o

~m

~lmm uwm

~

~

J

L. . . . . . . . . . . . . . . . . . . . . . Do While

Fig. 9

~

~

~

~

~

1

I I

I

_1

32

I n t e g r a t i o n - As it Is FUNCTION

A--B---" C---D-'E---F--G---Driver 1 Driver 2 Driver 3 Fig.

Fig°

Io

11

33

HIPO Hierarchy

Input

CVT

Process

Fig. 12

Draw the Graph

Nodes 1. "OPI" is fixed binary 2. "OP2" is fixed decimal 3. "OPERATOR" is + 4. "OPERATOR" is 5. "OPI' is invllid 6. "OP2" is invalid 7. e x p m l l i o n is valid 8. OPERATOR is valid 9. OPERATOR is invalid

Fig. 13

Outpu,

34

The Test Case Library Design lnvocable causes 1 2 3 4

T E S T S

I 2 3 4 5

| | ! 1 | 61

,I

,

,

Observable effects I

SSSS ~lSI I l l S l l S S i S i S S l i IS I lrllllll I 1234

i l 12 g3 t4 15 16

5 I6 7 9 T E S T S

1 2 3 4 5 6

I ! i I I i

P PAP AAPA AAPA AAA P APAA PAAA

1 2 3 4 5 6

5679 | = Involved S = Suppressed Fig.

P = Present A = Absent

14

Integration and Build New Code ~rror ;orrection

Integration Source Code

I Fig.

15

T

i

System Build

I I

Test

eQ ee e

J.

J.l~

i

i

i ~iii i~i¸

36

Phase 1I Walkthru * Probe spec for mistak.es and omissions

Probb

uj

• Record problems Problem I

* Resolve problems

Fig. 18

Find Programming Errors Early

Cost errors

Fig. 19

Walkthru

Unit Test

37

Results Average APARs Against Base Declining

OS

Valid APARs

Per I n s t ~

18 Fig.

19

20

21

Release

2o

Results New Code is Higher Quality TSO Scheduler Valid APARs

(.o..dzed)

Base Plus Fig.

2~

C~,,ged

New

Organizing

for Structured Programming

F. T. Baker;

IBM Federal Systems Division,

Maryland,

Gaithersburg,

USA

ABSTRACT A new type of programming methodology,

built around structured

programming ideas, has been gaining widespread acceptance production programming.

for

This paper discusses how this method-

ology has been introduced into a large production programming organization.

Finally it analyzes the advantages and disad-

vantages of each component of the methodology and recommends ways it can be introduced in a conventional programming environment.

INTRODUCTION At this point in timer the ideas of structured programming have gained widespread acceptance,

not only in academic

circles, but also in organizations doing production programming.

An issue [1] of Datamation,

one of the leading business

data processing oriented magazines in the U.S., several articles on the topic.

featured

The meetings of SHARE and

GUIDE, two prominent computer user groups, have had an increasing number of sessions on subjects related to structured programming.

The IBM Systems Science Institutes are offering

courses and holding seminars,

and several books on the topic

are in print° What is perhaps not so widely appreciated, the organizations,

however,

is that

procedures and tools associated with the

implementation of structured programming are critical to its success.

This is particularly true in production programming

environments, grams)

where program systems

are developed,

reliable, maintainable

(rather than single pro-

people come and go, and the attainment of software on time and within cost esti-

mates is a prime management objective.

In this environment,

89

module level coding and debugging activities typically account for about 20~ of the effort spent on software development [2] . Thus, narrow applications of structured programming ideas limited only to these activities have correspondingly It is therefore desirable to adopt a broad,

limited effects.

integrated approach

incorporating the ideas into every aspect of the project from concept development to program maintenance to achieve as many quality improvements and cost savings as possible.

BACKGROUND

The IBM Federal Systems Division

(FSD) is an organization in-

volved in production programming on a large scale.

Although

much of its software work is performed for federal,

state and

local governmental agencies,

the division also contracts with

private business enterprises for complex systems development work.

Work scope ranges from less than a man-year of effort

on small projects to thousands of man-years spent on the development and maintenance of large, evolutionary,

long-term

systems such as the Apollo/Skylab ground support software. Varying customer requirements cause the use of a wide variety of hardware,

programmihg languages,

software tools, documenta-

tion procedures, management techniques,

etc.

Problems range

from software maintenance through pure applications programming using commercially available operating systems and program products to the concurrent development of central processors, ~eripherals,

firmware,

support software and applications soft-

~are for avionics requirements.

Thus, within this single or-

ganization can be found a wide range of software development efforts.

FSD has always been concerned with the development of improved software tools, techniques and management methods.

Most re-

cently, FSD has been active in the development of structured programming techniques

[3]

This has led to organizations,

procedures and tools for applying them to production programming projects,

particularly with a new organization called a

Chief Programmer Team. [4]

The Team, a functional organization

40

based on standard support tools and disciplined application of structured programming principles,

had its first trial on

a major software development effort in 1969-71. [5],[6]

In the

three years since the completion of that experimental project, FSD has been incorporating structured programming techniques into most of its software development projects. scope and diversity of these projects,

Because of the

it was impossible to

adopt any single set of tools and procedures or any rigid type of organization to all or even to a majority of them. cause of the ongoing nature of many of these systems,

And beit was

necessary to introduce these techniques gradually over a period of many years.

The approach which was adopted,

the problems

which have been encountered and the results which were achieved, are the subject of this paper.

It is believed that any software

development organization can improve the quality and reduce the costs of its software projects in a similar way.

PLAN

To introduce the ideas into FSD work practices and to evaluate their use, a plan with four major components was implemented. First, a set of guidelines was established to define the terminology associated with the ideas with sufficient precision to permit the introduction and measurement of individual components of the overall methodology. and directives Second,

These guidelines were published,

regarding their implementation were issued.

support tools and methodologies were developed,

par-

ticularly for projects using commercial hardware and operating systems.

For those projects where these were not employed,

standards based on the developed tools enabled them to provide their own support.

Third, documentation of the techniques and

tools, and education in their user were both carried out.

These

were done on a broad scale covering management techniques,

pro-

41

gramming methodologies and clerical procedures.

Fourth,

a

measurement program was established to provide data for technology evaluation and improvement.

This program included both

broad measurements which were introduced immediately,

and de-

tailed measurements which required substantial development work and were introduced later.

The next four sections cover

the components of this plan and their implementation in detail.

GUIDELINES

A number of important considerations

influenced the establish-

ment of a set of guidelines for the application of structured programming technology within FSD.

First and most important,

they had to permit adaptation to the wide variety of project environments described above.

This required that they be use-

ful in program maintenance situations where unstructured program systems were already in being,

as well as in those where com-

pletely new systems were to be developed.

Second, they had to

allow for the range of processors and operating systems in use. This necessitated the description of functions to be provided instead of specific tools to be used.

Third, they had to allow

for differences in organizations and methodology fications,

documentation,

(e.g., speci-

configuration management)

required

or in use on various projects.

The guidelines resulting from these considerations are a hierarchical set of four components, in Figure I.

graphically illustrated

Use of the component at any level presupposes use

of those below it.

Thus, by beginning at a level which a

project's environment and status permit,

and then progressing

upward, projects can evolve gradually toward full use of the technology.

42

1.

Development

The introductory

Support Libraries level is the Development

Support Library,

which is a tool designed with two key principles a°

Keep current project status organized all times.

in mind:

and visible at

In this way, any programmer,

manager or

user can find out the status or study an approach directly without depending b.

Make it possible

on anyone else.

for a trained secretary to do as

much library maintenance clerical

from intellectual

A DSL is normally Librarian~

as possible, activity.

the primary responsibility

Programmers

thus separating

of a Programming

interface with the computer primarily

through the library and the Programming

Librarian.

This allows

better control of computer activity and ensures that the library is always complete with up-to-date standings

and current.

versions

of programs

and inconsistencies

and modification

Programmers

are always working

and data,

so that misunder-

are greatly reduced.

A version

level are associated with all material

in the

library to permit change control and assist in configuration nanagement.

In general,

in increasing

the library system is the prime factor

the visibility

reducing risk and increasing The guidelines

reliability.

provide that a Development

being used if the following

ao

of a developing project and thus

conditions

A library system providing the PPL

(see TOOLS below)

Support Library is

prevail:

the functional is being used.

equivalent of

43

b.

The library system is being used throughout the development process, ject code,

c.

not just to store debugged source or ob-

for example.

Visibility of the current status of the entire project, as well as past history of source code activities and run executions,

d.

is provided by the external library.

Filing procedures are faithfully adhered to for all runs, whether or not setup was performed by a librarian.

e.

The visibility of the code is such that the code itself serves as the prime reference for questions of data formats, program operation,

etc.

f.

Use of a trained librarian is recommended.

2.

Structured Pro~rammin~

In order to provide for use of structured programming techniques on maintenance as well as development projects, necessary to depart from t h e c l a s s i c a l and adopt a narrower definition.

In FSD, then, we distinguish

between those practices used in system development Development)

it was

use of the terminology (Top-Down

and those used in coding individual program modules

(Structured Programming).

Our use of the term "structured pro-

gramming" in the guidelines thus refers primarily to coding standards governing control flow, and module organization and construction.

They require three basic control flow figures and

permit two optional ones, as shown in Figure 2. a Guide [7]

They refer to

(see DOCUMENTATION below) which contains general

information and standards for structured programming,

as well as

44

d e t a i l e d standards

for use of various p r o g r a m m i n g

languages,

They also require that code be r e v i e w e d by someone other than the developer~

The d e t a i l e d g u i d e l i n e s for s t r u c t u r e d pro-

g r a m m i n g are as follows:

a.

The c o n v e n t i o n s e s t a b l i s h e d in the S t r u c t u r e d P r o g r a m m i n g Guide are being followed. documented.

E x c e p t i o n s to c o n v e n t i o n s are

If a language is being used for w h i c h con-

v e n t i o n s have not been p u b l i s h e d in the Guide,

then use

of a locally g e n e r a t e d set of conventions c o n s i s t e n t w i t h the rules of s t r u c t u r e d p r o g r a m m i n g is acceptable.

b.

The code is being r e v i e w e d for functional i n t e g r i t y and for a d h e r e n c e to the s t r u c t u r e d p r o g r a m m i n g conventions.

c.

A Development

Support L i b r a r y is being used.

3.

Top-Down Development

T o p - d o w n d e v e l o p m e n t refers to the process of c o n c u r r e n t design and d e v e l o p m e n t of p r o g r a m systems c o n t a i n i n g more than a single c o m p i l a b l e unit. which minimizes

It r e q u i r e s d e v e l o p m e n t to proceed in a way interface p r o b l e m s n o r m a l l y e n c o u n t e r e d during

the i n t e g r a t i o n p r o c e s s typical of "bottom-up development" by i n t e g r a t i n g and testing m o d u l e s as soon as they are developed. Other a d v a n t a g e s are that:

a.

It permits a p r o j e c t to man up more g r a d u a l l y and should reduce the total m a n p o w e r required.

bo

C o m p u t e r time r e q u i r e m e n t s tend to be spread more evenly over the d e v e l o p m e n t period.

45

c.

The user gets to work with major portions of the system much earlier and can identify gross errors before acceptance testing.

d.

Most of the system has been in use long enough by the time it is delivered that both the user and the developer have confidence in its reliability.

e.

The really critical interfaces between control and function code are the first ones to be coded and tested and are in operation the longest.

The term "top-down" may be somewhat misleading if taken too literally.

What top-down development really implies in every-

day production programming is that one builds the system in a way which ideally eliminates

(or more practically,

minimizes)

writing any code whose testing is dependent on other code not yet written,

or on data which is not yet available.

This re-

quires careful planning of the development sequence for a large system consisting of many programs and data sets, since some programs will have to be partially completed before other programs can be begun.

In practice,

it also recognizes that exi-

gencies of customer requirements or schedule may force deviations from what would otherwise be an ideal development sequence. The guidelines for top-down development are as follows:

a.

Code currently being developed depends only on code already operational,

except in those portions where devi-

ations from this procedure are justified by special circumstances.

46

b.

The project schedule reflects a continuing integrationr as part of the development process,

leading directly to

system test, as opposed to a development, integration,

followed by

followed by system test, cycle.

c.

Structured Programming

is being used.

d.

A Development Support Library system is being used. (While ongoing projects may not be able to meet this criterion,

an implementation of structured coding practice

is acceptable

e.

in these cases.)

The managers of the effort have attended a structured programming orientation course

4.

(see EDUCATION below).

Chief Progrgmmer Teams

A Chief Programmer Team

(CPT) is a functional programming or-

ganization built around a nucleus of three experienced professionals doing well-defined parts of the programming development process using the techniques and tools described above. It is an organization uniquely oriented around them and is a logical outgrowth of their introduction and use. detail in

[4], [5], and

Described in

[6], it has been used extensively in

FSD on projects ranging up to approximately i00,000 lines of source code and is being experimented with on larger projects. The guidelines

as

for CPT's are as follows:

A single person;

the chief programmer,

technical responsibility

has complete

for the effort.

He will

ordinarily be the manager of the other people.

47

b.

There is a backup programmer prepared to assume the role of chief programmer.

c.

Top-D~wn Development, velopment

d.

Structured Programming

and a De-

Support Library are all being used.

Top level code segments and the critical control paths of lower level segments

are being coded by the chief and

backup programmers.

e.

The chief and backup programmers

are reviewing the code

produced by other members of the team. f.

Other programmers

are added to the team only to code

specific well defined functions within a framework established by the chief and backup programmers.

TOOLS

Tools are necessary

in order to permit effective

of, and achieve m a x i m u m benefits programming.

Development

implementation

from the ideas of structured

Support Libraries,

introduced above,

are a recognized and required component of the methodology ployed in FSD.

em-

Standards are necessary to ensure a consistent

approach and to help realize benefits of improved project communications effective

and manageability.

Procedures

are required for

use of the tools and to permit functional breakup and

improved overall efficiency ly, other techniques ment can be helpful

in the programming

of design,

programming,

in a structured programming

well as in a conventional

one.

process.

Final-

testing and manageenvironment

as

48

1.

Development Su~ort

Libraries

The n e e d for and value of D e v e l o p m e n t Support L i b r a r y support,

(DSL)

both as a n e c e s s i t y for s t r u c t u r e d p r o g r a m m i n g and

as a vehicle

for p r o j e c t c o m m u n i c a t i o n and control, has been

t h o r o u g h l y covered in

[4],

[5],

[6],

[7], and

[8].

Early work

on DSL's c e n t e r e d on the p r o v i s i o n of libraries for p r o j e c t s using IBM's S y s t e m / 3 6 0 O p e r a t i n g System and Disk O p e r a t i n g System.

The OS/360 P r o g r a m m i n g P r o d u c t i o n L i b r a r y

(PPL) is

typical of those we are using in b a t c h p r o g r a m m i n g d e v e l o p m e n t situations. external

It consists of internal

(human-readable)

procedures.

libraries,

(computer-readable)

and

and office and m a c h i n e

Similar concepts and approaches apply to our other

library systems,

i n c l u d i n g those w o r k i n g in an online environ-

ment.

The PPL keeps all m a c h i n e a b l e data on a p r o j e c t - source code, object code, test data,

linkage e d i t o r language,

job control language~

and so on - in a series of data sets w h i c h c o m p r i s e

the internal l i b r a r ~

(see Figure 3).

Since all data is kept

i n t e r n a l l y and is fully b a c k e d up, there is no need for programmers to g e n e r a t e or m a i n t a i n their own p e r s o n a l copies.

Cor-

r e s p o n d i n g to each type of data in the internal library there is a set of c u r r e n t status binders w h i c h c o m p r i s e the external librar[

(see F i g u r e

~).

These are filed c e n t r a l l y and used by

all as a s t a n d a r d m e a n s of communication.

There is also a set

of a r c h i v e s of s u p e r s e d e d status pages w h i c h are r e t a i n e d to assist in d i s a s t e r recovery, run results.

Together,

and a set of run books c o n t a i n i n g

these record the a c t i v i t i e s - current

and h i s t o r i c a l - of an entire p r o j e c t and keep it c o m p l e t e l y organized°

49

The m a c h i n e p r o c e d u r e s , as the name implies,

are c a t a l o g e d

p r o c e d u r e s w h i c h p e r f o r m internal library maintenance, up, e x p a n s i o n and so on.

back-

Most of them are used by Program-

ming L i b r a r i a n s by means of simple control cards they have been trained to prepare.

A c o m p l e t e list is given in Table i.

The 0ffice p r o c e d u r e s are a set of "clerical algorithms" used by the P r o g r a m m i n g L i b r a r i a n to invoke the m a c h i n e procedures, to prepare the input and file the output.

Once new code has

been created and placed in the library initially,

a programmer

makes c o r r e c t i o n s to it by m a r k i n g up pages in the external library and giving them to the P r o g r a m m i n g L i b r a r i a n to make up control and data cards to cause the c o r r e s p o n d i n g changes or additions to be made to the internal library.

As a result,

clerical effort and w a s t e d time on the part of the p r o g r a m m e r s are s i g n i f i c a n t l y reduced.

Figure 5 shows the w o r k flow and the

central role of the P r o g r a m m i n g L i b r a r i a n in the process.

Be-

cause p r o g r a m m e r s are served by the L i b r a r i a n and the PPL, they are freed from i n t e r r u p t i o n s and can work on more routines in parallel than they p r e v i o u s l y did. ~ The PPL m a c h i n e and office p r o c e d u r e s are d o c u m e n t e d for programmers librarians in

in

[7] and for

[9].

S u b s e q u e n t work on DSL's in FSD has e x t e n d e d the support to some of the n o n - S y s t e m / 3 6 0

equipment in use and also introduced

interactive DSL's for use both by librarians and programmers. Furthermore,

a study of general requirements

for DSL's has

been p e r f o r m e d under c o n t r a c t to the U. S. Air Force and has been p u b l i s h e d in

[i0].

DSL's are now available for and in

use on m o s t p r o g r a m m i n g projects in FSD.

50

2.

Standards

To support s t r u c t u r e d p r o g r a m m i n g in the v a r i o u s languages used, p r o g r a m m i n g standards were required.

These covered

both the i m p l e m e n t a t i o n of the control flow figures in each language as w e l l as the c o n v e n t i o n s

for f o r m a t t i n g and in-

d e n t i n g p r o g r a m s in that language.

There are four a p p r o a c h e s w h i c h can be taken to provide the basic and o p t i o n a l control flow figures in a p r o g r a m m i n g language,

a~

and each was used in certain situations

The figures may be d i r e c t l y a v a i l a b l e as s t a t e m e n t s in the language.

In the case of PL/I, all of the basic

figures were of this variety. (with slight restrictions) PERFORM)

b.

in FSD.

In COBOL,

the IFTHENELSE

and the D O U N T I L

(as a

were present'.

The figures may be e a s i l y s i m u l a t e d using a few s t a n d a r d statements.

The CASE s t a t e m e n t may be readily s i m u l a t e d

in PL/I using an indexed GOTO and a LABEL array, w i t h e a c h case i m p l e m e n t e d via a D O - g r o u p e n d i n g in a GOTO to a c o m m o n null s t a t e m e n t f o l l o w i n g all cases.

c.

A s t a n d a r d p r e - p r o c e s s o r may be used to augment the basic language s t a t e m e n t s to p r o v i d e n e c e s s a r y features.

The

m a c r o a s s e m b l e r has been used in FSD to add s t r u c t u r i n g features to System/360,

System/370

and System/7 A s s e m b l e r

Languages.

d.

A special p r e - p r o c e s s o r may be w r i t t e n to compile augmented language statements

into standard ones, w h i c h may

then be p r o c e s s e d by the normal computer.

This was done

51

for the FORTRAN language, which directly contains almost none of the needed features.

The result of using these four approaches was a complete set of figures for PL/I, COBOL, FORTRAN and Assembler.

Using these

as a base, similar work was also done for several specialpurpose languages used in FSD. To assist in making programs readable and in standardizing communications and librarian procedures,

it was desirable that

programs in a given language should be organized, indented in the same way.

Link Editor Languages as well as of the procedural mentioned above.)

formatted and

(This was true of the Job Control and languages

Coding conventions were developed for each

covering the permitted control structures, naming, use of comments,

segment formatting,

labels, and indentation and formatting

for all control flow and special

(e.g., OPEN, CLOSE, DECLARE)

statements. 3.

Procedures

An essential aspect of the use of DSL's is the standardization of the procedures associated with them.

The machine procedures

used in setting up, maintaining and terminating the libraries were mentioned above in that connection.

However,

the office

procedures used by librarians in preparing runs, executing them and filing the results are also quite extensive.

These were

developed and documented [9] in a form readily usable by nonprogramming oriented librarians. ~.

Other

While the above constitute the bulk of the work originated by FSD, certain other techniques and procedures have been assimi-

52

lated into the methodology in varying degrees. management techniques,

These include

HIPO diagrams and structured walkthroughs.

FSD has been a leader in the development of management techniques for programming projects.

A book [II] resulting from

a management course and guide used in FSD has become a classic in the field.

As top-down development and structured program-

ming came into use, it became apparent that traditional management practices would have to be substantially revised IMPLEMENTATION EXPERIENCE below).

(see

An initial examination was

done, and a report [12] was issued which has been very valuable in guiding managers

into using the new methodology.

This

material is now being added to a revised edition of the FSC Programming Project Management Guide,

from which the book [II]

mentioned above was drawn. A documentation technique called HIPO Process-Output)

diagrams

[8,13]

(Hierarchy plus Input-

developed elsewhere in IBM has

proved valuable in supporting top-down development.

HIPO con-

sists of a set of operational diagrams which graphically describe the functions of a program system from the general to the detail level.

Not to be confused with flowcharts, which de-

scribe procedural

flow, HIPO diagrams provide a convenient means

of documenting the functions identified in the design phase of a top-down development effort.

They also serve as a useful intro-

duction to the code contained in a DSL and as a valuable maintenance tool following delivery of the program system. Structured walk-throughs [8] were developed on the second CPT project as a formal means for design and code reviews during the development process.

Using HIPO diagrams and eventually the

53

code itself, reviewers. grammer

the d e v e l o p e r

"walks through" his efforts for the

These latter may consist of the Chief or Backup Pro-

(or lead p r o g r a m m e r if a CPT is not being employed),

other p r o g r a m m e r s and a r e p r e s e n t a t i v e formally test the programs. detection,

not correction,

from the group w h i c h will

Emphasis is on error avoidance and and the attitude is open and non-

d e f e n s i v e on the part of all participants be tomorrow's reviewee).

(today's reviewer will

The reviewers prepare for the walk-

through by studying the diagrams or code before the meeting, and followup is the r e s p o n s i b i l i t y of the reviewee, who m u s t notify the reviewers of corrective actions taken.

D O C U M E N T A T I O N AND E D U C A T I O N

Once the fundamental tools and guidelines were established, was n e c e s s a r y to begin d i s s e m i n a t i n g them throughout FSD.

it Much

e x p e r i m e n t a l work had already been done in d e v e l o p i n g the tools and guidelines themselves,

so that a cadre of people familiar

w i t h them was already in being.

Most of the d o c u m e n t a t i o n has been referred to above.

The pri-

mary reference for p r o g r a m m e r s was the FSC S t r u c t u r e d Pro@ramm i n @ Guide [7]

In addition to the standards for each language

and for use of the PPL, it contained general i n f o r m a t i o n on the use of top-down d e v e l o p m e n t and structured programming, as the p r o c e d u r e s

as well

for m a k i n g exceptions to them when necessary.

It also contained provisions

for sections to be added locally

when s p e c i a l - p u r p o s e languages or libraries were in use. tributed t h r o u g h o u t FSD,

Dis-

the Guide has been updated and is still

the standard reference for programmers.

The FSC P r o ~ r a m m i n ~ Li-

b r a r i a n ' s Guide [9] serves a similar purpose for librarians and also has provisions for local sections where necessary.

While

54

the use of the macros

for System/360 A s s e m b l e r L a n g u a g e was in-

cluded in the Pr_ogrammin~ Guide, was a v a i l a b l e on them if desired. m e n t a t i o n in the form of

[ii] and

a d d i t i o n a l d o c u m e n t a t i o n [14] Finally,

m a n a g e m e n t docu-

[12] was also available.

It was r e c o g n i z e d that p r o v i d i n g d o c u m e n t a t i o n alone was not s u f f i c i e n t to p e r m i t m o s t p e r s o n n e l to b e g i n a p p l y i n g the techniqueso

S t r u c t u r e d p r o g r a m m i n g r e q u i r e s s u b s t a n t i a l changes

in the p a t t e r n s and p r o c e d u r e s of programming,

and a s i g n i f i c a n t

m e n t a l e f f o r t and amount of p r a c t i c e is n e e d e d to o v e r c o m e old habits and instill new ones. m a j o r language)

A series of courses

in s t r u c t u r e d p r o g r a m m i n g and DSL techniques. five hours, portantly,

L a s t i n g twenty-

these courses p r o v i d e d i n s t r u c t i o n and, m o r e imp r a c t i c e p r o b l e m s w h i c h forced the p r o g r a m m e r s to

begin the t r a n s i t i o n process. retrained,

(one for each

was set up to train e x p e r i e n c e d FSD p r o g r a m m e r s

Once all p r o g r a m m e r s had been

these courses w e r e discontinued,

and s t r u c t u r e d pro-

g r a m m i n g is now i n c l u d e d as p a r t of the basic p r o g r a m m e r training courses given to newly hired personnel.

The same s i t u a t i o n held true for m a n a g e r s as well as programmers.

Because FSD w i s h e d to apply the m e t h o d o l o g y as rapidly as

possiblea

it was d e s i r a b l e to a c q u a i n t m a n a g e r s w i t h it and its

p o t e n t i a l immediately.

Thus, one of the first actions taken

was to give a h a l f - d a y o r i e n t a t i o n course to all FSD managers. This p e r m i t t e d them to e v a l u a t e the depth to w h i c h they could begin to use it on current projects, its use on p r o p o s e d projects.

and to begin to plan for

This was then followed up by a

t w e l v e - h o u r course for e x p e r i e n c e d p r o g r a m m i n g managers,

ac-

q u a i n t i n g t h e m w i t h m a n a g e m e n t and control t e c h n i q u e s p e c u l i a r to t o p - d o w n d e v e l o p m e n t and s t r u c t u r e d programming.

(It was ex-

55

pected that most of these managers would also attend one of the structured programming courses described above to acquire the fundamentals.)

Again, now that most programming managers

have received this form of update, the material has now been included in the normal programming management co~rse given to all new programming managers.

MEASUREMENT

One of the problems of the production programming world is that it has not developed good measures of its activities.

Various

past efforts, most notably the System Development Corporation studies [15] have attempted to develop measurement and prediction techniques for production programming projects.

The

general results have been that a number of variables must be accurately estimated to yield even rough cost and schedule predictions,

and that the biggest factors are the experience and

abilities of the programmers involved.

Nevertheless,

it was

felt in FSD that some measures of activity were needed, not so much for prediction as for evaluation of the degree to which the methodology was being applied and the problems which were experienced in its use.

To these ends, two types of measure-

ments were put into effect.

The first type of measurement,

implemented immediately, was a

monthly report required from each programming project. programming manager was required to state:

I.

The total number of programmers on the project.

2.

The number currently programming.

3.

The number using structured programming.

Each

56

4.

The number of p r o g r a m m i n g groups on the project.

5.

The number of CPT~s.

6.

W h e t h e r a DSL was in use.

7.

W h e t h e r top-down d e v e l o p m e n t was in use.

These figures w e r e s - ~ m a r i z e d m o n t h l y for various

levels of

FSD m a n a g e m e n t and were a v a l u a b l e tool in e n s u r i n g that the m e t h o d o l o g y was indeed being introduced.

The second type of m e a s u r e m e n t was a m u c h more c o m p r e h e n s i v e one.

It r e q u i r e d a great deal of r e s e a r c h in its preparation,

and e v e n t u a l l y took the form of a q u e s t i o n n a i r e from w h i c h data was e x t r a c t e d to b u i l d a m e a s u r e m e n t data base. naire contains

The q u e s t i o n -

105 q u e s t i o n s o r g a n i z e d into the f o l l o w i n g eight

sections:

1.

I d e n t i f i c a t i o n of the project.

2.

D e s c r i p t i o n of the c o n t r a c t u a l environment.

3.

D e s c r i p t i o n of the p e r s o n n e l environment.

4.

D e s c r i p t i o n of the p e r s o n n e l themselves.

5.

D e s c r i p t i o n of the technical environment.

6.

D e f i n i t i o n of the size, type and q u a l i t y of the p r o g r a m s produced.

7.

I t e m i z a t i o n of the financial~

c o m p u t e r and m a n p o w e r re-

sources used in their development.

57

8.

Definition of the schedule.

The questionnaire

is administered at four points during the

lifetime of every project. ginning,

The first point is at the be-

in which all questions are answered with estimates.

The next administration is at the end of the design phase, when the initial estimates are updated as necessary.

It is

again filled out halfway through development, when actual figures begin to be known.

And it is completed for the last

time after the system has been tested and delivered, results are in.

and all

The four points provide for meaningful com-

parisons of estimates to actuals,

and allow subsequent projects

to draw useful guidance for their own planning.

The data base

permits reports to be prepared automatically and statistical comparisons to be made.

IMPLEMENTATION EXPERIENCE

Each of the four components of the methodology which FSD has introduced has resulted in substantial benefits.

However,

experience has also revealed that their application is neither trivial nor trouble-free.

This section presents a qualitative

analysis of the experience to date, describing both the advantages and the problems.

i.

Development Support Libraries

Most projects of any size have historically gravitated toward use of a program library system of some type.

This was cer-

58

tainly true in FSD~ w h i c h had some h i g h l y d e v e l o p e d systems already in place w h e n the m e t h o d o l o g y was introduced. p r i m a r i l y used as m e c h a n i s m s to control the code,

These w e r e

so that dif-

fering v e r s i o n s of c o m p l e x systems could be segregated.

In

some cases they p r o v i d e d p r o g r a m d e v e l o p m e n t services such as compilation,

t e s t i n g and so forth.

However, none were being

used p r i m a r i l y to achieve the goals of i m p r o v e d c o m m u n i c a t i o n s or w o r k f u n c t i o n a l i z a t i o n w h i c h are the p r i m a r y b e n e f i t s of a DSL.

In fact, the general a t t i t u d e toward the services they

p r o v i d e d was that they were there to be used w h e n and if the p r o g r a m m e r s wished.

M o s t code in them was p r e s u m e d private,

with

the usual e x c e p t i o n s of m a c r o and s u b r o u t i n e libraries.

One of the m o s t d i f f i c u l t p r o b l e m s

in the i n t r o d u c t i o n of the

DSL a p p r o a c h was to c o n v i n c e o n g o i n g p r o j e c t s that their p r e s e n t library systems f u l f i l l e d n e i t h e r the r e q u i r e m e n t s nor the intents of a DSL.

A DSL is as m u c h a m a n a g e m e n t tool as a pro-

grammer convenience.

A P r o g r a m m i n g L i b r a r i a n ' s p r i m a r y respon-

s i b i l i t y is to management,

in the sense of s u p p o r t i n g control

of the p r o j e c t ' s assets of code and data -- a n a l o g o u s to a controller's r e s p o n s i b i l i t y to m a n a g e m e n t of s u p p o r t i n g control of financial assets.

The p r o j e c t as a w h o l e should be e n t i r e l y

d e p e n d e n t on the DSL for its operation, other criterion,

and this, more than any

is the d e t e r m i n i n g factor in w h e t h e r a library

s y s t e m m e e t s the g u i d e l i n e s as a DSL.

When all functions are provided,

and a p r o j e c t implements a DSL,

then a h i g h degree of v i s i b i l i t y is available.

P r o g r a m m e r s use

the source code as a b a s i c means of c o m m u n i c a t i o n and rely on it to answer q u e s t i o n s on i n t e r f a c e s or suggest approaches to their problemso

M a n a g e r s use the code itself

(or the summary

59

features of more sophisticated progress

of the work.

basis.

even at an early

basis.

in itself is valuable,

even on a laissez-faire

But when it is coupled with w e l l - m a n a g e d above,

that the specifications

the planned test coverage, and last but not least,

ensure

reviews the code,

have been addressed,

assisting

in standards

constructively

While the review procedure concomitant

The walk-

or equivalent procedures,

that someone in addition to the developer verifying

code-reading

it also provides quality improvements.

throughs described

of the

of beginning to use the de-

system on an experimental

procedures,

the

from the ready availability

test data and the feasibility

The visibility

to determine

Users also benefit,

stage of implementation, veloping

DSL's)

criticizing

checking

compliance the content.

is obviously greatly facilitated by

use of structured programming,

it is possible with-

out it and was included with the DSL guidelines

to encourage

its adoption.

The archives which are an integral part of a DSL provide an ability to refer to earlier versions of a routine - sometimes useful in tracing intent when a program is passed from hand to hand.

More importantly,

cover from a disaster stroyed.

they give a project the ability to re-

in which part of its resources

(It is perhaps obvious but worth m e n t i o n i n g

will not be complete

from the working versions.)

sees

separate

There was an initial tendency

and to over-formalize

It appears unnecessary object code,

that this

insurance unless project management

to it that the backup data sets are stored physically FSD to over-collect

are de-

in

the archiving process.

to retain more than a few generations

run results and so forth.

of

The source code and test

60

data g e n e r a l l y w a r r a n t longer retentionl

but even here it

rapidly b e c o m e s i m p r a c t i c a l to save all versions. s u f f i c i e n t archives

In general,

should be r e t a i n e d to provide c o m p l e t e

r e c o v e r y c a p a b i l i t y w h e n used in c o n j u n c t i o n w i t h the b a c k u p data sets, plus enough a d d i t i o n a l to provide back references.

The s e p a r a t i o n of f u n c t i o n i n t r o d u c e d by the DSL office procedures has two main benefits.

The obvious one is of lowered

cost t h r o u g h the use of c l e r i c a l p e r s o n n e l instead of p r o g r a m mers for p r o g r a m maintenance,

run setup and filing activities.

A s i g n i f i c a n t a d d i t i o n a l b e n e f i t comes about t h r o u g h the resulting m o r e c o n c e n t r a t e d use of programmers. terruptions~

By r e d u c i n g in-

librarians afford the p r o g r a m m e r s a w o r k environ-

m e n t in w h i c h errors are less likely to occur.

Furthermore,

they p e r m i t p r o g r a m m e r s to work on more routines in p a r a l l e l than t y p i c a l l y is the case.

The last major b e n e f i t d e r i v e d from a DSL rests in its support of a p r o g r a m m i n g m e a s u r e m e n t activity.

By a u t o m a t i c a l l y col-

lecting s t a t i s t i c s of the types d e s c r i b e d above,

they can en-

h a n c e our ability to m a n a g e and improve the p r o g r a m m i n g process. The e a r l y DSL's in FSD did not include m e a s u r e m e n t features, and the next g e n e r a t i o n is only b e g i n n i n g to come into use, so a full a s s e s s m e n t of this support is not yet possible.

It was d i f f i c u l t to c o n v i n c e FSD p r o j e c t s in some cases that a w e l l - q u a l i f i e d P r o g r a m m i n g L i b r a r i a n could b e n e f i t a p r o j e c t as m u c h as a n o t h e r programmer.

In fact, there was an initial

t e n d e n c y to use junior p r o g r a m m e r s or p r o g r a m m e r t e c h n i c i a n s to provide librarian support.

This had two d i s a d v a n t a g e s

is not recommended.

the use of p r o g r a m m i n g - q u a l i f i e d

First,

and hence

61

personnel is not necessary because of the well-defined procedures inherent in the DSL's. dividuals

Use of overqualified in-

in some cases led to boredom and sloppy work with

a resulting loss of quality.

Second,

such personnel cannot

perform other necessary functions when needed.

One of the

advantages of using secretaries as librarians is that they can use both skills effectively over the lifetime of a typical project.

During design and documentation phases,

they can

provide typing and transcription services; while during coding and testing phases,

they can perform the needed librarian

work.

Two problems remain in defining completely the role of librarians.

First,

the increasing use of interactive systems

for program development is forcing an evolution of librarian skills toward terminal operation and test support rather than coding of changes and extensive filing.

The most effective

division of labor between programmer and librarian in such an environment remains to be determined.

It also appears

possible to use librarians to assist in documentation, as in preparation of HIPO diagrams.

such

Second, FSD has a number

of small projects in locations remote from the major office complexes and support facilities - frequently on customer premises.

Here it is not always possible to use a librarian

cost-effectively.

In this situation,

the programmer/librarian

better definition of

relationship in the interactive system

development environment may permit development and librarian support to some extent from the central facility instead of requiring all personnel to be on-site.

62

2.

Structured

Pro~rammin@

Recall that the FSD use of the term "structured programming" is a narrow one, adopted to permit ongoing projects to use some of the methodology.

In this usage,

might be called "structured techniques

coding"

used in developing

a single compilable

Combined with usage of a DSL, of code, enforces modularity and maintainability, manageability properties on here.

programming

and thus encourages testing,

and accountability.

An additional,

(module).

changeability

and permits improved

These are all well-known

programming unplanned

and need not be elaborated

for, benefit of structured

is that it tends to encourage

~11ocality of reference",

unit

it provides enhanced readability

simplifies

of structured

it is more like what

and is limited to those

the property of

which improves performance

in a virtual

systems environment. Reflecting

on the advantages

ming and the use of DSL's, techniques

fundamentally

gramming discipline. individualistic, cipline

attributed

program-

are directed toward encouraging

Historically,

undisciplined

programming

activity.

Thus,

introducing

in-

plus those due to better

and control.

The introduction achieved

itself,

dis-

recog-

yields double rewards -- the advantages

herent in the methodology standardization

pro-

has been a very

in the form of practices which most programmers

nize as beneficial,

in FSD.

of structured programming

was not easily

The broad variety of projects,

support has already been mentioned, DSL's,

to structured

one is struck by the fact that the

languages

and the development

and

of

the Guides [7'9] and the education program were necessary

before widespread

application

of the methodology

could take

63

place.

Furthermore,

the ongoing nature of many of the systems

meant that structured programming could take place only as modules were rewritten or replaced. This gradual introduction created a problem of educational timing.

Practically,

it was most expedient to have programmers

attend the education courses between assignments.

The nature

of the courses was such that they introduced the techniques and provided some initial practice.

Yet they required sub-

stantial work experience using the techniques to be fully effective.

Structured programming requires the development of a

whole new set of personal patterns in programming.

Until old

habits are unlearned and replaced by new ones, it is difficult for programmers to fully appreciate the advantages of structured programming.

For best results, this work experience and

the overcoming of the natural reluctance to change habits should follow the training immediately.

This was not always feasible

and resulted in some loss of educational effectiveness. A second problem arose because of the real-time nature of a significant fraction of FSD's programming business.

Here the

difficulty was one of demonstrating that structured programming was not detrimental to either execution speed or core utilization.

While it is difficult to verify the advantages quantita-

tively, a working consensus has arisen.

Simply stated, it is

that the added time and thought required to structure a program pay off in better core utilization and improved efficiency which generally are comparable to the effects achieved in unstructured programs by closer attention to detail. note that even in "critical" programs,

It is also useful to a relatively small frac-

tion of the code is really time- or core-sensitive,

and this

64

fraction may not in fact be predictable a priori.

Hence it is

probably a better strategy to use structured programming throughout to begin with.

Then, if performance bottlenecks do appear

and cannot be resolved otherwise,

at most small units of

code must be hand-tailored to remedy the problems. way the visibility,

In this

manageability and maintainability

ad-

vantages of structured programming are largely retained. Perhaps the most difficult problem to overcome in applying structured programming is the purist syndrome,

in which the

goal is to write perfectly structured code in every situation. it must be emphasized that structured programming is not an end in itself, but is a means to achieving better, more reliable, more maintainable programs.

In some cases

a loop when a search is complete,

(e.g., exiting from

handling interrupt conditions),

religious application of the figures allowed by the Guide may produce code which is less readable than that which might contain a GO TO

(e.g., to the end of the loop block, or to return

from the interrupt handler to a point other than the point of interrupt).

Clearly the exceptions must be limited if discipline

is to be maintained,

but they must be permitted when desirable.

Our approach in FSD has been to require management approval and documentation

for each such deviation.

This ensures that only

cases which are clearly justified will be nominated as exceptionsr

3.

since otherwise the requirements are prohibitive.

Top-Down_Developme~

As defined above~

top-down development is the sequencing of pro-

gram system development to eliminate or avoid interface problems.

65

This permits development and integration to be carried out in parallel and provides additional advantages such as early availability discussed under GUIDELINES.

Top-down development is the most difficult of the four components to introduce, probably because it requires the heaviest involvement and changes of approach on the part of programming managers.

Top-down development has profound

effects on traditional programming management methodology. While the guidelines

sound simple, they require a great deal

of careful planning and supervision to carry out thoroughly in practice,

even on a small project.

top-down development,

The implementation of

unlike structured programming and DSL's,

thus is fundamentally a management and not a programming problem.

Let us distinguish at this point between what might be called "top-down programming"

and true top-down development.

While

they were originally used interchangeably and the guidelines do not distinguish between them, the two terms are valuable in delineating levels of scope and complexity as use of the methodology increases.

Top-down programming is primarily a single-program-oriented concept.

It applies to the development of a "program", typical-

ly consisting of one or a few load modules and a number of independently compilable units, which is developed by one or a few programmers.

At this level of complexity the problems are pri-

marily ones of program design, and the approaches used are those of classical structured programming definition)

(here not the narrower FSD

such as "levels of abstraction "[16] and the use of

Mills' Expansion Theorem [3] .

Within this scope of development

66

external problems and constraints are not as critical, while management involvement is needed, pervasive as in top-down development.

and

it need not be so Many of FSD's suc-

cessful projects have been of this nature,

and the experience

gained on them has been most valuable.

Top-down development,

on the other hand,

program oriented idea.

is a multiple-

It applies to the development of a

"program system ~', typically consisting of many load modules and perhaps a hundred or more independently compilable units, which is developed by one or more programming departments with five or more people in each.

Now the problems expand to those

of system architecture~

and external problems and constraints

become the major ones.

The programs in the system are usually

interdependent and have a large number of interfaces, perhaps directly but also frequently through shared data sets or communications

lines.

They may operate in more than one processor

concurrently - for example, System/370

in a System/7

"front end" and a

"host".

The complexity of such a system makes management involvement in its planning and development essential even when external constraints are minimal.

It involves all aspects of the project

from its inception to its termination.

For example,

a proposal

for a project to be implemented top-down should differ from one for a conventional

implementation

and usage of computer time.

in the proposed manning levels

Functions must be carefully analyzed

during the system design phase to ensure that the requirements of minimum code and data dependency are met, and a detailed implementation sequence must be planned in accordance with the overall proposed plan and schedule.

The design of the system

67

very probably should differ significantly from what it would have been if a bottom-up approach were to be used. implementation,

sure that this sequence is being followed, are being met.

During

progress must be monitored via the DSL to enand that schedules

The early availability of parts of the system

must be coordinated with the user if he intends to use these parts for experimentation or production. ferent type of test plan must be prepared, testing over the entire period. components,

An entirely diffor incremental

Rather than tracking individual

the manager is more concerned with the progress of

the system as a whole, which is a more complicated matter to assess.

This is normally not determinable until the integration

phase in a bottom-up development,

when it suddenly become a

critical item; in top-down work it is a continuing requirement, but one which enables the manager to identify problems earlier and to correct them while there is still time to do so. In a typical system development environment such as those in FSD, however, exception. be met.

external constraints are the rule rather than the

A user will have schedule requirements which must

A particular data set must be designed to interface

with an existing system.

Special hardware may arrive late in

a development cycle and may vary from that desired.

These are

typical of situations not directly under the developers'

control

which have profound effects on the sequence in which the system is produced.

Now the manager's job becomes still more complex

in planning and controlling development.

Each of these external

constraints may force a deviation from what would otherwise be a classical,

no-dependency development sequence.

Provision may

have to be made for testing, documentation and delivery of

88

products

at intermediate

will typically will p r o b a b l y

points

in the overall

change the schedule

This and

increase the complexity of the m a n a g e m e n t

This is especially hundred thousand

true on a very large project

lines of source code or more),

realistic

schedule may well require

developed

in parallel

fashion

cycle.

from the ideal one,

job.

(several since any

that major subsystems be

and integrated

in a nearly conventional

(hopefully at an earlier point in time than the end of

the project). of 400,000

This was carried out successfully

lines of source code,

on a project

the largest known to the

author to date. When carried to its fullest extent~

top-down development

large system p r o b a b l y has greater effects on quality indirectly,

on productivity)

methodology.

than any other component

Even when competent m a n a g e m e n t

its implementation,

of the

is fully devoted

to

there are two other problems which potential-

ly can arise and must be planned for. overlapping

of a

(and thus,

nature of design,

These both relate to the

development

and integration

in a

top-down environment. The first of these concerns the system to be delivered ly, a user receives

the nature of materials

documenting

to and reviewed by the user.

Typical-

a p r o g r a m design document at the end of the

design phase and must express his concurrence before d e v e l o p m e n t proceeds.

This is impractical

in top-down development

because

d e v e l o p m e n t must proceed in some areas before design

is complete

in others.

a detailed

functional

To give a user a comparable specification

all external

is desirable

aspects of a system,

gorithms of concern to a user,

opportunity,

instead.

This describes

as well as any processing

but does not address

al-

its internal

69

design.

This type of specification is probably more readily

assimilated by typical users,

is more meaningful than a de-

sign document and should pose no problems in most situations. Where standardized procurement regulations

(such as U.S.

Government Armed Services Procurement Regulations) effect,

are in

then efforts must be made to seek exceptions.

top-down development becomes more prevalent,

(As

then it is hoped

that changes to such procedures will directly permit submission of this type of specification.)

The second problem is one of the most severe to be encountered in any of the components and is one of the most difficult to deal with.

It has to do with the depth to which a design should

be carried before implementation is begun.

If a complete,

de-

tailed design of an entire system is done, and implementation of key code in all areas is carried out by the programmers who begin the project,

then the work remaining for programmers added

later is relatively trivial.

In some environments this may be

perfectly appropriate and perhaps even desirable; may lead to dissatisfaction latecomers.

in others it

and poor morale on the part of the

It can be avoided by recognizing that design to

the same depth in all areas of most systems is totally unnecessary.

The initial system design work

"architecture"

(the overworked term

still seems to be appropriate here)

should con-

centrate on specifying all modules to be developed and all inter-module interfaces.

Those modules which pose significant

schedule, development or performance problems should be identified, and detailed design work and key code writing done only on these.

This leaves scope for creativity and originality on

the part of the newer programmers,

subject obviously to review

and concurrence through normal project design control procedures.

On some projects,

the design of entire support sub-

70

systems w i t h interfaces to a m a i n s u b s y s t e m only through standard,

s t r a i g h t f o r w a r d data sets has been left to late in the

project.

Note that while this may solve the p r o b l e m s of chal-

lenge and morale,

it also poses a risk that the d i f f i c u l t y has

been u n d e r e s t i m a t e d .

Thus, here again m a n a g e m e n t is c o n f r o n t e d

w i t h a d i f f i c u l t d e c i s i o n w h e r e an incorrect a s s e s s m e n t may be n e a r l y i m p o s s i b l e to r e c o v e r from.

4o

Chief P r o g r a m m e r Teams

The i n t r o d u c t i o n of CPT~s should be a natural o u t g r o w t h of t o p - d o w n development.

The use of a smaller group b a s e d on a

nucleus of e x p e r i e n c e d people tends to reduce the c o m m u n i c a t i o n s and control p r o b l e m s e n c o u n t e r e d on a typical project.

Use of

the other three c o m p o n e n t s of the m e t h o d o l o g y e n h a n c e s these a d v a n t a g e s t h r o u g h s t a n d a r d i z a t i o n and visibility.

In order for a CPT to function effectively, m u s t be given the time,

r e s p o n s i b i l i t y and a u t h o r i t y to p e r f o r m

the t e c h n i c a l d i r e c t i o n of the project. this poses no problems;

the Chief P r o g r a m m e r

In some e n v i r o n m e n t s

in FSD it is sometimes d i f f i c u l t to

achieve b e c a u s e of other demands w h i c h may be levied upon the chief.

In a c o n t r a c t p r o g r a m m i n g e n v i r o n m e n t he may be called

upon to p e r f o r m three d i s t i n c t types of activities:

technical

m a n a g e m e n t - the s u p e r v i s i o n of the d e v e l o p m e n t p r o c e s s itself, p e r s o n n e l m a n a g e m e n t - the s u p e r v i s i o n of the p e o p l e r e p o r t i n g to him,

and c o n t r a c t m a n a g e m e n t

t i o n s h i p s w i t h the customer.

- the s u p e r v i s i o n of the rela-

This latter in p a r t i c u l a r can be a

very t i m e - c o n s u m i n g f u n c t i o n and also is the s i m p l e s t to secure a s s i s t a n c e on.

Hence m a n y FSD CPT's have a p r o g r a m m a n a g e r who

has the p r i m a r y c u s t o m e r i n t e r f a c e r e s p o n s i b i l i t y in all non-

71

technical matters.

The Chief remains r e s p o n s i b l e for technical

c u s t o m e r i n t e r f a c e as well as the other two types of m a n a g e ment;

in m o s t cases this makes the situation manageable,

and

if not then a d d i t i o n a l support can be p r o v i d e d where needed.

The Backup P r o g r a m m e r role is one that seems to cause people a great deal of d i f f i c u l t y in accepting, p r o b a b l y because there are o v e r t o n e s of "second-best"

in the name.

Perhaps the name

could be improved, but the functions the Backup performs are e s s e n t i a l and cannot be d i s p e n s e d with.

One of the p r i m a r y

rules of m a n a g e m e n t is that every m a n a g e r should identify and train his successor.

This is no less true on a CPT and is a

m a j o r reason for the e x i s t e n c e of the Backup position.

It is

also h i g h l y d e s i r a b l e for the Chief to have a peer w i t h w h o m he can freely and openly interact, e s p e c i a l l y in the critical stages of system design. and balance on the Chief.

The Backup is thus an e s s e n t i a l check Because of this,

it is important that

the C h i e f have the right of refusal on a p r o p o s e d Backup;

if

he feels that an open r e l a t i o n s h i p of mutual trust and respect cannot be achieved,

then it is useless to proceed.

The require-

m e n t that the Backup be a peer of the Chief also should not be waived,

since it is always p o s s i b l e that a Backup w i l l be

called on to take over the project and must be fully q u a l i f i e d to do so.

One of the limits on a CPT is the scope of a p r o j e c t it can r e a s o n a b l y undertake.

It is d i f f i c u l t for a single CPT to get

much larger than eight people and still permit the Chief and Backup to e x e r c i s e the e s s e n t i a l amount of control and supervision.

Thus, even at the h i g h e r - t h a n - n o r m a l p r o d u c t i v i t y rates

achievable by CPT's it is d i f f i c u l t for a single Team to produce

72

much more than perhaps 20,000 lines of code in its first year and 30-~0q000 lines thereafter.

Larger projects must there-

fore look to multiple CPT'sr which can be implemented in two ways.

First, as mentioned above under Top-Down Development,

interfaces may be established and independent subsystems may be developed concurrently by several CPT's and then integrated. Second,

a single CPT may be established to do architecture

and nucleus development for the entire system.

It then can

spin off subordinate CPT's to complete the development of these subsystems.

The latter approach is inherently more appealing,

since it carries the precepts of top-down development through intact.

It is also more difficult to implement;

the experiment

under way by the author ran into problems because equipment being developed concurrently ran into definition problems and prevented true top-down development. It is difficult to identify problems unique to CPT~s which differ from those of top-down development discussed above.

Perhaps the

most significant one is the claim frequently heard that,

"We've

had Chief Programmer Teams in place for years - there's nothing new there for us. ~' While it is certainly true that many of the elements of CPT~s are not new, the identification of the CPT as a particular ciplined,

form of functional organization using a dis-

precise methodology

particular,

suffices to make it unique.

In

the emphasis on visibility and control through

management code-reading,

formal structured programming tech-

niques and DSL's differentiate true CPT's from other forms of programming teams [17]

And it is this same set of features

which make the CPT approach so valuable in a production programming environment where close control is essential if cost and schedule targets are to be met.

73

MEASUREMENT RESULTS

It is not possible, because it would reveal valuable business information,

to present significant amounts of quantitative

information in this paper.

At this time, the results of the

measurement program do show substantial improvements in program~ing productivity where the new technology has been used. A graph has been prepared where each point represents an FSD project.

The horizontal axis records the percentage of struc-

tured code in the delivered product, records the productivity. the project,

and the vertical axis

(The latter includes all effort on

including analysis,

design,

testing, management,

support and documentation as well as coding and debugging.

It

also is based only on delivered code, so that effort used to produce drivers,

code written but replaced, etc., tends to

reduce the measured productivity.)

A weighted least squares

fit to the points on the graph shows a better than 1.5 to 1 improvement in the coding rate from projects which use no structured programming to those employing it fully.

It is also possible, leased elsewhere,

because the data has already been re-

to make one quantitative comparison between

productivity rates experienced using various components of the technology on some of the programming support work which FSD has performed for the National Aeronautics and Space Administration's Apollo and Skylab projects.

This comparison

is especially significant because the only major change in approach was the degree to which the new methodology was used; the people, experience

level, management and support were all

substantially the same in each area.

Figure 6 shows the productivity rates and the components of the technology used.

In the Apollo project,

a rate of 1161

74

bytes of new code per m a n - m o n t h was e x p e r i e n c e d on the Ground Support S i m u l a t i o n work. all p r o j e c t effort.)

(Again, all numbers are based on over-

This work used none of the components

d e s c r i b e d in this paper. the Skylab project,

In the d i r e c t l y c o m p a r a b l e effort on

a DSL,

s t r u c t u r e d p r o g r a m m i n g and t o p - d o w n

d e v e l o p m e n t were all employed,

and a rate of 3756 bytes of

new code per m a n - m o n t h was a c h i e v e d -- almost twice as m u c h new code was p r o d u c e d w i t h s l i g h t l y m o r e than half the effort. It is i n t e r e s t i n g also to remark that this was a c h i e v e d on the p l a n n e d s c h e d u l e in spite of over Ii00 formal changes made d u r i n g the d e v e l o p m e n t of that product, m a n p o w e r and c o m p u t e r time.

along w i t h cuts in b o t h

Finally, w h i l e the i m p r o v e m e n t

may rest to some extent on the similar w o r k done previously, this was not d e m o n s t r a t e d C o n t r o l work.

in the p a r a l l e l M i s s i o n O p e r a t i o n s

There p r o d u c t i v i t y d r o p p e d from 15~7 to 841 bytes

per m a n - m o n t h on c o m p a r a b l e w o r k which in neither case used anything other than a DSL.

In a d d i t i o n to m a k i n g q u a l i t y m e a s u r e m e n t s

and d e t e r m i n i n g

p r o d u c t i v i t y rates, the m e a s u r e m e n t a c t i v i t y has served a number of other useful purposes.

First,

it has built up a substantial

data base of i n f o r m a t i o n about FSD projects. added,

As new data is

checks are made to ensure its validity,

data is r e v i e w e d before being added.

and q u e s t i o n a b l e

The result is an in-

c r e a s i n g l y c o n s i s t e n t and useful set of data.

Second,

it has

enabled FSD to begin studies on the value of the components of the methodology. of other factors p r o j e c t activity. ongoing projects~

Third,

and related,

(e.g.~ environment, Fourth,

it also permits the study personnel)

affecting

it is used to assist in r e v i e w i n g

w h e r e the o b j e c t i v e data it contains has

p r o v e d quite valuable. for p r o p o s e d projects,

And fifth,

it is used in e s t i m a t i n g

w h e r e it affords an o p p o r t u n i t y to com-

pare the new w o r k a g a i n s t similar w o r k done in the past, to i d e n t i f y risks w h i c h may exist.

and

75

CONCLUSIONS

It should be clear at this point that FSD's experience has been a very positive one.

Work remains to be done, particularly

in the management of top-down development and the formalization and application of CPT's.

Nevertheless,

FSD is fully committed

to application of the methodology and is continuing to require its use.

In retrospect,

the plan appears to have been a success and

could serve as a model for other organizations interested in applying the ideas.

The FSD experience shows that this is

neither easy nor rapid. and, most important,

It takes substantial time and effort

commitments and support from management,

to equip an organization to apply the methodology.

To summarize,

it appears that once a base of tools,

and education exists,

standards

it is most appropriate to begin with use

of structured and top-down programming and DSL's.

Where the

people, knowhow and opportunity exist, then top-down development should be applied on a few large, complex projects to yield an experienced group of people and the required management techniques.

It is likely that one or more of these may

also present the opportunity to introduce a CPT.

This is es-

sentially the approach that FSD has taken, and it appears to be an excellent way to organize for structured programming.

76 REFERENCES

[ 1]

Datamation,

[2]

B. W. Boehm,

"Software

Assessment",

Datamation,

H. Do Mills,

Mathematical

Programming,

Report No. FSC 72-6012,

[3]

Vol~

Gaithersburg,

[4]

H. D. Mills, Procedures,

Production

Vol.

A Quantitative

19, No. 5, May,

Foundations

Chief Programmer

Teams:

Maryland,

USA, June,

~'Chief Programmer

Programming",

1973, p. 52

for Structured IBM Corporation,

USA, February,

Report No. FSC 71-5108,

F° T. Baker,

1973, pp. 50-63

and its Impact:

Maryland,

Gaithersburg,

[ 5]

19, No. 12~ December,

1972 Principles

and

IBM Corporation, 1971

Team Management

IBM Systems Journal,

of Vol.

ii.~

Noo I, 1972, pp. 56-73

[ 6]

Fo T. Baker;

~'System Quality Through Structured

ming ~', AFIPS Conference

Proceedings,

Vol.

Program-

41, Part I,

1972r pp~ 339-343 [7]

Federal

Systems Center Structured

Report No. FSC 72-5075, Maryland,

[8]

USA, July,

Improved Technology ment Overview, August,

[ 9]

Programming

IBM Corporation,

1973

Guide,

Gaithersburg,

(revised)

for Application

IBM Corporation,

Development:

Bethesda,

Manage-

Maryland,

USA,

1973

Federal Systems Center Programm.ing Librarian's Report No~ FSC 72-5074, Maryland,

USA, April,

IBM Corporation,

1972

Guide,

Gaithersburg,

77

[io]

F. M. Luppino and R. L. Smith, Programming (PSL) Functional poration, Contract

Requirements:

Gaithersburg,

[ii]

Final Report,

Maryland,

#F30602-74-C-0186

HQ Rome Air Development

IBM Cor-

USA, prepared under

with the U. S. Air Force

Center,

Griffiss Air Force Base,

New York, USA, July,

197~

Contracting

Mr. Paul DeLorenzo)

officer,

Support Library

(Release subject to approval of

P. W. Metzger,

Managing a Programming

Prentice-Hall,

Englewood Cliffs,

Project,

New Jersey,

USA,

1973 [12]

R. C. McHenry, Structured Maryland,

[13]

Management Concepts

Programming, USA, November,

HIPO - Hierarchical

Gaithersburg,

1972

Input - Process - Output Docu-

mentation Technique: Corporation,

for Top Down

IBM Corporation,

Audio Education Package,

Form No. SR20-9~13

IBM

(Available through

any IBM Branch Office) [14]

M. M° Kessler, ming Macros, USA,

[15]

Assembly Language Structured

IBM Corporation,

September,

Gaithersburg,

ProgramMaryland,

1972

G. F. Weinwurm et al, Research into the Management Computer Programming:

A Transitional

Estimation

System Development

Techniques,

Santa Monica,

California,

from the Clearinghouse nical Information

Analysis

USA, November,

Corporation, 1965

for Federal Scientific

as AD 631 259)

of

of Cost (available and Tech-

78

[16]

E. W. Dijkstra~ ming System", No~ 5, May,

[17]

~'The Structure

Communications

ii.,

1968, pp. 341-346

G. M. Weinbergr

The Psychology

ming, Van Nostrand 1971

of the THE Multiprogram-

of the ACM, Vol.

Reinhold,

of Computer Program-

New York, New York, USA,

79 J ro~trelation. On the other hand, they need not be checked at any reference but just at the interface. This leads to the second reason: It is the designers responsibility to minimize the information passed across interfaces.

7.

SUMMARY

The preceding sections presented an attempt to handle errors systematically. Error isolation a n d ~ecovery were treated under one aspect~ Error isolation led to the realization of the p a r t i a l fanction concept and the provision of resource manager~. Error recovery in addition necessitated the i n t r o d u c t i o n Of a c o m m i t m e n t discipline in conjunction with a h i e r a r c h i c a l structure of processes.

REFERENCES

I.

J.J. Ho~ning and B. Eandell: Process Computing Surveyse Vol. 5, No I, March 1973

2.

N. Wirth: Informatica

.

The Programmin 9 I, 35-63 (1971)

N. Habermann: Language Pascai,

Language

Structuring,

Pascal,

Acta

3ritical Zomments on the Programming Acta Informatica 3, ~7-57 (1973)

P. Brinch Hansell: Concurrent Programming Computing Surveys, Vol. 5, No 4, December 1973 5~

Ch. T. Dawies, Jr.: Recovery Semantics System, 1973 P r o c e e d i n g s of the ACM

6.

L.A. Bjork: Proceedings

recovery Scenario of the ACM

for

a DB/DC

for

Conceptsr

a

DB/DC

System,

1973

7. D. L. Parnas: Software Engineering or Methods for the Multi-Person Construction of Multi-Version Programs, published in these proceedings 8. N. Wirth: The Design of 1, 309-333 (1971)

a P A S C A L Compiler~

Software,

Vol

97

Precedence Relation between Processes

Fig.

I

98

Mapping of Input States to Output States defined by a Process

l~p = {X+,X2} d

X2 Xl

~ig. 2

99

Distinction between Functional and Exceptional Actions of a Process I

~ig. 3

IIII

I

II

I

100

Use of Rangesand Error Descriptions

Example 1: 1 X(IXl '

The result form.

It

is an A P L G O L is left

source p r o g r a m in

in a global

expanded,

c h a r a c t e r array

indented

v a r i a b l e named

"OUT".

The

APLGOL

System

was d e s i g n e d

requires d i f f e r e n t human or editing its

a p r o g r a m than when

structure.

d e s i g n e d to

accept text with

the A P L G O L

the

user

to display

compiler

c o m b i n a t i o n s of

has

to the

p r o d u c e s source p r o g r a m s w i t h

in statements,

i n d e n t i n g for

been

a b b r e v i a t e d and

and m a n y source statements

line, w h i l e the REVERSE c o m p i l e r fully spelled k e y w o r d s

that

he lists a p r o g r a m

Consequently,

completely spelled keywords

with t w o - s p a c e

recognizing

factors c a p a b i l i t i e s w h e n he is typing

one s t a t e m e n t

each layer

per line,

of nesting.

Thus,

a

p r o c e d u r e can be entered as:

SAMPL; ~ A~B I~B A+C; ~ J÷I ~ L(N-1 +I)÷2 L2+L3÷L4L-J-I; L4÷ -L-l+pY÷7 DYADF L; E; E E A÷D; 2:ppZ T X C[2]÷(I+pZ)÷N~ Z+N,C,N,P,Q,R; E while

it is

p r o d u c e d by

the

reverse c o m p i l e r

s u b s e q u e n t editing as: ~ROCEDURE SAMPL~ ZF A~B ~HEN BEGIN A+C; ~OR J÷I ~NTIL L(N-I+I)~2 ~0 L2÷LS+L4L-J-I; BEGIN Lq÷-L-I+pY÷7 DYADF L; END~ END ELSE A÷D; ~Z i : p p Z ~HE~ [XIT C[2]÷(I+pZ)÷N~ Z+N,C,N,P,Q,R~ END PROCEDURE

and used

for

173

3.1

The A P L G O L Editor

A single c o n t e x t - o r i e n t e d editor is used to enter new programs, edit

old

ones,

and

p a t t e r n e d after with typical APL not to

of

a

lines

of text

b o t t o m of

relative to

A P L G O L statements lines

may

an implied

or down

a few

The text

to

cursor,

has no special

enter

an

one

replacing,

lines or

been

to context and

changing

deleting,

and p h y s i c a l lines;

be used

has

4). C o m p a r e d

functions include locating

inserting,

the text.

editor (Ref.

this editor is keyed

Its

cursor up

This

VM/370

c h a r a c t e r pattern,

pattern to another,

m o v i n g the

programs.

editor in

editors,

line numbers.

occurrence

list

the CMS

the next character

or p r i n t i n g

and,

finally,

to the

top or

r e l a t i o n between

an a r b i t r a r y

A P L G O L statement,

number of or

many

statements can be entered on a line.

APLGOL

source

Keywords all recognition,

programs are begin w i t h

actually

an u n d e r l i n e d

APL

character

first letter

arrays. for easy

while other letters are not u n d e r l i n e d in order to

reduce keystrokes.

Also, w h e n programs

the first u n d e r l i n e d

are being entered,

letter need be used.

facilitate the e n t e r i n g

of programs and to

This

only

is intended to

reduce m i s s p e l l i n g

of keywords.

To create an entirely new source

p r o g r a m the editor is invoked

by entering:

NEWPGM+EDIT

(O,N)p

~

t

974

where

NEWPGM

specifies and may lines

is the

the accept

commands

of source

is i n v o k e d

name

line width.

text.

according

of

for

new

source

is then

inserting,

To edit to the

the

The e d i t o r

and

N

C o m m a n d Mode

deleting,

a text array,

following

text~

in

or c h a n g i n g

the edit p r o c e d u r e

example:

ARRAYs-EDIT A R R A Y

where

ARRAY

the newly

is the

edited

and new texts

typing

may

INSERT

to abort

status

The E D I T O R ....

typing

this,

text

an

any other

DELETE

Mode

by

the

....

the

text

editing

following

after

"I" f o l l o w e d lines

the other

Insertion entered

editing

source

edited

and

for the old

either into

by

typing

internal

process

"FILE"

APL,

without

or by

affecting

commands:

the c u r r e n t

is in C o m m a n d Mode,

successive

one after

being

can be used

or I ....

W h e n the e d i t o r by

the old text names

of the procedure.

accepts

Insert

terminate

the A P L G O L

"QUIT"

the prior

of both

Different

if desired.

The p r o g r a m m e r to t r a n s l a t e

name

text.

and

pressing

by a

line.

Insertion carriage

Mode

is e n t e r e d

return.

Following

of text m a y be inserted

as they return

are to appear to

Command

the c a r r i a g e

return

by t y p i n g

in the text. Mode,

a

null

on a n e w

them

To leave line

is

line b e f o r e

character.

n or D n

Delete

n

lines b e g i n n i n g

with

the

current

line.

If

n is

175

omitted,

then

1 is assumed.

PRINT n or P n Print n

lines b e g i n n i n g

omitted,

1 is assumed.

NEXT n

w i t h the c u r r e n t

line.

If

n is

or N n

Step

forward

assume

n lines

in

the Text.

If n is

omitted

then

i.

UP n or U n Step up n lines.

If n is o m i t t e d

then assume

i.

TOP or T Position

line p o i n t e r

at first line

line p o i n t e r

at last line

in text.

B O T T O M or B Position

L O C A T E / .... / Search

or

L/ .... /

the lines

containing

the

following text s t r i n g

are used to

delimit

part of the

string being

be used

in text.

....

the argument

as a delimiter.

of the function,

the c u r r e n t

line

for the line

The c h a r a c t e r s of locate;

searched

for.

If the p o i n t e r

the search will b e g i n

/

and

/

they are not

Any c h a r a c t e r is on the last

may line

at the

top of the

and replace

it by text2

function.

CHANGE

/textl/text2/

Search

or C / t e x t l / t e x t 2 /

the c u r r e n t

if it occurs. delimiter

line for textl,

As noted,

and may

above,

be replaced

the c h a r a c t e r by any

/ serves

character.

as a

If textl

176

is null,

text2

If text2

is nu!ll

REPLACE

textl

at the b e g i n n i n g

of the

line.

is deleted.

.... or R ....

Replace given,

FILE

is i n s e r t e d

the

current

this

line by the

is the same

as DELETE,

text

....

If no

UP,

INSERT

the newly

edited

text

is

combined.

or F Leave target

the editor array

and a s s i g n

text

to the

status

of the

specified.

QUIT or Q Leave

the editor

function. it

and make

will

still

function,

remain

If

the a r g u m e n t

DELETE

is such

that

last

line.

point

3.2

The

The A P L G O L

compiler

syntax text

to the

is

scanner, generator.

initialize invoking

the

then

the

line,

line

to editing,

it was

a

defined

be unaltered.

NEXT,

Similarly,

the top first

to

If

prior

PRINT,

LOCATE,

the line p o i n t e r w o u l d

end of the function,

point b e y o n d

will

to the

undefined

undefined.

its d e f i n i t i o n

Note:

to the

no change

If the f u n c t i o n was

line p o i n t e r if the

move past

or the

is set to point

the UP c o m m a n d line p o i n t e r

tries is set

to to

in the text.

Compiler

organized

into three

(2) a lexical One of tables

the syntax

two for

scanner.

scanner, driving the

main and

sections: (3) an

procedures

appropriate

(i)

a

APL o b j e c t is

syntax

used

to

before

177 The syntax scanner is the c o n t r o l l i n g procedure, symbols

from

rules, and

the

lexical scanner,

then i n v o k i n g

the text

obtaining meta

identifying g e n e r a t o r as

the

grammar

n e c e s s a r y to

produce the a p p r o p r i a t e A P L object text.

On each invocation, symbol

the lexical scanner returns

number r e p r e s e n t i n g

character,

or

an APL

scanned for one

a

expression.

e.g.,

the first

Source

special

text characters

are

search for a specific meta symbol;

c h a r a c t e r in each reserved

w o r d is u n d e r l i n e d m e t a symbol, both

scanner and for the reader.

listed in

a single m e t a label,

characters, which is

d i s t i n g u i s h i n g it as a p a r t i c u l a r

for the lexical symbols

word,

of a very limited set of

then used to key a d e t a i l e d

to aid in

reserved

Appendix

II are

The terminal meta

detected

in the

lexical

scanner.

When a

grammar rule is i d e n t i f i e d

text g e n e r a t o r is

by the syntax

invoked with a number

scanner,

the

c o r r e s p o n d i n g to that

grammar rule to locate the applicable p o r t i o n of the generator. Information a c c u m u l a t e d in

the compile stack and

then used to g e n e r a t e labels, branches,

elsewhere is

or to produce a single

APL text line from a b u f f e r e d APL expression.

An output

p r o c e d u r e is

text,

is invoked from the

and

line is created. workspace,

so

employed to

form an

array of

object

text g e n e r a t o r each time

a new

This object text array remains in the c o m p i l e r that

extraneous b r a n c h e s

and

labels

can

be

removed after all the object text has been produced.

The semantics

of the control

structures are indicated

in the

e n c l o s e d figures.

In

the

APL

object

compiler, many of

text

produced

by

the labels and branches

the

simple

one-pass

may be unnecessary,

178

particularly absolute

the o b j e c t After by

those

branch. code

all the

an

target

of the

appear

on a

removed,

3.3

It

label

absolute single

been

internal

APLGOL

possible

to

produced

by

the single

branch.

one

and

APL

APLGOL

implementation

entity

from

with

if m u l t i p l e

labels

all

but one

are

accordingly.

systems

to r e t a i n

and to

rather

heavily

the other there

(Re,.

dissimilar,

this

to

requires

and

not m u l t i p l e

from

forms

of

of the

In the o r i g i n a l

not

only

a separate

guarantee

level.

by h a v i n g

been

advantage

source was

one could

reduced

it has

by the p a t t e r n s

The m a i n

same m a i n t e n a n c e

are

This

single

translation

conditioned way.

are

The

5) the A P L G O L

the A P L object, at the

a

translate

as necessary.

get out of s y n c h r o n i z a t i o n .

requirements

removed

to the

text,

translators.

is

form is that to

followed

for each direction.

both

however,

program

as

references.

a label and

and an

tables

to r e f e r e n c e s

Further,

are c h a n g e d

APL are

translation

were

revised

of o b j e c t

label

builds

and their

detected

representation,

source

forms

be

each APL p r o c e d u r e

construct

A P L to APLGOL,

storage

can

in

a pair of t r a n s l a t o r s ,

Although

of a

Back-Compiler

customary

form for

labels

suitably

line

The A P L to A P L G O L

or from a c h a r a c t e r

both

showing

and the r e f e r e n c e s

has

consist

the c o m p i l e r

code has been produced,

branch

to that

which

them,

is formed object

absolute

references

statements

To remove

that

Additionally, a single

copy

of a procedure.

The

reverse

stylized

the o r i g i n a l duplicate

translation

canonical

form,

APLGOL

program

the original,

an i n d e n t e d graphically.

format w h i c h In this

from

APL to

w h i c h may as

entered.

the t r a n s l a t o r displays

form,

keywords

APLGOL

appear

will p r o d u c e

quite

Rather was

different

than a t t e m p t

designed

the s t r u c t u r e are fully

a

from to

to produce

of the p r o g r a m

spelled,

and each

179

control

level is

statement

per

consistently

line.

This

indented,

graphical

typically with

representation

p r o g r a m is very useful to depict the p r o g r a m structure. this form is

a v a i l a b l e for listing an A P L G O L

been s u c c e s s f u l l y

t r a n s l a t e d to

always

in a

be listed

APL, the

standard format,

of

one the

Because

p r o g r a m that has

APLGOL program thereby p r o m o t i n g

can a

c o n s i s t e n t style.

In general the of the

reverse c o m p i l e r does not

A P L G O L program.

number of APL statements edited

to have

present in

two

however, keep

in each block.

or m o r e

the o r i g i n a l

will add a BEGIN

It does,

change the structure

statements

A P L G O L source,

...END pair

count of

the

If the A P L f u n c t i o n is where

only one

the reverse

was

compiler

around the statements during back

compilation.

4.

ACKNOWLEDGEMENTS

The author wishes to a c k n o w l e d g e the S c i e n t i f i c Center staff

work of the IBM Palo Alto

for their i n s p i r e d work

i m p l e m e n t i n g and testing the A P L G O L

system.

in designing,

The m a j o r o r i g i n a l

c o n t r i b u t i o n s were made by Robert A. Kelly, John R. Walters and Dan M c N a b b Myers

(See Refs.

for the

5-7).

Special thanks are due to H. Joseph

more recent

modifications

and testing

APLGOL

under m a n y conditions.

The

author is

making

p a r t i c u l a r l y grateful

available to

us

functions w r i t t e n in APL.

his

to John

m o s t recent

(Ref.8)

R. Walters

copyrighted

for

APLGOL

180

APPENDIX I:

References

io

APL/360 User's Manual, GH20-0683,

2.

Dijkstra,

E.

Lecture,

Communications

Wo~

"The

Humble of

IBM Corporation.

Programmer",

the

ACM,

Vol.

1972 15,

Turing No.

I0,

October 1972. 3.

Dijkstra,

E.W.,

Dahl and C.

"Structured Programming",

A. R. Hoare) Academic

(with O.

Press, London,

J.

October

1972. 4.

IBM

Virtual Machine Facility/370:

5.

Kelley,

Edit

Guide, GC20-1805,

IBM Corporation. for

R. A.,

APL",

"APLGOL,

IBM

Palo

a Structured Programming Language

Alto

Scientific Center

Report

No.

320-3299, August 1972. 6.

7.

Kelley,

R.

A.,

"APLGOL,

Programming

Language",

Development,

VOlo

Kelley, R. Programming

an

IBM

Experimental Journal

of

Structured Research

and

17, No. I, January 1973.

A. and Walters, System for

J. R.,

APL"r

IBM

"APLGOL-2 A Structured Palo Alto

Scientific

Center Report No. 320-3318, August 1973. 8.

Kelley, R. Programming

A. and Walters, Language

J. R.,

System

APL-VI Conference, May 14-17, 9.

Knuth~ Stanford

D.

E.,

"A

University

STAN-CS-73-371,

Review

for

Proceedings

of

1973. of

Computer

June 1973.

"APLGOL-2 A Structured APL".

Structured Science

Programming" Department,

181

APPENDIX .

.

.

.

.

.

.

.

II: .

.

.

APLGOL

.

.

.

.

.

.

.

.

.

SYNTAX .

.

.

[I]

[2]

[3] [4]

< S T A T E M E N T LIST> ::= < S T A T E M E N T > ; I

[5] [6]

< S T A T E M E N T > ::: ::= < C O M M E N T S T A T E M E N T > I

[9] [10] [11]

< S T A T E M E N T - A > ::= < E X P R E S S I O N > :

NULL :

~12]

~13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23]

P R O C E D U R E _I_

::=

_I_

~:=

PROCEDURE

LIST>

;

;

EZZT [XIT < B E G I N > < S T A T E M E N T L I S T > ~NTIL < T R U E P A R T > < S T A T E M E N T > ~ 0 < S T A T E M E N T > < W H I L E H E A D > DO < S T A T E M E N T > DO < C A S E E X P R > B E G I N < S U B C A S E L I S T >

[24]

CASE>

::=

HEAD>

~F



EASE

182

A. P. P. E. N. D.I.X. . II: . . .

A.P .L G. O. L.

.

.

[40]

[41]

B ~0 [3] BEGIN [4] REPEAT [5] B+C[I]; [6] WHILE I z in WORLD5° We may attempt to achieve this relation directly, by inserting the instruction sort(x z) in the middle of the program, yielding

251

the p r o g r a m sort(x y) sort(x z) sort(y z). However, this action, though ultimately

correct, risks disturb-

ing the relation x(y p r o t e c t e d in WORLD5.

A more prudent course

is to pass the predicate back still further to WORLDI, r e l a t i o n is p r o t e c t e d at all.

where no

Passing the r e l a t i o n back to WORLDI

gives the two predicates x ( z if y > z and x ~ y in WORLDI and y ( z if x > z and x > y in WORLDI. We must achieve both of these relations.

If we achieve the first

r e l a t i o n first, we insert the i n s t r u c t i o n sort(x z) at the beginning of the program, vially satisfied.

then beeuase x(z, the second relation is tri-

The p r o g r a m o b t a i n e d is the

sort(x z) sort(x y) s o r t ( y z). If, on the other hand we try to achieve the second r e l a t i o n first, by i n s e r t i n g the i n s t r u c t i o n sort(y z) at the b e g i n n i n g of the program,

the first r e l a t i o n will also be satisfied automatically,

and the p r o g r a m w i l l then be sort(y z) sort(x y) sort(y z). The simultaneous goal p r o b l e m is essential to all p r o g r a m synthesis:

what we have said in this section only begins to explore

the subject. This concludes the p r e s e n t a t i o n of our basic p r o g r a m synthesis techniques.

In the next part we will show how the same techni-

ques work together in the synthesis of some more complex examples.

III. PROGRAM SYNTHESIS:

THE P A T T E R N - M A T C H E R

We w i l l present the synthesis of a simple p a t t e r n - m a t c h e r to show how the concepts discussed in the pre%ious n o n - t r i v i a l problem.

section can be applied to a

Later, in Part IV, we shall show how we can con-

252

struct an even more complex program~ the u n i f i c a t i o n a l g o r i t h m of R o b i n s o n thesize.

[1965], by m o d i f y i n g the p r o g r a m we are about to syn-

We must first describe the data structures

tive operations

i n v o l v e d in the p a t t e r n - m a t c h i n g

and primi-

and u n i f i c a t i o n

problems.

A.

Domain and N o t a t i o n s

The main objects in our domain are 9 x p r e s s i 0 n ~ and substitutions. i.

Expressions

Expressions

are atoms or nested lists of atoms{ cog.,

is an expression.

An atom may be e i t h e r a v a r i a b l e or a constant.

(In our examples we will use A,B,C,... for variables.)

(A B (X C) D)

for constants

We have basic p r e d i c a t e s

and U,V,W,...

atom, var and eonst to

d i s t i n g u i s h these objects: atom(R) var(~) and

~ ~ is an atom, ~ Z is a variable,

const(Z)

~ ~ is a constant.

To decompose an expression, we will use the p r i m i t i v e functions head(~)

and tail(Z)~

defined w h e n ~ is not an atom.

head(~)

is the first element of Z~

tail(~)

is the list of all but the first element of Z.

Thus head(((A

(X) B) C (D X)))

: (A (X) B),

tail(((A

(X) B) C (D X)))

: (C (D X)).

We will a b b r e v i a t e head(~)

as £i and tail(~)

as ~2"

To construct e x p r e s s i o n s we have the c o n c a t e n a t i o n function: is any e x p r e s s i o n and m is a n o n a t o m i c expression,

if

~.m is the

e x p r e s s i o n w i t h £ i n s e r t e d before the first element of m.

For

example (A (X) B)

(C (D X)) = ((A (X) B) C (D X)).

The p r e d i c a t e o c c u r s i n ( x ~) is true if x is an atom that occurs in e x p r e s s i o n ~ at any level, e.g., oceursin(A

(C (B (A) B) C)) is true

but o c c u r s i n ( x Y) is false. Finally, we will introduce the p r e d i c a t e

constexp(~), w h i c h is

true if ~ is made up e n t i r e l y of constants.

Thus

253

constexp((A

(B) C (D E)))

constexp(X)

is false.

is true

but

Note that

constexp

differs

from const in that constexp may be

true on nonatomic expressions. 2.

Substitutions

A substitution

replaces

other expressions. of pairs.

certain variables

We will represent

of an expression

a substitution

by

as a list

Thus ( )

is a substitution. The instantiation to an expression

function inst(s £.

For example,

£) applies the substitution if s is the substitution

s

above

and £ is (X (A Y) X) then inst(s

~) is

((A B) (A (C Y)) Note that the substitution currences

(A B ) ) . is applied by first replacing

of X simultaneously

of Y simultaneously

by

(C Y).

all oc-

by (A B) and then all occurrences Thus,

if the substitution

s were

( ), then inst(s

~) would be

(C (A C) C). The empty substitution Thus,

A is represented by the empty list of pairs.

for any expression

~,

inst(A ~) = ~. We regard two substitutions

s I and s 2 as equal

(written Sl=S 2) if

and only if inst(s I ~) = inst(s 2 ~) for every expression

~.

Thus

( ) and ( ) are regarded

as the same substitution.

We can build up substitutions position):

If v is a variable

using the functions

p a i r and o (com-

and t an expression,

pair(v t) is

254

the s u b s t i t u t i o n that replaces v by t; i.e.~ a ~ i ~ ( v t) : (). If s I and s 2 are two substitutions~

Sl°S 2 is the s u b s t i t u t i o n

w i t h the same effect as a p p l y i n g s I followed by s 2.

Thus

inst(sl°s 2 ~) : inst(s 2 inst(s I ~)). For examp!e~

if

s I = ( ) and s then

2

= ( )

Sl°S 2 = ( ). Note that for the empty s u b s t i t u t i o n A A°s : s°A : s for any s u b s t i t u t i o n

B.

s.

The s p e c i f i c a t i o n s

The p r o b l e m of p a t t e r n - m a t c h i n g may be d e s c r i b e d as follows. are given two expressions,

but ar~ is assumed to contain no variables; is true.

We

p a t and ar___gg. Pat can be any expression, i.e.,

constexp(a_r~)

We want to find a s u b s t i t u t i o n z that transforms pat in-

to arg, such that inst(z pat)

= arg.

We will call such a s u b s t i t u t i o n a match.

If no m a t c h exists, we

want the p r o g r a m to return the d i s t i n g u i s h e d For examp!e~

constant NOMATCH.

if is

(X A

(Y B))

and is (C A

(D B)),

we want the p r o g r a m to find the m a t c h ( ). On the other hand, if pat is

(X A (X B))

arg is

(B A (D B)),

and then no s u b s t i t u t i o n will t r a n s f o r m pa t into arg, so the p r o g r a m will y i e l d NOMATCH. This version of the p a t t e r n ~ m a t c h e r

is simpler than the pattern-

m a t c h i n g a l g o r i t h m s u s u a l l y i m p l e m e n t e d in p r o g r a m m i n g languages because of the absence of "sequence"

or "fragment" variables.

Our variables must m a t c h exactly one expression, whereas ment v a r i a b l e may m a t c h any n u m b e r of expressions.

a frag-

Because of

255

the absence of fragment variables, be unique.

a match,

if it exists, w i l l

Thus if pat is

(X Y Z) and

at@ is

((A B) C (A B)),

X and Z must be bound to

(A B), and Y must be bound to C.

If

pat is (X Y) and arg is (A B C), no match is possible at all.

(If X and Y were fragment variables,

four matches would be possible.) In m a t h e m a t i c a l n o t a t i o n the specifications

for our p a t t e r n - m a t c h e r

are:

Goal i: I match(p0t a~g) : Find z such that inst(z pat) = arg

r

.......else ..... z = NOMATCH

where

......

"Find z such that P(z) else Q(z)" means find a z such that

P(z) if one exists; otherwise,

find a z such that Q(z).

The above s p e c i f i c a t i o n s do not c o m p l e t e l y capture our intentions; for instance,

if

pat is (X Y), and arg is (A B), then the s u b s t i t u t i o n z = ( ) will satisfy our s p e c i f i c a t i o n s as well as z

= ( ).

We have n e g l e c t e d to include in our s p e c i f i c a t i o n s that no substitutions

should be made for variables that do not occur in pat.

We will call a m a t c h that satisfies this additional condition a most seneral match. An i n t e r e s t i n g c h a r a c t e r i s t i c of the synthesis we present is that even if the user forgets to require that the match found be most general~

the system will be able to strengthen the specifications

a u t o m a t i c a l l y to imply this condition, in Section II-D.

using the method outlined

T h e r e f o r e we will begin the synthesis using the

w e a k e r specifications.

C.

T h e Synthesis:

The Base Cases

Rather than listing all the k n o w l e d g e we require in a special see-

256

tion

at the beginning~

about

to be used.

vial we will

we will m e n t i o n

Furthermore~

omit

a rule

if a rule

it entirely.

only w h e n

seems

The g e n e r a l

it is

excessively

strategy

tri~

is to first

work on Goal

2:

Find

If this

z such that

inst(z

pat)

is found to be i m p o s s i b l e

such

z exists)~

Goal

3:

we will

is seen to be t r i v i a l l y

Thus~

from now on we will be w o r k i n g

ever,

in w o r k i n g

cases

in w h i c h

return

Goal

NOMATCH~

We have

satisfied

is i m p o s s i b l e

2 is p r o v e n

which

base

Goal

z to be NOMATCH.

on Goal

a portion

to achieve.

impossible~

satisfies

in our k n o w l e d g e

by t a k i n g

primarily

on any goal we devote

the goal

that no

z = NOMATCH;

which

that

if it is proven

work on

Find a z such that

showing

= arg.

(i.e.,

2.

How-

of our time

to

When we find

we will

automatically

3.

a number

of rules

concerning

inst,

including Rule

i:

inst(s

x) = x for any s u b s t i t u t i o n

Rule

2:

inst(pair(v

s

if constexp(x) t) v) = t

if var(v) We assume

that

these

tion i n v o c a t i o n cons texp(pat) ditions;

on Goal

and pat

their

truth

to the program~ thetical

Rule

We will

are r e t r i e v e d

2.

Rule

= arg.

In the

i tells have

as they

to t i g h t e n

stand now~

The p o r t i o n

of the p r o g r a m we have

case that

on the p a r t i c u l a r

the

inputs

conditions

is a s a t i s f a c t o r y

simply

constructed

con-

for a hypo-

specifications

we will

func-

of these

as conditions

any s u b s t i t u t i o n

later;

arg)

in the

either

case that both of these

our p r o g r a m

match(pat

only

prove

depends

predicates

us that

occasion

by p a t t e r n - d i r e c t e d

i applies

We cannot

or f a l s e h o o d

We use these

world-split.

are true~ match.

rules

return

of z÷any.

so far reads

=

if c o n s t e x p ( p a t ) then

if pair = ar_~g then

z + any

else.o. On the other

hand~

in the

case

constex~(pat)

and pat

~ arig, Rule

I

257

tells

us that inst(z

for

any

z.

pat)

Hence

~ arg

we are

led to try

to s a t i s f y

Goal

3 and take

z÷NOMATCH. We n o w

consider

the

case

~ constexp(pat). Ruel

2 establishes

the

subgoal

var(pat). This

is a n o t h e r

var(pat)

occasion

is true,

g r a m we h a v e

the

for a h y p o t h e t i c a l

program

constructed

m a t c h ( p a t arg)

must

return

world-split.

~air(pat

arg);

When the p r o -

so far is =

if e o n s t e x ~ ( p a t ) then

if p a t

else

= arg

then

z ÷ any

else

z ÷ NOMATCH

if v a t ( p a t ) then

z ÷ pair(pat

arg)

else .... Heneefore

we

also

~ 9onstexp(pat).

ing

that

assume

additional

Rule

3:

This

rule

and y.

- var(pat).

knowledge

inst(s

x.y)

applies

We have

about

the

= inst(s

to o u r Goal

some

Recall

To p r o c e e d

that

function

x). inst(s 2 if p a t : x . y

additional

we h a v e

we m a k e

knowledge

been

use of the

assuming follow-

inst:

y)

for any

for

some

about

substitution expressions

expressions

s.

x

in ge-

neral: Rule

4:

u = uI

u2

if ~ atom(u) R e c a l l that u I is an a b b r e v i a t i o n t i o n for tail(u). Rule

5:

u ¢ v.w

for any u, v,

for h e a d ( u )

and

u 2 is an a b b r e v i a -

and w

if atom(u) Using

Rule

4, we

generate

a subgoal

- atom(pat). Since

we h a v e

can a c t u a l l y Therefore

already prove

assumed

~ atom(pat)

p a t = p a t l . p a t 2 and

~ constexp(pat) using

using

knowledge

Rule

and ~ v a t ( p a t ) , in the

3 our Goal

system.

2 is then

we

258

reduced Goal

to

~:

Find

We n o w m a k e Rule

6:

z such that

use

of some

To p r o v e

Applying

this

for

u a n d v.

arg some

x

rule~

u = a__~l a n d

= U

inst(z

general

• y = u we

patl)~inst(z

p a t 2 ) = ar_~.

list-processing

. v, p r o v e

generate

knowledge.

x = u a n d y = v.

a subgoal

to

show that

• V Applying

Rule

4, w e k n o w

this

is t r u e w i t h

v : arg 2 if

~ atom(arg). This

is a n o t h e r

Thus,

by R u l e

duces

to

Goal

for

6~ in t h e

Find

5:

occasion

a hypothetical

case

that

world-split.

~ atom(arg),

our

subgoal

re-

z such that

inst(z

p a t I)

= arg I

inst(z

p a t 2)

: a r g 2.

and We w i l l

portpone

sidered

the

treatment

other

case,

of ~his

goal

until

after we have

con-

in w h i c h

atom(ar___~g) holds.

In t h i s

for

any

z.

can

take

The

program

case

inst(z

Rule

p a t I)

Hence,

our

5 tells

- inst(z goal

us t h a t

p a t 2)

~ arg

is u n a c h i e v a b l e

in t h i s

case,

a n d we

z + NOMATCH. so f a r

match(pat

is ar~)

:

if constexp(~at) then

else

For the forth

as y e t

using

a r g l " arg 2 •

if pat then

z + any

else

z + NOMATCH

if var(pat) then

z ÷ pair(pat

else

if atom(arg)

untreated

Rule

= arg

4 we

arg)

then

z ÷ NOMATCH

else

.. .

case neither

assume

that

pat

pat

nor

a r ~ is

atomic.

is p a t l . p a t 2 a n d a r g

is

Hence-

259

D.

The Synthesis:

The Inductive Case

We will describe the remainder of the synthesis in less detail, because the reader has already seen the style of reasoning we have been using.

Recall that we had postponed our discussion of

Goal 5 in order to consider the ease in which arg is atomic. Now that we have completed our development of that case, we resume our work on Goal 5: Find z such that inst(z pat I) = argl , and inst(z pat 2) = ~Fg2" This is a conjunctive goal, and is treated analogously to the goal in the simultaneous treated separately.

linear equations example:

each conjunct is

The system will attempt to use a recursive

call to the pattern-matcher itself in solving each conjunct. The interaction between the two conjunets is part of the challenge of this synthesis.

It is quite possible to satisfy each conjunct

separately without being able to satisfy them both together.

For

example, if pat=(X X) and ~rg=(A B) then patl=X , pat2=(X) , argl=A and ar~2=(B).

Thus z=() satisfies the first conjunct,

z=() satisfies the second conjunct, but no substitution will satisfy both conjuncts because it cannot match X against both A and B.

Some mechanism is needed to ensure that the expression

assigned to a variable in solving the first conjunct is the same as the expression assigned to that variable in solving the second conjunct. There are several ways to approach this difficulty. programmer may satisfy the two conjuncts

For instance~ the

separately and then attempt

to combine the two substitutions thereby derived into a single substitution.

Or he may actually replace those variables

in pat 2

that also occur in a~9~l by whatever expressions they have been matched against~ before attempting to match pat 2 against arg 2. Or he may simply pass the substitution that satisfied the first conjunct as a third argument to the pattern-marcher in working on the second conjunct. The pattern-marcher must then check that the matches assigned to variables are consistent with the substitution given as the third argument.

260

We will examine in this

section how a system would discover the

second of these methods

and consider in a later section how the

third method

could be discovered.

method here because

We will not consider the first

it is not easily adapted to the unification

problem. Our strategy We will

for a p p r o a c h i n g

consider the first conjunct

Goal 6:

satisfies

z into the second conjunct, 7:

Prove inst(z

If we are successful

independently:

~at2)

find a broader

this goal~ we will substitute

z.

= ar___~g 2.

that satisfy Goal 6 and

select one that also satisfies

strategy,

rule that relates

if we fail,

In other words, we will try to

class of substitutions

method we introduced to solve conjunctive A p p l y i n g this

that

giving

in Goal 7~ we are done~ however,

we will try to generalize from these

goal is as follows.

Find z such that inst(z patl ) = argl.

If we find a z that

Goal

the conjunctive

Goal

7.

This

is the

goals in Section II-E.

we begin work on Goal 6.

We first use a

the construct

Find z such that P(z) to the construct Find z such that P(z) else Q(z). Rule 7:

To find z such that P(z), find z I such that P(z I) else Q(z I) for some predicate

Q,

and prove ~ Q(Zl). This rule~ Goal

8:

applied to Goal 6 causes the generation

of the subgoal

Find z I such that inst(z I pat I) = arg I else q(Zl)~ and prove ~ Q(Zl).

This subgoal matches NOMATCH.

the top-level

This suggests

Goal i, where Q(z I) is Zl=

establishing

a recursion

at this point,

taking z I = m a t c h ( p a t l argl). T e r m i n a t i o n is easily shown, because both pat I and arg I are proper subexpressions show,

of pat and @rg, respeetively,

according to Rule

7, that

zi~NOMATCH.

This

It remains

to

causes another

261

hypothetical NOMATCH

world-split:

in the case Zl=match(patl

(i.e., no substitution will cause p atl and arg I to match),

we can show that no substitution either,

and hence

match(~atl However,

ar~l)=

can cause pat and arg to match

can take z=NOMATCH.

argl)~NOMATCH,

On the other hand,

we try to show Goal

inst(z I pat 2) = arg 2. we fail in this attempt;

if

7, i.e.,

in fact we can find sample in-

puts pa t and arg that provide

a counter-example

to Goal

7 (e.g.,

pat=(A X), arg=(A B), Zl=A).

Thus we go back and try to genera-

lize our solution to Goal 6. We already have a solution to Goal 6:

we know inst(z I patl)=arg I.

We also can deduce that constexp(argl) , because we have assumed constexp(arg).

Hence Rule i tells us that

inst(w arg I) = arg I for any substitution w. Hence inst(w inst(z I p atl))

= argl,

i.e., inst(zl°w p a t I) = arg I for any substitution w.

Thus having one substitution

z I that sa-

tisfies Goal 6, we have an entire class of substitutions, Zl°W ~ that also satisfy Goal 6. sidered to be "extensions"

These substitutions

of Zl; although

sfy Goal "7, perhaps

some extension

The above reasoning

is straightforward

further work is needed to motivate

z I itself may not sati-

of z I will. enough to justify,

9:

but

a machine to pursue it.

It remains now to find a single w so that Zl°W satisfies Goal

of form

may be con-

Goal

7, i.e.,

Find w such that inst(zl°w pat.2)=ar__zg2,

or equivalently, Find w such that inst(w inst(z I P at2))=arg 2. applying Rule 7, we establish Goal i0:

a new goal

Find w such that inst(w inst(z I pat2))=arg 2 else Q(w), and prove

~ Q(w)

This goal is an instance inst(z I pat2),

of our top-level

goal, taking pat to be

a rg to be a rg2' and Q(w) to be w=NOMATCH.

attempt to insert the recursive

call

Thus we

262

z 2 * m a t c h ( $ n & t C z I pat2) into our p r o g r a m must

at this p o i n t

arg2 )

and take w to be z 2.

However,

we

first e s t a b l i s h Q(z2) , m a t c h ( i n s t ( z I pat2)

We c a n n o t p r o v e pat

this.

a r g 2) ~ N O M A T C H .

We have

= (A A)~ ~

eounter-examples,

e.g.,

if

= (A B) and z I = A,

then match(inst(A Therefore

we split

A)

B) = NOkIATCH.

on this

condition.

'In the case m a t e h ( i n s t ( z I pat2) Goal

i0 is s a t i s f i e d .

Z=Zl°Z 2 satisfies Our p r o g r a m

arg2 ) ~ N O M A T C H

Thus w=z 2 also

Goal

satisfies

Goal

9, and

7.

so far is

mateh(p_~

ar_~) =

if e o n s t e x ~ ( p a t ) t h e n if ~

= arg

then

z ÷ any

else

z ÷ NOMATCH

else if v a r ( p a t ) then

z ÷ pair(pat

ar$)

else if a t o m ( a < [ ) then

z + NOMATCH

else

z I ÷ m a t c h ( p a t I argl )

if z I = N O M A T C H t h e n z + N0~IATCH else

z 2 ÷ m a t c h ( i n s t ( z I p_a~2 ) arg2)

if z 2 = N O M A T C H

E.

The S y n t h e s i s :

We have

gone this

fications~ general.

...

else

z ÷ Zl°Z 2

The S < r e n ~ t h e n i n g

far t h r o u g h

i.e., w i t h o u t In fact~

then

the

requiring

the m a t c h

of the

synthesis

Specifications

u s i n g the w e a k

that the m a t c h

speci-

f o u n d be m o s t

f o u n d may or may not be m o s t g e n e r a l

263

d e p e n d i n g on the v a l u e taken for the u n s p e c i f i e d s u b s t i t u i o n "any" p r o d u c e d in the very first case.

The synthesis is nearly complete.

However, we will be unable to complete it without the s p e c i f i c a t i o n s

strengthening

and m o d i f y i n g the p r o g r a m accordingly.

have only one case left to consider.

We now

This is the case in w h i c h

z2 = NO~TCH i.e., m a t c h ( i n s t ( z i pat 2) = NOMATCH. This means that no s u b s t i t u t i o n w satisfies inst(w inst(z I pat2))

= arg2,

or, e q u i v a l e n t l y i n s t ( z l ° w pat 2) # arg 2 for every s u b s t i t u t i o n w. This means that no s u b s t i t u t i o n of form Zl°W could p o s s i b l y satisfy i n s t ( z l ° w pat) We here have a choice:

= a rg. we can try to find a s u b s t i t u t i o n s not of

form Zl°W that satisfies inst(s pat I) = arg 1 and repeat the process; stitution

or we could try to show that only a sub-

s of form Zl°W could p o s s i b l y satisfy inst(s pat I) = argl,

and t h e r e f o r e we can take z=NOMATCH. We have already e x t e n d e d the class of s u b s t i t u t i o n s that satisfy the condition once; t h e r e f o r e we pursue the latter course. try to show that the set of substitutions

We

S=Zl°W is the entire

set of solutions to inst(s pat l) = arg I. In o t h e r words, we show that for any s u b s t i t u t i o n s, if inst(s pat I) = arg I then s = Zl°W for some w. This condition is equivalent to saying that z I is a most general match.

We cannot prove this about z I itself; however,

since z 1

is m a t c h ( p a t I arg I) it suffices to add the condition to the specifications

for match, as d e s c r i b e d in Section II-D.

s p e c i f i c a t i o n s now read Find z such that {inst(z pat) = arg and for all s [if inst(s pat) = arg then s = z°w for some w]} else z = NOMATCH.

The s t r e n g t h e n e d

264

O n c e we h a v e go t h r o u g h

the

cifications In this

strengthened entire

are

case

the

program

satisfied,

no m a j o r

specifications and

see that

modifying

modifications

it is n e c e s s a r y the

new,

to

stronger

spe~

the p r o g r a m

if n e c e s s a r y .

are n e c e s s a r y ;

however,

the

assignment z + any that

occurs

tant

is

in the

further

case

in w h i c h

specified

pat

and

ar$

are

equal

and

to r e a d

z ÷ A. Our final

program

is t h e r e f o r e

match(pat

a_E~) :

if 9 o n s t e x z ( p a t ) then

if p a t = arg

then

z + A

else

z ÷ NOMATCH

else

if v a r ( p a t )

then

z + pair(pat

else

if a t o m ( a r g )

ar$ ,)

then

z ÷ NOMATCH

else

z I ÷ match(~atl

argl )

if z I = N O M A T C H then

z ÷ NOHATCH

else

z 2 ÷ m a t c h ( i n s t ( z I p a t 2) arg 2)

if z 2 = N O M A T C H then

z ÷ NOMATCH

else

z ÷ zl°z 2.

cons-

265

F.

Alternative

Programs

The above p a t t e r n - m a t c h e r

is only one of may pattern-matchers

that can be derived to satisfy the same specifications. suing the synthesis the alternative altogether,

the system has made many choices;

some of

paths result in a failure to solve the problem

whereas

better programs.

other paths result in different,

In this

might make a different sequently,

In pur-

possibly

section we will indicate how a system

decision

in the above synthesis

and, con-

drive a different program.

In the above synthesis we derived a goal Find w such that inst(zl°w

(Goal 9):

Pat2 ) = arg 2.

We chose at that point to transform the goal to the equivalent Find w such that inst(w

inst(z I Pat2))

and then apply Rule 7, which added an else-clause yielding Goal I0. goal, introducing

= arg2, to the goal,

We then matched Goal i0 against our top-level the second recursive

It would be equally plausible

call into our program.

that the system should apply Rule

7 directly to Goal 9, giving a new goal Goal I@': Find w such that inst(zl°w

pat 2) = ar__~2

else Q(w), and prove ~ Q(w). The system may now attempt to use a reeursive Goal I0'

However,

top-level

this goal is not a precise

call to satisfy instance of our

goal~ we are demanding not only that a match be found

between pat 2 and arg2, but also that the match be of form Zl°W , where z I is the output of the previous recursive therefore and alist. alist.

generalize

call.

our program to take three inputs:

We must pat,

ar___gg,

We insist that the match found be an extension of

In the case that alist is A, the new p r o g r a m will be iden-

tical to the old one. and arg:arg2,

On the other hand,

a recursive

if alist=zl~

pat=pat2

call to the new p r o g r a m suffices to

satisfy Goal i0' In mathematical

notation,

the stronger

I Find z such that

specifications

[inst(z pa t ) = arg and z = alist°w for some w]

[ else z : NOMATCH.

now read

266

We will call the g e n e r a l i z e d

program

m a t c h 2 ( R a ~ arg alist). The original p r o g r a m match will be written general

auxiliary program: match(pat ar_~) = m a t c h 2 ( p a t

The portion of the p r o g r a m already of match must be s y s t e m a t i c a l l y ed requirements First,

as a call to the more

ar e A).

constructed

m o d i f i e d to satisfy the strengthen-

of match2.

in the case that constexp(pat)

took z÷any.

in the synthesis

However~

where p at=arg~ we originally

we must now take

z ÷ alist°any, becuase the substitution

found must be an extension of alist.

In the case that vat(pat) however, matched

here we must against

we originally

took z+pair( a ~

check to see w h e t h e r ~

ar_~g);

has already been

something other than ar__~. The new p r o g r a m seg-

ment is if inst(alist

~at)

= pat

then z ÷ alis~°pair( ap~_ arg) else if i n s t ( a l i s t

pat)

= arg

then z ÷ alist else z ÷ NOMATCH. Of course.we derivation

are omitting the details

of synthesis;

the actual

is somewhat more lengthy.

The call z I ÷ m a t c h ( p a t I argl ) must be replaced by z I * mateh2(pat I arg I alist). Having m o d i f i e d the p o r t i o n of the p r o g r a m already constructed~ the system continues

the synthesis.

A recursive

satisfies

z 2 ÷ match2( a~_~2 arg 2 z I) Goal 9; the balance of the synthesis

velopment

of the first program,

ing of the specifications

including

call

parallels

the further

the de-

strengthen-

to ensure that the match found is the

most general possible. The value of the second recursive strengthened

top-level

The p r o g r a m derived

call is shown to satisfy the

goal.

from this alternative

synthesis

is

267

m a t c h ( p a t arg) = m a t c h 2 ( p a t arg A) where m a t e h 2 ( p a t arg alist)

=

if constexp(pat) then if ~ a t = arg then z ÷ a!ist else z ÷ N O ~ T C H else if var(pat) then if inst(a!ist pat)

= pat

then z ÷ a l i s t ° p a i r ( p a t arg) else if inst(alist pat)

= arg

then z + alist else z ÷ N O M A T C H else if atom(ar$) then z ÷ N O M A T C H else z I + m a t c h 2 ( p a t I ~ g l

alist)

if z I = N O M A T C H then z ÷ N O M A T C H else z + m a t c h 2 ( p a t 2 IV.

P R O G R A M MODIFICATION:

arg 2 Zl).

THE U N I F I C A T I O N A L G O R I T H M

In general, we cannot expect a system to synthesize an entire complex p r o g r a m from scratch, as in the p a t t e r n - m a t c h e r example.

We

w o u l d like the system to r e m e m b e r a large body of programs that have been s y n t h e s i z e d before and the m e t h o d by w h i c h they were constructed.

When p r e s e n t e d with a new problem, the system should

cheek to see if it has solved a similar p r o b l e m before.

If so, it

may be able to adapt the t e c h n i q u e to the old p r o g r a m to make it solve the new problem. There are several difficulties i n v o l v e d in this approach.

First,

we cannot expect the system to r e m e m b e r every detail of every synthesis in its history. and w h a t to forget.

Therefore,

it must decide what to r e m e m b e r

Second, the s y s t e m must decide w h i c h p r o b l e m s

are similar to the one being considered, larity is somewhat ill-defined.

and the concept of simi-

Third, having found a similar

program, the system must somehow modify the old synthesis to solve the new problem.

We will concentrate only on the latter of these

268

problems

in this discussion.

program modification RobinsonTs A.

We will illustrate

a technique

as applied to the synthesis

unification

for

of a version of

algorithm.

The Specifications

Unification matching

may be considered

in which variables

to be a generalization

of pattern

appear in both Pat and arg.

lem is to find a single substitution

(called a "unifier")

when applied to both pat and arg , will yield identical for instance,

The probthat,

expressions.

if

pat = (X A) and arg : (B Y)~ then a possible

unifier of pat and arg is

(). The close analogy between pattern-matehing clear.

and unification

If we assume that the system remembers

marcher we constructed goal structure unification

in Sections

The specifications cal notation,

the pattern-

III-2 through

involved in the synthesis,

problem is greatly

is

III-5 and the

the solution to the

facilitated.

for The unification

algorithm,

in mathemati-

are

unify(p_a~ arg) I Find z such that inst(z pat) = inst(z arg) else z : NOPLATCH B°

The Analogy with the Pattern-Matcher

For purposes

of comparison

match(pat

we rewrite the match specifications:

arg) =

Find z such that inst(z pat)

= arg

else z = NOMATCH. In formulating

the analgy, we identify unify with match, pat with

pat, the ar___ggin unify with arg~

(~at arg) with arg, and inst(z arg) also

In accordance

alter the goal structure example,

with this analogy,

we must systematically

of the pattern-matcher

Goal 5 becomes modified to read

synthesis.

For

269

Find z such that inst(z pat I) = inst(z argl ) and inst(z pat 2) : inst(z aFg2 ). In constructing the p a t t e r n - m a t c h e r , we had to break down the synthesis into various oases.

We will try to m a i n t a i n this case

structure in f o r m u l a t i n g our new program.

Much of the savings

derived from m o d i f y i n g the p a t t e r n - m a t c h e r instead of constructing the u n i f i c a t i o n a l g o r i t h m from scratch arises because we do not have to deduce the ease splitting all over again. A difficult step in the p a t t e r n - m a t e h e r synthesis i n v o l v e d the s t r e n g t h e n i n g of the specifications

for the entire program.

We

added the condition that the match found was to be "most general." In f o r m u l a t i n g the u n i f i c a t i o n synthesis, we will i m m e d i a t e l y s t r e n g t h e n the s p e c i f i c a t i o n s in the analogous way. thened s)ecifications

The streng-

read

u n i f y ( p a t argi =

.........

Find z such that {inst(z pat)

= inst(z erg) and

for all s [if inst(s pat)

: inst(s arg)

then s = z°w for some w]} else z = NOMATCH. Following Robinson

[1965], we will refer to a unifier s a t i s f y i n g

the new c o n d i t i o n as a "most general unifier." Note that this a l t e r a t i o n process is p u r e l y syntactic;

there is

no reason to assume that the altered goal structure corresponds to a valid line of reasoning.

For instance,

simply because a-

chieving Goal 2 in the p a t t e r n - m a t c h i n g p r o g r a m is useful in a c h i e v i n g Goal i does not n e c e s s a r i l y imply that a c h i e v i n g Goal 2' in the u n i f i c a t i o n a l g o r i t h m will have any bearing on Goal I' The extent to w h i c h the r e a s o n i n g carries over depends on the soundness of the analogy.

If a portion of the goal structure

proves to be valid, the c o r r e s p o n d i n g segment of the p r o g r a m ean still remain; otherwise, we must construct a new p r o g r a m segment.

C.

The M o d i f i c a t i o n

Let us examime the first two cases of the u n i f i c a t i o n synthesis in full detail,

so that we can see exactly how the m o d i f i c a t i o n

270

process works.

In the pattern-macher,

we generated

the subgoal

(Goal 2) Find z such that inst(z pat) The corresponding

unification

subgoal is

Find z such that inst(z pat) In the p a t t e r n - m a t c h e r where pat=arg.

= arg. = inst(z

arg).

we first considered the case constexp(pat)

In this case the corresponding

p r o g r a m segment

is

z ÷ A. This

segment also satisfies the m o d i f i e d ins t(A a~9~)

The system must also

: inst(A

goal in this case, because

arg).

check that i is a most general

for any s [if inst(s pat)

: inst(s

unifier,

i.e.,

ar__~g)

then s = A°w for some w]. This

condition

is easily

satisfied,

case, the p r o g r a m segment

is correct without

The next case does require marcher,

taking w=s.

Thus,

any modification.

some modification.

In the pattern-

when c o n s t e x p ( p ! ~) is true and p!tCarg,

be NOMATCHo

However,

in this

z is taken to

in this case in the u n i f i c a t i o n

algorithm

we must check that inst(s pat)

~ inst(s arg),

i.eo~ nat

~ inst(s ar$)

for any s, in order to take z=NOMATCH. may contain variables, must therefore way.

In this case

the u n i f i c a t i o n

this condition

try to achieve

Since for u n i f i c a t i o n cannot be satisfied.

the specifications

(where constexp(pat)),

a l g o r i t h m reduce

arg

We

in some other

the specifications

of

to

Find z such that {pat = inst(z

arg) and

for any s [if pat = inst(s

arg)

then s = z°w for some w]} else z = NOMATCH. These

specifications

pattern-marcher

are p r e c i s e l y

with ~

the specifications

and arg reversed;

of the

consequently,

we can

invoke m a t c h ( a r g ap_a~) at this point in the program. The balance manner.

of the m o d i f i c a t i o n

The derived

can be carried out in the same

unification

algorithm is

271

unify(pat

arg)

=

if ~constexp(pat) t h e n if pat

= arg

then

z ÷ A

else

z ÷ match(arg

pat)

else if v a t ( p a t ) t h e n if o c c u r s i n ( p a t

arg)

t h e n z + NO[%ATCH else

z + pair(pat

arg)

else if a t o m ( a r g ) then

z ÷ unify(arg

pat)

else

zI ÷ unify(patl

ar~l)

if z I = N O M A T C H then z ÷ N O M A T C H else z 2 ÷ u n i f y ( i n s t ( z I a ~ 2 )

i n s t ( z I arg2))

if z 2 = N O M A T C H then z + NOMATCH else z + Zl°Z 2. Recall that occursin(pat

arg) m e a n s that pat o c c u r s

in a r & as a

subexpression. The t e r m i n a t i o n

of this p r o g r a m

is c o n s i d e r a b l y

to p r o v e t h a n was the t e r m i n a t i o n ever,

the c o n s t r u c t i o n

of the u n i f i c a t i o n

algorithm

pattern-matcher

is m u c h e a s i e r t h a n the i n i t i a l

pattern-mateher

itself.

N o t e that the p r o g r a m we have branch.

more

difficult

of the p a t t e r n - m a t c h e r .

constructed

f r o m the

synthesis

contains

How-

of the

a redundant

The e x p r e s s i o n if pat = arg then z + A else z ÷ m a t e h ( a F ~

pat)

c o u l d be r e d u c e d to z ÷ mateh(arg Such i m p r o v e m e n t s phase.

pat).

w o u l d not be m a d e until

a later optimization

272

V.

DISCUSSION

A.

l~l~entat~on

Implementation way.

of the techniques

presehted

in this paper is under-

Some of them have already been implemented.

quire further development

before

Others will re-

an implementation

will be possible.

We imagine the rules,

used to represent

reasoning tactics,

expressed

as programs

in a PLANNER-type

language.

mentation

is in QLISP

(Reboh and Sacerdoti

to be

Our own imple-

[1973]).

Rules are

summoned by pattern-directed

function invocation.

Worlds have been implemented

using the context mechanism of QLISP,

which was introduced structure

necessary

actual splitting

in QA4 (Rulifson e t a l . for the hypothetical

[1973]) of INTERLISP

thetical world-splitting experiment

II, or the loop-free The generalization a difficult

strategies

Similarly,

[1974]).

(Bobrow The hypo-

but we have yet to

for controlling

it.

simple programs

the program to sort two variables

segments

of the pattern-matcher

of specifications

technique

develop heuristics tion.

(Teitelman

system is capable of producing

as the union function,

environments

has been implemented,

with the various

The existing

The control-

of the control path as well as the assertional

data base~ is expressed using the multiple and Wegbreit

[1972]).

worlds, which involve an

(Seotions

to apply without

to regulate our approach

such

from Part

from Part III.

II-4 and III-5) is

its going astray.

We will

it in the course of the implementato conjunctive

goals

(Section !I-5)

needs further explication. \

B.

H~storical

Early'work Waldinger bilities

Context

and Contemporary

in program synthesis

Research

(e.g. Simon [19631, Green [1969],

and Lee [1969]), was limited by the problem-solving of the respective

formalisms

involved

lem Solver in the case of Simon, resolution case of the others).

(the General Prob-

theorem proving in the

Our paper on loop formation

get [1971]) was set in a theorem-proging attention to the implementation

problems.

capa-

framework,

(Hanna and Waldinand paid little

273

It is typical of contemporary program synthesis work not to attempt to restrict itself to a formalism;

systems are more likely

to write programs the way a human programmer would write them. For example, the recent work of Sussman [1973] is modelled after the debugging process.

Rather than trying to produce a oorrect

program at once, Sussmants system rashly goes ahead and writes incorrect programs which it then proceeds to debug.

The work re-

ported in Green et al. [1974] attempts to model a very experienced programmer.

For example,

if asked to produce a sort program, the

system recalls a variety of sorting methods and asks the user which he would like best. The work reported here emphasizes reasonging more heavily than the papers of Sussman and Green.

For instance,

in our synthesis

of the pattern-marcher we assumed no knowledge about patternmatching itself.

Thus our system would be unlikely to ask the

user what kind of pattern-matcher he would like. do assume extensive knowledge of lists,

Of course we

substitutions,

and other

aspects of the subject domain. Although Sussman's debugging approach has influenced our treatment of program modification and the handling of simultaneous goals, we tend to rely more on logical methods than Sussman. Furthermore,

Sussman deals only with programs that manipulate

blocks on a table; therefore he has not been forced to deal with problems that are more crucial in conventional programming, as the formation of conditionals

such

and loops.

The work of Buchanan and Luckham [1974]

(see also Luckham and

Buchanan [1974]) is closest to ours in the problems it addresses. However, there are differences in detail between our approach and theirs: The Buchanan-Luckham

specification language is first-order pre-

dicate calculus; ours allows a variety of other notations. method of forming conditionals

uses contexts and the Bobrow-Wegbreit Buchanan-Luckham

Their

involves an auxiliary stack; ours control structures.

In the

system the loops in the program are iterative,

and are specified in advance by the user as "iterative rules," whereas in our system the

(recursive)

loops are introduced by

the system itself when it recognizes a relationship between the

274

top-level goal and a subgoal.

The treatment of programs with

side effects is also quite different in the Buchanan-Luckham system, in which a model of the world is maintained and updated, and assertions

are removed when they are found to contradict

other assertions

in the model.

Our use of contexts allows the

system to recall past states of the world and avoids the tricky problem of determining when a model is inconsistent.

I should

be added that the implementation of the Buchanan-Luekham system is considerably more advanced than ours. C.

Conclusions and Future Work

We hope we have managed to convey in this paper the promise of program synthesis, without giving the false impression that automatic synthesis is likely to be immediately practical.

A compu-

ter system that can replace the human programmer will very likely be able to pass the rest of the Turing test as well. Some of the approaches to program synthesis that we feel will be most fruitful in the future have been given little emphasis this paper because they are not yet fully developed.

in

For example,

the technique of program modifieation, which occupied only one small part of the current paper, we feel to be central to future program synthesis work.

The retention of previously

constructed

programs is a powerful way to acquire and store knowledge.

Further-

more program optimization and program debugging are just special cases of program modification. Another technique that we believe will be valuable is the use of more visual or graphic representations, properties

that convey more of the

of the object being discussed in a single structure.

For example, we have found that the synthesis of the pattern matchef could be made shorter and more intuitive by the introduction of the substitution notation of mathematical resent an expression P as P(Xl,...,Xn),

logic.

If we rep-

where Xl,...,x n is the

complete list of the variables that oeeur in P, then P(tl,...,t n) is the result of substituting variables x i by terms t i in P. can then formulate the problem of pattern matching as follows:

We

275

Let a ~

= pat Lxl,..,,xn)

Find z such that if ar$ = pat(tl,...,t n) for some tl,...,t n then z = { ,...,} else x = NOMATCH. Note that this specification

includes

implicityly

the restriction

that the match found be a most general match, because each of the variables

x i actually occurs in pat.

tions do not need to be strengthened

Therefore,

the specifica-

during the course of the

synthesis. We hope to experiment of applications.

with visual representations

Clearly,

in a variaty

while the reasoning required is simpli-

fied by the use of pictorial notation,

the handling of innovations

such as the ellipsis notation in an implementation

is correspond-

ingly more complex. ACKNOWLEDGEMENTS We wish to thank Robert Boyer, land for giving detailed

Bertram Raphael,

critical readings

would also like to thank Nachum Dershowitz, Fikes, Akira Fusaoka, Carl Hewitt,

Shmuel Katz, David Luckham,

We

Peter Deutsch~ Richard

Cordell Green and his students,

breit for conversations this paper.

and Georgia Suther-

of the manuscript.

Irene @reif,

Earl Saeerdoti,

that aided in formulating

and Ben Weg-

the ideas in

We would also like to thank Claire Collins

and Hanna

Z£es for typing many versions of this manuscript. This research was primarily dation under grants GJ-36146

sponsored by the National and GK-35493.

Science

Foun-

276

BIBLZOGRAPHY i. Balzer, R. M. (September 1972), "Automatic Programming," Institute Technical Memo, University of Southern California/Information Sciences Institute. 2. Biermann, A. W., R. Baum, R. Kirisknasw~my and F. E. Petry (October 1973)~ '~Automatic Program Synthesis Reports," Computer and Information Sciences Technical Report TR-73-6, Ohio State University. 3. Bobrow~ D. G. and B. Wegbreit

(August 1973), "A Model for Control

Structures for Artificial Intelligence Pr0grammin ~ Languages," Adv. Papers 3d. Intl. Conf. on Artificial Intelligence, 253, Stanford University, 4. Boyer, R. S. and J

246-

Stanford, California.

S. Moore (1973), "Proving Theorems about

LISP Functions," Adv. Papers 3d. Intl. Conf. on Artificial Intelligence. 5. Buchanan~ J. R. and D. C. Luckham (March 1974), "On Automating the Construction of Programs," Memo, Stanford Artificial Intelligence Project, Stanford~ California. 6. Bundy, A.

(August 1973), "Doing Arithmetic with Diagrams," Adv.

Papers 3d. Intl. Conf. on Artificial Intelligence, 130-138, Stanford University,

Stanford, California.

7. Floyd, R. W., (1967), "Assigning Meanings to Programs," Proc. of a Symposium in A~plied Mathematics, Vol. 19, (J. T. Schwartz, ed.), Am. Math. Sock, 19-32. 8. Green, C. C. (May 1969), "Application of Theorem Proving to Problem Solving~" Proc. Intl. Joint Conf. on Artificial Intelligenoe~ 219-239. 9. Green~ C. C., R. Waldinger, R. Elschlager, D. Lenat, B. McCune, and D. Shaw~

(1974), "Progress Report on Program-Understanding

Programs~" Memo, Stanford Artificial Intelligence Project, Stanford, California. i0. Hardy, S. (December 1973), "Automatic Induction of LISP Functions," Essex University.

277

ii. Hewitt, C. (1972)~ "Description and Theoretical Analysis Schemata) of PLANNER:

(Using

A Language for Proving Theorems and Mani-

pulating Models in a Robot," AI Memo No. 251, MIT, Project MAC, April 1972. 12. Hoare, C. A. R., (October 1969), "An Axiomatic Basis for Computer Programming," C. ACM 12, i0, 576-580, 583. 13. Kowalski, R. (March 1974), "Logic for Problem Solving," Memo No. 75, Department of Computational Logic, University of Edinburgh, Edinburgh. 14. Luckham, D. and J. R. Buchanan (March 1974), "Automatic Generation of Programs Containing Conditional Statements," Memo, Stanford Artificial Intelligence Project, Stanford, California. 15. Manna, Z. and R. Waldinger (March 1971), "Toward Automatic Program Synthesis," Comm. ACM, Vol. 14, No. 3, pp. 151-165. 16. McCarthy, J. (1962), "Towards a Mathematical Science of Computation," Prec. IFiP Congress 62, North Holland, Amsterdam, 17. Reboh, R. and E. Saeerdoti

21-28.

(August 1973), "A Preliminary QLISP

Manual," Tech. Note 81, Artificial Intelligence Center, Stanford Research Institute, Menlo Park, California. 18. Robinson, J. A., (January 1965), "A Machine-0riented Logic Based on the Resolution Principle," Jour. ACM, Vol. 12, No. I, 23-41. 19. Rulifson, J. F., J. A. Derksen, and R. J. Waldinger 1972), "QA4:

(November

A Procedural Calculus for Intuitive Reasoning,"

Tech. Note 73, Artificial Intelligence Group, Stanford Research Institute, Menlo Park, California. 20. Simon, H. A., (October 1963), "Experiments with a Heuristic Comuter," Jour. ACM, Vol. I0, No. 4, 493-506. 21. Sussman, G. J. (August 1973), "A Computational Model of Skill Acquisition," Ph.D. Thesis, Artificial Intelligence Laboratory, M.I.T., Cambridge, Mass. 22. Teitelman, W., (1974), INTERLISP Reference Manual, Xerox, Pale Alto, California. 23. Waldinger, R. J., and R. C. T. Lee

(May 1969), "PROW:

A Step To-

ward Automatic Program Writing," Prec. Intl. Joint Conf. on Artificial Intelli~enee,

241-252.

A NEW APPROACH TO PROGRAM TESTING

James C. King, IBM T. J. Watson Research Center, Yorktown Heights, New York, USA

ABSTRACT: The current approach for testing a program is, in principle, quite primitive. Some small sample of the data that a program is expected to handle is presented to the program. If the program produces correct results for the sample, it is assumed to be correct. Much current work focuses on the question of how to choose this sample. We propose that a program can be more effectively tested by executing it "symbolically". Instead of supplying specific constants as input values to a program being tested, one supplies symbols. The normal computational definitions for the basic operations performed by a program can be expanded to accept symbolic inputs and produce symbolic formulae as output.

If the flow of control in the program is completely independent of its input parameters, then all output values can be symbolically computed as formulae over the symbolic inputs and examined for correctness.

When the control flow of the program is input dependent, a case analysis can be performed

producing output formulae for each class of inputs determined by the control flow dependencies. Using these ideas, we have designed and implemented an interactive debugging/testing system called EFFIGY.

INTRODUCTION

As tools for realizing correct programs, program testing and program proving are the ends of a spectrum whose range is the number of times the program must be executed. To establish its correctness through testing, one must execute the program at least once for all possible unique inputs; usually an infinite number of times. To establish its correctness through a rigorous correctness proof, one need not execute the program at all; but he may be faced With a tedious, if not difficult, formal analysis. These two extreme points of the spectrum offer other contrasts as well. Correctness proofs usually ignore certain realities encountered in actual test runs, for example, machine dependent details like overflow and precision. (One notable effort to bring machine dependent issues into correctness proofs is the recent thesis by Sites [7]). On the other hand one may finish a proof of correctness, but seldom do we ever finish testing a program, Normal testing and correctness proofs also differ in the degree to which the user is required to supply a formal specification of "correct" program behavior. While a careful statement of correctness may be recommended for program testing, it is not required. A user

279

may choose an interesting input case and then decide a posteriori, in this specific case, if the output appears to be correct. In a formal proof of correctness one must have a careful program specification.

A testing tool is described in this paper which allows one to choose intermediate points on the spectrum between individual test runs and general correctness proofs. One can perform a single "symbolic execution" of the program that is equivalent to a large (usually infinite) number of normal test runs. Test run results can not only be checked by careful manual inspection but if a machine interpretable program specification is supplied with the program it can be used to automatically check the results. Furthermore, by varying the degree to which symbolic information is introduced into the symbolic execution one can move from normal execution (no symbolic data) to a symbolic execution which, in some cases, provides a proof of correctness.

SYMBOLIC EXECUTION

The notion of symbolically executing a program follows quite naturally from normal program execution. First assume that there is a given programming language and a normal definition of program execution for that language. This execution definition must be used for production executions but an alternative symbolic execution semantics for the language can be defined to great advantage for debugging and testing. The individual programs themselves are not to be altered for testing. The definition of the symbolic execution must be such that trivial cases involving no symbols should be equivalent to normal executions and any information learned in a symbolic execution should apply to the corresponding normal executions as well.

An execution of a procedure becomes symbolic by introducing symbols as input values in place of real data objects (e.g., in place of integers and floating point numbers). Here "inputs" is to be taken generally meaning any data external to the procedure, including that obtained through parameters, global variables, explicit READ statements, etc.

Choosing symbols to represent procedure inputs

should not be confused with the similar notion of using symbolic program variable names. A program variable may have many different specific values associated with it during a particular execution whereas a symbolic input symbol is used in the static mathematical sense to represent some unknown yet fixed value. Values of program variables may be symbols representing the non-specific procedure inputs.

Once a procedure has been initiated and given symbolic inputs, execution can proceed as in a normal execution except when the symbolic inputs are encountered. This occurs in two basic ways: computation of an expression involving procedure inputs, and conditional branching dependent on procedure inputs.

Computation of Expressions The programming language has a set of basic computational operators such as addition (+), multiplication (*), etc. which are defined over data objects such as integers. Each operator must be extended to

280

deal with symbolic data. For arithmetic data this can be done by making use of the usual relationship between arithmetic and algebra. The arithmetic computations specified by these operators can be "delayed" or generalized by the appropriate algebraic formula manipulations. For example, suppose the symbolic inputs a and /3 are supplied as argument values to a procedure with formal parameter variables A and B. Denote the value of a program variable X by v(X). Then initially, v(A) = a and v(B) =/1. If the assignment statement C := A + 2*B were symbolically executed in this context C would be assigned the symbolic formula (a + 2*/3). The statement D := C - A , if executed next, would result in v(D) = 2"/I.

Similar symbolic generalization can be done, at least in theory, for all computational operations in the programming language. In the most difficult case, one could at least record in some compact notation the sequence of computations which would have taken place had the arguments been non-symbolic. The success in doing this in practice depends upon how easily these recordings can be read and understood and how easily they can be subsequently manipulated and analyzed mechanically.

Conditional Branching Consider the typical decision-making program statement, the IF statement, taking the form: IF B THEN $I ELSE Sz, where B is some Boolean valued expression in the language and $1 and $2 are other statements. Normally, either v(B) = true and statement S~ is executed or v(B) = false and statement Sz is executed. However, during a symbolic execution v(B) could be true, false or some symbolic formula over the input symbols. Consider the latter case. The predicates v(B) and , v ( B ) represent complementary constraints on the input symbols that determine alternative control flow paths through the procedure. For now, this case is called an "unresolved" execution of a conditional statement. The notion will be refined as the presentation develops. Since both alternatives paths are possible the only complete approach is to explore both: the execution forks into two "parallel" executions, one assuming v(B), the other assuming ~ v(B),

Assume the execution has forked at an unresolved conditional statement and consider the further execution for the case where v(B),

The execution may arrive at another unresolved conditional

statement execution with associated boolean, say C. Expressions v(B) and v(C) are both over the procedure input symbols and it is possible that either v(B) = v(C) or v(B) ~ ~v(C). Either implication being true would show that the assumption made at the first unresolved execution, namely v(B), is strong enough to resolve the subsequent test, namely to show that either v(C) or , v ( C ) .

Because the assumptions made in the case analysis of one unresolved conditional statement execution may be effective in resolving subsequent unresolved statement executions they are preserved as part of the execution state, along with the variable values and the statement counter, and are called the "path condition*' (denoted pc). At the beginning of a program execution the pc is set to true. The revised

281

rule for symbolically executing a condition statement with associated Boolean expression B is to first form v(B ) as before and then form the expressions: ~

v(B)

~ ~v(B). If pc is not identically false then at most one of the above expressions is true. If the first is true the assumptions already made about the procedure inputs are sufficient to eompletely resolve this test and the exe,cution follows only the v(B) case. Similarly if the second expression is true it follows the ~v(B) case.

Both of these cases are considered "resolved" or non-forking executions of the conditional

statement.

The remaining case when neither expression is true is truly an unresolved (forking) execution of the conditional statement. Even given the earlier constraints on the procedure inputs (PC), v(B) and ~v(B) are both satisfiable by some non-symbolic procedure inputs. As discussed above, unresolved conditional statement executions fork into two parallel executions. One when v(B) is assumed, in which case the pc is revised to pc ^ v(B), the other w h e n ~ v ( B ) is assumed and then oe becomes pc ^ ~v(B). Note that the forking is a property of a conditional statement execution not the s t a t e m e n t itself.

One

execution of a particular statement may be resolved yet a later execution of the same statement may not.

The pc is the accumulator of conditions on the original procedure inputs which determine a unique control path through the program.

Each path, as forks are made, has its own pc.

No pe is ever

identically f a l s e since the original pc is true and the only changes are of the form pc := pc ^ q and those only in the case when pc ^ q is satisfiable ((pc ^ q) = , (PC = ~q) which is satisfiable if pc ~ - q is not a theorem). Each path caused by forking also has a unique pc since none are identically false and

they all differ in some term, one containing a q the other a ~ q.

S Y M B O L I C E X E C U T I O N TREE

One can characterize the symbolic execution of a procedure by an "execution tree".

Associate with

each program statement execution a node and with each transition between statements a directed arc connecting the associated statement nodes.

For each forking (unresolved) conditional statement the

associated execution node has more than one arc leaving it labeled by and corresponding to the path choices made in the statement.

In the previous discussion of I F statements there were two choices

corresponding to v(B) and , v ( B ) . The node associated with the first statement of the procedure would have no incoming arcs and the terminal statement of the procedure ( R E T U R N or END statement) is represented by a node with no outgoing arcs.

Also associate the complete current execution state, i.e., variable values, statement counter, and pc with each node.

In particular, each terminal node will have a set of program variable values given as

formulae over the procedure input symbols, and a pc which is a set of constraints over the input

282

symbols characterizing the conditions under which those variable values would be computed, A user can examine these symbolic results for correctness as he would normal test output or substitute them into a formal output specification which should then simplify to true.

The execution tree for a program will be infinite whenever the program contains a loop for which the number of interations is dependent, even indirectly, on some procedure inputs. It is this fact that prevents symbolic execution from directly providing a proof of correctness technique.

Symbolic

execution is indeed an execution and at least in this simplest form described here provides an advanced testing methodology. BurstaU [1] has independently developed the notion of symbolic execution and

added the required induction step needed to have a complete proof of correctness method. Deutsch [2], also independently, developed the notion of symbolic execution as an implementation technique for an interactive program prover based on Floyd's method [3]. In fact, one can see the basic elements of the notion of using symbolic execution as the basis for a correctness method in the earlier work of Good [4]. The author and his colleagues have been pursuing the idea of symbolic execution in its own right as a debugging/testing technique. A particular system we have built called EFFIGY is described briefly in the next section.

EFFIGY - - AN INTERACTIVE SYMBOLIC EXECUTOR

The author and his colleagues at IBM Research have been developing an interactive symbolic execution system for testing and debugging programs written in a simple P L / I style programming language. The language is restricted to integer valued variables and vectors (one dimensional arrays). It has many interactive debugging features including:

execution tracing, break-points, and state saving/restoring.

Of course, it provides symbolic execution and uses a formula manipulation package and theorem prover developed previously by the author [5, 6].

The generat facilities and capabilities available are all that is of real interest and these are perhaps simplest and most economically explained by a system demonstration. An APPENDIX is included which shows an actual script (annotated in italics) from such a demonstration. A method for exploring execution trees with their multitude of forks and parallel executions is up to the user. He is provided the ability to choose particular forks at unresolved conditional statement executions (via go true, go false,

and a s s u m e )

and also has the state save/restore ability so that he may return to

unexplored alternatives later. We are currently experimenting with various "test path-managers" which would embody some heuristics for automating this process, exhaustively exploring all the "interesting" paths, As with previous testing methods the crucial .issue is: if one cannot execute all cases, which ones should he do; which are the interesting ones.

We are also working on practical methods for dealing with more odvanced programming language

283

features such as pointer variables. While, as mentioned above, most such enhancements are straightforward "in theory" many offer fundamental problems in practice.

CONCLUSION

Interactive debugging/testing systems have shown themselves to be powerful, useful tools for program development. A symbolic execution capability added to such a system is a major improvement. The normal facilities are always available as a special case, In addition, the basic system components of a symbolic executor provide a convenient toolbox for other forms of program analysis, including program proving, test case generation, and program optimization. Since such a system does offer a natural growth from today's systems, an evolutionary approach for achieving the systems of tommorrow is available. Valuable user experience and support is also provided. While practical use of the EFFIGY system is still quite limited, considerable insight into and understanding of the general notion of symbolic execution has been gained during its construction.

ACKNOWLEDGMENTS

The colleagues at IBM Research collaborating with me in this work are: S. M. Chase, A. C. Chibib, J. A. Darringer, and S. L. Hantler. They have all contributed significantly to the ideas presented here and to the design and implementation of our EFFIGY system.

We also appreciate the support and

encouragement received from D. P. Rozenberg, P. C. Goldberg, and P. S. Dauber. The manuscript was typed by J. M. Hanisch.

REFERENCES

[1]

Burstall, R. M. Program proving as hand simulation with a little induction, IFIP Congress 74 Proc., Aug. 1974, pp. 308-312.

[2]

Deutsch, L.P. An interactive program verifier, Ph.D. dissertation, Dept. Comp. Sci., Univ. of Calif., Berkeley CA., May 1973.

[3]

Floyd, R.W. Assigning meanings to programs, Proc. Symp. Appl. Math., Amer. Math. Soc., vol. 19, pp. 19-32, 1967.

[4]

Good, D.I. Toward a man-machine system for proving program correctness, Ph.D. dissertation, Comp. Sci. Dept., Univ. of Wisc., Madison, Wisc., June 1970.

[5]

King, J.C. and Floyd, R.W. An interpretation oriented theorem prover over integers, Journal of Comp. and Sys. Sci., vol. 6, no. 4, August 1972, pp. 305-323.

[6]

King, J.C. A program verifier, IFIP Congress 71 Proc., Aug. 1971, pp. 235-249.

[7]

Sites, R.L. Proving that computer programs terminate cleanly, Ph.D. dissertation, Comp. Sei. Dept., Stanford Univ., Stanford, CA., May 1974.

284

APPENDIX

A script from an actual EFFIGY session is shown below. The user's inputs are in lowercase letters and the system responses are in uppercase letters. To prevent any possible confusion the symbol " ~ " is shown here to ~he left of the user inputs. Explanatory comments, in italic letters, have been added as a right hand column.

When EFFIGY is initially invoked it is in an "immediate" mode and will execute statements as they are typed.

Any statement executed in this context is considered part of a main initial procedure called

MAIN. The concept of the MAIN procedure and the concept of immediate execution are distinct since statements can also be executed in an immediate mode in the context of other procedures. MAIN is unique in that it has an immediate mode only and it is the onty procedure privileged to execute the managerial system commands. Programs are made available to EFFIGY for stored program execution by declaring them, in MAIN, with the PROC statement similar to the way that internal procedures are declared in PL/I. However, EFFIGY does consider all procedures as EXTERNAL and they must be declared in MAIN.

Procedures are tested by a CALL from MAIN. SymboIic inputs can be supplied by enclosing a symbol string in double quotes, e.g., "a", "Dog". These symbolic constants can be used in most places instead of integer constants.

The system responses drop the quotes since the context always makes the

distinctions between different uses of identifiers quite clear.

Values always involve the input symbols

and never program variable names. Formulae are stored internal to EFFIGY in a "normalized" form and some of the expressions may appear quite different from what one might expect (e.g., A < @ will be typed out as A - n > - q ). The formulae are also kept in a simplified form (e.g,, 2 * B = 4 is stored as

B-2=0). EFFIGY runs on CMS under V M / 3 7 0 on an IBM/370 model 168. The CMS filing system and context editor are used as an integral part of EFFIGY for creating, changing, and storing procedures and command files. The INPUT command directs EFFIGY to read its input from the designated file (files have two part names in CMS) instead of directly from the user's terminal. As procedures are entered into E FFIGY (by a PROC ,.. END declaration) the statements are sequentially numbered. These statement numbers are used to reference particular points in the procedure for inserting breakpoints, turning tracing on and off, etc.

~effigy EFFIGY READY ~ e d i t a b s o l u t e effigy; NEW FZLE: ~input ~absolute: proc(i,o); dcl (i,o) integer; if i-1)) TRUE BRANCH O=-A STOPPED BETWEEN 3 AND 5 ~display variables, assumption; IN A B S O L U T E I:A O:-A ((A~>-I))

~ r e s t o r e I; STATE I RESTORED. ~ g o false; ((A~>-I))

'

terminal input. Declare a variable in MAIN. Try a numeric execution. Result of display statement.

All tracing on in proc. "absolute'. Set back m MAIN. Try a symbolic input "a". Each statement execution is traced by printing it. Evaluated result of I 0 T H E N DO; ((i~ 0 T H E N DO; ((A~

3 12 -

12 . . . .

4 *---

13

-

14

I

t

5

I g (a) ;

f(b,c)

is sind

yon c der

G,

kosmt

man

in f(c,c) nach

.2

enthaltene

lnalcg ergibf

sich:

b - c}

schliesslich

Ton A angedeuted ,=

enthaltenen

ersetzt is%. Zur der

neuerliohen

A zur~ck, da dutch die Neudefinition in

b

gSnzlich

and

sin~)

ges~ss

smmtlicbe

(wie

dass sp~tere Workommnisse in terpretieren

darin,

h

gehen

c := g (a)

b - c

wird,

und

in

dutch c := b ersetzen

lquival~zinformaticn,

vo.

bet.its

welter

Anueisung

.,. B6 = .=* 6,

{,2-~ ~ #2-A].

~terpretation

a -> f(b,c)

in die

wobei darch die Unterstreichang

Zustands~eschreib~ng

mieder

wieder

•a.A7 = A_,i.¥ dass

is%

yon A in B e z u g auf

speichern.

wiederus:

,a.8 =

,o.~

(mine

indem sie

und dater b -> g(a)

Information

d~r Bedeatu~g,

Vorkcmmnisse

der

Zustandsheschreibung,

Formal:

sich,

San kann jedoch einen

~olgenden

dies

...? = ,,.B8 Wenn B in Bezug auf $,

Ferner

wird,

ist

~ ,o.¥}

Information die

darauf wird diese Eintragung

,~ = { b -> g(a)}.

1

"o

GemMss

Ausdr @cke

[$o-~

wird).

der

verfngbar)

die

Opti aisier un gsver f a hr en =

Interpretation

Zastandsheschreibung aQfgencmmen.

verf~gbar

das

ange.andt

,..A T.

a

Neudefinition

,o.~.

wird auf einen Modal angewandt,

|it

is

indmm

Kapitel

Boole 'sche

$o.e

zunSchst die Aufnahme

ist

unmittelhar die

in

Meiter

Alternativen

gleichbedmu%end

yon

We rden

nicht sich

Zustandsbeschreibung

gege~ene~

Zustandsbeschreibang

wird, in Zeichen

R ekursionsg leich ,ngss ystes mit

der

nims% das Verfahrmn

Information enthaltende)

gleichbedeutmnd

einbmzogen,

bei

welch, Ausdr~cke in welchen Variablen

auf den ~au~tsodal ~ "angewand%" des

wird der ~egrlff miner

der,

Information

won a

wiederum

330

~.~

Das

= ~.~7

= A~..7

I~%erpre%a%icnsverfahren

endet, sobald ein Modu! in ~ezug

auf eine Zus%andsbeschreibu~g auf

die

er

Verfahrers der

A~zahl

yon

I,formafion, Da

ei~e

bereits

ergibt

interprefier%

Mcdule~

s%eht es frei, daf~r eine,

[A ~I I B 5,)

System

entspricht

obeD

vollst~ndig Variahle~

in

blieben.

zu eliminieren

sich notwendig, zu ersefzen

das

vereinfachf,

unver~ndert

~bb.

Um

Idee,

die

is

~u tragen, dass in einem

gebrachf

al!erdi~gs

die Namen

yon

Ausdr~cke ist es

wesent!ichen wurde),

an yon vom

um dem

Programm verschiedene

Variable bezeichnen

f~r die M~glichkeit

werden

!~ferpretatic~sprezesses.

f~r redundante

Namen fragen kBnnen und dass umgekehrt

Versfandnis

zU gehen,

mehr.

yon /I/ @bernommen

Nalen die gleiche

Um ein h~sseres Verfahre~s

zu

wiedergegebene

(wie in /14/ beschrieben)

(eise

gleichen

Namen

Variablen dutch nquivalenzklassen

Ussfand Rec~nung den

6

redundante

Algorifh~us

verschiedene

Modul wieder

ergibt sich das

als darin

value-numbering Variablen

der ~@glichen

neuen

Eli~inationsverfahren

inscfer~e

des

als Besultatprogramm:

keine Redundanzen

¥iedergegebene wurde

Variab!en f~r

Es eufh@If

Bezug

~nd!ichkeit

angewandt auf einen

~o = {P)

in

eingeht.

Wird dies f~r cbiges Beispiel gemach%,

Ausdr~cke

und

der

und aus der Endlichkeit

Zustandsbeschreibung

Flussdiagra~eo Das

ist

~ie ~ndlichkeit

aus

die in die Zustandsbeschreib~ng

fo!gende R e k u r s i c n s g l e i c h u n g S S T S % e m

Diesem

wurde.

sich dabei aufoma%isch

einen Mcdu! ergib%, w~h!en.

xu inferpretieren

im

ohne Das

Folgenden Ausfahrung ers%e

des angegebenen

noch des

Beispiel

kBnnen.

2

~eispiele

eigentlichen betrifft

die

331

Opti~isierung

des

fclgenden

{einem

Programmteiles

unver~ffentlichten ~rtikel yon J.C. Heatt~ entnommen) :

DO I = 1 % 0

N;

IF K(I)

> 0 TH~N M (1)=~(J) ;

END ;

Mit

herr~mmliche~

OFtimisierungsverfahren

sich der nach der ersten Ausf~hrung nicht eliminiereD, Seiteneffekten o~iges ~bh.

8

dam

ResultatFrogramm.

Der

dass

der

in

wurde.

urspr~Dg]ic~

einfache

f~r

duplizierter

Code

ist

Sch!eife

Beis~ie]

und

darin

sell der

kann.

Interyretationsverfahren

der

dieser

Einschl~ss

Zustandsbeschreibung e~alich viele Divergenzgefahr

(do

Bocle,sche gegeben)

der

Ausdruck

wie

das

2

Variable

2 wie

angegebene

yon

ist

ist

nachtr~gliche

er!aubt es, in gevisse~ F~!len ~berflHssige

is

Wahrheitswert in

Wahrheitswerte gibt

yon

Wahrheitswerten

Information

nut

und

i(J)

Entscheidung

entsprechende

es

die in

~inschluss

Propagierung

~ekannt.

Duplizierung

den jeder

dass

werden kann.

zeigen,

Nach

Graphen wiederus

ersichtlich,

durch

dutch

und

werden

symholiscbe

benannte

Aus Abb. 8

Optimisierungsverfahren erweiter%

sei allerdings

das

gerich%ete,

gewHnscht h~chstens einmal ausgefHhrt

Wahrheitswerten

an

erhaltene

nech eine Reduktion durchgef~hrt wurde

Schleifen aufgelRst w~rde

zweite

halber

Ansch!uss

unn~tig

zusammengelegt

Dos

L(J)

Abb. 7 zeigt

Cptimisierung

Vel!s%~ndigkeit

dabei

(wie aus der Syntaxthecrie in

kommen kBnnte.

dutch

Interpre%ationsverfahren beka~nt),

~usdruck

in geeigneter For~ als ~lussdiagramm kodiert.

zeigt

hinzugef~gt,

redundan%e

l~sst

da es sonst 7u dos Programs falsifizierenden

(~nsafe code morion)

Beispiel

(code motion)

ja die

und

wiederum

nut keine

verwendung doyen ~ntscheidungen

zu

erke~nen und damit beis~ielsweise Schleifen zu diagnostizieren, die selbst ge~bten Prograssierern sanchsal Ahb.

9 9ibt ein solches Prog~ammfragsent.

in Abb.

10 zeigt, dass

SchleifeDdurchlSufe

das

~och

Pregramm sinnvoll

des drit%en Sch!eifendurchlauf ~m

Fermalismus

der

f~r

verborgen die

ersten

heiden

ist, und dass es erst nach

unbeding% zu schleifen

symbolischen

bleiben.

Dos Resultatprogra~s

Interpretation

ist

beginnt. dieser

332

Zustand

durch

dam

Auf%re%en

eines

unbedingten,

rekursiven

~oduls leicht z~ erkennen.

~. ~ X ~ A L Z

PARRLLELISIERUNG

Parallelisierung aus

eigent~ich

l~sendes)

ist

~hnlich

dynawisches

Preblem.

Im

der Op%imisierung ein yon Natur (nut

letz%en

~rograms!aufzeit

zu

Abschnitt wurde gezeigt,

zur

dass

sich f~r die Optisisierung ein Tell dieser eines

z~r

~berse%zungs~ei%

Prograsma~lauf werden,

vorwegnehmen

dass

sich,

Paral]e!itMt

zur

Verfahres beruht auf

der

in

~od~larisierungstechnik beschriebe~. gezeigfeD f(1},

Es

wird

her ist

wird) klar~

voneinander erfolgen

Parallelismus

~apitel ist

illustriert

f(100)

2

an

l~sen

Hand

der

und

in

tas

zweiten

/13/

des

Operator

der

l~sst,

angegehenen

/10/

das

~rkennung

Dam Programm berechnet (wobei

gezeigt

~lussdiagramme, -

in

wird

n~her

~bb.

11

die ~usdr~cke f

unhestismt

und summiert diese Ausdr@cke auf. Von der Logik dams

die

uDabh@ngig

kant.

f~r

in

symbolischen

Folgenden

itpliziert

und

Flussdiagramms.

f(2},...,

ge]asse,

Is

vollst~ndig - was eine

La~fzeif

bereits

durchgef~hrten

l~sst.

zumindestens

Para!lelisierungs~rcble,

~ynasik

Zu

Rus~ertung ist

~eigen

~nd ist,

aller

sithin wie

dieser

parallel dieser

~usdr@cke zueinander

"dynasische"

(der im allgemeinen nicht ~on der Programsgr@sse

scnder~ vc~ der L@nge des

Programmablaufs

abh@ng%)

ermittelt

werden kant. Gem~ss ergehen

der sich

in Abschni%t 2 angegebenen die

fclgenden

~ekursio~sgleichunges symbo]ischer

Programs

~ B C a G ~

beschreihenden

Anweisungen

@else als A, B, C, ... angef~hrt,

angede~%e~): ~=

das

(die einzelnen

Modularisierungstechnik werden nur in

wie in

A~h.

11

333

Der

gresse

Vcrteil

Parallelisierung

dieser

ist,

KontrollahhMngigkei%en

-

Rnweisungsausf~hrung aufgezeigt

durch

das

yon

werden.

Programmdarstellung

dass

sind

sie

die

Moduls der

cder

eines

darin

liegt, dann h~n9% die Ausf~hrnng

~usf~hrung

Ahh~ngigkeit

des

wird

(siehe

/7/),

wobei

(d~rch

Anweisung

Modulaufrufes yon

aS.

den

Des leichteren im

in

~olgenden

Er~ei%erung

(Zuweisung

Yariablen

und

lesen, bzw°

Expansionsoperaticnen hinzukcmmen,

dieser

weiteren yon

Um

zu

die ihrer

ist

~ine

Verst~ndnisses

in graphischer

~orm

basierend auf des Konzep% eines Datenflussgraphen

Basisopera%ionen yon

Analyse

~Iternativen

herauszuarbeiten

unumg~nglich.

diese

durchgef~hrf,

Dated

-

im Eereich eines

enthaltenen

Anweisungsausf~hrung

ben~tig%en

Datenflussanalyse halber

betreffenden

eider

~usf~hrung

die

Bereich

eider

~ntscheidungen

Warm immez eine Anweisung

im

die die

Abh~ngigkeiten

vcrangehenden

~oduls liegt, d.h. entweder direkt in einer seiner vorkommt

f@r

genau

(dutch

dieses

Konzeptes

zu

den

Eingabe/Ausgaheoperationen), nach Variablen

schreihen,

Doppelkreis

gekennzeichnet)

noch

die bei Ausf~hrung einen Knoten in einen weiteren

eine

Al~ernative

gegebenen)

Datenflussgraphen

e~pandieren. Ein

Dat~nflussgraph

des

betreffenden

wird gezeichnet,

Medals

nach

der

Schreihcperationen

in

jede Variable

K~sfchenform

beschrieben

(in

Pfeilfcrm gezeichne%

fortlaufende

Kopien

Numerier~ng

nut die let~te dieser d.h.

vcneinander

angefer%ig%en

in darauf folge~den

Ausl~s,ng

der

Entscheidungsvariable ~infachpfeile pote~%ielle

den

~U

oder

V

einen vcn

ergib%

(die

durch

verden)

verwendet

sich

und

iesevorgang allein

werden

yon

der

masgehenden)

Doppelpfeil

angegehen.

Expansionsoperationen

~bergehen

dargeste]Ite Bild. DeE Nebenmodul

einmal

"Aktualit~tswert"

die ers% bei Ausfffhrung

Datenli~ien

Hau~%~odul

Kopien der

and

~uftre%en ein

sind

unterschieden

Expansion

dutch

Datenlinien,

i~ tats~chliche ~nr

h@chstens

Operationen

ist

Lese-

uerden, wohei abet

angegeben) an2ufertigen

dart. F~r Expansionsoperaticnen (f~r

die

werden dart, sodass bei wiederholtes

and @erselben Variablen

besitzt,

indem f~r eine Operation anderen

der

sind

~xpansion

k@nnen. dabei

a hat 2

das

in

~Iternativen,

Ahh.

12

deren

334

entspreche~de wirk]ich

Datenflussgraphen

kommen

kann,

tragen

ist

wiedergegebene i~

yon Eingabe Eine

s ben~%igt einer

gew~hrleisten

ist.

Betrachter

hestimmt),

bereits

Ausf~hrung

erkennen.

werden.

Dies

vorgegeb~nen eigenes Bei

Graphen

deren Ausf~hrung geschrieben

gel~schf

ausgewMhlt

/2/)

de[

Ausf~hru~gs~glichkeiten VerschwindeD

beschreibf°

des Netzes

ausser~alb

dee

Betrachtungen).

ergibt

erweisen.

im

~r

zu

dabei

der

dass

Anweisung

momentan

Die

als

Netz jedem

Graphen" erlaubte, endet

mit

~uswirkungen ~edien,

maximal

der

steht

mBgliche

Programmansf~hrung

yon

oder Nichtvorhandensein

Programs

sich ~

hestimmt

Zu

"precedence

externen

bei

das gegebene

~herzeugen,

ihre

bestehende

Die ~usf~hrung

dutch das Vorha~deDsein

vo, Date~]inien.

das

(die eigentlichen

Parallelit~t

sich

dieser

eingesetzt.

s~mt]iche

das Schreiben

Ausf~hrungen

kann

sohald

~er Weft

wird in

Programmausf~hrung,

se!bst

Wet% an die

werden soll, und eine Eopie des

sfe!!t das Ne%z den

dar,

einfach

ein darin

vorhanden

selhst

ausf~hrbar,

Expansionsoperation

Ausf~hr~Dgszeitpu~k%

selbst,

sind

wird der entsprechende

Datenf]ussgra~hen

der

(vg!.

erzeugt mit der

~xpansion

ist,

scbald ihre Eingabewerte

definiert ist.

Alternative

eDtsFrechenden

parallelen

beginnend

~. Sobald dutch

werden

~tscheidu,gsvariable

anste!]e

einen

ausschliesslich der

und die Operation

zu

werden.

~xpansicnsopera%icnen welche

sie

f~r

m~ssen ~uerst einsal

hergestellt

ausf~hrbar,

jedoch durch

2umindestens

Expansion,

Expa~sio~soperaticn

Dafenf]ussnetz

~usgabevariable

des

dutch

nut

|s := s)

Kcntrollmechanismus

Operationen

nicht

Mod~lal%ernative

(und f~r den sind

geschieh%

Basisoperationen

(hier

~nalyse zeigt ferner,

Kopieroperation

lassen~

den

Zuge

f@r jede

wird, deren Hereitste!l~ng

Datenflussgraphen

auch

zum

serge zu

tiefergehende

eigenen

menschliche~

~amit

~xpansion "interface"

~alle eines Aufrufes der zweiten

~ie Variable

siDd.

zeigt.

und ~usgabevaria~len

abet in /13/ beschriebene)

Einf~hrung

Die

13

bei einer

f~r ein gleichf~rmiges

(gleich A ~ a h l

~odula]~ernafive). dass

~bb.

jede dieser ~]%ernativen

dabei

m@ge sich der die

voneinander

Leser

wiederholten una~hmngig

335

So

anschaulich

u~geeignet n~chster

schri%%

linearisiert gebracht. (nicht

graphische

sein

mag,

wird die graphische

Da~stellung

Lis%e)

einer

der

Operatienen

versehen

werden,

getren~ten

V =

als Menge Graphen

wobei Modulaufrufe die

die

dutch

Eingabe und Ausgabevariablen

Beispiel

ergibt

sich

damit

in

Form:

{So:=O; no:=1; write sz}

Po:=no

(!e_~t m k - X ( y , z )

= w:

f (y,z)) is...

mk...

At

the

the

manipulation

points

w h e r e i t is n e c e s s a r y of

functions,

use

to d i s c u s s is

made

more carefully of

the

Lamhda

notationf(x)

= ...x...

~owever,

these

(see ref.

[13])

let

x = e:

-

uses

will

o f t e n be

"sugared"

) )

g(x)

f = Xx . . . . x...

)

-

( x x . g ( x ) ) (e)

in L a n d i n ' s

style

394

Raps

are

used where %he graph of a function can be explicitly

co,structed -

[dl -> r~ J

explicit definition

[d->

implicit definition

r ] p(d,r)]

+

joining

are the c o ~ n t e r r a r t s of the set concepts.

To

come

now

to

the

problem

of

Semantic

le_~n_it_~o_~n°

d e f i n i t i o n s given beloN will be written in terms from

stated

se@~ntics VDL

domains

to

it is intended

style

models,

ranges.

[16],

in

c o m p o n e n t exists in the i n t e r p r e t i n g discussed

more

construct).

The

full~

below

re la ticn

mathematical

sema,tics

be reviewed

in c c D n e c t i o n

considered.

To begin

in

between

the d i s t i n c t i o n

from

the

which an ezplicit control machine connection functional

(this with

point the

semantics

is

set_e and

ref. [22] is of more i n t e r e s t

and will

with several of the l a n g u a g e

features

with i% is worth showing the definition of

a language which is itself f u n c t i o n a l and thus affords path tc functional

functions

In using the term f_u_n_ct_~onal

to e m p h a s i s e

ref.

of

The

an

easy

semantics.

1.2

Consider

the

language

given by the f o l l o w i n g

rules -

BI expr = inf-expr

B2 inf-expr

B3 var-ref

I var-ref

:: ex~r o F expr

:: id

B~ const =: I~TG

~ const

abstract syntax

395

The

class

op

is

not

existence of a functiom B5 apply-op

Now,

further

specified

than by the

-

: INTG op INTG -> INTG

for a given set of denotations

avoided

other

because

it

will

be

used

(the term "environment" below)

for

the

is

free

identifiers-

B6

DEN

the

: id->

INIG

denotation

of an expression,

which is also an integer,

is

given hy B7 eval-expr(e, den) = cases e: mk-inf-expr (el,op,e2)

->

(l_et v1=eval-expr(el,den) ; l_e_% v2=eval-expr (e2,den) ; r~su_!it_ is

{apply-op(~1,op,v2)))

mk-var-ref(id) mk-co~st(n) type:

This

expr DEN -> INTG

definition

composite

is

programming mathematics, at

from)

introduction

variable

a

has

expression

constructed The

-> den(id)

-> n

the depends

only

the denotations of

perhaps

on

that

the denotation

(and

can

of its component

].anguage. are is forced

most The

distinctive fact

that,

be

expressions.

feature in

of a

therefore

the concept of a dynamic assignment the

given point in time

semantics.

property

to a

of

a

contrast

to

to consider the value of a

variable

Foses problems for the definition

of

396

t.3

&~ais~a!_~a~£~

Consider

the

assignment

language

statements

whose programs consist of a s e q u e n c e of

(as-st)

which can be d e s c r i b e d -

CI as-st*

C2 as-st

The

:: s-lhs:id

effect

of

s-~hs:expr

such

a

sequence

of

statements

transfer~ so~e initial set of d e n o t a t i o n s step

by

step,

into

longer sufficient

their

final

to c o n s i d e r

for

the

denotations.

the DEN as

an

will

be to

variables,

Thus it is no

argument

to

the

interpretation:

the DEN reguired as the a r g u m e n t to the second

(and subsequent)

calls of

changed

by

the

the

interpretation

interpretation

may

have

been

of the first assignment. this proh!em

omitting

This is done because the

a]l

mention

of

the

DEN.

intention is to offer a number of

different

by

The

function given below a p p e a r s to ignore

simply

explanations.

should

be p o s s i b l e to see the intent of what is~ written

accepts

that ~ssign -changes"

assumes e v a l - e x p r

the DEN for

uses the c u r r e n t

the

given

It

if one

id,

and

DEN -

C3 int-st-l(st-!,!)

if i ~ lst-I !_he_n (let mk-as-st(lhs,rhs)

= st-l[i ];

l e t v:eval-expr (rhs) : assign(lhs,w) : int-st-I (st-l,i# I))

i type: It

is

formulae.

as-st ~- IN~g possible

to

The first

=> consider

three

ways

of

possibility is to read them

reading as

such

programs.

397

As

such,

each functicn

refers to one non-local course,

would ccrrespond variable

the same non-local

(i.e.

variable

eval-expr and all calls of int-st-l. trivially

defined

tc

modify

has its usual ordering are

written

in both cases to distinguish c~iven

this view and

the

DEN).

referenced

It

is,

of

by the modified

The sub-program

this variable.

implications.

with "=>", and

to a subrouti,e which

assign

is

The separator

"~"

Subroutine

type

clauses

their calls are marked with a ":", them from pure functions.

using the notation of ref. [ 12], the types

could be given in full by int-st-I

:: DEN:as-st~ INTG ->

eval-expr

:: DEN:expr

assign

:: DEN:id INTG ->

With

this simple, constructive,

discuss

one

of

definition.

the

dynamica~!y

functions. which te~t, there make

has

with

is no incentive

sequencing

language State" to

which

to

a

"Grand

State"

now

tc

all

part

of

the

state

this approach

is that it

the

discussion

The second

ref. [I ], would

in

as-st • INTG DEN -> DEN

eval-expr:

expr DEN -> INTG DEN

assign:

id INTG DEN -> DEN

not

The be

counter

eval-expr.

of alternative

views of the

interpretation

in each case a DEN. This,

int-st-l:

show

variable.

that the statement

possible

give-

and

would

functions as taking an extra argument

an extra result:

style

of program

Although in %his trivial language

further inspection,

int-st-l.

change

to do so, it would have been possible to counter

of taking

without

Returning regard

a

"Small

the possible exception

could net be affected h~, for erample,

function

of

by a side effect on this new non-local

disadvantage clear,

term

put into the state used by the defining

variables,

statement

the

in which only those things

are

are put into the state. the

properties

used

This is in contrast

all

view it is already possible to

desirable

McCarthy

describe a defi,itie, very

-> INTG

is

to

and yielding

which is the view of

398

(I,

fact

examplet

it

is

often

possible

since it can cause

nc

to

simplify;

changes,

in the above

eval-expr

need

not

return a DEN.)

But it is ne longer possible to rely on %he p r o g r a m m i n g view of ":".

It is n e c e s s a r y to d e s c r i b e it

functions.

as

a

combinator

The task of doing this is c o m p l i c a t e d

betwee,

by the various

a l t e r n a t i v e c o n t e x t s and it is e a s i e r to show the result would

come fros using

which

the c o m b i n a t o r -

int-st-I (st- l,itden)

=

if i -< ! s t - I

the__n (l_e~ mk-as-st(lhs,rhs) let

(v,denl)

= st-l[i ];

= eval-expr(rhs,den);

!_et den2 = assign(lhs,v,denl) ; result is

(int-st-l(st-l,i+1,den2)))

_e!s~e den

type:

Since

the

sugared

The

as-st* IN~G DEN -> DEN

"lets" are

now on pure functions,

a

fors cf lambda expressions.

third view one could

of ref.

they are simply

[22 ].

functiona~

The c o m m e n t

language,

take of the f u n c t i o n i n t - s t - i was made on the

definition

that the de,oration of its

was al! that was necessary to d e t e r m i n e

the

sub-components of

a

unit. By regarding the d e n o t a t i o n of an a s s i g n m e n t statement

as

a function this

is

it is again sc

~ossible

to enjoy

the

is that of

denotation

this property.

(That

is mcre c l e a r l y shown if the abstract syntax of a

cosposite statement

is given

recursively.)

would be-

int-st-l:

as-st ~ INTG ->

eval-expr:

expr ->

assign:

id->

(DEN ~> DE~)

(DEN -> INTG BEN)

(IN~G->

(DEN->

DEN})

The r e s u l t i n g

types

399

Again

in

this

combinator. (after

view

it

is

necessary

But now the fact that the

applying

the

functions

to

define

units

to

";"

be

as a

combined

to the static components)

basically f u n c t i o n s of the type ~EN -> ~E~ means that the

are very

pleasing model of functional c e m p o s i t i o n is adequate.

The

Oxford

group

(refs. [22, 23, 19]) have gone rather

far in

designing c c m b i n a f o r s which weuld permit f o r m u l a t i o n s like -

int-st-! (st- l,i) =

X~en.(!f

i

(DEN -> DEN)

w here -

CO~B(vf,uf)

=

kden.(kv,denl.(uf(v) (den1)) (vf(den)))

(It

should

be

made

clear

that if this had been written by a

genuine devotee of mathematical semantics, very

different.

it would have looked

It is the c u r r e n t author's view that excessive

zeal in shortening d e f i n i t i o n s makes them less rather than more readable,

Since

of. ref. [19],

ref. [1]).

this is a function one can look at its result

program!

Consider -

p = (X := I; y := x * 2)

for a test

400

Then after reduction-

i~t-st-I (ptl) = (kden.den + [y -> den(x)+2])

which

is

the result expected.

o (Xden.den + [x -> I])

(This e x e r c i s e is somewhat more

i l l u m i n a t i D g on larger examples.)

to

the earlier d i s c u s s i o n of grand

versus small state a p p r o a c h e s ,

(Reverting

it is worth noting that it would

have

for

been

a

moment

possible

tc make the

(undesirable)

step of putting

the s t a t e m e p t c o u n t e r in the state and still give a in

terms

cf a function

would ~ot,

howe~er, have been possible to provide

combinators. counter

to f u n c t i o n s from states

In

particular,

is required

definition

to states. such

~t

simple

the static role of the s t a t e m e n t

to provide the

required

decomposition

of

the semantics.)

If the a p p r o p r i a t e c o m b i n a t o r d e f i n i t i o n s bow provided three questio~

The

ways of reading

were written,

the f o r m u l a

we have

int-st-l.

positicD

advanced

in the next part of this paper is that

the i n t e r p r e t i v e view is useful during the development translator

mapping

thinkiDg

in

problems be resolved

~rograms

terms

~a~ipulated

definitions

notation

meta-~anguage.

written

in

reasoning

this

style

will

be

the m a t h e m a t i c a l view

or any

is

it

so far it would be easy

variant of the "fRr". the question

reasonable

to

add

In

doing

arises to

as the

Would it, for instance, be a c c e p t a b l e to have a

while c o n s t r u c t ? of

that certain

Thus the notation will develop

this for any r e a s o n a b l y c o m p l e x language m~ch

however,

this leads to r e t e n t i o n of

the a~ount of notation i n t r o d u c e d

how

the

to when d e c i s i o n s are otherwise unclear.

to define ~'if then else"

to

Observe,

of c e m b i n a t o r s ~ e ~ _ ~ d

as if they were oFerational;

say he a~pealed

With

to functions.

moze carefully:

also this view of the formulae. the style above~

of

and it is cn!y then that one need take the

view of mapping source %hat

The

of which should be used must now be discussed.

lhe answer

about

must a l w a y s be sought in

a construct.

the

ease

Thus in ref. [~] both while

401

and ~9~ constructs more

have been included,

restricted

kind

than

the

but they are of

FOR

a

much

of PL/I which was being

defined.

This

topic

definition

1.q

The

brings

in its banishment

standards

valuable

to

situations them

committees. have

and

some

an

problem

hop,

its ability

attempt

for

of

it

has not

languages

appears to he

abnormal

to provide formal

defining ~ 9

its

is

has

the

terminated

sequencing

definitions

for

as

both

technigue

this

the

stack

so

be

order.)

that

below

of the interFreting stack:

in of

if

with

a

can be shorter

The subject of

the

connection

state with

is itself

operation

obeyed and the next step of operation

of

the but

arbitrary

the recta-language

LAMBDA in ref.

removes the top

this

exits,

was more general,

describing

(called

function

block

in ref. [ 2,] was to

into

a VDL definition

function

compound

in the ne,t section.

the control

Instead

as functions,

interpreting

the

whose semantics

discussed.

component

[In fact

upsets the

same phrase structure

for modelling ~oto employed

point is discussed

control

can

is discussed

~acbine.

evaluation

section

this is simpler than

that

problems

introduce a control

~irectly

In

the phrase structure

Although

property

phrase structures

of a unit solely in terms of the

sub-units.

chosen

block terminations

abstract

is that, other than the local

and initiated abnormally

definition

the

serious,

tc leave or enter

should be pEovided.

an

be

mechanism

ability

with

of

statement

this

from descriptions

To

to state the semantics

semantics

The

a semantic

may be one of the tools for comparison.

The

it

the challenge of giving

debate on the morality of the Horn construct

yet resulted by

to

Lan~ua~

Goto

long

us

of ~o~o.

described [24]).

by

A step

instruction

from

is elementary,

it is

is performed;

in the case

402

of

a

"macro"

the centre]

operation~

the a p p r o p r i a t e o p e r a t i o n s are put on

stack so that the next step will e n c o u n t e r them.

So far this can be thought of as a way of d e s c r i b i n g functional application. component

The

an

its e x p l i c i t phrase

purpose,

explicit

~anipulation.

stzucture

structured delete

hoover,

was

Thus one

to

the

making

the

control

way to model ~ot_o out of

define

the

in line with the phrase

from

of

part of the state was to make p o s s i b l e

control

"obvious"

structure,

component

any

a

operations

but

to

simply

operations

which

c o r r e s p o n d e d to parts of the program being jumped over.

The effect of this was that, in general, present a r g u m e n t s structure

whose inductive

of the program.

structure f o l l o w e d the p h r a s e

It was of c o u r s e p o s s i b l e to present

proofs,

but they had to be by i n d ~ t i o n

states

generated

by

it was not p o s s i b l e to

LAMBDA.

One

over

could

the

sequence

argue

p r e c i s e l y the undesirable effect of the ~t__Qo, but in definition

had

the generality, instruction,

gone

too far.

in this case

forced

of

that this is fact

the

It was one of the places where

to

change

the

control

in

any

one to show that, in precisely the places

one did not require the power, it was not used.

In fact the deletion used more

of parts of the control

for exits from blocks: local

phrase

was sometimes only

the reason that it was not used for

structures

was

the

s o l u t i o n adopted for

h a n d l i n g abnormal entry into such phrase structures. some

definitions

was simply changed q_oto

into

and

to point tc the next

statement.

out of phrase structures

on exit from the phrase structure.

However,

unit

by

~anipulating

treatment of ~ND in ref. The

current author

action and

the

to be

into the

tended to cloud

the n e c e s s i t y to describe finding

made

performed

(This can, in fact he v i e w e d

part cf the iAMBDA function

such d e f i n i t i o n s

This

very easy to describe

providing there was no special e p i l o g u e action

as a b s o r b i n g

Hasical!y,

a d o p t e d a "current s t a t e m e n t s e l e c t o r " which

definition).

the normal action by

the successor to an

Fointer

(see,

for

embedded

example,

the

[25]).

became c o n v i n c e d

that setting up the normal

letting a ~oto "take the machine

by

surprise"

was

403

the

wrong

model.

The

proposal

could result in abnormal "abnormal"

result,

made

termination

which

was

was that any unit which

should

some

return

an

case. Any call of a function which could result in an return

must

(Together with W. Henhapl,

selector

treatment,

order

to

this was written

ref.

who provided the

statement

up in ref. [6]).

define Algol 60# in which it is possible to ~Qt__qo

into both "_if" and compound address

abnormal

test for this possibility and perform a p p r o p r i a t e

actions.

In

extra

null value in the normal

statements,

the other part of the problem.

[I ] is to provide functions

structure

without

to fo]]cw.

Since these functions

it

was

necessary

to

The approach employed in

which run through

the

phrase

executing but setting up all of the actions prompted the e x e c u t i o n

where to begin they became called

"cue-functions"

as

to

(as in acting

- Dam Stichwort).

Consider,

for example,

the following

-

~ot_a l: i_f P ~_h_e_n I: sl else s2 : s3

Not oD1y should

this,

rather odd, transfer of control

without evaluating p, it should also set up the performed

thereafter

so

thal

s2

is

get to sl

events

to

be

skipped and s3 is next

considered.

The c o m p l e t e l y functional definition of Algol given in ref. [ 1] became tedious because of the man~ places where the effect of a ~o~o

can

cause

a cha~ge of e v e n t s and therefore the abnormal

return value must be tested. most

common

action

was

It was, however,

next operation and pass back the abnormal level.

that

the

value

to

the

next

In fact %here are very few places where it is necessary

to describe any sFecial action. Lucas

clear

simply to refrain from execQting the

proposed

that

Based on

adopting

some

this unit

observation to

trap

P. the

404

interpretation

where the action

free %o drop the ,,test and

This

abbreviation

normal • ~!

is

for

ABN.

e x p l i c i t l y by the retur,ed

a

to

be

Non-~i~l

exi~

the

(implied)

values

statement.

for

ABN

Normal

return

of

All a

are returned

action

with the same value for

~i~

unit b r a c k e t e d

which it applies.

The

one

on

being

ABN.

An

explicit

performed for a non-hi I ~BN value is defined by

means of a ~ a ~

containing

leave

non-hi ! ABN value is to terminate also the calling

function abnormally action

would

the one used in ref. [ ~] and below.

~eturns are written omitting

walue

to

was required

return" case by convention°

together with the statement

C o m p l e t i o n of the trap unit completes the

function.

developmert of this idea has been d e s c r i b e d in terms of an

i~terpre%er partly

partly because

because

it

is

this is how it actually o c c u r e d

probably

easier

to

first

read

and the

following functions in this way. (In

fact

the

separation

i n t - n s - I and c u e - i n t - n s - i were more

taking

the

mathematical

of

the. largely similar,

would probably not

purely i n t e r p r e t i v e view

given

be

functions

made

view: it is only the

if

one

for the

below

that

functions

are

given

by the f o l l o w i n g abstract

written separately°)

The

language

syntax. is

considered

It is assumed

something

is

that among

the unlisted statement

types

like the a s s i g n m e n t of the previous section which

would force retention of the DEN component.

D1 st = goto-st

} cpd-st

I ---

D2 goto-st :: id

93 cpd-st :: nmd~st • D~ hind-st :: s - n m = [ i d ~ s-body:st

id net further defined

405 The defining "the unique

functions

can now be given

object satisfying")

(the "b" operator

-

D5 int-st(st): cases

st:

ink-gore-st(lab)

-> ~Ii~(lab)

mk-cpd-st(ns-l)

-> int-ns-l(ns-l,1)

type:

st =>

D6 int-ns-l(ns-l,i) :

if i _~ !ns-I

t~_n {(trap exii_t (lab) wit_~h if is-contained (lab,ns-l) then cue-int-ns-l(ns-l,lab) else exit (lab) ; int-st(s-body(ns-l[i~)) int- ns-I (ns-l,im I) )

_e!s~e ! type:

hind-st* INIG =>

;

means

406 D7 c u e - i n t ~ n s - l ( , s - ] , l a b ) : !et i =

(hi) (is-contained(lab,))

if lab = s-nm(ns-l[i]) thhen int-ns-l(ns-l,i)

e_is_e witl

(~ra~ exit(lab)

if is-co~tained |lab,ns-l) t_he~n c u e - i n t - n s - i (ns-l,lab) else exit (lab) ; c u e - i n t - ns-I (s-body (ns-l[i ]) ,lab) ) int- ns-I (ns-l,i+ I) )

type:

hind-st* id =>

D8 i s - c o n t a i n e d ( ] a b , n s - l ) (3i) (s-nm(ns-l[i])

= = lab)

v

(39) (is-cpd-st (s-body(ns-l[j ]) ) ^ i x - c o n t a i n e d (lab,x-body(ns-l[j 9))

type:

[%

id n m d - s t ~ - >

was

observed

obtained

terms.

constructs.

that

a deeper

This

view

is

~irst it is n e c e s s a r y

in the "=>"

int-st:

Tt

above

understanding

by viewing a m o r n - l a n g u a g e c o n s t r u c t

sesantics

hidden

{_t_rue,fa_!_se)

now to

in

a t t e m p t e d of the above uncover

what

st ->

n~d-st~ INTG -) (DEN -> EEN IBN) hind-st* id

to

define

s t a t e m e n t s separated

by

(DEN -> DEN A~N)

the

denotation

";"

in

terms

of two m e t a - l a n g u a g e of

their

denot atic,s. stl;st2

-

been

(DEN -> DEN ~%BN)

int-rs-l:

easy

has

of the type c l a u s e s -

cue-int-ns-l:

is

is often

mathematical

kden. (le_t (denl,abnl)

= st1{den) ;

if abnl = _nil then st2(denl) else

{denl,abn I)

individual

407

This

gives

us the way of creating

-> ~ ABN}

from two functions

test

dynamic

is

~qt_o will, !t

is

a function

depend on the state.

to write a very straightforward

for the t r!2 e_/xi~ but, if this results in the

fact

functions applied make

which

to the whole text

for

which

one

or free

although

the

labels

(in the sense they may the set of labels the

labels.

by the following

example -

1

I:$2 /*no contained

Then

that

can do something is known: it is precisely

~ f p ~he._,.nn~

=

is

within the unit),

This point can be illustrated

S

dynamic

(of the current unit) would

to the trap are unknown

he either contained set of contained

equally

to ascribe a semantics of the required type

The key observation

will come

an

combinator

that the t_ra_~ exit body again uses defining

it impossible

to a unit.

{~

unavoidable because the occurence of the

in general,

is also possible

action,

whose type is

of similar type: the fact that t h e

labels

*/

-

int- ns-l(S,1 ) = !_e_% (den1 ,abnl)

= int-st(if

p _th_e.~ng_o_to 1 e_!Is_~e~Lqto =);

(abnl = all -> int-st(s2) abnl = 1

-> cue-int-ns-l(S,l)

T

-> exit (abnl))

Wow s i n c e cue-int-ns-l{S,l)

it

= int-st(S2)

can be seen how tc construct the denotation

of course, statements

a trivial case. introduces

But eve~

where

looping, an equation

fixed point can be sought.

the

of S. This was, graph

of

~o

will be given whose

408

It should be conceded at ~rife " d e f i N i t i o n s " combination°

The

above

which

do

not

permit

static

This is a cause for further consideration.

mechanis~ is not the one usually e m p l o y e d

mathematical semantics: been

this point that it is also p o s s i b l e to

using exits

accepted

{cf.

the mechanism

ref.

[23])

which

is

in giving

appears

that of

to

have

"Continuations".

Basically,

the denotation of a label is the function

~)

r e p r e s e n t s starting at that label and running to the

which

(say ~

->

end of the program!

This is c e r t a i n l y a more powerful concept:

that

more g e n e r a l languages,

it

can define

found out to his cost possible

when

continuations

while

tried

to

the current author show

that

to e l i m i n a t e c o n t i n u a t i o n s in ref. [21].

maxim is to be sparing

general.

he

to

model Algol 60 labels

was

~owever,

c~ power in the m e t a - l a n g u a g e {ref. [ 1 9 ~

It seems unfortunate, for instance,

it

and

the

using

may be too

that in -

p do

!_f q !_b_e_~so__t_o I: I: $2

ea_d the

The

label I cannot be "treated

actual

locally".

choice between c o n t i n u a t i o n s and the model offered

here must be made in the c o n t e x t of the definition.

Since

use

be

decided.

%.he language

the Oxford group has an interest in proving

c o m p i l e r s c o r r e c t it will await a larger can

of

The

experience

with

arguments on e~i~t is, sc far, e n c o u r a g i n g .

example basing

before

this

correctness

409

B_!_o_cL_s!~9~!uj~L__aan_.q~a_q~

1.5

Both

blocks

and

rrocedures

local level of ~aling.

permit their user to i n t r o d u c e a

Since the names defined within different

(even nested) b l o c k s do not have to be distinct, of 1.3 will not suffice as a state. language

in

"remember"

which

Consider

no recursion is allowed.

the value of a variable,

a nested block in which a n o t h e r to o v e r c o m e this problem statically

distinct

for

the

of

a

say x, over the lifetime of

be

to

make

all

one way

identifiers

example, q u a l i f y i n g them with a

unique b l o c k

number.

The

renaming scheme would not, however,

static

case

It is necessary to

variable x is declared,

would

by,

the s i m p l e DEN

be a d e q u a t e if

recursion

were also allowed.

It would then be necessary to keep

distinct,

multiple i n s t a n c e s of a variable which is declared in

a recursive block.

Before

considering

the passing of procedures as parameters,

is a p p r o p r i a t e to discuss c a l l - b y - r e f e r e , c e since i,troduces

a

tool

which

makes

the

its

it

solution

remaining problems both

easier te state and solve.

C o n s i d e r the following -

b__eain ~oc

p(x): ia~

x; x := a;

p(a)

If

the variable a is passed by reference, the parameter x will

refer to a. In an i m p l e m e n t a t i o n the

parameter

location. i~

x

would

result

in

a

reference a

reference

the argument

in all

referenced.

In

places

and

to the same

The d e s c r i p t i o n of Algol 60 in the Algol Report,

this part very operational.

become -

the non-local

was

The model given was to copy in

where

the

formal

parameter

was

this way the body of the above p r o c e d u r e would

410

a := a.

Some

care

because

was

discussion using

necessary

concrete of

when

abstract

describe

with

What

The storage

directly

with

has really

be i n s e r t e d ) .

for

class

[I]).

and

for

there

associate

of the class. which

to

least is

a

is to show the s h a r i n g

of objects

below

the

Eut even

tedious At

call-by-name}

component

bee~ done

(of.

somewhat

The idea

identifiers

rule" partly

should

ref.

the same m e m b e r

"copy

discussed

becomes

below

i, the e x a m p l e

the

being

(of.

by an e n v i r o n m e n t

(chosen

!ocations)~

it

mechanism.

some a u x i l i a r y

is m a i n t a i n e d

values

(see

equivalent,

by h a v i n g

LOCs

parentheses

copying

call-by-reference

identifiers

were

programs,

this

simpler~

in d e s c r i b i n g

strings

maps

This

identifiers

to be s u g g e s t i v e will

no

but i n s t e a d

is to d e c o m p o s e

into

of m a c h i n e

longer with

both

association

associate

locations.

-

DE~ i d - - ..... > VAL into-

ENV

STG

id ....... > LOC ....... > VAt

hut

in doing

so,

the

possibility

is i n t r o d u c e d

to have

idl .... I--->

n--->

v

id2 ....

so

that

any change

references more

via

thaw

via

one of the i d e n t i f i e r s

the other.

the

expression

(The use of of

an

£0C is,

is r e f l e c t e d

to

in this model,

no

eguivalence

relation

over

identifiers).

Tt

is

now

necessary

environments mentioned dynamica!ly

handles

above~

consider

the block

~he

distinct,

to

and

locations

how

model

recursive

will

so the problem

a

block

be g e n e r a t e d

of entering

which

a

has

problems

so as to block

be and

411

destroying

a

denctafion

c e r t a i n l y been overcome. environment location the

is

case

of

base

bindings,

The

will

later

be

r e q u i r e d has

generated

mapping

the

new

local

to

a new

identifier

(notice such a copying of DEN would be incorrect). a

~eeper blocks the

which

All that happens is that a

block

which

can be known by and called from

(i.e. a procedure), it is necessary to

environment,

to

which

it

will

show

insert

how

its local

is to be found.

most

mcdel

economical

would

be to assume again that all

identifiers are distinct in which case it is possible that

In

any

to

show

valid calling e n v i r o n m e n t c o n t a i n s the required base

environment as a sub-part.

"most

existence

does nct cover the case where procedures can be passed

parameters!

ref.

This

is

variable

the

recent"

as

any

then, only

solution

recent"

of

TB this case,

precisely

references are possible.

[7]).

The

operational

because

then,

"most

discussion

see

will be to "remember"

what its base e n v i r o n m e n t should

be.

In

an

model one would make a procedure d e n o t a t i o n contain

the pair of procedure text and environment. semantics

other than

(For a fuller

general solution,

for any procedure

can be r e f e r r e d to. This

In

a

mathematical

style d e f i n i t i o n one wo~Id use these two entities to

create a function.

The language to be considered is -

El proc :: s-nm:id s-parms:id~

E2 s t

= call-st

~ as-st

E3 call-st :: s-pn:id

~4 as-st :: s-lhs:id

Identifiers

--.

s-args:id~

s-rhs:expr

then correspond either to variables

is considered) required,

I

proc-set s - d c l s : i d - s e t st

in

or procedures: the

latter

in a

associated object.

E5 ENV : id ->

(LOC | PROC-DE~)

the

former

procedure

(only one type

case

a

denotation,

LOC as

is the

412

A not yet initialised

value for a variable is allowed,

so -

~6 S : LOC ~> VAL

E7 LOC = sere infinite set

E8

VAL

=

INTG

~

A functio~a] type for ~rocedure d e n o t a t i o n s is given

Eg PROC-D~N

: (LOC ~ PROC-DEN)~

->

-

{S -> S)

(Notice that qo_tc is net is the c u r r e n t language,

so ~BN is not

required) o In order tc cover recursive

procedures it is necessary that the

d e n o t a t i o n of a procedure is a v a i l a b l e indirectly) denotation

call

itself.

Since

denotations

wherever it might

is discussed

here are functional objects,

(let

env'

validit7

of

= proc;

= = env

+

([parm-l[i] - d e n ° l i t ] [id - eval-dcl(id) [s-nm(proc)

~ 1- YROC-D~N

413

Eli

eva!- dcl (id) : let

l:alloc () ;

assigm (I,?) result is

type:

E12

;

(I)

id => LOC

int-st(st,env|: cases st: mk-call-st (pn,arg-l)

->

(let f = env(pn); let den-i =

The functions -

~13

allot : :> r OC

El@

release : LOC =>

extend

and

restrict,

respectively,

the

domain

of storage.

While-

E15

assign : LOC VAL =>

E16

contents : LOC => INTG

modify and read,

Based

on

the

respectively,

environment

important points.

it

values of storage.

is

possible

First, it is interesting

is precisely the response

to

to note

clarify that

two this

of both c o n s t r u c t i v e and m a t h e m a t i c a l

414

definitions ]a,guageo

to

the

This

problem

the above manipulation cf ~ref.

the environment,

as

the

well

in the state.

he modified the

defining

a

block

structure

environment

better

than,

is say

The VDL models used the grand state approach and

[2~3"?

co,tailed

of

leads to the second point: in what respects

as

all

"stacked"

versions,

This ~eans that, potentially,

by any function.

were

they can

It was then necessary to show that

i n t e r p r e t a t i o n of two successive statements in a statement

list was under the same environment. non-trivial call, had

because if the first statement was, for example,

the e n v i r o n m e n t to

call,

show

was indeed dumped

environments exactly

that by termination of the i n t e r p r e t a t i o n of the

as

arguments,

that two

the

argument.

~£~i~i~.

certain

the

other

calls

of

(This

was

of

hand, shows guite int-st

the

are

passed

subject of the,

~roefs of the first two lemmas in ref. [g].)

language now available

%hat

on

successive

sa~e

somewhat tedious,

of ~ £ n ~

a

then changed: the proof

the dumped e n v i r o n m e n t had been restored. The passing

explicitly

The

Moreover, such proofs were

is rich enough

to discuss the topic

In the d e f i n i t i o n given, it

conditions

is

assumed

hold for a b s t r a c t programs which are

not expressed by the syntax rules.

For example,

the

definition

would simply be undefined for a program which attempted to call a simple variable or which called a defined procedure less

arguments

than

type

rules

the

unnecessary

in

parameters. abstract

encumberance.)

syntax

however,

of

ref.

It would, of course,

write appropriate checks i~tc

but

with

(The attempt to include such

the

defining

[ I]

was

an

he possible to

functions.

This,

would net show that the checks, in this language,

of a static

nature.

That

is,

it

is

possible

to

define

are a

predicate -

is-well-formed

: ~roc ->

{tr_ue,_fa_!se}

which only yields tzue in the case that all such static context c o n d i t i o n s are fulfilled. position

This

is

not

intended

to

take

a

on to what extent type questions in a language should

be s t a t i c a l l y checkable° useful

~roperty

static

and

what

of can

a

It is only

to

definition

to e x p l i c i t l y show what

only

be

checked

argue

that

it

dynamically.

is

a is (Rn

415

associated

~oint

implementation errors

It

is

i~ an unexecnted

would

be

procedures

1.6

that

for programs

it

permits

which contain

freedom

statically

to

an

checkable

part).

to

possible

define

both

and function

blocks

in the style of this section.

F ux!her ToEiN~

This

section

ccnstrucis With

will consider how some other,

familiar,

language

could be tackled in the same spirit as above.

regard

definition

to fl£~£, it is straightforward

to extend the last

to cover S£~2 out of procedure calls.

This,

with the merging of the other features already defined,

along is done

in the Appen@ix.

~he problem is simple because only known,

therefore

recently activated,

most

referenced.

[f the language allows

parameters

this

the

passing

property no longer holds.

to make each instance

of

labels

as

It is now necessary

cf a label dynamically

this requires some mechanism

and

instances of labels can be

distinct

like the activation

and to do

identifiers

of

ref. [~].

The

i,troduction

variables)

into

consideration

a of

label

language

target variable definition

the

something

The

(or,

with

it

affected of

variables

is

forced

fact,

the

a label instance

to be not-greater PL/q d ~ s

in

entry

additional

which Do longer

the

lifetime

than the lifetime of the

not make this restriction to

of

add

a

validity

and

check via

like a set of currently active activations.

definition

creation model

brings

referencing

label being assigned. so

variables

Algol 68 avoids this by constraining

exists. any

of

of

call-by-value

of a special by the in

any

location

changes

A ]gol

60

via

is simple to include

which

will

be

the parameter.

description

aD imaginary block.

of

the

by the

only

one

This is a close

assignment

to

new

The fact that Pi/I makes the

416

choice

between

calling

call-by-value

side

ca~!-~y-~a~a

is

of Algo~

passing

procedures.

it

is

frequently

the

i~Dlementer

wider

cf

definition write

it

referenced

could

is

freedo~

the

First

-

that

in

definition

reasonable

is warning

to leave

This

is

permit

the

user

order.

in

references

such orders

is an open

a

the

to be made

function

In a l l o w i n g

for

an e q u i v a l e n t

to

in an expression contains

such freedom

powerful

mechanism

which g u a r a n t e e

be

designer

define

note

may

the

on

pmore

the

rely on any particular

of how tc formally

(a +

via

in a !a,guage

side-effects.

which

The

of order of evaluation.

variables

language

programs

[4].

handled

if the e x p r e s s i o n cause

call-by-reference

ref.

than o p t i m i s a t i o n s

any order even which

60

~cr instance,

access

and

in

desirable

some

freedom

result.

shown

in a

not

to

The question problem!

(b + c))

the d e f i n i t i o n

may

wish

to allow

not only

-

a h c, a c b, b c au c b a

but also c

It

a

h etc.

is net,

at each The

therefores

branch:

response

of the

performed

were in any

any a v a i l a b l e

The

its

Droperty

of

such

arbitrary

this

was

VDL

to this

~achine But

in

ref.

LAMBDA

into

order

cculd

The

in fact a r e a s o n a b l y

control

operations

to

randomly

be be

chose

for execution.

by

very similar

building

the definition. occur

arbitrariness.

if they were to

function

[I] was in fact nature

the

was to sake the

branches

IAMBDA

of the ccnt£cl

functional

tree.

parallel the

one path or the other

to inherit problem

into a

on

crder and

leaf

definiticn

achieved

to choose

it is n e c e s s a r y of

component

executed

adequate

was in

Since

this

the only

expression

small impact.

and only relevant place

evaluation

417

The

definition

meta-language not

solved

provides

in

ref.

[4]

by introducing because

for the

the

shifts

the

problem

the "," separator.

definition

inheritance

of

of

a

to

the

The problem

combinator

sequencing

freedom

is

which is

not

provided. 9.

Beki%,

pieces than

among

others,

of "interfering" their

has pointed out that to combine two

program it is necessary

(extensional)

this problem is discussed

in ref.

be

the

generally

required,

pursue the definition defining

of

know

more

His approach to

[3]. While this approach

current

author

more axiomatically:

conditions

to

functional meaning.

good

may

would prefer to

in the first place by

cooperation

that

guarantee

~on-interference. One

final

axiomatic great

area,

that of Storage

parts of a definition.

freedom

left

to

the

reasons,

In

some

implementor

mappings of the data structures efficiency

Models,

are

mappings.

worthwhi3e example, list

viewing

I. 7

in

an

ref.

array

there

is

to what storage

required.

In

PL/I,

somewhat

[q],

however,

storage

model

for

and

to

express

the

constrain

it

was

found

abstractly

value as a mapping

For a fuller discussion

(for

from a (hypera

additional

linearised constraints

see ref. [ 2].

S_~u_m_ma__r Z

difficulty

of defining a programming

from a purely functional occur.

A

functional

extensionally Three

as

of integers to values, rather than as

thereof)

separately.

The

Even

to state the basic

)rectangle

languages

the programmer can take alternative views

of an aggregate and thus the language does the

brings in the role of

definition

(of. section

alternative

definition

in

solutions

were program

to

which functions

I. 2) is not immediatel~

as an interpreting

all interpreting

language as distinct

language is that changes

di~ussed;

a

state

are read

applicable.

to consider the

(ref. [ 2 ~ D ; to

functions as ha~ing an additional

consider

argument

and

418

result which is the state fu,ctions [!9]}. to

(ref. [I]): %o consider the

defining

as producing a f u n c t i o n s from states to states

I% is possible, by adopting c a r e f u l l y

write

in

a

chosen

notation,

style which can be read in more than one way.

Except for the problem cf a r b i t r a r y order, c o m b i n a t o r s provided which permit

The

advantages

are returned

Whatever

(ref.

of

can

be

ref. [4] to be read in all three ways.

the different ways of viewing a definition

to in the next part of the paper.

one's chosen viey of a semantic definition,

there are

more important c o n s i d e r a t i o n s which influence what is

written.

The

central

guide-line

proposed is that the definition should

not possess properties ~hich are no% inherent in being

defined.

This

is

that

a proof is often

a

such

details

it

program,

required that certain

the model haYe no effect on the final outcome. eliminate

language

,or to say that such definitions are

wrong in the overall effect they d e s c r i b e for rather

the

but

details of

If the model can

will facilitate the use envisaged

below. ~xa~ples

of

where d e f i n i t i o n s can be o v e r - s p e c i f i c range

the use ef the grand state a p p r o a c h use

of

lists

vhere

to trivial items

like

from the

sets conld suffice for components of the

state in ref~ [25]. Before

proceeding

%e

the

discussion

of

proving a compiler

correct with respect to a language definition, should

be

spePt

on

the

language definition. corresponds hope.

Most

properties

to

a

such

standards

(axioms)

The

direction

encouraging

moments

definition,

descriptions

the

definition

there is little

are

a

mixture

and partial models for the language. which it is not, for the natural

he read precisely,

best i0cemplete,

verbal

few

of the correctness of the

Rs far as -proving" that normal

if it were possible, to

question

a

such d e f i n i t i o n s

of Even

language

would be shown to be at

at worst inconsistent. followed

by

ref.

[25]

is,

of

course,

very

in that it is a huge step towards s t a n d a r d i s i n g via

a document which could be considered

to be formal.

419

In spite of the difficulties possible

to consider

most trivial languages passing

level a definition

should be checked the

implicitly

defined

seeing

why

currently

task

under

cn

a

the source

first

be

is required. "formally"

question

of

the must

two

to

source to

be

to

to

be

entail

is

the creation of a

of

Secondly,

functions from states generated

environment

an understanding is

returned

of to

process to be discussed

resolved

is

is which of the reading

are

to

be

adopted.

the interpretive

This

which a

have

mapping

been shown in the source in the choice of code to of

source

to states has more to be

functions:

will also require

author,

view of the

Firstly, it is very unlikely that

are exactly those required

produced.

It is

(and even ref. [9]).

for taking

distinctions,

case

source

specification

ucrked out. There seem, to the current

arguments

the

definition,

the

The question of whether this doCume, ted

definition

as the basic one.

these

aid

some extent remain open until more examples

definition

be

would

is

this process can begin,

must be

have bee~ fully to

has been

definition

definition

The step by step development

question

an

to the object language.

very similar to that in ref. [15]

styles

the proof

consideration

Based

machine

understanding

The

a

as

of

texts of a source language into texts of a

from

that before

object

below.

Much more

existence

this section discusses how to obtain a

mapping

obvious

believe

like

OF i TRANSL~TOB SPECIFICATICN

language.

a

by current

In ref. [4] an attempt

authors

the

under consideration.

overall

language,

and

it is

it

errors

%o a function.

termination

objects.

program to translate machine

of

The guestion of what

2. DEVELOPMENT

the

of the size required

~cst and assertion comments

the

"consistent".

correctness,

consistency,

to be free of clerical

guesticns

made tc insert pre,

for

like

the wrong number of arguments

subtle are

The

in establishing

some property

programs

developed

to than

manipulations of, for example, modelling in the object state.

the

420

The

abstract

state

of

the

source

definition

permit a large range of implementations. with

a

particular

designer ~ s task particular

machine

is

to

object

might be a

developing

something

of ref.

in view, to be more specific. The

find

machine,

development

concrete

realisations,

for

abstract

the

multi-stage

process

version

properties example

a

in

on

state. the

case

his This of

like a n ~NV to a d i s p l a y model like those

[7]. E a c h stage of d e v e l o p m e n t

concrete,

was chosen to

The time has now come,

of

an

object.

not possessed

by

the

proTides This

more

a

new,

more

model

will

have

abstract

object.

For

list has an o r d e r i n g p r o p e r t y not present in a set.

For this reason,

the a p p r o p r i a t e style to document the r e l a t i o n

believed

to express the c o r r e c t n e s s is from

abstract~

Thus,

(more)

c o n c r e t e to

if Z is a set and modelled by a list L, the set

is r e t r i e v e d by -

retr-$ {L) =

{L[i] ) !- pre-oP(q) and

the

required

{t_~rue,f~!se}

= (3#') (G [OP] ,')

other

cf

which

specifies the input/output

post-OP

: Z Z ->

pre-OP(#)

{tr_ue,false}

^ w [OP] #' = post-OP(#,#')

by -

If both of these conditions hold the fact is recorded pre-OP These

relation

for the operation -

two

post-OP

predicates,

the meaning of OP

then, record everything

(there is of course

necessary

about

nothing about performance

etc.). Suppose

a

proceed?

The problem is decomposed

specification

is

given in this form, how does one

a set of

(simpler)

operations,

to sub-problems

on

the sub-operations

the

be

true

of statements

that time they provide

will be recorded

The

most

languages execution

trivial is of

to the

way

of

separate first

execution of the second. such a combination

in the language

specifications

is

to

with be

of the two operations

in exactly assumptions

work.

two

operations

":"

showing

immediately

The conditions

necessary -

The

being used. Until

for further

combining them

could be

specification.

the sate style. Eventually all of the sub-operation will

choosing

which, if one had them,

combined in some stated way to fulfil assumptions

by

in most

that

the

followed to show

by

that

428

pro-OPt

post-OPl

pre-OP2

post-OP2

will satisfy pro

are

firstly

stated

post

that

each

pro(G1)

= pre-OP1 (=1) A post-OPt(,1,=2)

secondly

~erived

the

overall

such a sisp!e

combination,

requirement

not being suggested

should

be

accompanied

post(,1,,3)

to

In wany sit~atious

For instance,

list has been provided

can be

is proved in refo [12])~ prove

however,

true.

For

three special

with total operations

the first two lemmas are vacuously

it is certainly program

the

excessive.

cases can be applied.

relation

^ post-OP2{,2,~3)=

of these c o n d i t i o n s

appears

~X~)

input/output

-

~ post-OPl(~1,e2)

adequacy

lemmas

%hat

= pre-OP2(q2)

frow the combination

pre{#1) (The

will only be used over its

domain-

pre(,1) and

=

operation

(pre

Furthermore,

that every use of ":" in

by formal

proofs:

a

but a check

to which an appeal can be made

in

case

cf doubt. The

reader

the

next

should stage

properties

observe of

relied

onQ

that the development current providing

proof.

that a specification

development

which

Thus it is not n e c e s s a r y

cf that operation

There

is passed on to

states

is

a

complete

does

not

all

of

the

to later show disturb

split of the problem

the of

J ~stif ications.

More

methods

style

by ref. [ 12], section 9 of that paper also considers

of

combining

operations

the set could be further extended.

are defined in the same how

429

3~

many

respects

this might be the more important

forms of abstraction

being considered:

of the two

it is certainly

the

one

which is under-employed. The idea of data abstraction the earlier mappings

has, in fact,

parts of the paper.

have

both

used

an

The

language

abstraction

(i.e. that class of objects described That

this

was

alternative

necessary

of redefining

can

already

be

been used

definitions

and

of the program text

by the abstract seen

in

by

syntax).

considering

%hose functions in terms of

the

concrete

strings. If

then,

terms

it

ef

is

difficult

detailed

impossible

tc

data

tc even state the specification

representations,

it

will

write a r g u m e n t s for correctness

in

become

at such a level

of detail. The

proposal

is

that development

bring in those properties effect

on

percentage section

the

below.

represe,tafion performance The

of the data structure

algorithm.

That

cf the development

3.~

~hus

this

that

permits

can be seen

one

in

a

have

an

reasonable

the

example

is able to postpone

of

fixing the

of a data object until adequate reason

of a sub-algorithm)

interface

of an algorithm should only

(e.g. the

can be ascertained.

between operations might, then,

terms of sets or maps for example:

in

the

be described

final

code

in

linked

lists or hash tables might be the chosen representations. ~t

is

necessary to discuss

which refiDes a data doing is adding

properties

has an ordering

property

possible

what goes on in a development

representation. to

not

Essentially

the data structure present

in

the

what

step is

(e.g. the list set).

to re~iev~e all of the data of the abstract

the more concrete.

one

It

is

level from

430

Suppose

an

o p s r a t i c ~ ~n states of c l a s s D has been used, s u c h

that -

prod postd

and one now wishes to show that -

pree poste is

an

adequate

simulation.

If

is

sufficient

to

find

a

r e l a t i o n s h i p b e t w e e n the two state c l a s s e s -

retrd : g -> D w h i c h shows that OPe works on a wide enough class of states -

pred(d)

and

that

retrd)

the

ne~

operation

produces states m a t c h i n g

those produced by the old o p e r a t i o n

prod(d)

(This

^ r e t r d ( e ) = d = pree(e}

^ retrd(e)=d

~ction

differs

O p e r a t i o n s are Dot,

(under

-

^ poste {e ,e ') = postd(d,retrd(e'))

from

that

in

ref.

[ 18]

in that the

necessarily, functions).

3j___~le___o_f_~_xa_re_s si o_n_C_o_~ t_ii_on

The

input/output

[elation given for trans-expr in section

defined %o operate conveniently linear form consideriDg

cn

chosen

objects

parsing and

consider a reverse-~olish first p a r s e -

type

"expr".

These

2 is were

to be tree r e p r e s e n t a t i o n s of %he o r i g i n a l

(presumably infix). the

of

Without going all of the way to

tokenising of an e x t e r n a l string,

text which might result from

such

a

431

c-expr

::= c-inf-expr

c-inf-expr c-var-ref c-const The

~ c-const

::= c-expr c-expr c-op ::= ...

::= ...

relation

retrieve

~ c-Tar-tel

of

this to the class expr can be specified

function

which

hy a

uses a stack -

retr-expr (tl) =

!_o_r i = ~ _to ! t l ~o (is-c-var-ref (tl[i ]) -> ~ush (retr-var (tl[i ])) is-c-const (tl[i ]) -> p.ush (retr-ccnst (tl[i ]) ) is-c-op (till ]) -> (!_e_t e2: p_q~; l~t e1:~o~; 9.~sh(mk-ex~r (el,retr-op (tl[i ]) ,e2) ) ) ) ; result type: Not

is

(~o~)

c-expr -> expr

only

criterion suggestive

does

this

retrieve

for the following o f a way

to track

function

translate

give the correctness

function,

the temporaries,

the stacking

is

432

Assuming

an external

variable b -

trans-c-ezpr (t i) : _~_~_r i = I t c _! t l do (is-c-var-ref (tl[i ]) -> (b:=b+

I;

assign (rib ],contents (retr-var (tl[i ~ ) ) ) is-c-const (tl[i ]) -> t;

(b : = b +

assign (t[ b 3, retr-const (tl[i ])) )

is-c-op(tl[i])

->

(assign (t[b-1 ].appl y-op (contents (t[ b- I), retr-op (tl[i ~ , contents (t[ b]) )) ; b := b-l)

type:

c-expr =>

The correctness

ca~ now be proved

if -

b := O; trans-c-expr(tl| trans-expr (retr-expr (tl) , I) This

result

follews

from a Froof, by induction

{and possible c c n s t r u c t i o n s

of)

on the length

tl, of the stronger statement

-

b := k; trans-c~expr(tl) leaves b = k + I and creates

the same as

frans-expr (retr-expr (tl) ,k+ I)

The

rather

short

leave the reader Whilst

the

correctly

unclear as tc

notation

made the

development

treatment of Formal

via

used

step

in

from

operations,

he,

a

ref.

Development bigger [12]

development the

example

offered

example

may

looks.

is t h o u g h t to have via of

functions

to

that report

is

433

unconvincing. was

so

This is mainly because the a l g o r i t h m

oriented

to

arrays

considered

that the use of an abstract data

r e p r e s e n t a t i o n is somewhat artificial.

The

example

ref. [11] is more i n t e r e s t i n g with respect to

of

data a b s t r a c t i o n and a

short

outline

of

a

rewrite

of

its

development is now given -

Specification:

find

a

(general,

table

driven

recogniser)

algorith~ -

REC

: grammar nt s y m b * - >

{_Y_E_S,~O}

Where the abstract form of a grammar is -

grammar : nt -> rhs-set rhs = el* e! = symb I nt

The

pro-condition

n o n - t e r m i n a l and sentence

defines that

that

there

non-terminal.

there

is

The

are

exactly

rules

one

post-condition

should yield "YES" if and only if produced from the given gra$mar.

the

for

rule

states

symbol

each

for

the

that

R~C

string

can

be

("Produceable" is defined).

Step I: Splits the problem into an input stage which stores the grammar:

a main stage which c r e a t e s

"State

contain

information on all possible top-down parses:

stage which yields YES or NO depending on a state

sets.

terms of the the

which

will

an output

predicate

of

the

The storage for the grammar is still s p e c i f i e d in (abstract)

programmer

map.

assigned

This may be

the

task

a

disappointment

to

of c o n s t r u c t i n g the input

routine since he has little to work on yet. fixing

Sets"

But

the

cost

of

this interface for his c o n v e n i e n c e is that the far more

time c o n s u m i n g parsing operation has not yet been developed far enough to get h~s

views on an e f f i c i e n t representation.

The state sets are a l s o described a b s t r a c t l y

(as a list of sets

of tuples)

is

certain

since the purpose of this

stage

to

show

that

upper and lower b o u n d s on the state sets are s u f f i c i e n t

434

to make the final Fredicate correct. of abstract

definition.

Notice this

give~ bounds and different a l g o r i t h m s could be use

this

freedom.

(In

fact

the

considerable use in considering

Scanning] the

which generate nee stales. ~inimum

constructed

specification

to

has been of

(Prediction,

Completion,

The state sets are defined

sets satisfying a certain equation.

are showP to fall

form

optimisations).

Step 2: introduces E a r l e y ' s operations

as

extreme

There is a great deal of freedom in the

within the bounds stated in

Step

Such sets I.

Notice

that not o~ly are these operations defined in terms of abstract data objects, distinction

they are also

implicitly

Step

~:

not

Begins

%c

restriction would,

algorithm

4:

Makes

is not

representing ~t

this

rhs)

the

Using

~t

algorithm

the restriction.

similar ordering step to the data structure

grammars.

what the common oFerations on the

Now is the time to give

REFEP

abstraction [S].

complete.

to the allowable grammars.

the c o n c r e t e

(In fact those used were quite complicated the

lists

mcint most of the algorithms as such, is designed and

it ix clear are.

yet

have been possible to use a different

a

of

on lists it is necessary to iDtrodnce a

(no zero length

however,

by mapping state

object in a yon NeQmann machine,

at _this stage of development and avoid

Step

in

development

But notice that, since a list

a convenient data

chosen

this form of

ccnsider representations

step to a concrete representation the

is

later.

sets onto state lists. is

This

to ref. [8 ] in which the a l g o r i t h m s are programmed

with operations on the a b s t r a c t data: is employed

defined.

option.)

Doing

data

(~L/I)

structures

data objects!

HASEE variables with

this prompts a macro style of data

like refo [5] which is similar to that used in ref.

435

4. SU~M~RY

The

aim

of the paper has been to show how a large problem,

this case the development of a compiler, can be decomposed small

steps.

Providing

each

step is adequately

complete design history is thus

obtained.

One

in

into

documented, of

the

a

views

expressed is that each stage of development should be supported by a justification. recorded

in

a

~his

notation

correctness argument. with

a

view

implies

to

cn

Such

that

which

it

correctness

steps

of

design

are

is nossible to base a arguments

are

sought

human readers rather than mechanical theorem

checkers.

The

key

to

making

ahstraction. minimum

properties

more difficult to provide

such

a

an approach practical is the use of

In each of the sections the value of stating has find

been shown. an

construction,

the

Although it is frequently

~FFropriate

abstraction

than

to

the a d v a n t a g e s of the former make the

effort worthwhile.

By

emF]cying

both data and operational abstraction,

the development are

rea]isations

of detail. argument

is based on showing how the from the

it is possib]e

The

of the same algorithm at ever greater

Taking this view, the normal

he retrieved

operations

design

(more)

of

levels

correctness

abstract model can

(more) concrete realisation.

In this

of

a

way

design language to be able to specify

imp!ici%ly and tc ~se very abstract data objects

different

particular

style

tc avoid general equivalence proofs.

requirements

very

a view of

process is obtained where the s u c c e s s i v e stages

from these of ~rogramming languages.

netatiGn

language,

(that of ref.

are

A!tho~gh a

[4]) has been employed as the

it should be emphasised

that it is the method

not a Farticular notation which is being proposed here.

436

Ackncwle~sement

~ost

of

the

ideas

contained in this paper were developed

co]]ahcraticn

with

se~bers

of

Laboratories.

The members and

the

Hsrs!ey

and

vienna

i~ IBM

meetings of IYIP WG 2.3 have also

been a great stimulus.

~eferences

[I]

C.9.%]]en~

9.N.Chap~an

Definition

of Algol 60",

12.1~5 August

[2]

[Ed.)

on

~ormal

Report,

TR

Algorithmic

Languages"

Lecture

Notes

in

1979.

H.Beki&~

P r e s e n t a t i o n on "Semantics of Actions" given at

Newcastle

University,

September

197~.

Ho~eki£ et a] "A Formal ~efinition of PL/I" to be printed

A.Hansal,

Re~ort Gf T ~

"Soft,are

Laboratory Vienna.

Devices for Processing Graphs Using

Facilities",

W.~e~hap]

a~d

Statements

in the VDL ~, I ~

March

[7]

of

SF~iDger-Verlag

IS8, October

PL/I c o m p i l e - t i m e

[6]

IB~ Hursley T e c h n i c a l

Semantics

E.Engeler,

as a Technical

[5]

"A

H.~ekie and K. Walk, " P e r m a l i s a ~ i o n of Storage Properties"

~ a t h e m a % i c s No.

[~]

C.E. Jones,

1972.

i~ ~'Sym~osiu~

[3]

and

C.B. Jcneso

Into Proc Letters.

1974.

~'On the Interpre%ation of Gore Vienna

Note,

LN

25.3.065,

1970.

~.He~hapl

and

C.B. Jones,

Possible Implementations, IBM Vienna Tech,ical

"The with

Hlock Proofs

Report, TN 25.104,

Concept and Some of

~quivalence",

April

1970.

437

[8]

C.A.R.Hoare,

"Proof

?epresentations",

ef

Correctness

Acta Informatica,

Vo!.

of

1,

pp

Eata 271-281,

1972. [9]

C.B.Jo~es

and

Implementation Algorithmic

P. Lucas,

Techniaues",

Languages"

"Proving

(Ed.) E. Engeler,

Lecture Notes in Mathematics [10] C.B. Jones,

"Sufficie ~t

Correctness:

Assignment

Correctness

in "Symposium

on Semantics

Springer-Verlag

No. 188, October

Properties Language",

of of

for

1970.

Implementation

IB~ Hursley

Note,

TN

9002, June 1971. [11 ] C.B.Jcnes,

"Formal Development

Example Based on Harley's S!GPLAN Conference,

of Correct

~ecogniser",

SIG~AN

Algorithms:

presented

at

An AC~

Notices Vol. 7, No.l, January

1972. [12] C.B.Jcnes,

"Formal Development

Technical

Deport,

[13] P.J.Landin,

"R

Correspondence

Church's Lambda-Notation: No.2, February [14] P.Lucas,

of Programs",

Constructive

"On

Development

[16] P.Lucas

Program

Realisations

cf

I mple me nfa tions",

Press,

[17] J.McCarthy, Computation"

60

ACM,

and

Vol.8,

of

the Block

Vienna

Technical

and

the

presented

Stepwise at

IB~

1972.

K. Walk, "On the Formal Description

in Annual Review in Automatic Pergamon

IB~

Correctness

at Pisa University,

and

Algol

1965.

"Two

Conference

Between

Part !", Comm. of

Co,cept and their Equivalence", Report, TR 25.085, 1968. [15] P.Lucas,

!~M Rursley

TR 12. 117, June 1973.

Programming,

Vol.6,

of PL/I" Part 3,

1969. "Towards

a

Mathematical

presented af !FIP Congress

1962.

Science

of

438

[18] R.~il~er,

"An Algebraic

Programs"~

Stanford

[ Ig] P.gosses,

"~he

Definition

University

~athematical

Oxford University Computing

of Simulation

AIM-I~2,

February

Semantics

Laboratory,

of

Between 1971.

Algol

PNG-12,

60",

January

197g. [20] Po

Naur,

"An Experiment

pp 3~7=365, [21 ] J.C.~eynolds,

Languages",

Conference,

August

Languages",

Computers

Series

Brcok!yn,

1971.

[23] C. Strachey,

presented

~icrowave

Vo!.21,

"~bstract

PL/I", TBM Vienna Technical [25] "PL/I BASIS/I"

National

ACM

A

Semantics

of the Symposium

Research

Polytechnic

,'Continuations:

al,

for Higher-Order

25th

"Toward a Mathematical

which can deal with Full Jumps", et

at

in "Proceedings

and Automata",

Symposia

[2~] K.Wa~k

HIT 12,

1972o

and C.Strachey,

for Computer on

Eevelopment",

~'Definitic~al I,terpreters

Programming

[22] DoScott

on Program

1972.

Institute

Institute

Mathematical

of

Semantics

unpublished.

Syntax

and Interpretation

Report, T~ 25.098,

ECMA ANSI working document,

of

lg69.

February

19g~.

439 A~pendix This appendix c o n t a i n s a definition merging

the separate

written

in a style

mathematical

features

of the language obtained

of section

I.

(cf. the tlpe clauses)

The definition

is

which can he read

as

sesantics.

B___STIR A ]%CC~. SYNTAX prOC

:: s-nm:id

St

=

as-st

:: s-lhs:id

goto-st

:: in

call-st

:: s-pn:id

as-st

s-parms:id~

~ gore-st

s-args:id~

:: nmd-st*

nmd-st

:: s-ns:[id S s-body:st

expr

=

inf-expr

:: expr o~ expr

vat-tel

:: id

const

:: INTG

in

}

op

)

~ vat-rot

/'args dcl,

~ const

not further specified

INTG )

DOMAINS

ENV: i d - > S: L O C - >

(LOCI

v~n = I ~ T G

PROC-DEN: AB.

= [ id ]

~ROC-DEN)

VAL

LOC = infinite

cpd-st

~ cpd-st

s-rhs:expr

cpd-st

inf-expr

proc-set s-dcls:id-set

| call-st

set

I ! (LOC I PROC-DEN}~

-> (S -> S ABN)

bT

proc or parma/

440

FUNCTIONS

e v a ! - ~ r o c - d c l (proc) (env) = le_t < i d , p a r ~ - l , ~ r o c s , d c l s g m k - c p d - s t ( n s - l )

> = proc;

!e_t f~den:!)= {!_~! env'

: e~v + ([ par,-![i ]

den-l[i]

~ 11

an= a o ( a 2 ) { n - 1 ) / 2

Somit e r h a l t e n w i r f a l l s n=l f a l l s n gerade sonst

a n = ~(a2) n/2 I ~.(a2) (n'1)/2

(5)

oder umgeschrieben

(5 ~ )

1)

proc p2 = (!~ a, N n) M: if n:l then a elsf

even n then p 2 ( q u a ( a ) , n/2) else a.p2(qua(a), (n-l)/2)

fi,

mit proc

I)

qua = (M a) M:

a.a

.

Man b e a c h t e , dab der 0bergang von (4) nach ( 4 ) nach ( 5 ' )

eigentlich

nur o r t h o g r a p h i s c h e r

bzw. yon (5)

Natur i s t .

455 Wir haben h i e r

e i n e Form d e r P o t e n z b e r e c h n u n g ,

gewendet w i r d ,

da s i c h d e r T e s t ,

die Division gesetze

durch

2

leicht

ob

n

die

h~ufig

geradzahlig

realisieren

lassen.

(P1) und (P2) geben aber Raum auch f u r

ist

anund

Die Rechen-

andere Ver-

fahren:

(6)

p r o c p3 = (M a, N n) M: if

n = I

then a then q u a ( a )

elsf

n = 2

elsf

n mod 3 = 0 then c u b ( p 3 ( a , n / 3 ) )

elsf

n mod 3 = I

then a . c u b ( p 3 ( a ,

(n-l)/3))

else qua(a).cub(p3(a,

(n-2)/3))

fi

mit

proc cub : Auf e i n e

n~here D i s k u s s i o n

gegenUber nicht

von (6)

vorteilhafter

sein

NatUrlich

p2 und p3 ~ q u i v a l e n t

Funktion Gibt

(5')

n~her e i n g e h e n .

ten p l ,

wir

(~ a) M: a . q u a ( a ) .

zum Axiom

M

(A)

Frage, wir

warm (6)

hier

Rechenvorschrif-

dab s i e

dieselbe

ein

Einselement,

dann haben

das Axiom

3eEM: ¥xEM: x . e = e . x : x

Das E i n s e l e m e n t

ist

ein zweites

, dann f o l g t

e'

eindeutig

e.e'

= e, a b e r auch

e.e'

= e', a l s o

e

=

bestimmt,

denn g i b t

es neben

e

e'

Somit k~nnen w i r

mit

Hilfe

gew~hlte Bezeichnung fur eins :

Menge

sind die drei

i n dem S i n n e ,

es nun i n d e r H a l b q r u p p e

Gilt

wollen

repr~sentieren.

zus~tzlich

(E)

oder a u f d i e k~nnte,

Axiom NU{O}

eines dieses

~eEM: VxEM: x - e :

(E)

, so kann d i e

Kennzeichnungsterms eine freiEinselement e.x :

Definition

der nicht-negativen

einfUhren

x.

d e r Potenz a u f d i e

ganzen Z a h l e n a u s g e d e h n t wer-

456 den, was b e k a n n t l i c h a° = e i n s

setzt.

Halbgruppe ist, nition

die

die erstere

letztere.

wobei

Wir

Struktur

dab d i e D e f i -

kompatibel

ist

mit

bekommen a l s o d i e R e c h e n v o r s c h r i f t

p

n = 0 then e i n s e l s e

mit

pl

~

p2

oder

p r o c ( M , N ) M p = p2.

p(a,n)

p3

p

bei

im S i n n e e i n e s d e f e n s i v e n ~ d . h .

werden kann,

auf eine b e r e i t s

d er K o n s t r u k t i o n die

mes k o n s e r v i e r e n d e n , P r o g r a m m i e r s t i l s Auch im H i n b l i c k

fi,

identifiziert

Der R U c k g r i f f

handene R e c h e n v o r s c h r i f t

Korrektheit

sollte

I d e e kommen, etwa a n s t e l l e

sehr z w e c k m ~ i g

man n ~ m l i c h von

p2

yon

mit

q

kann

sein.

ist

d i e s e Vor-

sp~ter wirklich p3

vor-

e i n e s Program-

auf die ~nderungsfreundlichkeit

gehensweise ratsam, fen,

Einselement eine

p r o c q = (~ ao NU{O} n) M: if

z.B.

da~ man

Halbaruppe mit

w i r d man d a r U b e r h i n a u s f o r d e r n ,

der Potenz f u r

der f u r (7)

i n der Weise q e s c h i e h t ,

Da nun j e d e

arbeiten

auf die zu w o l -

so kann d i e ~nderung d a d u r c h g e s c h e h e n , da~ man d i e

Iden-

tit~tsdeklaration p r o c ( H , N ) M p = m2 durch proc(M,N)M p = p3 ersetzt. Nimmt man nun zu den Axiomen

(A)

und

(E)

die

E x i s t e n z des

Inversen dazu, (I)

VxEM: ~yEM: x . y

dann e r h ~ I t

= eins,

man d i e S t r u k t u r

y = yo(x.y') eindeutig

= y.x

bestimmt

zeichnungsterms die

= (y.x)'y ist,

w~hlte Bezeichnung f u r wie bei

Gruppe. Da das I n v e r s e wegen

~ = y'

k~nnen w i r

Intention

seiT~inverses z u o r d n e t , dabei ~hn!ich

einer

formal

wieder mit

der F u n k t i o n ,

Hilfe

eines

Kenn-

d i e jedem Element

niederschreiben

und e i n e f r e i g e -

diese Funktion einfUhren.

Wir v e r f a h r e n

d e r E i n f U h r u n g der B e z e i c h n u n g

eins

457 proc i n v e r s :

(Mx) M: IyEM: y . x = x . y = eins

Wir nennen d i e s e D e k l a r a t i o n diese Deklaration angibt,

nicht

tionswerte.

lediglich

Entsprechend

genschaft eines Objektes, aber k e i n V e r f a h r e n

Menge

M

(die ja

ist

eins

auch d i e D e k l a r a t i o n intentional,

Z

da d o r t

d i e es e i n d e u t i g

der Funkfur die freizwar e i n e E i -

bestimmt,

zur Auswahl d i e s e s Objektes

sowieso n i c h t

procr

aus der

Gruppen a u f d i e Menge

d e r ganzen Zahlen u n t e r Wahrung der K o m p a t i b i l i t ~ t

(8)

angegeben

k o n k r e t gegeben i s t ) .

Die U b l i c h e E r w e i t e r u n g der Potenz f u r dann s o f o r t

, weil

d i e E i g e n s c h a f t der F u n k t i o n s w e r t e

aber e i n V e r f a h r e n zur K o n s t r u k t i o n

g e w ~ h l t e Bezeichnung ist

i n t e n t i o n a 1

liefert

die Rechenvorschrift : if

(Ma,Zn) M: n