The Early History Of Smalltalk by Alan Kay
Abstract TOC Introduction Section I II III IV V VI

III. 1970-72--Xerox PARC: The KiddiKomp, miniCOM, and Smalltalk-71

In July 1970, Xerox, at the urgin of its chief scientist Jack Goldman, decdided to set up a long range reserach center in Palo Alo, California. In September, George Pake, the former chancellor at Washington University where Wes Clark's ARPA project was sited, hired Bob Taylor (who had left the ARPA office and was taling a sabbatical year at Utah) to start a "Computer Science Laboratory." Bob visited Palo Alto and we stayed up all night talking about it. The mansfield Amendment was threatening to blinkdly muzzle the most enlightened ARPA funding in favor of directly military reserach, and this new opportunity looked like a promising alternative. But work for a company? He wanted me to consult and I asked for a direction. He said: follow your instincts. I immediately started working up a new versio of the KiddiKimp tha could be made in enough quantity to do experiments leading to the user interface design for the eventual notebook. Bob Barton liked to say that "good ideas don't often scale." He was certainly right when applied to the FLEX machine. The B5000 just didn't directly scale down into a tiny machine. Only the byte-codes did. and even these needed modification. I decided to take another look at Wes Clark's LINKX, and was ready to appreciate it much more this time [Clark 1965].

I still liked pattern-directed approaches and OOP so I came up with a language design called "Simulation LOGO" or SLOGO for short *(I had a feeling the first versions migh run nice and slow). This was to be built into a SONY "tummy trinitron" and ould use a coarse bit-map display and the FLEX machine rubber tablet as a pointing device.

Another beautiful system that I had come across was Petere Deutsch's PDP-1 LISP (implemented when he was only 15) [Deutsch 1966]. It used onl 2K (18-bit words) of code and could run quite well in a 4K mahcine (it was its own operating system and interface). It seemed that even more could be done if the system were byte-coded, run by an architectural that was hoospitable to dynamic systems, and stuck into the ever larger ROMs that were becoming available. One of the basic insights I had gotten from Seymour was that you didn't have to do a lot to make a computer an "object for thought" for children, but what you did had to be done well and be able to apply deeply.

Right after New Years 1971, Bob Taylor scored an enourmous coup by attracting most of the struggling Berkeley computer corp to PARC. This group included Butler Lampson, Check Thacker, Peter Deutsch, Jim Mitchell, Dick Shoup, Willie Sue Haugeland, and Ed Fiala. Him Mitchell urged the group to hire Ed McCreight from CM and he arrived soon after. Gar Starkweather was there already, having been thrown out of the Xerox Rochester Labs for wanting to build a laser printer (which was against the local religion). Not long after, many of Doug Englebart's people joined up--part of the reason was that they want to reimplement NLS as a distributed network system, and Doug wanted to stay with time-sharing. The group included Bill English (the co-inventor of the mouse), Jeff Rulifson, and Bill Paxton.

Almost immediately we got into trouble with Xerox when the group decided that the new lab needed a PDP-10 for continuity with the ARPA community. Xerox (which has bought SDS essentially sight unseend a few years before) was horrified at the idea of their main compeititor's computer being used in the lab. They balked. The newly formed PARC group had a metting in which it was decided that it would take about three years to do a good operating system for the XDS SIGMA-7 but that we could build "our own PDP-10" in a year. My reactopn was "Holy cow!" In fact, they pullit it off with considerable pnache. MAXC was actually a microcoded emeulation of the PDP-10 that used for the first time the new integrated chip memeoris (1K bits!) instead of core memory. Having practicalin house experience with both of these new technologies was critical for the more radical systems to come.

One little incident of LISP eauty happened when Allen Newell visited PARC with his theory of hierarchical thinking and was challenged to prove it. He was given a programming problem to solve while the protocol was collected. The problem was: given a list of items, produce a list consisteing of all of the odd indexed items followed by all of the even indexed items. Newel's internal programming langage resembple IPL-V in which pointers are manipulated explicitly, and he got into quite a struggle to do the program. In 2 seconds I wrote down:

oddsEvens(x) = append(odds(x), evens(x))

the statement of the problem in Landin's LISP syntax--and also the first part of the solution. Then a few seconds later:

where odds(x) = if null(x) v null(tl(x)) then x
                   else hd(x) & odds(ttl(x))
     evens(x) = if null(x) v null(tl(x)) then nil
                   else odds(tl(x))

This characteristic of writing down many solutions in declarative form and have them also be the programs is part of the appeal and beauty of this kind of language. Watching a famous guy much smarter than I struggle for more than 30 minutes to not quite solve the problem his way (there was a bug) made quite an impression. It brought home to me once again that "point of view is worth 80 IQ points." I wasn't smarter but I had a much better internal thinking tool to amplify my abilities. This incident and others like it made paramount that any tool for children should have great thinking patterns and deep beeauty "built-in."

Right around this time we were involved in another conflict with Xerox management, in particular with Don Pendery the head "planner". He really didn't understand what we were talking about and instead was interested in "trends" and "what was the future going to be like" and how could Xerox "defend against it." I got so upset I said to him, "Look. The best way to predict the future is to invent it. Don't worry about what all those other people might do, this is the century in which almost any clear vision can be made!" He remained unconvinced, and that led to the famous "Pendery Papers for PARC Planning Purposese," a collection of essays on various aspects of the future. Mine proposed a version of the notebook as a "Display Transducer." and Jim Mitchell's was entitled "NLS on a Minicomputer."

Bill English took me under his wing and helped me start my group as I had always been a lone wolf and had no idea how to do it. One of his suggestions was that I should make a budget. I'm afraid that I really did ask Bill, "What's a budget?" I remembered at Utag, in pre-Mansfield Amendment days, Dave Evans saying to me as hwent off on a trip to ARPA, "We're almost out of money. Got to go get some more." That seemed about right to me. They give you some money. You spend it to find out what to do next. You run out. They give you some more. And so on. PARC never quite made it to that idyllic standard, but for the first half decade it came close. I needed a group because I had finally ralized that I did not have all of the temperaments required to completely finish an idea. I called it the Learning Research Group (LRG) to be as vaue as possible bout our charter. I only hired people that got stars in their eyes when they heard about the notebook computer idea. I didn't like meetings: didn't believe brainstorming could substitute for cool sustained thought. When anyone asked me what to do, and I didn't have a strong idea, I would point at the notebook nodel and say, "Advance that." LRG members developed a very close relationship with each other--as Dan Ingalls was to say later: "... the rest has enfolded through the love and energy of the whole Learning Research Group." A lot of daytime was spent outside of PARC, playing tennis, bikeriding, drinking beer, eating chinese food, and constantly talking about the Dynabook and its potential to amplify human reach and bring new ways of thinking to a faltering civilization that desperately needed it (that kind of goal was common in California in the afternath of the sixties).

In the summer of '71 I refined the KiddiKomp idea into a tighter design called miniCOM. It used a bit-slice approach like the NOVA 1200, had a bit-map display, a pointing device, a choice of "secondary" (really tertiary) storages, and a language I now called "Smalltalk"--as in "programming should be a matter of ..." and "children should program in ...". The name was also a reaction against the "IndoEuropean god theory" where systems were named Zeus, Odin, and Thor, and hardly did anything. I figured that "Smalltalk" was so innocuous a label that if it ever did anything nice people would be pleasantly surprised.

This Smalltalk language (today labeled -71) was very influenced by FLEX, PLANNER, LOGO, META II, and my own derivatives from them. It was a kind of parser with object-attachment that executed tokens directly. (I think the awkward quoting conventions come from META). I was less interested in programs as algebraic patterns than I was in a clear scheme that could handle a variety of styles of programming. The patterned front-end allowed simple extension, patterns as "data" to be retrieved, a simple way to attach behaviors to objects, and a rudimentary but clear expression of its eval in terms that I thought children could understand after a few years experience with simpler programming.. Program storiage was sorted into a discrimintaion net and evalutaion was strightforward pattern-matching.

Smalltalk-71 Programs

to T 'and' :y do 'y'
to F 'and' :y do F

to 'factorial' 0 is 1
to 'factorial' :n do 'n*factorial n-1'

to 'fact' :n do 'to 'fact' n do factorial n. ^ fact n'

to :e 'is-member-of' [] do F
to :e 'is-member-of' :group
          do'if e = firstof group then T
		          else e is-member-of rest of group'

to 'cons' :x :y is self
to 'hd' ('cons' :a :b) do 'a'
to 'hd' ('cons' :a :b) '<-' :c do 'a <- c'
to 'tl' ('cons' :a :b) do 'b'
to 'tl' ('cons' :a :b) '<-' :c do 'b <- c'

to :robot 'pickup' :block
         do 'robot clear-top-of block.
		 robot hand move-to block.
		 robot hand lift block 50.
		 to 'height-of' block do 50'

As I mentioned previously, it was annoying that the surface beauty of LISP was marred by some of its key parts having to be introduced as "special forms" rather than as its supposed universal building block of functions. The actual beauty of LISP came more from the promise of its metastrcutures than its actual model. I spent a fair amount of time thinking about how objects could be characterized as universal computers without having to have any exceptions in the central metaphor. What seemed to be needed was complete control over what was passed in a message send; in particular when and in what environment did expressions get evaluted?

An elegant approach was suggested in a CMU thesis of Dave Fisher [Fisher 70] on the syntheses of control structures. ALGOL60 required a separate link for dynamic subroutine linking and for access to static global state. Fisher showed ow a generalization of these links could be used to simulate a wide variety of control environments. One of the ways to solve the "funarg problem" of LiSP is to associate the proper global tate link with expressions and functions that are to be evaluted later so that the free variables referenced are the ones that were actually implied by the static form of the language. The notion of ";azy evaluation" is anticipated here as well.

Nowadays this approach wouldbe called reflective design. Putting it together with the FLEX models suggested that all that should be required for "doing LISP right" or "doing OOP right" would be to handle the mechanics of invocations between modules without having to worry about the details of the modules themselves. The difference between LISP and OOP (or any other system) would then be what the modules could dontain. A universal module (object) referrence --ala B5000 and LISP--and a message holding structure--which could be virutal if the senders and receivers were sympatico-- that could be used by all would do the job.

If all of the fields of a messenger struccture were enumerated according to this view, we would have:

GLOBAL: the environment of the parameter values
SENDER: the sender of the message
RECEIVER: the receiver of the message
REPLY-STYLE: wiat, fork, ...?
STATUS: progress of the message
REPLY: eventual result (if any)
OPERATION SELECTOR: relative to the receiver
# OF PARAMETERS:
P1:
...:
Pn:

This is a generalization of a stack frame, such as used by the B5000, and very simiilar to what a good intermodule scheme would require in an opeating system such as CAL-TSS--a lot of state for every transactin, but useful to think about.

Much of the pondering during this state of grace (before any workable implementation) had to do with trying to understand what "beautiful" ight mean with reference to object-oriented design. A subjective definition of a beautiful thing is fairly easy but is not of much jelp: we think a thing beautfiul because it evokes certain emotions. The cliche has it like "in the eye of the beholder" so that it is difficult to think of beauty as other than a relation between subject and object in which the predispositions of the subject are all important.

If there are such a thing as universally appealing forms then we can perhaps look to our shared biological heritage for the predispositions. But, for an object like LiSP, it is almost certan that most of the basis of our judgement is leanred and has much to do with other related areas that we think are beautiful, such as much of mathematics.

One part of theperceived beuty of mathematics has to do with a wondrous snery between parasimony, generality, enlightenment, and finesse. For example, the Pythagoriean Theorem is expressable in a single line, is true for all of the infinite number of right triangles, is incredibly uiseful in unerstanding many other relationships, and can be shown be a few simple but profound steps.

When we turn to the various languages for specifying computations we find many to be general and a few to be parsimonious. For example, we can define universal machine languages in just a few instructions that can speicfy anything that can be computed. But most of these we would not call beautiful, in part because the amount and kiind of code tha has to be written to do anything interesting is so contribed and turgid. A simple and small system that can do interesing things also needs a "high slope"--that is a good match between the degree of interestingness and the level of complexity needed to express it.

A fertialized egg that can transform itself into the myriad of specializations needed to make a complex organism has parsimony, gernerality, enlightenment, and finesse-in short, beauty, and a beauty much more in line with my own esthetics. I mean by this that Nature is wonderful both at elegance and practicality--the cell membrane is partly there to allow useful evolutionary kludges to do their neccessary work and still be able act as component by presenting a uniform interface to the world.

One of my continual worries at this time was about the size of the bit-map display. Even if a mixed mode was used (between fine-grained generated characters and coarse-grained general bit-mao for graphics) it would be hard to get enough information on the screen. It occurred to me (ina shower, my favorite place to think) that FLEXtype windows on a bit-map displahy could be made to appear as overlapping documents on a desktop. When an overlapped one was refreshed it would appear to come to the top of the stack. At the time, this did not appear as the wonderful solujtion to the problem but it did have the effect of magnifying the effective area of the display anormously, so I decided to go with it.

To investigate the use of video as a display medium, Bill english and Butler Lampson specified an experimental character generator (built by Roger Bates) for the POLOS (PARC OnLine Office System) terminals. Gary Starkweather had just gotten the first laser printer to work and we ran a coax over to his lab to feed him some text to print. The "SLOT machine" (Scanning Laser Output Terminal) was incredible. The only Xerox copier Gary could get to work on went at 1 page a seocond and could not be slowed down. So Gary just made the laser run at the rate with a resolution of 500 pixels to the inch!

The character generator's font memory turned out to be large enough to simulate a bit-map display f one displayed a fixed "strike" and wrote into the font memory. Ben Laws built a beautiful font editor and he and I spent several months learning about the peculaiarities of the human visual system (it is decidedly non-linear). I was very interested in high-quality text and graphical presentations because I thought it would be easier to get the Dynabook into schools as a "trojan horse" by simply replacing school books rahter than to try to explain to teachers and school boards what was really great about personal computing.

Things were generally going well all over the lab until May of 72 when I tried to get resources to build a few miniCOMs. A relatively new executive ("X") did not want to give them to me. I wrote a memo explaining why the system was a good idea (see Appendix II), and then had a meeting to discuss it. "X" shot it down completely saying amoung other things that we had used too many green stamps getting Xerox to fund the time-shared MAXC and this use of resources for personal machines would confuse them. I was chocked. I crawled away back to the experimental character generator and made a plan to get 4 more made and hooked to NOVAs for the initial kid experiments.

I got Steve Purcell, a summer student from Stanford, to build my design for bit-map painting so the kids could sketch as well as display computer graphics. John Shoch built a line drawing and gesture regognition system (based on Ledeen's [Newman and Sproull 72]) that was integrated with the painting. Bill Duvall of POLOS built a miniNLS that was quite remarkable in its speed and power. The first overlapping windows started to appear. Bob Shur (with Steve Purcell's help) built a 2 1/2 D animation system. Along with Ben Laws' font editor, we could give quite a smashing demo of what we intended to build for real over the next few years. I remember giving one of these to a Xerox executive, including doing a protrait of him in the new painting system, and wound it up with a flourish declaring: "And what's really great about this is that it only has a 20% chance of success. We're taking risk just like you asked us to!" He looked me straigt in the eye and said, "Boy, that's great, but just make sure it works." This was a typical exeuctive notion about risk. He wanted us to be in the "20%" one hundred percent of the time.

That summer while licking my wounds and getting the demo simnulations built and going, Butler Lampson, Peter Deutsch, and I worked out a general scheme for emulated HLL machine languages. I liked the B5000 scheme, but Butler did not want to have to decode bytes, and pointed out that since an 8-bit byte had 256 total possibilities, what we should do is map different meanings onto different parts of the "instruction space." this would give us a "poor man's Huffman code" that would be both flexible and simple. All subsequent emulators at PARC used this general scheme.

I also took another pass at the language for the kids. Jeff Rulifson was a big fan of Piaget (and semiotics) and we had many discussions about the "stages" and what iconic thinking might be about. After reading Piaget and especially Jerome Bruner, I was worried that the directly symbolic approach taken by FLEX , LOGO (and the current Smalltalk) would be difficult for the kids to process since evidence existed that the symbolic stage (or mentality) was just starting to switch on. In fact, all of the educators that I admired (including Montessori, Holt, and Suzuki) all seemed to call for a more figurative, more iconic approach. Rduolph Arnheim [Arnheim 69] had written a classic book about visual thinking, and so had the eminent art critic Gomrich [Gombrich **]. It really seemed that something better needed to be done here. GRAIL wasn't it, because its use of imagery was to portray and edit flowcharts, which seemed like a great step backwards. But Rovner's AMBIT-G held considerably more promise [Rovner 68]. It was kind of a visual SNOBOL [Farber 63] and the pattern matching ideas looked like they would work for the more PLANNERlike scheme I was using.

Bill English was still encouraging me to do more reasonable appearing things to get higher credibility, likemakin budgets, writing plans and milestone notes, so I wrote a plan that proposed over the next few years that we would build a real system on the character generators cum NOVAs that would involve OOP, windows, painting, music, animation, and "iconic programming." The latter was deemed to be hard and would be handled by the usual method for hard problems, namely, give them to grad students.

II III IV

Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the ACM copyright notice and the title of the publication and its date appear, and notice is given that copying is by permission of the Association for Computing Machinery. To copy otherwise, or to republish, requires a fee and/or specific permission.
HOPL-II/4/93/MA, USA
© 1993 ACM 0-89791-571-2/93/0004/0069...$1.50

The Early History Of Smalltalk by Alan Kay
Abstract TOC Introduction Section I II III IV V VI