Artificial Intelligence

Artificial Intelligence



Artificial Intelligence

I. What is Artificial Intelligence?

According to John McCarthy, the man that coined the term, “[Artificial Intelligence] is the science and engineering of making intelligent machines, especially intelligent computer programs” where “intelligence is the computational part of the ability to achieve goals in the world.” 

An intelligent machine can be a machine that mimics the way humans think, feel, move and make decisions.  It could also act in conjunction with a human to compliment and improve their ability to do those things.  There are many possible approaches to the challenge and the definition has never had a static solution.

Even the name 'Artificial Intelligence' has been subject to argument, as some researchers feel it it sounds unscientific.  They argue the word 'artificial' suggests lesser or fake intelligence, more like science fiction than academic research.  They prefer to use terms like computational neuroscience or emphasize the particular subset of the field they  like semantic logic or machine learning.  Nevertheless, the term 'Artificial Intelligence' has gained popular acceptance and graces the names of various international conferences and university course offerings.

This paper does not attempt to come up with a precise characterization of the field.  Instead, it examines what Artificial Intelligence has been so far by leading the reader through an admittedly non-comprehensive collection of projects and paradigms, especially at MIT and in the United States.  

Unlike many fields, Artificial Intelligence has not had a linear progression and its research and breakthroughs have not grown toward an easily identified Sun.  Computing, in contrast, has been noted for its exponential growth and improvement characterized by Moore's law, “the empirical observation that the complexity of integrated circuits, with respect to minimum component cost, doubles every 24 months” (wikipedia).  The path of AI, however, more resembles the intertwining world wide web, spiraling out and looping back in many directions.  

Here you will find a rough chronology of some of AI's most influential projects.  It is intended for both non-scientists and those ready to continue experimentation and research tomorrow.  Included is a taste of who the main players have been, concepts they and their projects have explored and how the goals of AI have evolved and changed over time.  Many will be surprised that some of what we now consider obvious tools like search engines, spell check and spam filters are all outcroppings of AI research.  

II. Foundations

Though the term 'Artificial Intelligence' did not exist until 1956, the advances and ideas from the preceding decades evoked many of the future themes.  At a time when digital computers had only just been invented, using programming to emulate human intelligence was barely even imaginable.

Understanding the context into which Artificial Intelligence was born helps illustrate the technological obstacles that researchers had to overcome in the search for machine intelligence as well as elucidating many of the original paths.

Beyond Number-Crunchers: Programmable Machines

The idea of machines that could not just process, but also figure out how to solve equations was seen as the first step in creating a digital system that could emulate brain processes and living behavior.  What would it mean to have a machine that could figure out how to solve equations?  Let's go through an example using basic algebra. 

In order to create a machine that can solve more complicated equations than 2+2=4, a machine needs to have a strategy for deciding on the multiple steps necessary to come up with a solution.  For example, if you told the machine, X+Y=7 and X=3, you would like the machine to  deduce that 3 + Y = 7, then that Y = 7 – 3, then that 7 – 3 = 4, and finally that Y = 4.  Assuming someone has already told the machine what '+', '-', and '=' mean, you would traditionally tell the machine how to solve those simple problems by defining a step-by-step procedure called a program.

As early as 1930, Vannevar Bush of MIT published a paper about a Differential Analyzer, doing just that for another class of mathematical problems.  Computers had not been invented at that point, but his paper nonetheless described a set of rules that would automatically solve differential equations if followed precisely. 

The next major idea came in Alan Turing's 1937 paper about any automatic programmable system, known as the Turing Machine.  This concept establishes the redundant nature of making a variety of types of programmable-devices out of different materials, because any one could be set up such that it mimics the input-output characteristics of any other. 

Bush and Turing did not yet know how one would go about actually making that universal programmable device, but in 1949 Shannon would write a paper called “Information Theory” that set up the foundations for using digital electronics to represent information.  This idea became the basis of using machines to use symbols (like the X and Y in the example above) to execute complex operations.


Early 'Computers' were Room-Sized Calculators

Technology has improved by leaps and bounds since the start of World War II when computers were first coming into use.  The first electronic computer, ABC, came in 1940, while the first programmable American computer, Mark I, followed in 1944.

Constructed from wires, magnetic cores and vacuum tubes, they were huge devices that literally filled rooms.  They had about the functionality of a modern-day scientific calculator, but no monitor or keyboard.  Instead, if you wanted the computer to compute the value of a calculation, you would punch buttons in sequence or feed in stacks of punch cards, and it would eventually print you back the results

A description of computing pioneer Grace Hopper's experience with a computer was representative of the kinds of problem computers were used for at the time:

[Hopper] was commissioned a lieutenant in July 1944 and reported to the Bureau of Ordnance Computation Project at Harvard University, where she was the third person to join the research team of professor (and Naval Reserve lieutenant) Howard H. Aiken. She recalled that he greeted her with the words, "Where the hell have you been?" and pointed to his electromechanical Mark I computing machine, saying "Here, compute the coefficients of the arc tangent series by next Thursday."
Hopper plunged in and learned to program the machine, putting together a 500-page Manual of Operations for the Automatic Sequence-Controlled Calculator in which she outlined the fundamental operating principles of computing machines. By the end of World War II in 1945, Hopper was working on the Mark II version of the machine.  (Maisel)
Grace Hopper will also be remembered for discovering and naming the first computer “bug” in 1945 as well as inventing the idea of a computer compiler, a device that can translate higher level programming languages into machine language that the computer knows how to execute.

The other revolutionary electronic creation of the decade was the transistor, created by Bell Labs in 1947, soon replacing vacuum tubes.  A tribute to its importance according to wikipedia, an open encyclopedia that all can edit (see bibliography), follows:

The transistor is considered by many to be one of the greatest inventions in modern history, ranking in importance with the printing press, automobile and telephone. It is the key active component in practically all modern electronics.

Its importance in today's society rests on its ability to be mass produced using a highly automated process (fabrication) that achieves vanishingly low per-transistor costs... The transistor's low cost, flexibility and reliability have made it an almost universal device for non-mechanical tasks, such as digital computing.

Analog Intelligence: Emulating Brain Function

Before the new digital technology caught on, many were asking themselves a question that has recently been having a resurgence in Artificial Intelligence; If we know how the brain works, why not make machines based off the same principles?  While nowadays most people try to create a programmed representation with the same resulting behavior, early researchers thought they might create non-digital devices that had also the same electronic characteristics on the way to that end.  In other words, while new approaches try to represent the mind, analog approaches tried to imitate the brain itself.

Modern systems also look to the brain for inspiration though ultimately do the actual programming using a computer, but early researchers believed we could create analog circuits that mimic the electrical behavior of the brain identically and therefore fundamentally replicate actions and intelligence.  Their methodology rested on the feedback and control heralded in Norbert Wiener's 1948 paper Cybernetics.

Examples of these analog brains included Shannon's mechanical 'mice' that could remember which path to take through a maze to get to the 'cheese' to the better known Grey Walter Turtles with wandering, home-seeking and curiosity drives that depended on its energy levels.   These machines relied on cleverly arranged circuits using resistors, capacitors and basic subcomponents , that automatically behave in a certain way based on sensor input or charge levels.

III. 1950's: Establishing the Field

The fifties saw the growth of an AI community, experimentation with the first digital AI machines, the inaugural Dartmouth Artificial Intelligence Conference, and the creation of one of its strongest initial proponents, DARPA.

The Turing Test: An AI Legend

How can one know if a machine is intelligent?  While the larger issue of defining the field is subject to debate, the most famous attempt to the answer to the intelligence question is in the Turing Test.  With  AI's history of straddling a huge scope of approaches and  fields, everything from abstract theory and blue-sky research to day-to-day applications, the question of how to judge progress and 'intelligence'  becomes very difficult.  Rather than get caught up in a philosophical debate, Turner suggested we look at a behavioral example of how one might judge machine intelligence. 

The actual test involves examining a transcript of an on screen conversation between a person and a computer, much like instant messenger.  If a third party could not tell which one was the human, the machine would then be classified as intelligent. The test was intended merely to illustrate a point, but has since ascended to the level of legend in the AI community. 

Even today, The Loebner Prize uses the Turing Test to evaluate artificial conversationalists and awards a bronze metal annually to the “most human” computer.  Many former winners are available to talk to online.  The organization also offers a $100,000 prize of to the program that can pass the test that has yet to be won.  

Though its methodology and exclusive focus on human-style communication is contentious, one can not learn about AI without knowing what the Turing Test is.  It is a common feature in any AI journal, class or conference and still serves to motivate the AI community though its literal goal is still far from being achieved.

Thinking Machine: The Logical Theorist

Early in 1956, two young CMU researchers, Al Newell and Herbert Simon implemented a working AI machine.  Their 'Logical Theorist' had a built-in system that could deduce geometric proofs. 

In honor of its 50-year anniversary, the story was reported in this year's Pittsburg Post-Gazette:

“Over the Christmas holiday,” Dr. Simon famously blurted to one of his classes at Carnegie Institute of technology, “Al Newell and I invented a thinking machine...”  Dr. Simon concentrated on developing “heuristics,” or rules of thumb, that humans use to solve geometry problems and that could be programmed into a computer, while Dr. Newell and Mr. Shaw in California, developed a programming language that could mimic human memory processes...

Their machine used symbolic reasoning to solve systems of equations, pioneering an AI methodology that involved programming knowledge and information directly into a computer. 

The Dartmouth Artificial Intelligence Conference and General Problem Solver

The 1956 Dartmouth Artificial Intelligence Conference originated with a proposal submitted to the Rockefeller Foundation by McCarthy, Minsky, Fochester and Shannon requested funding for a summer retreat dedicated to exploring the potentials in the field whose name it coined. 

It is interesting to note how relevant the seven research pillars they outlined still are: 

  1. Automatic Computers
  2. How Can a Computer be Programmed to Use a Language
  3. Neuron Nets
  4. Theory of the Size of a Calculation
  5. Self-Improvement
  6. Abstractions
  7. Randomness and Creativity.

Though they made little concrete progress that summer, it marked the start of an new age and McCarthy's use of the controversial name 'Artificial Intelligence' stuck.

Given that it was the first working implementation of digital AI, it might seem curious that the Logical Theorist project did not seem to significantly impress the other people at the Dartmouth Conference.  One explanation is that Newell and Simon had been invited to the conference almost as an afterthought,  less well known than many of the other attendees.  But by 1957, the same duo created a new machine called the General Problem Solver (GPS) that they heralded as an epoch landmark in intelligent machines, believing that it could solve any problem given a suitable description. 

While its ability to solve complex problems was disappointing, the reasons for which will be discussed below, the GPS did explore and formalize the problem-solving process and helped researchers better understand the issues at stake in achieving an effective program.  It was also the first program that aimed at a general problem-solving framework.  This inspired much further research.

Optimism about the rate of AI Progress: GPS and NP-hard Problems

In retrospect, other established researchers admit that following the Dartmouth conference, they mostly pursued other routes that did not end up working as well as the Newell-Simon GPS paradigm.  Later they acknowledged Newell and Simon's original insights and many joined the symbolic reasoning fold (McCorduck). 

This reaction fits into a reputation that this field has of unrealistic predictions of the future.  Unfortunately, many see AI as a big disappointment, despite the many ways its advances have now become a fundamental part of modern life.  If you look at the rash claims of its original proponents, however, such a conclusion may not seem far fetched. 

A particularly exuberant example of this disconnection was Newell's claim after the creation of General Problem Solver that “there are now in the world machines that think, that learn and create.  Moreover, ...in a visible future – the range of problems they can handle will be coextensive with the range to which the human mind has been applied.” (Norvig)

One limitation he overlooked was the curse of 'NP-hard' problems.  In these cases, it is not that one can not write an appropriate program to find a solution, but rather that it will, in effect, never return an answer because the computation will take so long.  A fundamental property of these problems' formulation is that execution time grows exponentially with the size of the input, and it turns out there are many many problems like with these characteristics.  In other worlds, given two inputs, the output might take 2^2 = 4 seconds to compute, three inputs might take 2^3=16 seconds, eight might take 2^8=256  seconds and so forth.

Modern researchers seem to have a more cautious approach to speculations about the future, having learned from history.  Some see AI research as a way to appreciate and understand the complexity of the human mind.  It has certainly been much harder than most realized to achieve even a small part of what organic brains can do.  When I asked them what advice they would give a novice AI researcher, one AAAI Fellow recommended, “Choose a easy problem.  Then make it simpler. It will always turn out to be much harder than you'd expect.”

ARPA: Early AI's Fairy God Mother

If the Turing Test was the spirit-leader of early AI research, ARPA was the day-job that paid the bills, although one of its original heads, J. C. R. Licklider, did also encouraged many new conceptualizations of the purpose and potential of technology.  Licklider's paper, Man Machine Symbiosis, outlined a way of envisioning the human-technology relationship, in which a machine assists and works with a human to accomplish tasks.  The extensive resources that the organization provided were indispensable to the start of the field.

Short for the Advanced Research Program Association, and a subset of the Defense Department, ARPA (now known as DARPA) was created in 1958 after Sputnik I went into orbit with the explicit purpose of catching up with the Russian space capabilities.  When Eisenhower decided that space should be civilian-controlled and founded NASA, however, ARPA found computing to be its new niche.

It began operations by contributing large research block grants starting in 1963 and supported a range of AI and computer science efforts over the years, with MIT, Stanford and Carnegie Mellon among the first recipients.

LISP: The language that made AI possible

John McCarthy introduced LISP in 1958, heralded as the language that made AI programming possible.    LISP is special because it was the first language that allowed information to be stored as list of objects rather than just lists of numbers.  An object is essentially a placeholder or symbol that is defined somewhere else.  This structuring makes it possible to program recursive functions and abstract ideas directly into the machine.

As part of the shift of batch-processing to interactive computers, McCarthy designed LISP to have an interactive environment, in which one could see errors in the code real time.   The capability of evaluating and seeing on screen feedback one function at time, rather than having to run the entire file can greatly facilitate finding bugs in one's code.

While many other early languages have died out, LISP remains the most common programming language for Artificial Intelligence in the United States and is used on par with Prolog in Europe and Japan.  According to Peter Norvig, founder of Google and author of a popular textbook on the subject, one reason for the continuing popularity of Lisp is the flexibility of its simple list data structure.  In his words, “The list is a very versatile data structure, and while lists can be implemented in any language, Lisp makes it easy to use them.  Many AI applications involve lists of constantly changing size, making fixed-length data structures like vectors harder to use.” (Norvig 25)

It is also easily extensible because there are no limitations on how one defines and manipulates both programs and data, so one can easily rename or add functions to better fit the problem at hand.  Its simple elegance has survived the test of time while capturing all the necessary functionality; functions, data structures and a way to put them together. 

Research at MIT: The Artificial Intelligence Project

The first coordinated AI research at MIT began in 1959 when John McCarthy and Marvin Minsky founded the Artificial Intelligence Project as part of both the Research Laboratory for Electronics (RLE) in Building 26 and the Computation Center.  They were junior faculty at the time and had known each other as from graduate school at Princeton, where Minsky had studied artificial neural networks (cybernetics).  A theoretician,  he immediately begin work on theories of computations relevant to creating intelligent machines in Computation: Finite and Infinite Machines.

AI and computation have long had mutually inspiring relationship.  Much AI research could not be implemented until we had different or better machines, and their theories influenced the way those strides forward would be achieved.  The early gurus of the field, like the hackers described below, were often pioneers in both, creators and consumers of the new technologies.  The tools they created become part of the expected package for the next generation of computers, and they explored and and improved upon the features that any new machine might have.

MIT Hackers: Starting a Computer Culture

On the other end of the spectrum from governmental initiatives and administration, computers also captured the imagination of the idealistic relays-and-wiring-obsessed sect of the Tech Model Railroad at MIT.  They created a breed of 'hackers' that believed in the power, beauty and freedom of computing.  The 'Hacker Ethic' that still exists at MIT today found its roots in the fifties and, as taken from Steven Levy's book about the subject, consisted of the following precepts:

  1. Access to computers – and anything which might teach you something about the way the  world works – should be unlimited and total.  Always yield to the Hands-On Imperative.
  2. All information should be free.
  3. Mistrust Authority – Promote Decentralization.
  4. Hackers should be judged by their hacking, not bogus criteria such as degrees, age, race, or position.
  5. You can create art and beauty on a computer.
  6. Computers can change your life for the better.


A scant few years before, computers had only existed as a heavily regulated industry or military luxury that took up whole rooms guarded by designated personnel who were the only ones actually allowed to touch the machine.  Programmers were far removed from the machine and would pass their punch card programs on to the appropriate personnel, who would add them to the queue waiting to be processed.  The results would get back to the programmers eventually as a binary printout, which was then deciphered to find the result.

Thus, the Hacker's desire to play with the machine itself was revolutionary for the time.  With the reverence surrounding the expensive machines, the concept of spending one's day in front of a computer at the modern office would have sounded ludicrous.  In contrast and immune to the social mores of the time, the hackers felt challenged and inspired by the worlds of possibility they saw in these new machines that allowed them to create virtual universes. 

Hacker Innovations

In the late fifties and even after, computers were put to work day and night because they were so expensive (and slow).  So it was common practice for these young computer enthusiasts to keep late hours and take advantage of the less-utilized middle of the night machine time.  They even developed a system whereby someone would watch out for when another sleepy user did not show up for their slot.  The information would be immediately relayed to the rest of the group at the Model Railroad club and someone would make sure the computer time did not go to waste. 

One of the most important hacker innovations was hooking up a screen and teletype machine to the computer, first used for interactive debugging.  In doing so, users had an interactive real time relationship and drastically changed the way a user would use and relate to the machine. Several of these innovations would grow into the life, gas, and solar  corona video clips available on this website.

As a result of using the machine so much, they knew where they wanted optimize machine performance and what tools to create to elicit new kinds of functionality from the machines.   Early hackers created better languages and even hardwired new commands into the computer circuitry. The most famous program was Space Wars, the first real computer game.  It involved maneuvering spacecrafts and torpedoes that was created on a machine little memory and virtually no features. 

Soon Space Wars spread through the entire computing community, even used by the Digital Equipment Corporation to ensure the customer properly working computers.   As told on wikipedia, “Spacewar was a fairly good overall diagnostic of the PDP-1 computer and Type 30 Precision CRT Display, so DEC apparently used it for factory testing and shipped PDP-1 computers to customers with the Spacewar program already loaded into the core memory; this enabled field testing as when the PDP was fully set up, the field representative could simultaneously relax and do a final test of the PDP.”

IV. 1960's: Pursuing Machine Genius

In terms of projects, the sixties saw the creation of the first comprehensive mathematics programs, an attempt to decoding sentence meaning in word problems and the creation of now integral operating system tools like user faces and word processors.  In addition, a conversing parody of a psychoanalyst gained notoriety, the first industrial robot made its appearance and the expert system DENDRAL derived conclusions in the area of chemistry.  If this section seems like something of a laundry list, that is because there are so many different subareas which saw their beginnings in these seminal projects.  

As years progressed, each new computer would form a new image in the strobe light morphing from big hulking machine to interactive personal computer.  The growing capabilities opened up new possibilities for AI.  For example, imagine having a computer without a screen.  It was Lincoln Labs' computer LINC that incorporated a TV-style CRT screen into a commercial computer, giving a user immediate feedback instead of making the user wait for a printout.  Everything from graphics to word processing to user interfaces has hinged on that addition.  

On the other coast at the Stanford Research Institute (SRI), Doug Englebart invented the mouse and on-screen cursor in his experiments with different kinds of user faces, as well as windows and multiple raster monitors, all of which he demoed in 1967.

The computer systems in those days were far from failsafe.  In 1960, one Defense computer mistakenly identified the moon as an incoming missile which understandably caused great consternation.  Another example came during the Cuban Missile crisis, when communications were blocked for several days.  These shortcomings would help motivate high-level encouragement and support for the computer industry.

At the same time. computer science was gaining growing acceptance as a field.  First, IBM declared separate departments for software and hardware, meaning pure programmers officially would have a declared place to develop programs and environments.  In the academic sphere, universities began granting the first degrees in Computer Science.   The decade also saw the birth of the BASIC programming language, designed to be easy to understand, and UNIX, a way of structuring and communicating with an operating system that now underlays all Macs and Linux-based computers.

With the new DARPA funding in 1963, MIT created a new research group Project MAC.  Mirroring the wide range of research it would inspire, Project MAC brought together disparate researchers from departments across the institute, including those from the AI Project.  All moved over to Tech Square, originally occupying two floors, complete with machine shop and research areas, including Minsky's beanbags and project testing haven, the Play-Pen. 

The lab, under Bob Fano's initial leadership, focused on mimicking higher cognitive levels of human intelligence.  They worked on systems that could play chess, do SAT analogy problems, higher level math, and infer logical conclusions from a given set of preconditions. One fun invention was Ivan Sutherland Virtual Reality head-mounted display, the first of its kind.

Slagle, Moses, Bobrow, Evans MIT

The initial use of programs to solve complex mathematics was not a matter of rote application of straightforward computations, but rather involved programs that could actively figure out what that solution or a close approximation might be.

The first step at MIT, SAINT, was created by PhD student James Slagle and could solve basic integrations.  It also had the dual fame of  being the first LISP program ever written.  CSAIL has a reading room that preserves the collection of all these early thesis projects, and although not the only institution that could claim this, early titles read much like a timeline of developments in AI and Computer Science at that time. 

Expanding upon the more traditional approach of using computers as high-powered calculators, the mammoth MACSYMA entered the scene in 1967.  The predecessor of Matlab and still widely used by mathematicians and scientists, this program used symbolic reasoning for integration problems, in other words, a logic based system.  It became the go-to program for mathematical operations and one of the earliest expert systems.  Its creator was Joel Moses of MIT and he initially used a collection of mostly unstructured LISP functions to accomplish a wide variety of operations.

Another very different approach to doing math on a computer was Danny Bobrow's thesis in 1964 that solved high-school level algebra word problems, using semantic rules to interpreting natural (human) language.  The year before, Thomas Evans had created ANALOGY, a program that could solve SAT-level analogy problems.  ANALOGY used a way of deciphering relationships between words that was similar to that used in Bobrow's project.  Though they may seem at first glance more human that mammoth-calculator MACSYMA, Norvig, Director of Research at Google, Inc., comments that these kinds of programs “derive simplicity because they deal with simplified worlds.”

Building Tools at MIT: TECO, SKETCHPAD
Greenblatt and Murphy, Sutherland, MIT

TECO was a text editor created at MIT by Greenblatt and Murphy in 1962.   Predominantly used for writing code at the time, the concept would evolve into the word processor functionality that later helped computers break into the workplace.  In one colorful description, author Steven Levy declared the young Greenblatt a “single-minded, unkempt, prolific, and canonical MIT hacker who went into the night phase so often that he zorched his academic career.”

The next big tool was SKETCHPAD, a drawing program that invented the graphical user interface.  According to wikipedia:
Ivan Sutherland demonstrated... that computer graphics could be utilized for both artistic and technical purposes in addition to showing a novel method of human-computer interaction.
Sketchpad was the first program ever to utilize a complete graphical user interface. Sketchpad used an x-y point plotter display as well as the then recently invented light pen. The clever way the program organized its geometric data pioneered the use of "objects" and "instances" in computing and pointed forward to object oriented programming.

LOGO, 1967: early AI language. 
Papert, MIT

There is a large presence of LOGO and LOGO turtle videos in the TechSquare film clips.  Invented by Seymour Papert of MIT, LOGO is famous for being an easier-to-understand programming language.  It pioneered the idea of educational children programming programs, the first of which occurred down the street from MIT in Lexington, MA. 
Students and researchers could type in the human-friendly commands over teletype, a typewriter-like contraption that was wired into the main computer and could make simple math, word or whatever-else-they-could-imagine programs.
The next major innovation came when they hooked the system up to a 'turtle' robot whose movements were scripted by the LOGO programs.  It provided a way for the students and researchers to immediately see their program in action and test out their algorithms by watching its motion. 
By strapping a marker or pencil to the turtles and initiating some simple rules for movements, the robots became famous for tracing complex and beautiful patterns on the paper beneath it.  Use the same algorithms to create a path in pixels and they created some of the first screensaver-like graphics.

Vision Project, 1966: They thought they could Solve Machine Vision in a Summer

By connecting cameras to the computers, researchers experimented with ways of using AI to interpret and extract information about vision data.  No one really understood how difficult that would be and the initial MIT attempt is one of my favorite AI anecdotes.

Rumor has it that the task of figuring out how to extract objects and features from video camera data was originally tossed to a part-time undergraduate student researcher to figure out in a few short months.  What is known for certain is that there was summer vision project sometime in the sixties, in which researchers fully expected to establish many of the main concepts by the start of the next semester.

As would often be the case in AI, they had vastly underestimated the complexity of human systems, and the field is still working on how too make fully functional vision systems today.

UNIMATE, 1961: The First Industrial Robot
Engelberger and Devol, General Motors

According to the Computer History Museum, “The first industrial robot UNIMATE started out in 1961 on the TV picture tube manufacturing line, then went to work at General Motors. Weighing 4,000-pounds,  the robot arm that  obeyed commands one-by-one to stack and sequence die-cast metal.”

Robots would become a major area in AI experimentation, with initial applications in factories or human controllers but later expanding into some cooperative and autonomous tasks.  The world 'robot' is derived from the Czech word for worker, but nowadays the machines are used from everything from actors in the Entertainment Industry (see the movies Gremlins, Jurassic Park, A.I.) to high precision medical surgeries, toys and autonomous vacuum cleaners.

ELIZA, 1965: A pattern-matching psychologist
Weizenbaum, MIT

ELIZA is a simple pattern matching conversational machine, the first of her kind, created by MIT computer scientist Joseph Weizenbaum  in 1966.  It has now become part and parcel of all Emacs programs and crudely simulates a psychotherapist using pattern matching. 

Weizenbaum was not a proponent of AI.  In fact, in the paper about his invention, he says, “Machines are made to behave in wondrous ways, often sufficient to dazzle even the most experienced observer.  But once a particular program is unmasked, once its inner workings are explained in language sufficiently plain to induce understanding, its magic crumbles away; it stands revealed as a mere collection of procedures, each quite comprehensible.”

Whether or not you agree with Weizenbaum's characterization, the best way to understand the program is to look at an example.  The following comes from the description Daniel Bobrow gave of an unplanned encounter between the Vice President of BBN an abandoned ELIZA program unwittingly left running on the computer screen:

Here is how Bobrow tells the story:
One Saturday morning about 9 am., a Vice President of Bolt, Beranek, and Newman [BBN]... found a note on the PDP-1 teletype that read: "TYPE ALL MESSAGES INTO THIS TELETYPE AND END THEM WITH A PERIOD." This note was left there by one of the programming bums who had just succeeded in making run a LISP version of the DOCTOR program (based on ELIZA as developed by Weizenbaum). However, the VP thought that this note indicated that this teletype was linked with the one in my home, since I often use the computer all night. Therefore, he started typing to "me," but in fact was typing to this version of the DOCTOR program. The following conversation ensued:
vp: Do you think that I can use the computer this morning?
eliza: Why do you ask?
vp: I might be able to make some additional Telcom sales.
eliza: Why aren't you so sure?
vp: My perspective customers have never seen a demonstration of the system.
eliza: Does this mean very much to you?
vp: Of course it does.
eliza: You seem very positive about that.
vp: This is ridiculous.
eliza: What do you mean, this is ridiculous?
vp: Is there any reason why I can't use the computer now?
eliza: Tell me more about your perspective customers.
vp: Please dial me up on 491-1850
Note that after the last remark the VP did not obey instructions, and left out the period. Therefore, of course, the computer didn't answer him. This so infuriated the VP, who thought I was playing games with him, that he called me up, woke me from a deep sleep, and said:
vp: Why are you being so snotty to me?
bobrow: What do you mean I am being snotty to you?
The VP angrily read me the dialog that "we" had been having, and couldn't get any response but laughter from me. It took a while to convince him it really was a computer.
The basic algorithm is the following:  (1) Look at user's input, (2) Find a pattern that matches the input, (3) Look up the and print out the corresponding response.  Though you can, of course, form your own opinion, I find it amazing that such a simple setup can result in such an amusing and complex situation.

DENDRAL, 1966: Chemistry Expert System analyzing organic compounds
Buchanan, Feigenbaum, Lederberg, Sutherland, Stanford

One of the clearest examples of applied AI research, DENDRAL analyzed organic compounds using mass spectrogram and nuclear magnetic resonance data to determine their structure.   It limited the search space using constraint satisfaction, increasing the probability that the system would find a solution. 

The heuristics and rules it used to trace the path of which structures and characteristics respond to what kind of molecules were painstaking gathered from interviewing and shadowing experts in the field.  It involved a very different approach to intelligence from a universal problem solving structure, requiring extensive specialized knowledge about a system.

DENDRAL evolved into the MetaDendral system, which attempted to automate the knowledge gathering bottleneck of building an expert system.  MetaDendral made the first scientific discovery by a machine regarding an unknown chemical compound in 1975.


V. 1970's – A Rising Industry

Directions of AI advancement accelerated in the seventies with the introduction of the first personal computers, a medical diagnostic tool MYCIN, new conceptualizations of logic, and games like Pong and PacMan. 

Expanding from abstract tools to applications, Project Gutenburg began compiling electronic versions of books in 1970, an ongoing effort now available online.  The first reading machine was created by Kurzweil in 1976 and was used to assist the blind.  Whether robots or keyboards, the next evolutionary step in both AI and computer science came with the control, interpretation and coordination of peripheral devices. 

Computers, inaccessible to individuals outside of military, academia and large banks, were suddenly available to own oneself for a mere few thousand dollars.   At the start, the machine did not even have a screen, just a set of LEDs and buttons one had to punch in sequence to program the machine.  Market forces soon welcomed in a flood of peripheral devices to improve input and output capabilities.  As Microsoft and Apple Computers began operations and the first children's computer camp occurred in 1977, major social shifts in the status of computer technology were underway.

Back at MIT, former director Rod Brooks relates that in the seventies, “Patrick Winston became the director of the Artificial Intelligence Project, which had newly splintered off Project MAC.  The lab continued to create new tools and technologies as Tom Knight, Richard Greenblatt and others developed bit-mapped displays, fleshed out how to actually implement time-sharing and included e-mail capabilities. 

“Knowledge representation, knowledge-based systems, reasoning and natural language processing continued to motivate innovations in projects programming languages as the lab expanded in size, accepting former students Gerry Sussman, Carl Hewitt and Ira Goldstein into the faculty ranks.” 

Early Mobile Robots: Shakey, Freddie 
Stanford and University of Edinburgh

DARPA funded various initial robot projects across the country including Stanford's mobile robot Shakey.  In a similar vein, the University of Edinburgh soon created their own mobile robot, Freddie, in 1973.  Both robots used visual perception and other inputs to create internal models of the world around them, which they would then use to navigate through space.  More specifically, wikipedia declares:

SRI International´s Shakey became the first mobile robot controlled by artificial intelligence. Equipped with sensing devices and driven by a problem-solving program called STRIPS, the robot found its way around the halls of SRI by applying information about its environment to a route. Shakey used a TV camera, laser range finder, and bump sensors to collect data, which it then transmitted to a DEC PDP-10 and PDP-15. The computer radioed back commands to Shakey — who then moved at a speed of 2 meters per hour.

Robots and Natural Language Processing at MIT: Copy Demo, Silver Arm, SHRDLU

Early robotics included the 1961 MH1 robot-hand project and 1970 copy-demo in which a robotic arm equipped and camera could visually determine the structure of a stack of cubes and then construct an imitation.  Both of the projects are well documented on the AI Films website.

The first first computer controlled robot intended for small parts assembly came in 1974 in the form of David Silver's arm, created to do small parts assembly.  Its fine movements and high precision required great mechanical engineering skill and used feedback from touch and pressure sensors.  Patrick Winston soon expanded the idea of cube manipulation with his program ARCH, that learned concepts from examples in the world of children's blocks.

Another major program that was applied to the blocks world was SHRDLU, a natural language processing system created by Terry Winograd in 1970. According to wikipedia:


SHRDLU carried on a simple dialog (via teletype) with a user, about a small world of objects (the BLOCKS world) shown on an early display screen (DEC-340 attached to a PDP-6 computer). 
The name SHRDLU was derived from ETAOIN SHRDLU, the arrangement of the alpha keys on a Linotype machine, arranged in descending order of usage frequency in English.
A side effect of this memory, and the original rules SHRDLU was supplied with, is that the program could answer questions about what was possible in the world and what was not. For instance, SHRDLU would deduce that blocks could be stacked by looking for examples, but would realize that triangles couldn't be stacked, after having tried it. The "world" contained basic physics to make blocks fall over, independent of the language parser.
Finally, SHRDLU could also remember names given to objects, or arrangements of them. For instance one could say "a steeple is a small triangle on top of a tall rectangle"; SHRDLU could then answer questions about steeples in the blocks world, and build new ones.
To clarify, an example SHRDLU dialog might consist of the following:
Computer: OK.
Computer: OK.
Computer: THE BOX.
Computer: FOUR OF THEM.
Computer: YES, THE RED CUBE. (wikipedia)

The SHRDLU research was exciting because it allowed the user, albeit in highly constrained circumstances, to communicate directly with the computer in English, rather than having to learn a machine programming language.

MYCIN: Medical Diagnostic Expert System
Edward Shortliffe, Stanford

MYCIN was the first fully developed expert system, diagnosing and recommending treatment for infectious blood diseases.  The name 'Mycin' was used because it is a common antibiotic suffix.  Created in the beginning of the decade, by 1979 MYCIN was declared as good as medical experts by the Journal of American Medical Association.  The system even adjusted recommended dosages to the patient's body weight.  Its functionality can be described as follows:

Mycin operated using a fairly simple inference engine, and a knowledge base of ~500 rules. It would query the physician running the program via a long series of simple yes/no or textual questions. At the end, it provided a list of possible culprit bacteria ranked from high to low based on the probability of each diagnosis, its confidence in each diagnosis' probability, the reasoning behind each diagnosis (that is, Mycin would also list the questions and rules which led it to rank a diagnosis a particular way), and its recommended course of drug treatment.

Mycin was never actually used in practice. This wasn't because of any weakness in its performance — in tests it outperformed members of the Stanford medical school. It was as much because of ethical and legal issues related to the use of computers in medicine — if it gives the wrong diagnosis, who can be held responsible? Issues with whether human experts would find it acceptable to use arose as well. (wikipedia)
The creators of MYCIN found that doctors were unwilling to accept its advice if the system could not convince them of why it made its conclusions.  Therefore, they included the ability to answer questions about how it was making its decisions.  As described in one AI textbook, “[MYCIN] uses rules that tell it such things as, If the organism has the following set of characteristics as determined by the lab results, then it is likely that it is organism X.  By reasoning backward using such rules, the program can answer questions like “Why should I perform that test you just asked for?” with such answers as “Because it would help to determine whether organism X is present.” (Rich 59)  It is important that programs provide justification of their reasoning process in order to be accepted for the performance of important tasks. 


VI. 1980's: Boom and Crash

The start of the eighties was the golden age for Artificial Intelligence in the US, as the field caught the imagination of the larger population.  Institutions across the board were suddenly springing up departments of Artificial Intelligence from video game companies to Campbell's Soup.  The most common utilities came in the form of MYCIN-style expert systems, wizards that could give advice or information about how to do something in its area of expertise. 

These expert systems were specialized, serving the knowledge base of gurus in a field.  For example, in the case of Campbell's soup, a factory manager might be curious about the tub-cleaning requirements between making different batches of soup.  As related in the interview with on AAAI Fellow, if you were going from Chicken Broth to Chicken Noodle, you could proceed right way, but if the ordering was Clam Chowder to Vegetarian Minestrone, the tanks better be spic and span in between. 

Family and work computers started to become commonplace in the 1980's with six million computers sold in 1983.  Most of the tool builders at MIT left the lab in the eighties to work in new companies and bring their work to the consumer.  IBM introduced its 'PC' and Xerox, LMI and Symbolics had a variety of Lisp machines.  In addition, Apple's LISA and then Macintosh hit the market and ARPANET opened up to civilians, a precursor to the Internet. Despite these advances, by the end of the decade, the 'AI Winter' left the field, especially companies, struggling to defend their funding and reputation with a downturn in public interest.

In 1985, Professor Nicholas Negroponte and former MIT President Jerome Wiesner started the MIT Media Laboratory.  According to the Media Lab website:

[The Media Lab grew] out of the work of MIT's Architecture Machine Group, and building on the seminal work of faculty members in a range of other disciplines from cognition and learning to electronic music and holography...   In its first decade, much of the Laboratory's activity centered around abstracting electronic content from its traditional physical representations, helping to create now-familiar areas such as digital video and multimedia. The success of this agenda is now leading to a growing focus on how electronic information overlaps with the everyday physical world. The Laboratory pioneered collaboration between academia and industry, and provides a unique environment to explore basic research and applications, without regard to traditional divisions among disciplines.

The MIT AI lab was also in full swing, directing its talents at replicating the visual and mobility capabilities of  a young child, including face recognition, object manipulation and the ability to walk and navigate through a room.  Tomas Lozano-Perez pioneered path search methods used for planning the movement of a robotic vehicle or arm.  There was work done on legged robots by Marc Raibert and John Hollerback and Ken Salisbury created dexterous robot hands.  This decade was also when famed  roboticist and current director of CSAIL Rodney Brooks built his first robots.

Wabot-2, 1980: Robot that reads Sheet Music and plays Organ
Waseda University, Japan


The name WABOT is from 'WAseda roBOT', honoring the University in Japan at which it was designed.  In this case, the story is best told by its originators.  The description of the project on the Waseda University website follows:

It has been forecast that robots will penetrate society in 21st century... In that case, robots will be required to have anthropomorphic appearance sand faculties... Developing the anthropomorphic intelligent robot WABOT (WAseda roBOT) [aimed] to finally develop a "personal robot" which resembled a person as much as possible.

In 1980, our laboratories... commenced the WABOT-2 project. Playing a keyboard instrument was set up as an intelligent task that the WABOT-2 aimed to accomplish, since an artistic activity such as playing a keyboard instrument would require human-like intelligence and dexterity.

...The robot musician WABOT-2 can converse with a person, read a normal musical score with is eye and play tunes of average difficulty on an electronic organ. The WABOT-2 is also able of accompanying a person while he listens to the person singing. The WABOT-2 was the first milestone in developing a "personal robot."

It is interesting to note that the research group sees WABOT-2 as the first generation of an oncoming class of personal robots.  It may seem far-fetched at the moment, but look how far personal computers have come since they were first conceived of fifty years ago.

HEARSAY, 1982: Speech Understanding Program
Erman, Hayes-Roth, Lesser, Reddy at CMU

HEARSAY was a speech understanding program developed at CMU in 1982 that pioneered a useful model for solving perceptual problems, that is, problems in which a machine is trying to derive meaning out of complex input signals.  That process might involve decoding words from someone's voice, recognizing someone's face from a set of vision data or tactilely distinguishing different kinds of textures.

Because it is a widely applicable problem, below you will find a textbook summary of the steps one must consider in figuring out how a machine can glean information from sensory data.  As HEARSAY was a CMU project, it seems appropriate to include a summary from the an Artificial Intelligence textbook by Elaine Rich of CMU:


It is important to divide the overall understanding process into manageable pieces.  We can do this by dividing the process of analyzing either a speech sample or a picture into the following five stages:
Digitization: Divide the continuous input into discrete chunks.  For speech recognition, this can be done by measuring the amplitude of the signal at fixed intervals, such as 20,000 times per second...
Smoothing: Eliminate sporadic large variations in the input.  Because the real world is mostly continuous, these spikes in the input are usually the result of random noise.
Segmentation: Group the small chunks produced by digitization into larger chunks corresponding to logical components of the signal.  For speech understanding, these segments correspond to logical components of the signal... such as s or a.  These segments are often called phones...
Labeling: Attach to each of the segments a label that indicates which, of a set of building blocks, that segment represents...  So the labeling procedure can do one of two things.  It can assign multiple labels to a segment and leave it up to the later analysis procedure or choose the one that makes sense in the context of the entire input.  Or it can apply its own analysis procedure in which many segments are examined to constrain the choice of label for each segment.
Analysis: Put all the labeled segments together to form a coherent object... when surrounding pieces are considered, the number of interpretations that lead to a consistent overall interpretation [also known as constraint satisfaction] is considerably reduced.. In speech, this results from such things as intonation patterns that cover whole sentences. (Rich 349)

The actual HEARSAY program parsed audio information using a 'blackboard model' that follows the above techniques in a way that traces up and down the complexity levels of sound, syllable, word, as well as right to left, in sentences where there are ambiguous signals.  Like constructing a jig saw puzzle, the fastest method is invariably putting together the easily parsed border and then filling in the less obvious pieces.  This method becomes particularly useful when words are not enunciated clearly.


AARON, 1985: An Autonomous Artist
Harold Cohen, UCSD

Harold Cohen is an English artist who almost accidentally encountered programming at Stanford and then became father to the first robot artist, AARON.  Who knows what the rising technological art community will come
up with next.  According to Cohen's homepage:

The AARON program, an ongoing research effort in autonomous machine (art making) intelligence... began when [Cohen] was a visiting scholar at Stanford University's Artificial Intelligence Lab in 1973.  Together, Cohen and AARON have exhibited at London's Tate Gallery, the Brooklyn Museum, the San Francisco Museum of Modern Art, Amsterdam's Stedelijk Museum and many more of the world's major art spaces...

One of the few artists ever to have become deeply involved in artificial intelligence, Cohen has given invited papers on his work at major international conferences on AI, computer graphics and art technologies...

AARON has produced many thousands of drawings, to a few dozen of which Cohen has added color... The painting machine with which AARON colored real drawings in the real world was premiered at an exhibit at the Computer Museum in Boston in the spring of 1999.”

A picture being created by the latest version of AARON side by side with its creator appears above. 

Allen, 1985: Starting a New Generation of Reactive Robots 
Rodney Brooks, MIT AI Lab

One of the original MIT AI Lab groups was named the Mobot Lab and dedicated to making mobile robots.  'Allen' was the group's first creation and shares Brook's middle name.

According to author Kevin Kelly:

"Allen" was the first robot Brooks built. It kept its brains on a nearby desktop, because that's what all robot makers did at the time... The multiple cables leading to the brain box [a.k.a. computer] from Allen's bodily senses of video, sonar, and tactile were a never ending source of frustration for Brooks and crew... Brooks vowed that on their next project they would incorporate the brains inside the robot -- where no significant wiring would be needed -- no matter how tiny the brains might have to be.

They were thus forced to use very primitive logic steps and very short and primitive connections in "Tom" and "Jerry," the next two robots they built. But to their amazement they found that the 'dumb' way their onboard neural circuit was organized worked far better than a [complex] brain in getting simple things done.

Since then, Rodney Brooks has become one of the most famous proponents of robotics and is the current head of CSAIL, MIT's Computer Science and Artificial Intelligence Laboratory.

VII. Catching up to the Present

Since the eighties, several projects stand out as major new shifts and developments in the field.  When Deep Blue beat world chess champion Garry Kaspacov in 1996, some say it marked the end of an era in which specialized programs and machines reigned.  One new potential direction, the first official RoboCup, kicked off that the very same year posing and requires integrating all kinds of intelligences.  Their  goal is to be able to beat the winning World Cup soccer team by 2050. 

With the results of the DARPA Grand Challenge this year, that potentially rash aspiration seems more plausible.  After the first year's race when none of the autonomous vehicles made it even ten miles past the start of the 131.2 mile course, this year saw five of the twenty-three DARPA Grand Challenge competitors reach the finish with time to spare.

Other developments include the efforts started in 2002 to recreate a once wonder-of-the-world-status library in Egypt as online e-book called Bibliotheca Alexandrina.  The transition to computerized medical records has been sluggish, but in other areas of medicine from imagery to high precision surgery, the new facilitates machines can give a surgeon has saved lives and made new diagnosis and operations possible.
While we have all heard about NASA space robots, but less known were the $400,000 'His' and 'Her' robots featured in the 2003 Niemen Marcus Christmas catalog.  Clearly, our relationships with machines in society is in transition.  One of the most important examples of that was Cynthia Breazeal's research on machine emotion and social interaction with her MIT thesis-project Kismet in 2002. 

New versions of ELIZA-like programs are becoming commonplace with AOL Instant Messenger's SmarterChild, an agent that can answer questions and try to search the web to answer your questions about Movie times or tell you not to have a 'potty mouth.'  

While we do not have full realization of Licklider's man-machine symbiosis, the idea of machines and tools becoming agents that work hand and hand with human beings seems more and more natural with each generation.  IRobot's vacuum cleaner Roomba is kickstarting a new household robotics industry   with record sales. 

John McCarthy believes that fundamental new ideas are required before AI can reach human-level intelligence, rather than just needing large databases and faster computers.  He declares on his website, “My own opinion is that the computers of 30 years ago were fast enough if only we knew how to program them.”  

Whether or not human-level intelligence is even the main goal of the field anymore, it is one of the many that entice our interest and imagination.  It is clear that AI will continue to impact and contribute to a range of applications and only time will tell which paths it will travel along the way.

Heather Knight received her B.S. in Electrical Engineering with a minor in Mechanical Engineering from MIT in 2006 and has been accepted into their EECS Masters of Engineering program.  She has also worked at the MIT Media Lab since 2002 with Professor Cynthia Breazeal of the Robotic Life as well as Professor Deb Roy of  Cognitive Machines.


I. Project Background

The Recovering MIT's AI Film History project was born in 2001, when a collection of old film reels showed up on some dusty shelves during the move from Tech Square to Frank Ghery's architectural creation, the Ray and Maria Stata Center.  The Stata Center is the home of the now joined AI Lab and Computer Science departments known as CSAIL, the Computer Science and Artificial Intelligence Laboratory. 

Thanks to the support of the National Science Foundation, these films and more are now available on the project website, http://projects.mit.edu/films.  The original NSF proposal to digitize and create a website was worded as followed:

This project will collect, organize and preserve historic materials, particularly film, that are part of the historical record of the field of Artificial Intelligence (AI). It will create an organized digital archive and use highlights selected from the archive to illustrate the intellectual history of AI...  Sources for this project included notes, memos and technical reports from MIT and elsewhere, and in particular, a uncatalogued, unconserved and uncurated collection of films that recently came to light at MIT... The project will create a web site or DVD to showcase the selected clips, the connecting narrative, and other more technical materials.

The  opening of the website fortuitously coincided with both the  50th anniversary of Artificial Intelligence (as the term was coined at the Dartmouth conference in 1956) and the American Association of Artificial Intelligence (AAAI) conference in Boston, MA June 16-22, 2006.  There we had the opportunity to interview on video more than one quarter of the AAAI Fellows in attendance.  The footages is now part of the site.  The Fellows include the most influential innovators in the field of Artificial Intelligence and many of the original founders of the field were present.

Another primary source for the site was Rick Greenblatt, who began his MIT career in the 1960s.  He was extraordinarily generous with his time, watching each and every of the site's film clips and leaving an audio 'podcast' of his reminiscences for each one. 

The Recovering MIT's AI Film History website itself was created over the summer of 2006, led by CSAIL's Outreach Officer Tom Greene and produced by Luis Gomez (University of Florida undergrad), Heather Knight (MIT MEng student) and Matt Peddie (MIT undergrad), who collectively did the research, web design and interviews contained within the site. 

I would like to personally thank MIT Electrical Engineering and Computer Science Professors Fernando Corbato and Bob Fano, as well Harvard History of Science PhD candidate Hallam Stevens for reading drafts of this paper.  I have not done full justice to the feedback they offered, but the content is more complete and less error-ridden because of their help.

II. Artificial Intelligence in Popular Culture

Asimov, Isaac. I, Robot (1950), Caves of Steel (1954), Robots of Dawn(1982). Robot Science Fiction, book.  Conceives fictional Three Laws of Robotics
Orwell, George.  1984 (1949). Big Brother uses computers to enslave humanity, book.
Shelley, Mary Frankenstein. book.
Kubrick, Stanley. “2001: A Space Odyssey” (1968), movie. (Based on book by Arthur C. Clark)  
“Star Wars” (1977), movie.

III. AI Organization Timeline*
*many appendix timeline events adapted from Mark Kantrowitz's compilation

1951 IEEE founded.
1956 The Dartmouth AI Conference, McCarthy coins name.
1958 DARPA created.
1958 Teddington (UK) Conference. McCarthy, Minsky, Selfridge
1969 First IJCAI Conference in Washington DC.
1974 First SIGGRAPH conference.
1980 First AAAI conference.  Stanford.
1982 ICOT formed.  Japan.

IV. MIT Research Centers Timeline

1959 Artificial Intelligence Project starts, led by Minsky and McCarthy
1963 Project MAC, led by Minsky and Papert
1969 AI Lab splits off from Project MAC, led by Pat Winston
1975 LSC (Laboratory of Computer Science) replaces Project MAC
1980 The Media Lab founded by Negropante?
2003 CSAIL (Computer Science and Artificial Intelligence Laboratory) grows out of a LCS and AI Lab                             merger, co-directed by the former heads of both, Victor Zhu and Rod Brooks prospectively.

V. Academic Research Centers Timeline

1959* MIT's Artificial Intelligence Project, founded by John McCarthy and Marvin Minsky.
1963 Stanford AI Lab (SAIL), founded by John McCarthy
1963* MIT's Project MAC, begun under Minsky and Seymour Papert,  $2 million DARPA grant.
CMU AI Lab, founded
1966 Edinburg AI Lab, founded by Donald Michie.
1979 CMU Robotics Institute, founded by Raj Reddy.
1980* MIT Media Laboratory

VI. Major Early AI Companies:
(rashly incomplete)

DEC, Artificial Intelligence Corp., Apple, Microsoft, Symbolics, Xerox, Intel, LMI, Teknowledge, Thinking Machines, Google

VII. AI Projects Timeline

1947 Grey Walter builds electro-mechanical “turtle”
1949 Turing and colleagues try to create a chess program on Mach 1.
1950 Chess Program proposed as search problem.  Shannon.
1956 The Logic Theorist, solves math problems. Newell, Shaw and Simon.
1957 General Problem Solver, “means-end analysis.”  Newell, Shaw and Simon.
1959 Checkers Program beats best human players. Samuel.
1959 Timesharing. Kurtz and Kemeny.

1961* SAINT, first Lisp program. PhD work. J. Slagle.
1962* TECO, text editor for PDP-1.  Murphy and Greenblatt. MIT.
1962 First Commercial Industrial Robots
1963* ANALOGY, solves SAT-level analogy problems.  PhD work. Thomas Evans.
1963* SKETCHPAD, drawing tool.  Sutherland.
1963 Parser, tested on “Time flies like an arrow.” Susumo. 
1964* STUDENT, solves high-school level algebra word problems. PhD. Danny Bobrow.
1964* SIR.  PhD work. Bert Raphael.
1965* ELIZA, conversational psychotherapist. Joseph Weizenbaum.
1965* First Virtual Reality head-mounted display.  Ivan Sutherland.
1966 DENDRAL, chemistry knowledge-based sys.  Buchanan, Feigenbaum, Lederberg, Sutherland. Stanford.
1967* LOGO, early AI language.  Papert.
1967* MACSYMA, symbolic reasoning for integration problems, logic based system.  Joel Moses.
1968* Tentacle Arm, aka Minsky-Bennett arm.

1970 PROLOG. Alain Colmerauer.
1970 Shakey, first computer controlled mobile robot.  Stanford.
1970 INTERNIST, aid in disease diagnosis. Pople and Myers.
1970* SHRDLU, natural language processing, blocks world. Terry Winograd.
1970* ARCH.  Winston.
1970 Project Gutenburg, free electronic versions of books. M. Hart.
1971 PARRY, paranoid conversation agent. Colby.
1971 STRIPS, first motion planning system?.  Nils Nilsson and Rich Fikes.
1972 Smalltalk. Xerox Parc.
1972 PONG, early video game. Nolan Bushell.
1973 Scripts developed. Schank and Abelson.
1973 MYCIN, medical diagnostic expert system. PhD  Edward Shortliffe. Stanford.
1974* Silver Arm, first computer controlled robot, intended for small parts assembly. David Silver.
1975 MetaDendral, first scientific discovery by a machine.
1976 Adventure, first adventure game.  Crowther and Woods.
1976* First LISP machine. Greenblatt.
1976 First reading machine. Kurzweil.
1976 Automated Mathematician.  Lenat.
1976* Primal Sketch for Visual Representation. David Marr et al.
1979  Stanford Cart crosses chair filled room without help.  Hans Moravec.
1978 VisiCalc. Bricklin.
1978 Version Spaces.  Tom Mitchell. Stanford.
1978 MYCIN generalized.  PhD. Bill VanMelle.  Stanford.
1979 PacMan brought to market.

1980 HEARSAY, uses blackboard model. Erman, Hayes-Roth, Lesser, Reddy.  CMU.
1980 Expert systems up to 1000 rules.
1980 Japanese 5th Generation Project.  Kazuhiro Fuchi.
1981 Connection Machine Designed, powerful parallel architecture. Danny Hillis. Thinking Machines.
1983 SOAR. John Laird & Paul Rosenbloom with Allen Newell. PhDs. CMU.
1984 Neural Nets with backpropagation widely used. John Hopsfield.
1984 “Wabot-2” reads sheet music and plays organ.
1985 Aaron, autonomous drawing program.  Harold Cohen.
1985* Allen, autonomous reactive robot.  Rod Brooks.

1990 Human Genome Project begins
1997 Deep Blue beats world chess champion Garry Kaspacov.
1997 First Official RoboCup, start of a new paradigm
2000* Kismet, robot that recognizes and displays emotion.  PhD. Cynthia Breazeal.
2000 AIBO introduced.
2002 Bibliotheca Alexandrina
2003 Niemen Marcus's Christmas catalog features $400,000 his and her robots.


VIII. AI Papers Timeline

1930* Differential Analyzer, Vannevar Bush, MIT
1937 “On Computable Numbers,” Turing Machine.  Turing.

1943 Neural Networks.  McCulloch and Pitts.
1945* “As We May Think.” Vannevar Bush, MIT.
1948 “Cybernetics.” Norbert Wiener.
1949 Information Theory.  Shannon.

1950 “Computing Machinery and Intelligence,” Turing Test. Turing.
1957* “Syntactic Structures.” Chomsky.
1958* Perceptron, Rosenblatt.

1962 “Structure of Scientific Revolutions.” Kuhn.
1962 “Possible Worlds Semantics.” Kripke.
1963 Semantic Networks as a Knowledge Representation.  M. Ross Quillian.
1963* “Steps Toward Artificial Intelligence.” Marvin Minsky.
1968* “Semantic Information Processing.” Marvin Minsky.
1968 *“The Sound Pattern of English.” Chomsky and Halle.
1969* “Perceptrons,” discusses limits of single layer neural networks. Minsky and Papert.
1969* “Philosophical Problems from the Perspective of Artificial Intelligence,” situation calculus      
McCarthy and Pat Hayes.

1972 “What Computers Can't Do.” Dreyfus.
1974* “A Framework for Representing Knowledge.” Marvin Minsky.
1974 “Creative Computing.” Ahl.
1974 “Computer Lib.” Nelson
1976 Metalevel reasoning, PhD. R. Davis. Stanford.
1979 Mycin as good as medical experts.  Journal of American Medical Association.
1979* AI Lab Flavors OOP memo.  Weinreb and Moon.
1979* Non-monotonic logics. McDermott and Doyle (MIT), McCarthy (Stanford).

1980 “The Knowledge Level.” Allen Newell.
1980 “Gödel, Esher, Bach,” wins Pulitzer.  Hofstadter.
1983 “The Fifth Generation.” Feigenbaum and McCorduck.
1984 “Common LISP the language.” Steele.
1985* “The Society of Mind.” Marvin Minsky.


IX. Landmarks in Computation

1940 The ABC, first electronic computer. Atanasoff and Berry.
1941 Z3, first programmable computer. Zuse. Germany.
1944 Mark I, first programmable computer in US.  Aiken.
1945 First computer “bug.” Grace Hopper.
1947 Transistor. Schockley, Brittain and Ardeen.  Bell Labs.

1950 UNIVAC, first commercial computer.  Eckert and Mauchley.
1952 Compiler. Grace Hopper.
1956 FORTRAN, programming language.  IBM.
1958 Integrated Circuit. Jack St. Clair Kilby.
1959 PDP-1 sells for $159,000.  DEC.

1960 Defense computer mistakes moon for incoming missile.
1960 LINC, first computer with integrated CRT.  Lincoln Labs.
1961 All Dartmouth students required to be computer literate. Kemeny's timesharing system.
1964 PDP-8, first mass-produced microcomputer. DEC.
1964 IBM 360 series.
1964 BASIC, programming language. Kemeny and Kurtz.
1967 IBM distinguishes hardware and software.
1967 Mouse, windows and multiple raster monitors demoed.  Englebart. SRI.
1968 First PhD in Computer Science. Mexelblat. University of Pennsylvania.
1969 UNIX, Thomson and Ritchie. AT&T.

1970 Floppy Disks.
1971 Intel 8008, first microprocessor in US.
1975 BASIC for a microcomputer, Gates and Allen.
1975 Altair 8800, first personal computer with 256 bytes memory.
1975 BYTE, magazine.
1977 Apple Computer.  Wozniak and Jobs.
1977 Apple II, Radio Shack TRS80, Commodore PET.
1977 First children's computer camp.
1977 Microsoft founded.

1980 Lisp machines widely marketed. Xerox, LMI, Symbolics.
1981 IBM Introduces Personal Computer (PC)
1983 Six million computers sold.
1984 Apple LISA
1984 Compact Disk (CD) technology. Sony.
1984 Apple introduces Macintosh.
1987 ARPANET opens to civilians

*at MIT

AAAI Fellow Interviews.  “Oral Histories.”  Recovering MIT's AI Film History Website. MIT. June
2006.   <http://projects.csail.mit.edu/films>.

“Artificial Intelligence: The Next Twenty-Five Years.” Edited by Matthew Stone and Haym Hirsh.  AI             Magazine, 25th Anniversary Issue. Winter 2005.

Brooks, Rodney.  “Artificial Intelligence Laboratory.Electrons and Bits. Ed. John V. Guttag.              Cambridge, MA, Electrical Engineering and Computer Science Department: 2005.

Buchanan, Bruce and McCarthy, John. AAAI 2002. Brief History of Artificial Intelligence.             <http://www.aaai.org/AITopics/bbhist.html>.

Buchanan, Bruce G. “A (Very) Brief History of Artificial Intelligence.” AI Magazine, 25th
Anniversary Issue.  Winter 2005.

Chandler, David. Volkswagen wins robotic race across the desert. NewScientist.com news service.       Oct. 10, 2005 <http://www.newscientist.com/article.ns?id=dn8119>.

Cohen, Paul R. “If Not Turing's Test, Then What?”AI Magazine, 25th Anniversary Issue. Winter 2005.

Edwards, Paul N. Closed World: Computers and the Politics of Discourse in Cold World America.             Cambridge, MA: The MIT Press, 1996.

Garfinkel, Simon L. LCS: Architects of the Information Society.  Ed. Hal Abelson. Thirty-Five Years of the Laboratory for Computer Science at MIT. Cambridge, MA: The MIT Press, 1999.

Greenblatt, Rick.  “Podcasts.” Recovering MIT's AI Film History Website. MIT. June 2006.             <http://projects.csail.mit.edu/films>.

Güzeldere, Güven, and Stefano Franchi. “Dialogues with Colorful Personalities of early AI.” SEHR:             Constructions of the Mind. Vol. 4.2, 24 July 1995.                          <http://www.stanford.edu/group/SHR/4-2/text/toc.html>.

“Harold Cohen.” Personal Homepage at Center for Research in Computing and the Arts.  University of             California San Diego. 1999  <http://crca.ucsd.edu/~hcohen/>.

“Harold Cohen's 'Aaron' – The Robot as an Artist.” SciNetPhotos. 1997             <http://www.scinetphotos.com/aaron.html>.

Kantrowitz, Mark.  “Milestones in the Development of AI” CMU 'comp.ai' Newsgroup Archives. 1994

Kelly, Kevin. “Machines with Attitude” Out of Control: The New Biology of Machines, Social Systems             and the Economic World. Chapter 3.  Perseus Books Group: 1995             <http://www.kk.org/outofcontrol/ch3-b.html>.

Kirsh, David. “Foundations of artificial intelligence: The big issues.” Artificial Intelligence 47 (1991): 3-30.

Levy, Steven. Hackers. New York:  Dell Publishing Co., 1984.

Luger, George. “AI: Early History and Applications” Ch1 of Artificial Intelligence: Structures and             Strategies for Complex Problem-Solving.  Addison Wesley; 4th edition.  January 15, 2002.              <http://www.cs.unm.edu/%7Eluger/ai-final/chapter1.html>

MIT Media Laboratory. MIT, 2006. <http://www.media.mit.edu>.

Maisel, Merry and Laura Smart. “Admiral Grace Murray Hopper.” Women in Science. San Diego             Supercomputer Center, 1997 <http://www.sdsc.edu/ScienceWomen/hopper.html>.

McCarthy, John. “Reminiscences on the History of Time Sharing.” Stanford University, 1983.                 <http://www-formal.stanford.edu/jmc/history/timesharing/timesharing.html> 2006.

McCarthy, John, M.L. Minsky, N. Rochester, C.E. Shannon. “A Proposal for the Dartmouth Summer             Research Project on Artificial Intelligence.”  August 31, 1955. 

McCarthy, John.  “What is Artifical Intelligence?” Website FAQ.24Nov. 2004.

McCorduck, Pamela.  Machines who Think. (Original ed. 1974). Natick, MA: A K Peters, 2004.

Minsky, Marvin. “Steps toward Artificial Intelligence.” Computers and Thought.  Ed. Edward Feigenbaum. place: publisher, 1963: 406-450.

Nilsson, Nils J. “Human-Level Artificial Intelligence? Be Serious!”AI Magazine, 25th Anniversary      Issue. Winter 2005.

Norvig, Peter. Paradigms of Artificial Intelligence Programming: Case Studies in Common Lisp. San             Francisco, CA: Morgan Kaufman Publishers, 1992.

Turing, A.M. “Computing Machinery and Intelligence” Computers and Thought. * 1963: 11-35.

Rich, Elaine. Artificial Intelligence: International Student Edition. The University of Texas at Austin.              Singapore: McGraw-Hill, 1983.

“Robots and AI Timeline.”  The Computer History Museum. Mountain View, CA.  2006             <http://www.computerhistory.org/timeline/timeline.php?timeline_category=rai>.

Spice, Byron. “Over the holidays 50 years ago, two scientists hatched artificial intelligence.” Pittsburg Post-Gazette. 2 Jan. 2006.  <http://www.post-gazette.com/pg/06002/631149.stm>.

“WABOT: Waseda roBOT.” Humanoid Robotics Institute. Waseda University, Japan.             <http://www.humanoid.waseda.ac.jp/booklet/kato02.html>.

Waldrop, M. Mitchell. The Dream Machine: J.C.R. Licklider and the Revolution That Made Computing             Personal.  New York: Penguin, 2002.

Wikipedia* August 2006.  <http://wikipedia.org>.

*A note on wikipedia:
The use of wikipedia as a source is sometimes viewed with skepticism, as its articles are created voluntarily rather than by paid encyclopedia writers.  I contend that not only is the concept of wikipedia  an outcropping of the field this paper is about, but it probably has more complete and up to date information than many other sources about this particular topic.  The kind of people that do or are interested in AI research are also the kind of people that are most likely to write articles in a hackeresque virtual encyclopedia to begin with.  Thus, though multiple sources were consulted for each project featured in this paper, the extensive use of wikipedia is in keeping with championing clever technological tools that distribute and share human knowledge.


Source: http://projects.csail.mit.edu/films/aifilms/AIFilms.doc

Web site to visit: http://projects.csail.mit.edu

Author of the text: indicated on the source document of the above text


History of Artificial Intelligence
Compiled by Dana Nejedlová in October 2003

When we look to the past, we can see that people have always been striving to ease their living by making machines that should perform tasks demanding strength, rapidity, or dull repetition. In the beginning it involved only physical tasks, but later people needed some help with the tasks that so far had to be solved only mentally. You surely know that a typical task like this is computing large numbers. By now it is evident that it is possible to construct machines named computers that can compute large numbers far much faster than most of people can do. But people had more ambitions than compute large numbers. They wanted to construct an artificial man that would behave like a genuine man. And it has shown that this task is extremely difficult. From what I have already said you can deduce that artificial intelligence or AI is connected with computers and the wish of people to construct artificial men or robots. But what this curious wish of people to make beings like them, in other ways than ordinary and easy breeding, originates from? I think that it stems from the curiosity of the people about how they are constructed. What makes them feel the feelings?

All this effort of creating other intelligent beings, than biological, needs some theoretical background. This background has been built since antiquity. One necessary thing for this was logic. The field of logic has been initiated by Aristotle (384 – 322 BC) with his method of deductive reasoning characterised by the syllogism. A syllogism is a form of reasoning in which two statements are made and a logical conclusion is drawn from them. For example, Socrates is a man, all men are mortal therefore Socrates is a mortal.

In the 17th century materialist philosophy flourished in Europe. French philosopher Rene Descartes (1596 – 1650) proposed that bodies of animals are nothing more than complex machines. British philosopher Thomas Hobbes (1588 – 1679) in his book Leviathan (1651) came to the idea that any intelligent entity must have a body and that this body needed not to be all in one piece but could be spread all over the place, and that reasoning or mind could be reduced to computation. So, the body of some artificial intelligence must be material and it must be able to compute.

The scientific field of artificial intelligence could not go any further until computers were constructed. The first computers have been envisaged in the 17th century by Gottfried Wilhelm von Leibniz (1646 – 1716) and built in the 18th century by Charles Babbage (1792 – 1871). The pace of computer building has been accelerated by the Second World War. Since 1940 the computer construction has not been a mere theoretical affection but it is a part of activities that bring strategic advantage to the most developed countries.

The ideas of creating an artificial formal language patterned on mathematical notation in order to classify logical relationships, and of reducing logical inference to a purely formal and mechanical process, were due to Leibniz. Leibniz’s own mathematical logic, however, was severely defective, and he is better remembered simply for introducing these ideas as goals to be attained than for his attempts at realising them.

Although philosophy provided the initial ideas for artificial intelligence, it took mathematics to turn these ideas into a formal science. In 1847, British mathematician and logician George Boole (1815 – 1864) developed a mathematical theory of binary logic and arithmetic known as Boolean algebra. Boolean algebra is a two-valued, or binary, algebra, in which a proposition can have the value yes or no, true or false, 1 or 0; there are no intermediate values. Boolean logic and its derivatives provided a formal language that allowed mathematicians and philosophers to explicitly describe the logic proposed by Aristotle in a precise and unambiguous way.

The information processing in contemporary computers employs the binary principles of boolean algebra developed by George Boole. The people who were designing computers in the 1940s and 1950s, especially American born in Hungary John Louis von Neumann (1903 – 1957) and British scientist Alan Mathison Turing (1912 – 1954), were both interested in the principles of artificial intelligence.

While John von Neumann is known for determining the architecture of computer hardware, Alan Turing has created the concept of the algorithm to digital computers. Turing has done this by conceiving an abstract representation of a computing device, which is known today as the “Turing machine” published in the Church-Turing Thesis in 1936. The Turing machine consists of a reading and writing head that scans a (possibly infinite) two-dimensional tape divided into squares, each of which is inscribed with a 0 or 1. Computation begins with the machine, in a given “state”, scanning a square. It erases what it finds there, prints a 0 or 1, moves to an adjacent square, and goes into a new state. This behaviour is completely determined by three parameters: (1) the state the machine is in, (2) the number on the square it is scanning, and (3) a table of instructions. The table of instructions specifies, for each state and binary input, what the machine should write, which direction it should move in, and which state it should go into. (E.g., “If in State 1 scanning a 0: print 1, move left, and go into State 3”.) The table can list only finitely many states, each of which becomes implicitly defined by the role it plays in the table of instructions. These states are often referred to as the “functional states” of the machine. Computer scientists and logicians have shown that the Turing machines – given enough time and tape – can compute any function that any conventional digital computers can compute. The Turing Machine that he envisioned is essentially the same as today’s multi-purpose computers. The concept of the Turing machine was revolutionary for the time. Most computers in the 1950s were designed for a particular purpose or a limited range of purposes. What Turing envisioned was a machine that could do anything, something that we take for granted today. The method of instructing the computer was very important in Turing’s concept. He essentially described a machine, which knew a few simple instructions. Making the computer perform a particular task was simply a matter of breaking the job down into a series of these simple instructions. This is identical to the process programmers go through today. He believed that an algorithm could be developed for most any problem. The hard part was determining what the simple steps were and how to break down the larger problems.

Alan Turing is considered to be one of the fathers of artificial intelligence. In 1950 he wrote a paper describing what is now known as the “Turing Test”. The test consisted of a person asking questions via keyboard to both a person and an intelligent machine. He believed that if the person could not tell the machine apart from the person after a reasonable amount of time, the machine was somewhat intelligent. The Turing test can be used to provide a possible definition of intelligence. This test can be of various difficulty ranging from a talk about some limited subject to solving common sense problems, which are the most difficult for machines, because they require to input facts about vast quantity of everyday objects into the machines. In 1990 Hugh Gene Loebner agreed with The Cambridge Center for Behavioral Studies to underwrite a contest designed to implement the Turing Test. Dr. Loebner pledged a Grand Prize of $100,000 and a Gold Medal for the first computer whose responses were indistinguishable from a human's. Each year an annual prize of $2000 and a bronze medal are awarded to the most human computer. The winner of the annual contest is the best entry relative to other entries that year, irrespective of how good it is in an absolute sense.

Although the computer provided the technology necessary for artificial intelligence, it was not until the early 1950s that the link between human intelligence and machines was really observed. Norbert Wiener (1894 – 1964) was one of the first Americans to make observations on the principle of feedback theory. The most familiar example of feedback theory is the thermostat: It controls the temperature of an environment by gathering the actual temperature of the house, comparing it to the desired temperature, and responding by turning the heat up or down. What was so important about his research into feedback loops was that Wiener theorised that all intelligent behaviour was the result of feedback mechanisms. Mechanisms, that could possibly be simulated by machines. This discovery influenced much of the early development of artificial intelligence.

Since Turing, there have been two kinds of approach to the human mind. The first approach was that it is basically a digital computer. The second approach was that it is not. The first approach, also called Good Old-fashioned Artificial Intelligence or symbolic artificial intelligence, was the dominant approach in artificial intelligence through the mid-80s. On this view, the mind just is a computer, which manipulates symbols, and these symbols can be regarded as thinking. The second approach was called New-fangled Artificial Intelligence, and the most prominent branch of it has been connectionism. Good Old-fashioned Artificial Intelligence assumes that an intelligent machine represents the world somehow in its memory and is able to operate on this representation to achieve its goals. Symbolic AI tries to reconstruct the human intelligence from top to bottom by reducing the intellectual abilities so that they could be proper to machines. New-fangled Artificial Intelligence goes from bottom to top, from the simplest reactions to the more complex behaviour of machines that is supposed to emerge.

In 1943 Warren McCulloch and Walter Pitts published their paper dealing with what are generally regarded as the first neural networks. These researchers recognised that combining many simple neurons into neural systems was the source of increased computational power. The weights on a McCulloch-Pitts neuron are set so that the neuron performs a particular simple logic function. The neurons can be arranged into a net to produce any output that can be represented as a combination of logic functions. They were hard-wired logic devices, which proved that networks of simple neuron-like elements could compute. Because they were hard-wired, they did not have the mechanisms for learning, and so they were extremely limited in modelling the functions of the more flexible and adaptive human nervous system. Solving tasks via neural networks is also called connectionism. Connectionism is a movement in cognitive science, which hopes to explain human intellectual abilities using artificial neural networks. Connectionist networks get their name from the fact that they consist of multiply connected units that interact among themselves. Obviously these systems are modelled on the biological nervous system. The interdisciplinary field of cognitive science brings together computer models from AI and experimental techniques from psychology to try to construct precise and testable theories of the workings of the human mind.

In 1949 Donald Olding Hebb, a psychologist at McGill University in Canada, designed the first learning law for artificial neural networks. His premise was that if two neurons were active simultaneously, then the strength of the connection between them should be increased.

In 1951 two graduate students in the Princeton mathematics department Marvin Minsky and Dean Edmonds built the SNARC for Stochastic Neural-Analog Reinforcement Computer, the first neural network computer. It was a randomly wired neural network learning machine consisting of 40 neurons based on the reinforcement of simulated synaptic transmission coefficients. Marvin Minsky, who is one of the most prominent figures in artificial intelligence, has made many contributions to artificial intelligence, cognitive psychology, mathematics, computational linguistics, robotics, and optics.

In late 1955, Allen Newell, Herbert A. Simon, R. Solomonoff, and J. C. Shaw from Carnegie Institute of Technology, now Carnegie Mellon University (CMU), developed The Logic Theory Machine, also called the Logic Theorist, considered by many to be the first artificial intelligence program. The program was basically a decision tree system for finding proofs for mathematical theorems. The impact that the Logic Theorist made on both the public and the field of artificial intelligence has made it a crucial stepping stone in developing the artificial intelligence field.

In 1956 John McCarthy from Princeton regarded as the father of artificial intelligence, organised a conference to draw the talent and expertise of others interested in machine intelligence for 2-month workshop. The other participants were Marvin Minsky from Harvard, Nathaniel Rochester from IBM, Claude Shannon from Bell Telephon Laboratories, Trenchard Moore from Princeton, Arthur Samuel from IBM, Oliver Selfridge and Ray Solomonoff from MIT, Allen Newell and Herbert Simon from Carnegie Tech. John McCarthy invited them to Vermont for “The Dartmouth summer research project on artificial intelligence”. From that point on, because of McCarthy, the field would be known as Artificial Intelligence. Although not a huge success, the Dartmouth conference did bring together the founders in artificial intelligence, and served to lay the groundwork for the future of artificial intelligence research. In the seven years after the conference, artificial intelligence began to pick up momentum. Although the field was still undefined, ideas formed at the conference were re-examined, and built upon. Centres for artificial intelligence research began forming at Carnegie Mellon, MIT, Stanford, and IBM, and new challenges were faced: further research was placed upon creating systems that could efficiently solve problems, by limiting the search, such as the Logic Theorist. And second, making systems that could learn by themselves.

The early years of AI were full of successes – in a limited way. Given the primitive computers and programming tools of the time, and the fact that only a few years earlier computers were seen as things that could do arithmetic and no more, it was astonishing whenever a computer did anything remotely clever. From the beginning, AI researchers were not shy in making predictions of their coming successes. In 1958 Herbert Simon predicted that within 10 years a computer would be chess champion, and an important new mathematical theorem would be proved by machine. Claims such as these turned out to be wildly optimistic. The barrier that faced almost all AI research projects was that methods that sufficed for demonstrations on one or two simple examples turned out to fail miserably when tried out on wider selections of problems and on more difficult problems.

Mathematicians like David Hilbert (1862 – 1943) and Kurt Gödel (1906 – 1978) have shown that there are some functions on the integers that cannot be represented by an algorithm – that is, they cannot be computed. This motivated Alan Turing to try to characterise exactly which functions are capable of being computed. This notion is actually slightly problematic, because the notion of a computation or effective procedure really cannot be given a formal definition. However, the Church-Turing thesis, which states that the Turing machine is capable of computing any computable function, is generally accepted as providing a sufficient definition. Turing also showed that there were some functions that no Turing machine can compute. For example, no machine can tell in general whether a given program will return an answer on a given input, or run forever. This is so called halting problem.

Gödel is best known for his proof of "Gödel's Incompleteness Theorems". In 1931 he published these results in Über formal unentscheidbare Sätze der Principia Mathematica und verwandter Systeme. He proved fundamental results about axiomatic systems, showing in any axiomatic mathematical system that is capable of expressing general arithmetic (equality, addition and multiplication of natural numbers) there are propositions that cannot be proved or disproved within the axioms of the system. In particular the consistency of the axioms cannot be proved. This ended a hundred years of attempts to establish axioms which would put the whole of mathematics on an axiomatic basis. One major attempt had been by Bertrand Russell with Principia Mathematica (1910-1913). Another was Hilbert's formalism which was dealt a severe blow by Gödel's results. The theorem did not destroy the fundamental idea of formalism, but it did demonstrate that any system would have to be more comprehensive than that envisaged by Hilbert. Gödel's results were a landmark in 20th-century mathematics, showing that mathematics is not a finished object, as had been believed. It also implies that a computer can never be programmed to answer all mathematical questions.

Although undecidability and noncomputability are important to an understanding of computation, the notion of intractability has had a much greater impact. Roughly speaking, a class of problems is called intractable if the time required to solve instances of the class grows at least exponentially with the size of the instances. The distinction between polynomial and exponential growth in complexity was first emphasised in the mid-1960s (Cobham, 1964; Edmonds, 1965). It is important because exponential growth means that even moderate-sized instances cannot be solved in any reasonable time. Therefore, one should strive to divide the overall problem of generating intelligent behaviour into tractable subproblems rather than intractable ones. The second important concept in the theory of complexity is reduction, which also emerged in the 1960s (Dantzig, 1960; Edmonds, 1962). A reduction is a general transformation from one class of problems to another, such that solutions to the first class can be found by reducing them to problems of the second class and solving the latter problems. How can one recognise an intractable problem? The theory of NP-completeness, pioneered by Steven Cook (1971) and Richard Karp (1972), provides a method. The concept of NP-completeness was invented by Cook, and the modern method for establishing a reduction from one problem to another is due to Karp. Cook and Karp have both won the Turing award, the highest honour in computer science, for their work. Cook and Karp showed the existence of large classes of canonical combinatorial search and reasoning problems that are NP-complete. Any problem class to which an NP-complete problem class can be reduced is likely to be intractable. These results contrast sharply with the “Electronic Super-Brain” enthusiasm accompanying the advent of computers. Despite the ever-increasing speed of computers, subtlety and careful use of resources will characterise intelligent systems. Put crudely, the world is an extremely large problem instance! Before the theory of NP-completeness was developed, it was widely thought that “scaling up” to larger problems was simply a matter of faster hardware and larger memories. The optimism that accompanied the development of resolution theorem proving, for example, was soon dampened when researchers failed to prove theorems involving more than a few dozen facts. The fact that a program can find a solution in principle does not mean that the program contains any of the mechanisms needed to find it in practice.

The field of complexity analysis analyses problems rather than algorithms. The first gross division is between problems that can be solved in polynomial time and those that cannot be solved in polynomial time, no matter what algorithm is used. The class of polynomial problems is called P. These are sometimes called “easy” problems, because the class contains those problems with running times like O(log n) and O(n). But it also contains those with O(n1000), so the name “easy” should not be taken literally. Another important class of problems is NP, the class of nondeterministic polynomial problems. A problem is in this class if there is some algorithm that can guess a solution and then verify whether or not the guess is correct in polynomial time. The idea is that if you either have an exponentially large number of processors so that you can try all the guesses at once, or you are very lucky and always guess right the first time, then the NP problems become P problems. One of the big open questions in computer science is whether the class NP is equivalent to the class P when one does not have the luxury of an infinite number of processors or omniscient guessing. Most computer scientists are convinced that P ≠ NP, that NP problems are inherently hard and only have exponential time algorithms. But this has never been proven. Those who are interested in deciding if P = NP look at a subclass of NP called the NP-complete problems. The word complete is used here in the sense of “most extreme”, and thus refers to the hardest problems in the class NP. It has been proven that either all the NP-complete problems are in P or none of them is. This makes the class theoretically interesting, but the class is also of practical interest because many important problems are known to be NP-complete. An example is the satisfiability problem: given a logical expression, is there an assignment of truth values to the variables of the expression that make it true?

In 1957 Frank Rosenblatt at the Cornell Aeronautical Laboratory invented the Perceptron in an attempt to understand human memory, learning, and cognitive processes. On the 23rd of June 1960, he demonstrated the Mark I Perceptron, the first machine that could “learn” to recognise and identify optical patterns. Rosenblatt’s work was a progression from the biological neural studies of noted neural researchers such as Donald Hebb and the works of Warren McCulloch and Walter Pitts that I have already mentioned. The most typical perceptron consisted of an input layer of neurons analogical to the retina in the eye connected by paths with the output layer of neurons. The weights on the connection paths were adjustable. The perceptron learning rule uses an iterative weight adjustment that is more powerful than the Hebb rule.

In 1957, the first version of a new program The General Problem Solver (GPS) was tested. The program was developed by the same team, which developed the Logic Theorist. The GPS was an extension of Wiener’s feedback principle, and was capable of solving a greater extent of common sense problems. Unlike the Logic Theorist, this program was designed from the start to imitate human problem-solving protocols. Within the limited class of puzzles it could handle, it turned out that the order in which the program considered subgoals and possible actions was similar to the way humans approached the same problems. Thus, GPS was probably the first program to embody the “thinking humanly” approach.

In 1960 Bernard Widrow and his student Marcian Ted Hoff developed a learning rule, which usually either bears their names, or is designated the least mean squares or delta rule, that is closely related to the perceptron learning rule. The similarity of models developed in psychology by Rosenblatt to those developed in electrical engineering by Widrow and Hoff is evidence of the interdisciplinary nature of neural networks. The Widrow-Hoff learning rule for a two-layer network is a precursor of the backpropagation rule for multilayer nets. Work of Widrow and his students is sometimes reported as ADAptive LINEar Systems or neurons or ADALINES. In 1962 their work was extended to MADALINES as multilayer versions of ADALINES.

While more programs were being produced, McCarthy, who has moved from Dartmouth to MIT, was busy developing a major breakthrough in artificial intelligence history. In 1958 McCarthy announced his new development, the LISP language, which is still used today, especially in the USA. LISP stands for LISt Processing, and was soon adopted as the language of choice among most artificial intelligence developers. Later special “LISP processors” and computers were developed to speed up processing, for example SYMBOLICS 36XX (Symbolics Inc.), XEROX 11XX (Xerox Company), and EXPLORER (Texas Instruments).

A couple of years after the GPS, IBM contracted a team to research artificial intelligence. Herbert Gelernter spent 3 years working on a program for solving geometry theorems called the Geometry Theorem Prover completed in 1959. Like the Logic Theorist, it proved theorems using explicitly represented axioms.

In 1959, Minsky along with John McCarthy founded the Artificial Intelligence Laboratory at MIT. It was here that the first theories of artificial intelligence were formulated and applied. Work at MIT in the mid-to-late 1960s focused on getting computers to manipulate blocks, which meant they had to understand three-dimensional geometry and certain aspects of physics. And they had to be able to see. The problem of how to make a computer not only see, through video cameras, but more importantly and problematically how to make it makes sense of what it sees, was tackled by a variety of researchers at MIT including Larry Roberts, Gerald Sussman, Adolfo Guzman, Max Clowes and David Huffman, David Waltz, Patrick Winston, and Berthold Horn. The end result of their efforts was “micro-blocks world”, where a robot was able to see the set of blocks on the table and move and stack them. Minsky supervised a series of students who chose limited problems that appeared to require intelligence to solve. These limited domains became known as microwords. The most famous microworld was the blocks world, which consists of a set of solid blocks placed on a tabletop. A task in this world is to rearrange the blocks in a certain way, using a robot hand that can pick up one block at a time.

In 1963 John McCarthy founded the Artificial Intelligence Laboratory at Stanford University.

In 1963 Marvin Minsky’s student James Slagle wrote SAINT (Symbolic Automatic INTegrator) that worked like the Logic Theorist but upon problems of algebra rather than logic. This is a tricky domain because, unlike simple arithmetic, to solve a calculus problem – and in particular to perform integration – you have to be smart about which integration technique should be used: integration by partial fractions, integration by parts, and so on.

In 1965 John Alan Robinson formulated a general method of automatic deduction of sentences in predicate calculus based on so-called resolution principle. In the same year Dr. Lotfi A. Zadeh of the University of California at Berkeley developed fuzzy logic. The importance of fuzzy logic derives from the fact that most modes of human reasoning and especially common sense reasoning are approximate in nature. Fuzzy logic, like it sounds, is a technique of applying logic to “fuzzy” or imprecise data and complex situations. In classical logic, the statement is either true or false. In fuzzy logic the statement is true by some probability. In fuzzy logic everything is a matter of degree. Knowledge is interpreted as a collection of elastic or, equivalently, fuzzy constraint on a collection of variables, and inference, which is the deriving of a conclusion, is viewed as a process of propagation of elastic constraints. For decades fuzzy logic has been massively applied to industry in Japan, which has probably enabled this country to get over USA in many industrial branches.

Marvin Minsky’s student Daniel Bobrow from MIT produced STUDENT in 1967, which could solve algebra story problems.

In 1968 Marvin Minsky’s student Tom Evans from MIT created program ANALOGY that had excellent results on automated analogies like figure A is to figure B as figure C is to figure D.

Bertram Raphael from MIT wrote SIR (Semantic Information Retrieval) in 1968 that was able to accept input statements in a very restricted subset of English and answer questions thereon.

Joseph Weizenbaum from MIT created the natural language processing machine ELIZA in 1967. It was more or less an intellectual exercise to show that natural language processing could be done. ELIZA is an automated psychoanalysis program based on the psychoanalytic principle of repeating what the patient says and drawing introspection out of the patient without adding content from the analyst. It actually just borrowed and manipulated sentences typed into it by a human. Weizenbaum believed a computer program shouldn’t be used as a substitute for a human interpersonal respect, understanding, and love. He rejected its use on ethical grounds. “ELIZA - a computer program for the study of natural language communication between man and machine” is a name of the book that Weizenbaum has written about it in 1966.

In 1968 Carol Engleman, William Martin, and Joel Moses of MIT developed a large interactive mathematics expert system called MACSYMA, which could manipulate mathematical expressions symbolically. The project entailed 100 person-years of software design and LISP programming. It is the most powerful system yet developed to solve algebraic problems on a computer. The user enters formulas and commands, which the system converts into solutions to extremely complex symbolic problems.

In 1969 Ross Quillian proposed a model for semantic knowledge in the form of a computer program called TLC – Teachable Language Comprehender. Quillian’s goal was to explore the way that knowledge about the meaning of words and concepts could be stored in a computer program that represented an artificial intelligence model for language comprehension.

Work similar to the blocks world at Stanford University eventually led to a robot that could construct an automobile water pump from randomly scattered parts, and then in 1969 to “SHAKEY”, a wobbly robot on wheels that was able to move around rooms picking up and stacking boxes. It was the first autonomous robot.

Numerous refinements to the artificial intelligence control programs were made over the years. Each tiny improvement took a lot of effort. A program called STRIPS (abbreviation of STanford Research Institute Problem Solver) was one of the earliest robot-planning programs. It was developed by R. E. Fikes and N. J. Nilsson during 1971 and 1972. STRIPS was attached to the robot SHAKEY, which had simple vision capabilities as well as tactile sensors. It took the lead over GPS for a while. STRIPS and GPS were similar in that they both used means-ends analysis. The main difference between them was in their control strategy for selecting operators. GPS used an operator-difference table; STRIPS used theorem proving. At the same time C. Hewit from MIT presented the system called PLANNER.

Then along came Terry Winograd’s SHRDLU (a nonsense name, it has no meaning.) SHRDLU developed in 1969 at Stanford University was more than an incremental advance – it was a considerable advance. It was a pioneering natural language processing system that let humans interrogate the robot in a blocks world. It could manipulate coloured building blocks based on a set of instructions and was programmed to ask questions for clarification of commands. SHRDLU was part of the micro worlds, also called blocks worlds, project, which consisted of research and programming in small worlds (such as with a limited number of geometric shapes). The MIT researchers headed by Marvin Minsky, demonstrated that when confined to a small subject matter, computer programs could solve spatial problems and logic problems. The result of these programs was a refinement in language comprehension and logic.

In 1969, Marvin Minsky and Seymour Papert published a book called Perceptrons: An Introduction to Computational Geometry, which emphasized the limitations of the perceptron and criticised claims on its usefulness. In effect, this killed funding for neural network research for 12-15 years. Minsky and Papert demonstrated there that the perceptron could not solve so-called linearly inseparable problem, the simplest example of which is XOR also called exclusive-or function.

Partly because of the perceptron critique, the 1970s saw the advent of the expert system belonging to the symbolic artificial intelligence. Expert systems predict the probability of a solution under set conditions. The programs for expert systems use a lot of IF THEN statements and heuristics. They search through the space of possible solutions, and are guided by rule-of-thumb principles. The modern term for the latter idea is “heuristic search”, a heuristic being any rule-of-thumb principle that cuts down the amount of searching required in order to find the solution to a problem. Programming using heuristics is a major part of modern artificial intelligence, as is the area now known as machine learning. Over the course of ten years, expert systems had been introduced to forecast the stock market, aiding doctors with the ability to diagnose disease, and instruct miners to promising mineral locations.

During the 1970s many new methods in the development of artificial intelligence were tested, notably Minsky’s frames theory in 1975. A frame is a data structure for representing a stereotyped situation, for example, a living room or birthday party. It includes information about how to use it, what to expect, what to do if expectations are not met. It can be thought of as a network of nodes and relations. The frames theory adopted a structured approach, collecting together facts about particular object and event types, and arranging the types into a large taxonomic hierarchy analogous to a biological taxonomy.

British psychologist David Marr pioneered the mathematical analysis of vision. In his research, he studied such questions as how depth is perceived, how motion is perceived, and what defines boundaries in the visual field. He claimed that to process nearly infinite combinations, the brain must operate on visual information in certain mathematical ways and have the ability to be finely tuned on many different scales. He intensively studied the fly visual system, working out many of its details. His work has been incredibly important, not only in the understanding of human vision, but in creating the possibility of machine vision.

In 1973 Alain Colmerauer presented an outline of PROLOG, proposed by him already in 1967, a logic-programming language for expert systems. The language has become enormously popular, especially in Europe and Japan, and has been adopted for use in the Japanese Fifth Generation Program announced in 1981. It was a 10-year plan to build intelligent computers running PROLOG in much the same way that ordinary computers run machine code. The idea was that with the ability to make millions of inferences per second, computers would be able to take advantage of vast stores of rules. The project proposed to achieve full-scale natural language understanding, among other ambitious goals. Many of those goals have not been achieved yet, but the project helped to make a qualitative leap in computer development.

The first expert systems were DENDRAL and MYCIN. DENDRAL took ten years, from 1965 to 1975, to develop at Stanford University under a team headed by Edward Feigenbaum and Robert Lindsay. Feigenbaum is today considered the guru of expert systems. DENDRAL was designed to help chemists determine the structure of molecules from spectroscopic data, a problem previously done painstakingly by trial and error and relying on the expertise of the chemist. DENDRAL, programmed in LISP, worked very well until the number of rules and logic grew beyond a certain point of complexity, when it became very difficult to add new rules or make adjustments to existing ones while maintaining stability. The system essentially became chaotic, with a small change in initial conditions having large and unforeseen impacts down the line. Nevertheless, DENDRAL has been routinely used since 1969 via computer net, which makes it the expert system used for the longest time.

INTERNIST, an internal medicine expert system that is now called CADUCEUS, was developed at the University of Pittsburgh in the early 1970s by Harry People and Jack Myers to analyse hundreds of clinical problems. The program begins by asking the physician to describe the patient’s symptoms and medical history. Each symptom is then analysed to determine the disease. Written in LISP, the system addresses some 500 diseases, 25 percent of which are within the realm of internal medicine.

MYCIN, designed to diagnose infectious blood diseases, went some way toward overcoming DENDRAL’s shortcoming by separating the rules governing when to apply the rules from the knowledge base, which is itself a list of IF THEN rules. MYCIN problem domain was selection of antibiotics for patients with serious infections. Medical decision making, particularly in clinical medicine, is regarded as an “art form” rather than a “scientific discipline”: this knowledge must be systemised for practical day-to-day use and for teaching and learning clinical medicine. Its target users were physicians and possibly medical students and paramedics. The originator of MYCIN was Edward Shortliffe from the Department of Medicine and Computer Science at Stanford University School of Medicine in California who created it in 1972. EMYCIN is a problem-independent version of MYCIN, which is still used in American medicine practice.

DENDRAL was an all-or-nothing system. It would only provide an answer when it was 100% certain of the correctness of its response. As we all know, in daily life, few things are certain. This is certainly true of medicine, a profession, which, for all its high-tech gadgetry, still relies heavily on physician intuition or heuristic decisions. MYCIN, appropriately for a medical expert, incorporated probability into its decisions. Its answers would not be straight Yes or No, but “There’s a 63% chance the patient has X infection”. As I have already said, the fundamental advance represented by MYCIN over DENDRAL was that its knowledge base was separated from the control structure. All modern expert systems use this two-part structure, which facilitated the development of expert system “shells”, a control structure plus empty slots into which one could feed expert knowledge from any domain. The primary difference among the large number of modern expert systems is not how they reason, they all reason in pretty much the same way. The difference, rather, is in what they know. One expert system may know about infectious diseases, another about oil-bearing rock formations. The hardest part of creating a new expert system is transferring knowledge from a human expert into the system’s knowledge base. In education, we call this “teaching”. In artificial intelligence, it’s known as “knowledge engineering”.

Knowledge engineering got its start with TEIRESIAS, a program developed in 1976 by Randall Davis that helped the human expert spot gaps and inconsistencies in the knowledge being transferred to the system. DENDRAL and MYCIN were terrific advances for artificial intelligence in an academic and scientific sense, but they were not ready for prime time in the real world of chemists or doctors. They were not big enough and not powerful enough.

In 1975 the Carnegie Mellon University developed HEARSAY, a system for speech understanding. It accepts a speech wave as input and produces a list of hypotheses about what was enunciated as well as a database query based on the best guess of its meaning. The system possessed a 1,000-word vocabulary and a 75 percent accuracy rate in interpreting human speech. The system also demonstrated the clear superiority of the heuristic method over the algorithmic method in dealing with speech understanding.

A commercial expert system, PROSPECTOR was developed in the late 1970s at Stanford Research Institute International (SRI) by a team of prominent scientists including Richard Duda, Peter Hart, and P. Barnett. The LISP-based system locates valuable ore deposits and produces maps and geological site evaluations. The team worked with a number of mineral experts to fashion the system’s five models. Once the initial data is entered into the system, PROSPECTOR selects the model that best explains the data. This system has gained popularity by finding a 100 million $ molybdenum deposit in Washington during the first six weeks of its usage.

PUFF was developed at Stanford in 1980 to interpret readings from respiratory tests given to patients in a pulminary (lung) function lab. The system interfaces directly with the lab’s lung machines and measures the capacity of the patient’s lungs and their ability to get oxygen in and carbon dioxide out on a regular basis. PUFF relies on 64 rules stored in the knowledge base to interpret the test data. The system’s accuracy rate is about 93 percent.

The program, that counts as the first real-world application of expert system technology, was Digital Equipment Corporation (DEC)’s XCON – “Expert Configurer”. XCON, originally called R1, introduced in 1982 by John McDermott of Carnegie Mellon University, helped DEC salespeople decide what configuration of hardware components was best for a given customer’s needs (DEC sold “clusters” of minicomputers that could be configured in hundreds of different ways). XCON then helped DEC production engineers put the components together. The system simply took a customer’s order as input and drew a set of diagrams that will be used by the assemblers to build a computer. XCON was credited with making DEC profitable. But like DENDRAL and MYCIN before it, XCON too would eventually become bogged down as it grew in size and complexity. There were needed other approaches in artificial intelligence.

In 1974 in his Harvard doctoral thesis Paul Werbos discovered a learning rule for a multilayer neural net called backpropagation of errors or the generalised delta rule. Although his work had not gained wide publicity, this learning rule was independently rediscovered by people like David Parker in 1985 and Yann Le Cun in 1988. The basic elements of the theory can be traced back to the work of Bryson and Ho in 1969. Parker’s work was refined and publicised in 1986 by psychologists David Rumelhart, of the University of California at San Diego, and James McClelland, of Carnegie-Mellon University. The discovery played a major role in the re-emergence of neural networks in the 1980s, because it provided the direction how to solve tasks that were declared as unsolvable by neural nets by Minsky and Papert in 1969.

In 1980 American philosopher John Searle attempted to cast doubt upon whether artificial intelligence can be viewed as Good Old-Fashioned artificial intelligence, also called Strong Artificial Intelligence, in his “Chinese room” thought experiment. The Strong Artificial Intelligence says approximately the following things: A computer programmed in the right way really is a mind, that is, it can understand and have other cognitive states, which means that the programs actually explain human cognition. Opposing to the Strong Artificial Intelligence there is the Weak Artificial Intelligence saying that the computer is a useful tool for the study of the human mind, and it helps us formulate and test our hypotheses in a more precise, rigorous way. Other definition that distinguishes Weak AI from Strong AI is that the assertion that machines can be made to act as if they were intelligent is called the Weak AI position, while the Strong AI position claims that machines that act intelligently have real, conscious mind. Searle has no objection to Weak Artificial Intelligence, only to Strong Artificial Intelligence. To put it another way, Searle has no objection to the use of computers to simulate intelligence; what he objects to is the notion that intelligence is nothing but manipulating symbols. The Chinese room experiment is about a man sitting in a room with Chinese symbols and rules for composing meaningful sentences from these symbols. The man in the room doesn’t know Chinese, but he knows the language used for the description of the rules. He can’t get the meaning of the Chinese symbols from these rules, but he is able to compose right answers to the questions in Chinese that someone outside the room sends into the room. This man could pass the Turing test without knowing what was the talk about. Searle wanted to show by this thought experiment that Strong Artificial Intelligence and cognitive sciences cannot examine the inner states of mind, because these are accessible only via introspection. Margaret A. Boden, Professor of Philosophy and Psychology at the University of Sussex, has remarked to this that we should regard the Chinese room as a whole, because the emergence effect has caused that all the components that do not know Chinese have built a system that “knows” Chinese.

In 1982 John Hopfield of the California Institute of Technology together with David Tank, a researcher at AT&T, introduced model of neural nets, which came to be known as Hopfield Networks, which again revived research in the neural network area. The Hopfield neural network is a simple artificial network, which is able to store certain memories or patterns in a manner rather similar to the brain – the full pattern can be recovered if the network is presented with only partial information.

In 1982 Teuvo Kohonen, of Helsinki University of Technology, developed self-organising feature maps that use a topological structure for the cluster units. These nets have been applied to speech recognition for Finnish and Japanese words in 1988, the solution of the “Travelling Salesman Problem” also in 1988, and musical composition in 1989.

In 1986 a team at Johns Hopkins University led by Terrence Sejnowski trained a VAX computer in the rules of phonetics, using a multilayer perceptron network called NETtalk. In just twelve hours of learning the machine was able to read and translate text patterns into sounds with a 95% success rate. The team noted that the machine sounded uncannily like a child learning to read aloud while it was training. NETtalk used the back-propagation learning rule.

Applications using nets like NETtalk can be found in virtually every field that uses neural nets for problems, that involve mapping a given set of input to a specified set of target outputs. As is the case with most neural networks, the aim is to train the net to achieve a balance between the ability to respond correctly to the input patterns that are used for training or memorisation and the ability to give reasonable good responses to input that is similar, but not identical, to that used in training data, which is called generalisation. Neural nets are now used as a tool for solving a wide variety of problems like speech recognition, optical character recognition (OCR), knowledge bases, bomb detectors, data visualisation, financial market predictions, medical diagnoses, and much, much more.

In 1996 IBM Computer Deep Blue defeated World Chess Champion, Gary Kasparov. One persistent “problem” is that as soon as an artificial intelligence technique truly succeeds, in the minds of many it ceases to be artificial intelligence, becoming something else entirely. For example, when Deep Blue defeated Kasparov, there were many who said Deep Blue wasn’t artificial intelligence, since after all it was just a brute force parallel minimax search. Deep Blue was an IBM RISC System/6000 Scalable Power Parallel System. It had 32 processors dedicated to calculation each processor connected to 8 chess specific processors. It calculated 200.000.000 moves per second.

The recent development of artificial intelligence, a computerised toddler named HAL, after the self-aware machine created by Arthur C. Clarke in 2001: A Space Odyssey, may be the first form of artificial intelligence to understand human language, according to its inventors. Researchers at Artificial Intelligence Enterprises (Ai), an Israeli company, claim that HAL has developed the linguistic abilities of a child of 15 months, making it the first to satisfy a standard test of a true mechanical mind. The HAL software, which is compact enough to run on a laptop computer, learns in similar fashion to children. It is capable of speaking a few simple words, and may eventually develop the language capacity of a child aged five. Using “learning algorithms” and children’s stories, scientists at Ai claim to have created a computer that can teach itself to “speak”. The researchers, led by Chief Scientist Jason Hutchens, developed a small program that uses no input other than typed-in words in natural everyday language to teach the computer to understand and communicate via human speech. HAL is trained by having a single human “carer” type in children's stories. HAL then responds to questions in simple sentences and the carer responds back as a parent would. The only motivation in the program is described as a “built-in desire for positive reinforcement from the carer”. After a training session the carer analyses HAL’s responses and gives feedback to the algorithm designers, who make new algorithms, which are then fed back into HAL, and the cycle continues. As the algorithms improve, this kind of training could take only days, instead of the years it takes human babies to learn a language. Ai’s approach is different from other speech programs, which use statistical and grammatical rules linked to giant vocabulary lists. This should allow HAL to respond to commands in more normal language, instead of the rigid syntax and specific command words necessary for existing voice command programs.

In the end of my presentation let me explain some artificial intelligence terms. One of them is soft computing. Soft computing, according to the definition by the inventor of fuzzy logic Lotfi Zadeh, differs from conventional (hard) computing in that, unlike hard computing, it is tolerant of imprecision, uncertainty and partial truth. In effect, the role model for soft computing is the human mind. The guiding principle of soft computing is to exploit the tolerance for imprecision, uncertainty and partial truth to achieve tractability, robustness and low solution cost. At this juncture, the principal constituents of soft computing (SC) are fuzzy logic (FL), neural network theory (NN) and probabilistic reasoning (PR), with the latter subsuming belief networks, genetic algorithms, chaos theory and parts of learning theory. What is important to note is that SC is not a melange of FL, NN and PR. It is rather a partnership in which each of the partners contributes a distinct methodology for addressing problems in its domain. In this perspective, the principal contributions of FL, NN and PR are complementary rather than competitive.

An agent is a software system that engages and helps users. More generally, an agent is just something that perceives and acts. An example of an agent, that you probably know, is Microsoft Paperclip. The paperclip in Microsoft World is an example of a simple and sometimes annoying agent that monitors the user’s input and offers assistance when a common task is recognised. The dynamic and complex nature of both information and applications require software not merely respond to user’s requests but also intelligently anticipate, adapt, and help the user. Such systems are called intelligent software agents. The study of agents is as old as the field of artificial intelligence. John McCarthy, one of artificial intelligence’s founders, conceived the idea of an intelligent software agent in the mid 1950s. The term “agent” was coined by Oliver Selfridge a few years later.

Between the 60s and 70s, Dr. John Holland from MIT, along with his students and colleagues, laid the foundation for an area of artificial intelligence research that is now called genetic algorithms. Genetic algorithms are not a separate discipline under artificial intelligence research, but are considered part of evolutionary computation. The field of evolutionary computation is mainly made up of evolution strategies, evolutionary programming and genetic algorithms. Research in this field is based on the idea that evolution could be used as an optimisation tool for engineering problems. The common thread in all evolutionary systems is the belief that it is possible to evolve a population of candidate solutions to a given problem, using operators inspired by natural genetic variation and natural selection. John Holland’s original intent was not to design algorithms that solve specific problems, as was the mindset of the day, but instead to study the process of adaptation. Genetic algorithms originated in the work of IBM researcher R. M. Friedberg (1958), who attempted to produce learning by mutatuing small FORTRAN programs. Since most mutations to the programs produced inoperative code, little progress was made. John Holland (1975) reinvigorated the field by using bit-string representations of agents such that any possible string represented a functioning agent. John Koza (1992) has championed more complex representations of agents coupled with mutation and mating techniques that pay careful attention to the syntax of the representation language. Current research appears in the annual Conference of Evolutionary Programming.

Genetic algorithms, also called machine evolution, are a sub-field of so-called artificial life or a-life. One of the most interesting projects of this kind is Tierra. Tierra was started in 1990 by American biologist Thomas Ray. Originally run on a single, massively parallel Connection Machine, the program is now being run on several computers linked by the Internet, giving it a much larger and more diverse environment in which to evolve. The researchers hope that Tierrans will evolve into commercially harvestable software. The Tierra C source code creates a virtual computer and its Darwinian operating system, whose architecture has been designed in such a way that the executable machine codes are evolvable. This means that the machine code can be mutated (by flipping bits at random) or recombined (by swapping segments of code between algorithms), and the resulting code remains functional enough of the time for natural (or presumably artificial) selection to be able to improve the code over time.


Source: http://multiedu.tul.cz/~dana.nejedlova/AI/AIhist.doc

Web site to visit: http://multiedu.tul.cz

Author of the text: indicated on the source document of the above text

If you are the author of the text above and you not agree to share your knowledge for teaching, research, scholarship (for fair use as indicated in the United States copyrigh low) please send us an e-mail and we will remove your text quickly. Fair use is a limitation and exception to the exclusive right granted by copyright law to the author of a creative work. In United States copyright law, fair use is a doctrine that permits limited use of copyrighted material without acquiring permission from the rights holders. Examples of fair use include commentary, search engines, criticism, news reporting, research, teaching, library archiving and scholarship. It provides for the legal, unlicensed citation or incorporation of copyrighted material in another author's work under a four-factor balancing test. (source: http://en.wikipedia.org/wiki/Fair_use)

The information of medicine and health contained in the site are of a general nature and purpose which is purely informative and for this reason may not replace in any case, the council of a doctor or a qualified entity legally to the profession.


Artificial Intelligence


The texts are the property of their respective authors and we thank them for giving us the opportunity to share for free to students, teachers and users of the Web their texts will used only for illustrative educational and scientific purposes only.

All the information in our site are given for nonprofit educational purposes


Artificial Intelligence



Topics and Home
Term of use, cookies e privacy


Artificial Intelligence