Home About us Products Services Contact us Bookmark
:: wikimiki.org ::
M68k

M68k

The Motorola 680x0/0x0/m68k/68k/68K family of CISC microprocessor CPU chips were 32-bit from the start, and were the primary competition for the Intel x86 family of chips.

The 68k family members


- Generation one
  - Motorola 68000 a hybrid 16/32 bit chip
  - Motorola 68EC000
  - Motorola 68HC000
  - Motorola 68008 a hybrid 8/16/32 bit chip
  - Motorola 68010
  - Motorola 68012
- Generation two (fully 32-bit)
  - Motorola 68020
  - Motorola 68EC020
  - Motorola 68030
  - Motorola 68EC030
- Generation three (fully 32-bit)
  - Motorola 68040
  - Motorola 68EC040
  - Motorola 68LC040
- Generation four (fully 32-bit)
  - Motorola 68060
- Others
  - Motorola CPU32 (aka Motorola 68330)
  - Motorola ColdFire
  - Motorola DragonBall

Main uses

The 68k line of processors have been used in a variety of systems, from Texas Instruments calculators up to critical control systems of the Space Shuttle. However, they have become most well-known as processors powering desktop computers such as the Macintosh, the Amiga, the Atari, and others. Today, these desktop systems are either end-of-line (in case of the Amiga and the Atari), or are using different processors (such as is the case for the Macintosh). Since these desktops are now more than a decade old, the original manufacturers either are out of business, or no longer provide an operating system for the hardware; however, the Debian and NetBSD operating systems are still supported for m68k processors.

Architectural heritage

People who are familiar with the PDP-11 or VAX usually feel comfortable with the 68000. With the exception of the split of general purpose registers into specialized data and address registers, the 68000 architecture is in many ways a 32-bit PDP-11.

Where did the 68050 go? Was there no -070?

Note that there is no 68050, this is because the design that was destined to be the 68050 was eventually released as a version of the 68040. There is also no revision of the 68060, as Motorola was in the process of shifting away from the 68k and 88k processor lines into its new PowerPC business, so the 68070 was never developed. Had it been, it would have been a revised 68060.

The next 68k generation

The 4th generation 68060 shared most of the features of the Intel P5 architecture of x86. Had Motorola decided to stick with the 680x0 series, it is very likely that the next processor (68080) would have resembled Intel's P6 architecture.

Other variants

After the mainline 68k processors' demise, the 68k family has been used to some extent in microcontroller/embedded microprocessor versions. These chips include the ones listed under "other" above, i.e. the CPU32 (aka 68330), the ColdFire, and the DragonBall.

Competitors to the mainstream 68ks

The principal competitors in the microcomputer market for generation one were the x86 architecture P1 and P2 IA-16 chips (8088, 80286). For generation two, it was the P3 IA-32 chips (80386), and for generation three it was the P4 IA-32 chips (80486). Generation four did compete against the P5 IA-32 chips (Pentiums), but to a lesser extent, as much of the hitherto 68k marketplace was shifting over to the PowerPC, sounding the death knell for the 680x0 on the desktop. Category:68k microprocessors Category:Computer architecture

Motorola

Motorola is a global communications company based in Schaumburg, Illinois, a Chicago suburb.

Divisions


- Mobile Devices (MD)
- Networks
- Connected Home Solutions (CHS)
- Government & Enterprise Mobility Solutions (GEMS)
- Corporate (CORP)

History

The company started as Galvin Manufacturing Corporation in 1928. The name Motorola was adopted in 1947, but the word had been used as a trademark since the 1930s. Founder Paul Galvin came up with the name Motorola when his company started manufacturing car radios. A number of early companies making phonographs, radios, and other audio equipment in the early 20th century used the suffix "-ola", the most famous being Victrola; there was also a film editing device called a Moviola. Many of Motorola's products have been radio-related, starting with a battery eliminator for radios, through the first walkie-talkie in the world, defense electronics, cellular infrastructure equipment, and mobile phone manufacturing. The company was also strong in semiconductor technology, including integrated circuits used in computers. Motorola has been the main supplier for the microprocessors used in Commodore Amiga, Apple Macintosh and Power Macintosh personal computers. The chip used in the latter computers, the PowerPC family, was developed with IBM, and in a partnership with Apple (known as the AIM alliance). Motorola also has a diverse line of communication products, including satellite systems, digital cable boxes and modems. On October 6, 2003, Motorola announced that it would spin off its semiconductor product sector into a separate company called Freescale Semiconductor, Inc.. The new company began trading on the New York Stock Exchange on July 16th of the following year. See also: List of Motorola products (including Freescale's semiconductors) The Six Sigma quality system was developed at Motorola even though it became most well known because of its use by General Electric. It was created under the direction of Bob Galvin, the son of founder Paul Gavin, when he was running the company, by engineer Bill Smith. Motorola University is one of many places that provides Six Sigma training. Recently, a massive turnaround plan has been executed successfully by CEO Edward Zander, although many credit former CEO Chris Galvin with taking the first comprehensive steps. Due to recent layoffs and the spinoff of Freescale Semiconductor, the number of employees working for Motorola has gone from just over 150,000 to approximately 66,000. Motorola has recently been regaining market share in the cellular-phone business from Nokia, Samsung and others due to stylish new cellular phone designs like the Motorola RAZR V3. In addition, the company unveiled the first ever iTunes phone, the Motorola ROKR E1, in September 2005.

Directors and officers

Current members of the board of directors of Motorola are: Laurance Fuller, Judy Lewent, Walter Massey, Thomas Meredith, Nicholas Negroponte, Indra Nooyi, Samuel Scott, Ron Sommer, James Stengel, Douglas Warner, John A. White, Miles White, and Edward Zander.
- Chairman and CEO: Ed Zander (formerly of Sun Microsystems)
- CFO: David Devonshire (formerly of Ingersoll-Rand)
- CMO: Geoffrey Frost (formerly of Nike, Inc.)
- General Counsel: Peter Lawson (formerly of Baxter International)
- CSO: Richard N. Nottenburg (formerly of Vitesse Semiconductor)
- CTO: Padmasree Warrior

Diversity

Motorola received a 100% rating on the Corporate Equality Index released by the Human Rights Campaign starting in 2004, the third year of the report.

Products


- See List of Motorola products.

Competitors


- Nokia
- Ericsson
- Lucent
- Nortel Networks
- ZTE
- Huawei
- EADS
- M/A-COM
- OpenSky
- EDACS
- E.F. Johnson
- Dataradio
- Tait
- Texas Instruments
- Samsung Electronics
- BenQ
- LG Electronics
- Cyon
- Anycall
- SK Teletech
- Pantech Curitel
- KTF Ever
- VK Mobile

External links


- [http://www.motorola.com/ Motorola official website]
  - [http://www.motorola.com/content/0,,115-110,00.html A short history timeline]
- [http://webloga.com/mobile_phones,0,mobile_phones.html Motorola mobile phones] Category:Electronics companies of the United States Category:Fortune 500 companies ko:모토롤라 ja:モトローラ


CISC

A Complex Instruction Set Computer (CISC) is a microprocessor instruction set architecture (ISA) in which each instruction can execute several low-level operations, such as a load from memory, an arithmetic operation, and a memory store, all in a single instruction. The term was coined in contrast to Reduced Instruction Set Computer (RISC). Before the first RISC processors were designed, many computer architects tried to bridge the "semantic gap" - to design instruction sets to support high-level programming languages by providing "high-level" instructions such as procedure call and return, loop instructions such as "decrement and branch if non-zero" and complex addressing modes to allow data structure and array accesses to be combined into single instructions. Additionally, the compact nature of a CISC ISA results in program sizes and fewer calls to main memory, which at the time (the 1960s) resulted in a tremendous savings on the cost of a computer. While they achieved their aim of allowing high-level language constructs to be expressed in fewer instructions, it was observed that they did not always result in improved performance. For example, on one processor it was discovered that it was possible to improve performance by not using the procedure call instruction but using a sequence of simpler instructions instead. Furthermore, the more complex the instruction set, the greater the overhead of decoding any given instruction, both in execution time and silicon area. This is particularly true for processors which used microcode to decode the (macro)instructions. In other words, adding a large and complex instruction set to the processor even slowed down the execution of simple instructions. Implementing all these complex instructions also required a lot of work on the part of the chip designer, and a lot of transistors; this left less room on the processor to optimize performance in other ways. Examples of CISC processors are the VAX, PDP-11, Motorola 68000 family and the Intel x86 CPUs. The term, like its antonym RISC, has become less meaningful with the continued evolution of both CISC and RISC designs and implementations. Modern "CISC" CPUs, such as recent x86 designs like the Pentium 4, whilst they usually support every instruction that their predecessors did, are designed to work most efficiently with a subset of instructions more resembling a typical "RISC" instruction set. Indeed, many CISC CPUs (such as modern x86 processors from both Intel and AMD) decode many x86 instructions into a series of smaller internal "micro-operations" that are then executed internally by the processor.

See also


- CPU
- RISC
- ZISC
- microprocessor
- computer
- CPU design
- computer architecture Category:Computer architecture ko:CISC ja:CISC

Central processing unit

package]] A central processing unit (CPU), or sometimes simply processor, is the component in a digital computer that interprets and executes instructions and data contained in software. Microprocessors are CPUs that are manufactured on integrated circuits, often as a single-chip package. Since the mid-1970s, single-chip microprocessors have become the most common and prominent implementations of CPUs, and today the term is almost always applied to this form. The phrase "Central processing unit" is, in general terms, a functional description of a certain class of logic machines that can execute complex computer programs. This broad definition can easily be applied to many early computers that existed long before the term "CPU" ever came into widespread usage. The term itself and its acronym have been in use at least since the early 1960s.

History

computer program Prior to the advent of machines that resemble today's CPUs, computers such as ENIAC had to be physically rewired in order to perform different tasks. These machines are often referred to as "fixed program computers" since they had to be physically reconfigured in order to run a different program. Since the term "CPU" is generally defined as a software (program) executing device, the earliest devices that could rightly be called CPUs came with the advent of the stored program computer. The idea of a stored program computer was already present during the design of ENIAC, but was not used in that computer due to speed considerations. Before ENIAC was even completed, on 1945-06-30 mathematician John von Neumann published the paper entitled First Draft of a Report on the EDVAC, which outlined the design of a stored program computer that would eventually be completed in August 1949. EDVAC was designed to perform a certain number of instructions (or operations) of various types. These instructions could be combined to create useful programs for the EDVAC to run. Significantly, the programs written for EDVAC were stored in high speed computer memory, rather than being specified by the physical wiring of the computer. This overcame a severe limitation of ENIAC, which was the large amount of time and effort it took to reconfigure the computer to perform a new task. With Von Neumann's design, the program, or software, that EDVAC ran could be changed simply by changing the contents of the computer's memory. It should be noted that while Von Neumann is most often credited with the design of the stored program computer due to his design of EDVAC, others before him such as Konrad Zuse had suggested similar ideas. Additionally, the so-called Harvard architecture of the Harvard Mark I, which was completed before EDVAC, also utilized a stored-program design using punched paper tape rather than electronic memory. The key difference between the Von Neumann and Harvard architectures is that the latter separates the storage and treatment of CPU instructions and data, while the former uses the same memory space for both. Most modern CPUs are primarily Von Neumann in design, but elements of the Harvard architecture are commonly seen as well. Being digital devices, all CPUs deal with discrete states and therefore require some kind of switching elements to differentiate between and change these states. During the height of electromechanical and electronic computers, electrical relays and vacuum tubes (thermionic valves) were commonly used as switching elements. Although these had distinct speed advantages over earlier, purely mechanical designs, they were unreliable for various reasons. For example, building direct current sequential logic circuits out of relays requires additional hardware to cope with the problem of contact bounce. While vacuum tubes don't suffer from contact bounce, they must heat up before becoming fully operational and eventually stop functioning due to the slow contamination of their cathodes that occurs when the tubes are in use. Usually, when a tube failed, the CPU would have to be diagnosed to locate the failing unit so it could be replaced. Therefore, early electronic (vacuum tube based) computers were generally faster, but less reliable than electromechanical (relay based) computers. Tube computers like EDVAC tended to average eight hours between failures, whereas relay computers like the (slower, but earlier) Harvard Mark I failed very rarely. In the end, tube based CPUs became dominant because the significant speed advantages afforded generally outweighed the reliability problems. Most of these early synchronous CPUs ran at low clock rates compared to modern microelectronic designs. Clock signal periods ranging from 100 kHz to 4 MHz were very common at this time, limited largely by the speed of the switching devices they were built with. kHz interface of a PDP-8/I]]

Discrete component transistor CPUs

The design and complexity of CPUs increased as various technologies facilitated building smaller and more reliable electronic devices. The first such improvement came with the advent of the transistor. Transistorized CPUs during the 1950s and 1960s no longer had to be built out of bulky, unreliable, and fragile switching elements like vacuum tubes and electrical relays. With this improvement, more complex and reliable CPUs were built onto one or several printed circuit boards containing discrete transistor components. In 1964, IBM introduced its System/360 computer architecture, which was used in a series of computers that could run the same programs with different speed and performance. This was significant at a time when most electronic computers were incompatible with one another, even those made by the same manufacturer. To facilitate this improvement, IBM utilized the concept of a microprogram, which still sees widespread usage in modern CPUs (often called "microcode"). The System/360 architecture was so popular that it dominated the mainframe computer market for the next few decades and left a legacy that is still continued by similar modern computers like the IBM zSeries. In the same year (1964), Digital Equipment Corporation (DEC) introduced another influential computer aimed at the scientific and research markets, the PDP-8. DEC would later introduce the extremely popular PDP-11 line that originally was built with discrete transistors but was eventually implemented with integrated circuits once these became practical. While discrete component transistor CPUs were in heavy usage, new high performance designs like SIMD (Single Instruction Multiple Data) vector processors began to appear. These early experimental designs later gave rise to the era of specialized supercomputers like those made by Cray Inc. Transistor based computers had several distinct advantages over their predecessors. Aside from facilitating increased reliability and lowered power consumption, transistors also allowed CPUs to operate at much higher speeds due to the short switching time of a transistor in comparison to a tube or relay. Thanks to both the increased reliability as well as the dramatically increased speed of the switching elements (which were almost exclusively transistors by this time), CPU clock rates in the tens of megahertz were obtained during this period.

Microprocessors

An innovation that has significantly affected the design and implementation of CPUs came in the mid-1970s with the microprocessor. Since the introduction of the first microprocessor (the Intel 4004) in 1970 and the first widely-used microprocessor (the Intel 8080) in 1974, this class of CPUs has almost completely overtaken all other implementations. This fact, combined with the advent of the personal computer, has led to the term "CPU" being applied almost exclusively to microprocessors in the past few decades. personal computer While the previous generation of CPUs were integrated as discrete components on one or more circuit boards, microprocessors are manufactured onto compact integrated circuits (ICs), often a single chip. The smaller transistor sizes mean faster switching time largely due to decreased gate parasitic capacitance. This has allowed synchronous microprocessors to utilize clock rates ranging from tens of megahertz to several gigahertz. Additionally, as the ability to construct exceedingly small transistors on an IC has increased, the complexity and number of transistors in a single CPU has increased dramatically. This trend has been observed by many and is often described by Moore's law, which has proven to be a fairly accurate model of the growth of CPU (and other IC) complexity to date. While the complexity, size, construction, and general form of CPUs has changed drastically over the past sixty years, it is notable that the basic design and function has not changed much at all. Almost all common CPUs today can be very accurately described as Von Neumann stored program machines. As the aforementioned Moore's law continues to hold true, concerns about the limits of integrated circuit transistor technology have become much more prevalent. Extreme miniaturization of electronic gates is causing the effects of phenomena like electromigration and subthreshold leakage to become much more significant. These newer concerns are among the many factors causing researchers to investigate new methods of computing such as the quantum computer as well as expand the usage of parallelism and other methods that extend the usefulness of the classical Von Neumann model.

CPU operation

The fundamental operation of most CPUs, regardless of the physical form they take, is to execute a sequence of stored instructions called a program. Herein we are discussing devices that conform to the common aforementioned Von Neumann architecture. The program is represented by a series of numbers that are kept in some kind of computer memory. There are four steps that nearly all Von Neumann CPUs use in their operation, fetch, decode, execute, and writeback. The first step, fetch, involves retrieving an instruction (which is a number or sequence of numbers) from program memory. The location in program memory is determined by a program counter (PC), which stores a number that identifies the current position in the program. In other words, the program counter keeps track of the CPU's place in the current program. Having been used to fetch an instruction, the PC is incremented by the length of the instruction word in terms of memory units. Often the instruction to be fetched must be retrieved from relatively slow memory, causing the CPU to stall while waiting for the instruction to be returned. This issue is largely addressed in modern processors by cache and superscalar architecture (see below). program counter The instruction that the CPU fetches from memory is used to determine what the CPU is to do. In the decode step, the instruction is broken up into parts that have significance to other portions of the CPU. The way in which the numerical instruction value is interpreted is defined by the CPU's Instruction set architecture (ISA) . Often, one group of numbers of the instruction, called the opcode, indicates which operation to perform. The remaining parts of the number usually provide information required for that instruction, such as operands for an addition operation. The operands may contain a constant value in the instruction itself (called an immediate value), or a place to get a value: a register or a memory address. In older designs the portions of the CPU responsible for instruction decoding were unchangable hardware devices. However, in more abstract and complicated CPUs and ISAs, a microprogram is often used to assist in translating instructions into various configuration signals for the CPU. This microprogram is often rewritable and can be modified to change the way the CPU decodes instructions even after it has been manufactured. microprogram After the fetch and decode steps, the execute step is performed. During this step, various portions of the CPU are "connected" (by a switching device such as a multiplexer) so they can perform the desired operation. If, for instance, an addition operation was requested, an ALU will be connected to a set of inputs and a set of outputs. The inputs provide the numbers to be added, and the outputs will contain the final sum. If the addition operation produces a result too large for the CPU to handle, an arithmetic overflow flag in a flags register may also be set (see the discussion of integer precision below). Various structures can be used for providing inputs and outputs. Often, relatively fast and small memory areas called CPU registers are used when a result is temporary or will be needed again shortly. Various forms of computer memory (for example, DRAM) are also often used to provide inputs and outputs for CPU operations. These types of memory are much slower compared to registers, both due to physical limitations and because they require more steps to access than the internal registers. However, compared to the registers, this external memory is usually less expensive and can store much more data, and is thus still necessary for computer operation. The final step, writeback, simply "writes back" the results of the execute step to some form of memory. Very often the results are written to some internal CPU register for quick access by subsequent instructions. Some types of instructions manipulate the program counter rather than produce a single clearly defined result. These are generally called "jumps" and facilitate behavior like loops, conditional program execution (through the use of a conditional jump), and functions in programs. Many instructions will also change the state of digits in a "flags" register. These flags can be used to influence how a program behaves, since they often indicate the outcome of various operations. For example, one type of "compare" instruction considers two values and sets a number in the flags register according to which one is greater. This flag could then be used by a later instruction to determine program flow. After the execution of the instruction and writeback of the resulting data, the entire process repeats, with the next instruction cycle normally fetching the next-in-sequence instruction due to the incremented value in the program counter. If the completed instruction was a jump, the program counter will be modified to contain the address of the instruction that was jumped to, and program execution continues normally. In more complex CPUs than the one described here, multiple instructions can be fetched, decoded, and executed simultaneously. This section describes what is generally referred to as the "Classic RISC pipeline," which in fact is quite common among the simple CPUs used in many electronic devices (often called microcontrollers).

Design and implementation

Integer precision

The way a CPU represents numbers is a design choice that affects the most basic assumptions about how the device functions. Some early digital computers used the common decimal (base ten) numeral system to internally represent numbers. Other computers have used more exotic numeral systems like ternary (base three). By far, the most common numeral system used in CPUs is the binary (base two) system. Nearly all modern CPUs represent numbers in binary form, each digit being interpreted from some physical quantity such as "high" and "low" voltage. voltage Related to number representation is the size and precision of numbers that a CPU can represent. In the case of a binary CPU, a 'bit' refers to one significant place in the numbers a CPU deals with. The number of bits (or numeral places) a CPU uses to represent numbers is often called "word size," "bit width," "data path width," or "integer precision" when dealing with strictly integer numbers (as opposed to floating point). This number differs between architectures, and often within different parts of the very same CPU. For example, an 8-bit CPU deals with a range of numbers that can be represented by eight binary digits (each digit having two possible values), that is, 28 or 256 discrete numbers. Integer precision can also affect the number of locations in memory the CPU can "address" (locate). For example, if a binary CPU uses 32 bits to represent a memory address, and each memory address represents one octet (8 bits), the maximum quantity of memory that CPU can address is 232 octets, or 4 GiB. This is a very simple view of CPU address space, and many modern designs use much more complex addressing methods like paging in order to locate more memory with the same integer precision. Higher levels of integer precision require more structures to deal with the additional digits, and therefore more complexity, size, power usage, and generally expense. It is not at all uncommon, therefore, to see 4 or 8 bit microcontrollers used in modern applications, even though CPUs with much higher precision (such as 16, 32, 64, even 128 bit) are available. The simpler microcontrollers are usually cheaper, use less power, and therefore dissipate less heat, all of which can be major design considerations for electronic devices. However, in higher-end applications, the benefits afforded by the extra precision (most often the additional address space) are more significant and often affect design choices. To gain some of the advantages afforded by both lower and higher bit precisions, many CPUs are designed with different bit widths for different portions of the device. For example, the IBM System/370 used a CPU that was primarily 32-bit, but it used 128-bit precision inside its floating point units to facilitate greater accuracy and range in floating point numbers. Many later CPU designs use similar mixed bit width, especially when the processor is meant for general purpose usage where a reasonable balance of integer and floating point capability is required.

Clock rate

floating point Most CPUs, and indeed most sequential logic devices, are synchronous in nature. That is, they are designed and operate on assumptions about a synchronization signal. This signal, known as a "clock signal," usually takes the form of a periodic square wave. By calculating the maximum time that electrical signals can move in various branches of a CPU's many circuits, the designers can select an appropriate period for the clock signal. This period must be longer than the amount of time it takes for a signal to move, or propagate, in the worst-case scenario. In setting the clock period to a value well above the worst-case propagation delay, it is possible to design the entire CPU and the way it moves data around the "edges" of the rising and falling clock signal. This has the advantage of simplifying the CPU significantly, both from a design perspective and a transistor count perspective. However, it also carries the disadvantage that the entire CPU must wait on its slowest elements, even though some portions of it are much faster. This limitation has largely been compensated for by various methods of increasing CPU parallelism (see below). Architectural improvements alone do not solve all of the drawbacks of globally synchronous CPUs, though. For example, a clock signal is subject to the delays of any other electrical signal. Higher clock rates in increasingly complex CPUs make it more difficult to keep the clock signal in phase (synchronized) throughout the entire unit. This has led to the requirement in many modern CPUs to be provided with multiple identical clock signals rather than a single signal that would be significantly delayed if it drove all the switching elements. Another major issue as clock rates increase dramatically is the amount of heat that is dissipated by the CPU. The constantly changing clock causes many components to switch, regardless of whether or not they are being used at that time. In general, a component that is switching uses more energy than a switching element in a static state. Therefore, as clock rate increases, so does heat dissipation, causing the CPU to require more effective cooling solutions. One method of dealing with the switching of unneeded components is a technique called clock gating which involves turning off the clock signal to unneeded components (effectively disabling them). However, this is often regarded as difficult to implement and therefore does not see common usage outside of very low-power designs. Another method of addressing some of the problems with a global clock signal is the removal of the clock signal altogether. While removing the global clock signal makes the design process considerably more complex in many ways, "clockless" (or asynchronous) designs carry marked advantages in power consumption and heat dissipation in comparison with similar synchronous designs. While somewhat uncommon, entire CPUs have been built without utilizing a global clock signal. Two notable examples of this are the ARM compliant AMULET and the MIPS R3000 compatible MiniMIPS. Rather than totally removing the clock signal, some CPU designs allow certain portions of the device to be asynchronous. For example, using asynchronous ALUs in conjunction with superscalar pipelining to achieve some arithmetic performance gains. While it is not altogether clear whether totally asynchronous designs can perform at a comparable or better level than their synchronous counterparts, it is evident that they do at least excel in simpler math operations. This, combined with their excellent power consumption and heat dissipation properties, makes them very suitable for embedded computers.

Parallelism

embedded computer The description of the basic operation of a CPU offered in the previous section describes the very simplest form that a CPU can take. This type of CPU, usually referred to as scalar, operates on and executes one instruction on one or two pieces of data at a time. This process gives rise to an inherent inefficiency in purely scalar CPUs. Since only one instruction is executed at a time, the entire CPU must wait for that instruction to complete before proceeding to the next instruction. In other words, any given portion of a strictly scalar CPU is idle most of the time since it must wait for the current instruction to finish before it receives new input to process. Digital designers quickly realized this inefficiency and sought to reduce the amount of time that scalar CPU circuitry remained idle. The result has been a variety of design methodologies that cause the CPU to behave less linearly (that is, step-by-step) and more in parallel. When referring to parallelism in CPUs, two terms are generally used to differentiate between the main techniques used to increase parallelism. Instruction level parallelism (ILP) seeks to increase the rate at which instructions are executed within a CPU, and Thread level parallelism (TLP) purposes to increase the number of threads (effectively individual programs) that a CPU can execute simultaneously. Each methodology differs both in the ways in which they are implemented, as well as the relative effectiveness they afford in increasing the CPU's performance for an application.

ILP: Instruction pipelining and superscalar architecture

thread One of the simplest methods used to accomplish increased parallelism is to begin the first steps of instruction fetching and decoding before the previous instruction has finished execution. This is the simplest form of a technique known as instruction pipelining, and is utilized in almost all modern general-purpose CPUs. Pipelining does, however, introduce the possibility for a situation where the result of the previous operation is needed to complete the next operation, a condition often termed data dependency conflict. To cope with this, additional care must be taken to check for these sorts of conditions and delay a portion of the instruction pipeline if this occurs. Naturally, accomplishing this requires additional circuitry, so pipelined processors are more complex than strictly scalar ones (though not very significantly so). instruction pipelining Further improvement upon the idea of instruction pipelining led to the development of a method that decreases the idle time of CPU components even further. Designs that are said to be superscalar include a much longer instruction pipeline and multiple identical execution units. In a superscalar pipeline, multiple instructions are read and passed to a dispatcher, which decides whether or not the instructions can be executed in parallel (simultaneously). If they can, the CPU dispatches them to any available execution units, resulting in the ability for several instructions to be executed simultaneously. In general, the more instructions a superscalar CPU is able to dispatch simultaneously to waiting execution units, the more instructions will be completed in a given cycle. Therefore, most of the difficulty in the design of a superscalar CPU architecture lies in creating an effective dispatcher. The dispatcher needs to be able to quickly and correctly determine whether instructions can be executed in parallel, as well as dispatch them in such a way as to keep as many execution units busy as possible. This requires that the instruction pipeline is filled as often as possible and gives rise to the need in superscalar architectures for significant amounts of CPU cache. It also makes hazard-avoiding techniques like branch prediction, speculative execution, and out-of-order execution crucial to maintaining high levels of performance. By attempting to predict which branch (or path) a conditional instruction will take, the CPU can minimize the number of times that the entire pipeline must wait until a conditional instruction is completed. Speculative execution often provides modest performance increases by executing portions of code that may or may not be needed after a conditional operation completes. Out-of-order execution queues somewhat rearrange the order in which instructions are executed to reduce delays due to data dependencies. Both simple pipelining and superscalar design increase a CPU's ILP by allowing a single processor to complete execution of instructions at rates surpassing one instruction per cycle (IPC). Most modern CPU designs are at least somewhat superscalar, and nearly all general purpose CPUs designed in the last decade are superscalar. In later years, some of the emphasis in designing high-ILP computers has moved out of the hardware of the CPU and into the CPU's software interface, or ISA. The strategy of the Very long instruction word (VLIW) causes some ILP to become implied directly by the software, reducing the amount of work the CPU must perform to boost ILP, and thereby reducing the design's complexity.

TLP: Simultaneous thread execution

Another strategy commonly used to increase the parallelism of CPUs is to include the ability to run multiple threads (programs) at the same time. In general, high-TLP CPUs have been in use much longer than high-ILP ones. Many of the designs pioneered by Cray during the late 1970s and 1980s concentrated on TLP as their primary method of enabling enormous (for the time) computing capability. In fact, TLP in the form of multiple thread execution improvements has been in use since as early as the 1950s. In the context of single processor design, the two main methodologies used to accomplish TLP are Chip-level multiprocessing (CMP) and Simultaneous multithreading (SMT). On a higher level, it is very common to build computers with multiple totally independent CPUs in arrangements like Symmetric multiprocessing (SMP) and Non-uniform memory access (NUMA). While being very different means, all of these techniques accomplish the same goal; increasing the number of threads that the CPU(s) can run in parallel. The CMP and SMP methods of parallelism are similar to one another and the most straightforward. These involve little more conceptually than the utilization of two or more complete and independent CPUs. In the case of CMP, multiple processor "cores" are included in the same package, sometimes on the very same integrated circuit. SMP, on the other hand, includes multiple independent packages. NUMA is somewhat similar to SMP, but is considered a much more scalable method, successfully allowing many more CPUs to be used in one computer. SMT differs somewhat from other TLP improvements in that it attempts to duplicate as few portions of the CPU as possible. While considered a TLP strategy, its implementation actually more resembles superscalar design, and indeed is often used in superscalar microprocessors. Rather than duplicating the entire CPU, SMT designs only duplicate parts needed for instruction fetching, decoding, and dispatch, as well as things like general purpose registers. This allows a SMT CPU to keep its execution units busy more often by providing them instructions from two different software threads. Again, this is very similar to the ILP superscalar method, but simultaneously executes instructions from multiple threads rather than executing multiple instructions from the same thread concurrently.

Vector processors and SIMD

A less common but increasingly important paradigm of CPUs (and indeed, computing in general) deals with vectors. The processors discussed earlier are all referred to as some type of scalar device. As the name implies, vector processors deal with multiple pieces of data in the context of one instruction. This contrasts with scalar processors, which deal with one piece of data for every instruction. The two schemes of dealing with data are generally referred to as SISD (Single instruction, single data) and SIMD (Single instruction, multiple data), respectively. The great utility in creating CPUs that deal with vectors of data lies in optimizing tasks that tend to require the same operation (for example, a sum or a dot product) to be performed on a large set of data. Some classic examples of these types of tasks are multimedia applications (images, video, and sound), as well as many types of scientific and engineering tasks. Whereas a scalar CPU must complete the entire process of fetching, decoding, and executing each instruction and value in a set of data, a vector CPU can perform a single operation on a comparatively large set of data with one instruction. Of course, this is only possible when the application tends to require many steps which apply one operation to a large set of data. multimedia vector processors.]] Most early vector CPUs, such as the Cray-1, were associated almost exclusively with scientific research and cryptography applications. However, as multimedia has largely shifted to digital mediums, the need for some form of SIMD in general purpose CPUs has become significant. Shortly after floating point execution units started to become commonplace to include in general purpose processors, specifications for and implementations of SIMD execution units also began to appear for general purpose CPUs. Some some of these early SIMD specifications like Intel's MMX were integer-only. This proved to be a significant impediment for some software developers, since many of the applications that benefit from SIMD primarily deal with floating point numbers. Progressively these early designs were refined and remade into some of the common modern SIMD specifications, which are usually associated with one ISA. Some notable modern examples are Intel's SSE and its successors, SSE2 and SSE3, the PowerPC-related AltiVec (also known as VMX), and MIPS MDMX.

See also


- Arithmetic logic unit
- CISC and RISC
- Computer bus
- Computer engineering
- CPU cache
- CPU cooling
- CPU core voltage
- CPU design
- CPU power dissipation
- Floating point unit
- Instruction pipeline
- Instruction set
- Microprocessor
- Microprogram
- Notable CPU architectures
- SIMD
- Superscalar
- Vector processor
- Wait state

Notes

# Since the program counter counts memory addresses and not instructions, it is incremented by the number of memory units that the instruction word contains. In the case of simple fixed-length instruction word ISAs, this is always the same number. For example, a fixed-length 32-bit instruction word ISA that uses 8-bit memory words would always increment the PC by 4 (except in the case of jumps). ISAs that use variable length instruction words, such as x86, increment the PC by the number of memory words corresponding to the last instruction's length. Also, note that in more complex CPUs, incrementing the PC does not necessarily occur at the end of instruction execution. This is especially the case in heavily pipelined and superscalar architectures (see the relevant sections below). # Because the instruction set architecture of a CPU is fundamental to its interface and usage, it is often used as a classification of the "type" of CPU. For example, a "PowerPC CPU" uses some variant of the PowerPC ISA. Some CPUs, like the Intel Itanium, can actually interpret instructions for more than one ISA; however this is often accomplished by software means rather than by designing the hardware to directly support both interfaces. (See emulator) # Some early computers like the Harvard Mark I did not support any kind of "jump" instruction, effectively limiting the complexity of the programs they could run. It is largely for this reason that these computers are often not considered to contain a CPU proper, despite their close similarity as stored program computers. # This description is, in fact, a simplified view even of the Classic RISC pipeline. It largely ignores the important role of CPU cache, and therefore the access stage of the pipeline. See the respective articles for more details. # In fact, all synchronous CPUs use a combination of sequential logic and combinatorial logic. (See boolean logic) # One notable late CPU design that uses clock gating is that of the IBM PowerPC-based Xbox 360. It utilizes extensive clock gating in order to reduce the power requirements of the aforementioned videogame console it is used in. # It should be noted that neither ILP nor TLP is inherently superior over the other; they are simply different means by which to increase CPU parallelism. As such, they both have advantages and disadvantages, which are often determined by the type of software that the processor is intended to run. High-TLP CPUs are often used in applications that lend themselves well to being split up into numerous smaller applications, so-called "embarrassingly parallel problems." Frequently, a computational problem that can be solved quickly with high TLP design strategies like SMP take significantly more time on high ILP devices like superscalar CPUs, and vice versa. # Best-case scenario (or peak) IPC rates in very superscalar artchitectures are difficult to maintain since it is impossible to keep the instruction pipeline filled all the time. Therefore, in highly superscalar CPUs, average sustained IPC is often discussed rather than peak IPC. # While TLP methods have generally been in use longer than ILP methods, Chip-level multiprocessing is more or less only seen in later IC-based microprocessors. This is largely because the term itself is inapplicable to earlier discrete component devices and has only come into use recently.
For several years during the late 1990s and early 2000s, the focus in designing high performance general purpose CPUs was largely on highly superscalar IPC designs, such as the Intel Pentium 4. However, this trend seems to be reversing somewhat now as major general-purpose CPU designers switch back to less deeply pipelined high-TLP designs. This is evidenced by the proliferation of dual and multi core CMP designs and notably, Intel's newer designs resembling its less superscalar P6 architecture. Late designs in several processor families exhibit CMP, including the x86-64 Opteron and Athlon 64 X2, the SPARC UltraSPARC T1, IBM POWER4 and POWER5, as well as several video game console CPUs like the Xbox 360's triple-core PowerPC design. # Although SSE/SSE2/SSE3 have superseded MMX in Intel's general purpose CPUs, later IA-32 designs still support MMX. This is usually accomplished by providing most of the MMX functionality with the same hardware that supports the much more expansive SSE instruction sets.

References


-
-
-
-
-
-
-

External links

Microprocessor designers


- [http://www.amd.com/ Advanced Micro Devices] - Advanced Micro Devices, a maker of primarily x86-compatible desktop oriented CPUs.
- [http://www.arm.com/ ARM Ltd] - ARM Ltd, one of the few CPU designers that profits solely by licensing their designs rather than manufacturing them. ARM architecture microprocessors are among the most popular in the world for embedded applications.
- [http://www.freescale.com/ Freescale Semiconductor] (formerly of Motorola) - Freescale Semiconductor, designer of several embedded and SoC PowerPC based processors.
- [http://www-03.ibm.com/chips/ IBM Microelectronics] - Microelectronics division of IBM, which is responsible for many POWER and PowerPC based designs, including many of the CPUs that power late video game consoles.
- [http://www.intel.com/ Intel Corp] - Intel, a maker of several notable CPU lines, including IA-32, IA-64, and XScale. Also a producer of various peripheral chips for use with their CPUs.
- [http://www.mips.com/ MIPS Technologies] - MIPS Technologies, developers of the MIPS architecture, a pioneer in RISC designs.
- [http://www.ti.com/home_p_allsc Texas Instruments] - Texas Instruments semiconductor division. Designs and manufactures several types of low power microcontrollers among their many other semiconductor products.

Further reading


- [http://www.gamezero.com/team-0/articles/math_magic/micro/index.html Processor Design: An Introduction] - Detailed introduction to microprocessor design. Somewhat incomplete and outdated, but still worthwhile.
- [http://computer.howstuffworks.com/microprocessor.htm How Microprocessors Work]
- [http://arstechnica.com/articles/paedia/cpu/pipelining-2.ars/2 Pipelining: An Overview] - Good introduction to and overview of CPU pipelining techniques by the staff of Ars Technica
- [http://arstechnica.com/articles/paedia/cpu/simd.ars/ SIMD Architectures] - Introduction to and explanation of SIMD, especially how it relates to desktop computers. Also by Ars Technica Category:Digital electronics Category:Computer hardware ja:CPU ko:중앙 처리 장치 ms:Unit Pemproses Pusat th:หน่วยประมวลผลกลาง

Intel

The following article is about the multinational corporation; intel is also an abbreviation for intelligence, used in reference to military intelligence and espionage. Intel Corporation (, ), founded 1968, is a U.S.-based multinational corporation that is best known for designing and manufacturing microprocessors and specialized integrated circuits. Intel also makes network cards, motherboard chipsets, components, and other devices. Intel has advanced research projects in all aspects of semiconductor manufacturing, including MEMS.

Overview

Intel was founded in 1968 by Gordon E. Moore (a chemist and physicist) and Robert Noyce (a physicist and co-inventor of the integrated circuit) when they left Fairchild Semiconductor. It is noteworthy that Intel competitor AMD was also founded by the Traitorous Eight, in 1969. Intel's employee number four was Andy Grove (a chemical engineer), who ran the company through much of the 1980s and the high-growth 1990s. It is Grove who is now remembered as the company's key leader. By the end of the 1990s, Intel was one of the largest and most successful businesses in the world, though fierce competition within the semiconductor industry has since diminished its position somewhat.

SRAMS and the microprocessor

1990s The company's first products were random-access memory integrated circuits, and Intel grew to be a leader in the fiercely competitive DRAM, SRAM, and ROM markets throughout the 1970s. Concurrently, Intel engineers Marcian Hoff, Federico Faggin, Stanley Mazor and Masatoshi Shima invented the first microprocessor. Originally developed for the Japanese company Busicom to replace a number of ASIC's in a calculator already produced by Busicom, the Intel 4004 was introduced to the mass market on November 15, 1971, though the microprocessor did not become the core of Intel's business until the mid-1980s. (Note: Intel is usually given credit with Texas Instruments for the almost-simultaneous invention of the microprocessor.)

From DRAM to microprocessors

In 1983 at the dawn of the personal computer era, Intel's profits came under increased pressure from Japanese memory-chip manufacturers, and then-President Andy Grove drove the company into a focus on microprocessors. Grove described this transition in the book Only the Paranoid Survive. A key element of his plan was the notion, then considered radical, of becoming the single source for successors to the popular 8086 microprocessor. Until then, manufacture of complex integrated circuits was not reliable enough for customers to depend on a single supplier, but Grove began producing processors in three geographically distinct factories, and ceased licensing the chip designs to competitors such as Zilog and AMD. When the PC industry exploded in the late 1980s and 1990s, Intel was one of the primary beneficiaries.

The rise of PC architecture

AMD During the 1990s, Intel's Intel Architecture Labs (IAL) was responsible for many of the hardware innovations of the personal computer, including the PCI Bus, the PCI Express (PCIe) bus, the Universal Serial Bus (USB), and the now-dominant architecture for multiprocessor servers. IAL's software efforts met with a more mixed fate; its video and graphics software was important in the development of software digital video, but later its efforts were largely overshadowed by competition from Microsoft. The competition between Intel and Microsoft was revealed in testimony at the Microsoft antitrust trial.

Partnership with Apple

On June 6 2005, Apple Computer CEO Steve Jobs announced in his keynote address at WWDC that Apple would be transitioning from its long-favored PowerPC Architecture to Intel CPUs. Reasons stated for the change were vague, but included thermal issues, as recent G5-class PowerPC chips are well-known for running hot. Also, it was implied that the future PowerPC roadmap was unable to satisfy Apple's needs in terms of computing power. In particular, the large power requirement of the G5 chips was seen as a major stumbling block, preventing the placement of such a chip in one of Apple's laptop computers, the PowerBook and iBook. The switchover to Intel will begin in mid-2006, reportedly appearing first in Apple's low-end machines and portables.

Competition and antitrust

Intel's dominance in the x86 microprocessor market led to numerous charges of antitrust violations over the years, including FTC investigations in both the late 1980s and in 1999, and civil actions such as the 1997 suit by Digital Equipment Corporation (DEC) and a patent suit by Intergraph. Intel's market dominance (at one time it controlled over 85% of the market for 32-bit PC microprocessors) combined with Intel's own hardball legal tactics (such as its infamous 338 patent suit versus PC manufacturers) made it an attractive target for litigation, but few of the lawsuits ever amounted to anything. Currently, the only major competitor to Intel on the x86 processor market is Advanced Micro Devices (AMD), with which Intel has had full cross-licensing agreements since 1976: each partner can use the other's patented technological innovations without charge. Some smaller competitors such as Transmeta produce low-power processors for portable equipment. In June 2005, AMD sued Intel in two jurisdictions for anticompetitive practices. The Japanese Fair Trade Commission found in favor of AMD; the other case will be heard by a court in Delaware. The case in Japan led to "dawn raids" by the European Commission on some European Intel offices in July 2005. Intel filed its response[http://www.intel.com/pressroom/archive/releases/20050901corp.htm] in September to AMD's lawsuit and refuted AMD's claims, stating that its business practices are fair and lawful. In its rebuttal, Intel laid out the skeleton of its legal defense, which included a deconstruction of AMD's offensive strategy and levied the charge that AMD's long-struggling market position is largely a result of bad business decisions and management incompetence, including underinvestment in essential manufacturing capacity and overreliance on outsourcing chip foundries.[http://www.forbes.com/technology/2005/09/02/intel-amd-antitrust-cz_dw_0902intel.html?partner=yahootix] Legal experts predict the lawsuit will most likely drag out for a number of years, since Intel's response indicates they are not likely to try and settle with AMD.

Leadership

Robert Noyce was Intel's CEO at its founding in 1969, followed by co-founder Gordon Moore in 1975. Andy Grove became the company's President in 1979 to which he added the CEO title in 1987 when Moore became Chairman. In 1997 Grove succeeded Moore as Chairman, and Craig Barrett, already company president, took over. On May 18 2005, Barrett handed the reins of the company over to Paul Otellini, who previously was the company president and was responsible for Intel's design win in the original IBM PC. The board of directors elected Otellini, and Barrett replaced Grove as chairman of the board. Grove stepped down as Chairman, but will be retained as a special advisor.

Corporate governance

Current members of the board of directors of Intel are: Craig Barrett, Charlene Barshefsky, John Browne, James Guzy, Reed Hundt, James Plummer, David Pottruck, Jane Shaw, John Thornton, and David Yoffie.

Origin of the name

At its founding, Gordon Moore and Robert Noyce wanted to name their new company "Moore Noyce". But the name did not sound good in electronics—noise being associated with bad interference. They then used the name NM Electronics for almost a year, before deciding to call their company INTegrated ELectronics or "Intel" for short. However, Intel was already trademarked by a hotel chain, so they had to buy the rights for that name at the beginning.

Financial information

Its market capitalization is about $154 billion (March 2005).

Stock exchanges


- Intel is publicly traded at NASDAQ with the symbol INTC.

Indices


- Dow Industrials
- S&P 500
- Nasdaq 100
- SOX (PHLX Semiconductor Sector)
- GSTI Software Index

Diversity

Intel received a 100% rating on the first Corporate Equality Index released by the Human Rights Campaign in 2002. It has maintained this rating in 2003 and 2004. In addition, the company was named one of the 100 Best Companies for Working Mothers in 2005 by Working Mother magazine. However, Intel's working practices still face criticism, most notably from Ken Hamidi, a former employee who has been subject to multiple unsuccessful lawsuits from Intel. [http://www.faceintel.com/]

Controversial issues

Antitrust claims

In June 2005, AMD, Intel's chief rival in the x86 microprocessor market, filed an antitrust claim against Intel and its Japanese subsidiary in a Delaware court. Amongst other accusations, AMD alleged that Intel was unlawfully maintaining its monopoly through unfair business practices, such as drastically lower pricing for customers on the condition that Intel microprocessors were used exclusively in their systems. Whilst proving that Intel holds a monopoly is simple (the company is reckoned to have an 80%–90% share of the processor market), the debate over the "scare and coercion" tactics supposedly employed by Intel is likely to be more protracted. IT insiders foresee the case to be a landmark ruling in what is a fiercely competitive market.

Advertising

Intel has become one of the world's most recognizable brands following its long-running "Intel Inside" campaign. The campaign, which started in 1990, was created by Intel marketing manager Dennis Carter. The five-note jingle was introduced the following year and by its tenth anniversary was being heard in 130 countries around the world. The Intel Inside program is very lucrative for advertisers. Intel pays half the advertising costs for any ad that uses the "Intel Inside" logo. However, if in print, the ad page cannot contain any references to competitors, such as AMD. If the ads do not meet these requirements, Intel does not pay half the cost and the advertiser is prohibited from using the "Intel Inside" logo. Intel employs a large staff whose primary function is looking for advertisements which violate the agreement. Advertisers found doing so—many of which are "mom and pop" shops ignorant of the reimbursement agreement—are requested to stop violating the use of the logo and are then told how to legally use the logo and get part of their advertising costs reimbursed. The Centrino advertising campaign has been hugely successful, leading to the ability to acess wireless internet from a laptop becoming linked in consumers minds to intel chips. In the UK this has caused some controversy, as the ASA upheld complaints that this was a misleading advert. PC companies advertising products containing Intel chips are required to include the jingle in their film and television adverts in order to receive the reimbursement.

See also


- List of Intel microprocessors
- List of Intel chipsets

External links


- [http://www.intel.com/ Intel website]
- [http://www.intel.com/intel/finance/ Intel Investor Relations site]
- [http://www.intel.com/intel/intelis/museum/ Intel Museum]
- [http://www.amdboard.com/pintospecial.html Intel vs. AMD saga]
- [http://www.inteltechnology.net/ Intel Technology]

Data


- [http://biz.yahoo.com/ic/13/13787.html Yahoo! - Intel Corporation Company Profile] Category:Electronics companies Category:Computer companies of the United States Category:Computer hardware companies Category:Electronics companies of the United States Category:Manufacturing companies of the United States Category:Fortune 500 companies Category:Companies traded on NASDAQ Category:Companies based in California Category:Companies based in Oregon ko:인텔 ja:インテル (企業) th:อินเทล

Motorola 68000

The Motorola 68000 is a CISC microprocessor, the first member of a successful family of microprocessors from Motorola, which were all mostly software compatible. The entire series was often referred to as the m68k, or simply 68k. 68k

History

Initial samples of the MC68000 were released in 1979. At the time, there was fierce competition among several of the then established manufacturers of 8 bit processors to bring out 16 bit designs. Intel was first, with the Intel 8086, but Motorola marketing made a point of the 68000 being a much more complete 16 bit design. This was reflected in the complexity. The transistor cell count, which was said to be 68,000 (in reality around 70,000), was more than twice that of the 29,000 cells of the 8086. By 1982, the 68000 was clocked at a then fast 8 MHz, with the simplest instructions taking four clocks but the most complex ones requiring many more, and an assumed average of 1 MIPS. Each instruction could do more than those of the Intel processors. Motorola ceased production of the original NMOS 68000 in 2000, although derivatives continue in production, such as the 68HC000 (a pin compatible, HCMOS version of 68000), the 680x0 family and the CPU32 family. As of 2001, Hitachi continued to manufacture the 68000 under license. The MC68000 was used for the design of computers like the Apple Macintosh, Commodore Amiga, Atari ST, early HP 9000s, and the original Sun Microsystems UNIX machines as well as the Apollo/Domain workstations. It was also used in the Sega Genesis/MegaDrive, NeoGeo and several arcade machines, including Atari's classic Marble Madness, as their main CPU. In the Sega Saturn, the 68000 was used as the sound processor, and in the Atari Jaguar they were used as a main controller for all the other dedicated hardware ICs. In the Silicon Graphics's IRIS 1000 and 1200 terminals, the 68000 was also used. 68000 derivatives persisted in the UNIX market for many years, because the architecture so strongly resembles the Digital PDP-11 and VAX, and is an excellent computer for running C code. The 68000 eventually saw its greatest success as a controller. Thousands of HP, Printronix and Adobe printers used it. Its derivative microcontrollers, the CPU32 and Coldfire processors have been manufactured in the millions as automotive engine controllers. It also sees use by medical manufacturers and many printer manufacturers because of its low cost, convenience, and good stability. The DragonBall low-voltage versions of the processor were used in the popular Palm Pilot series of PDAs from Palm Computing and the Handspring Visor, until the architecture was gradually phased out in favor of the ARM processor core. A small family of derivatives with integrated hi-speed serial ports (68302 and 68360) was used in many communication products from Cisco, 3com, Ascend, Marconi and others. The Motorola 68000 is also used in Texas Instruments' latest line of graphing calculators and in AlphaSmart's Dana portable electronic typewriters.

Architecture

Address bus

The 68000 was a clever compromise. When the 68000 was introduced, 16-bit buses were really the most practical size. However, the 68000 was designed with 32-bit registers and address spaces, on the assumption that hardware prices would fall. It is important to note that even though the 68000 had 16-bit ALUs, addresses were always stored as 32-bit quantities, i.e. it had a flat 32-bit address space. Contrast this to the 8086, which had 20-bit address space, but could only access 16-bit (64 kilobyte) chunks without manipulating segment registers. The 68000 achieved this functionality using three 16-bit ALUs. In normal operation, two 16-bit ALUs are chained together to perform an address operation, while the third executes the 16-bit arithmetic. For example, a 32-bit address register postincrement on a 16-bit ADD.W (An)+, Dn runs without speed penalty. So even though starting out as "16-bit" cpu, the 68000 instruction set describes a 32-bit architecture. The importance of architecture cannot be emphasized enough. Throughout history, addressing pains have not been hardware implementation problems, but always architectural problems (instruction set problems, i.e. software compatibility problems). The successor 68020 with 32-bit ALU and 32-bit databus runs unchanged 68000 software at "32-bit speed", manipulating data up to 4 gigabytes, far beyond what software of other "16-bit" CPUs (for example, the 8086) could do. However, forwards-incompatible software sometimes resulted from programmers storing data in the address bits (24 through 31) that weren't implemented on the bus. When such code was executed on a machine with a wider address bus, bus errors resulted. Software upgrades were required before Macintosh computers could use over 8 MB RAM. This software usually remained backwards compatible. (Many applications were written with more foresight, and never had such problems.) To address the perceived markets, two M68000 variants were designed. The MC68000 had a 24-bit address, and a 16-bit data bus. The short form, the MC68008 (as used in the Sinclair QL), had a 20-bit address (20-bit in the DIP/48-pin version and 22-bit in the PLCC/52-pin version, introduced later), and an 8-bit data bus.

Internal registers

The CPU had eight 32-bit general-purpose data registers (D0-D7), and eight address registers (A0-A7). The last address register was also the standard stack pointer, and could be called either A7 or SP. This was a good number of registers in many ways. It was small enough to allow the 68000 to respond quickly to interrupts (because only 15 or 16 had to be saved), and yet large enough to make most calculations fast. Having two types of registers was mildly annoying at times, but not hard to use in practice. Reportedly, it allowed the CPU designers to achieve a higher degree of parallelism, by using an auxiliary execution unit for the address registers. Integer representation in the 68000 family is big-endian.

Status register

The 68000 comparison, arithmetic and logic operations set bits in a status register to record their results for use by later conditional jumps. The bits were "Z"ero, "C"arry, o"V"erflow, e"X"tend, and "N"egative. The e"X"tend bit deserves special mention, because it was separated from the Carry. This permitted the extra bit from arithmetic, logic and shift operations to be separated from the carry for flow-of-control and linkage.

The instruction set

The designers attempted to make the assembly language orthogonal. That is, instructions were divided into operations and address modes, and almost all address modes were available for almost all instructions. Many programmers disliked the "near" orthogonality, while others were grateful for the attempt. At the bit level, the person writing the assembler would clearly see that these "instructions" could become any of several different op-codes. It was quite a good compromise because it gave almost the same convenience as a truly orthogonal machine, and yet also gave the CPU designers freedom to fill in the op-code table. The minimal instruction size was huge for its day at 16 bits. Furthermore, many instructions and addressing modes added extra words on the back for addresses, more address-mode bits, etc. Many designers believed that the MC68000 architecture had compact code for its cost, especially when produced by compilers. This belief in more compact code led to many of its design wins, and much of its longevity as an architecture. Most embedded system designers are acutely aware of the costs of memory. This belief (or feature, depending on the designer) continued to make design wins for the instruction set (with updated CPUs) up until the ARM architecture introduced the Thumb instruction set that was similarly compact.

Privilege levels

The CPU, and later the whole family, implemented exactly two levels of privilege. User mode gave access to everything except the interrupt level control. Supervisor privilege gave access to everything. An interrupt always became supervisory. The supervisor bit was stored in the status register, and visible to user programs. A real advantage of this system was that the supervisor level had a separate stack pointer. This permitted a multitasking system to use very small stacks for tasks, because the designers did not have to allocate the memory required to hold the stack frames of a maximum stack-up of interrupts.

Interrupts

The CPU recognized 8 interrupt levels. Levels 0 through 7 were strictly prioritized. That is, a higher-numbered interrupt could always interrupt a lower-numbered interrupt. In the status register, a privileged instruction allowed one to set the current minimum interrupt level, blocking lower priority interrupts. Level 7 was not maskable - in other words, an NMI. Level 0 could be interrupted by any higher level. The level was stored in the status register, and was visible to user-level programs. Hardware interrupts are signalled to the CPU using three inputs that encode the highest pending interrupt priority. A separate interrupt controller is usually required to encode the interrupts, though for systems that do not require more than three hardware interrupts it is possible to connect the interrupt signals directly to the encoded inputs at the cost of additional software complexity. The interrupt controller can be as simple as a 74LS148 priority encoder, or may be part of a VLSI peripheral chip such as the MC68901 Multi-Function Peripheral, which also provided a UART, timer, and parallel I/O. The "exception table" (interrupt vector addresses) was fixed at addresses 0 through 1023, permitting 256 32-bit vectors. The first vector was the starting stack address, and the second was the starting code address. Vectors 3 through 15 were used to report various errors: bus error, address error, illegal instruction, zero division, CHK & CHK2 vector, privilege violation, and some reserved vectors that became line 1010 emulator, line 1111 emulator, and hardware breakpoint. Vector 24 started the real interrupts: spurious interrupt (no hardware acknowledgement), and level 1 through level 7 autovectors, then the 15 TRAP vectors, then some more reserved vectors, then the user defined vectors. Since at a minimum the starting code address vector must always be valid on reset, systems commonly included some nonvolatile memory (e.g. ROM) starting at address zero to contain the vectors and bootstrap code. However, for a general purpose system it is desirable for the operating system to be able to change the vectors at runtime. This was often accomplished by either pointing the vectors in ROM to a jump table in RAM, or through use of bank-switching to allow the ROM to be replaced by RAM at runtime. The 68000 did not meet the Popek and Goldberg virtualization requirements for full processor virtualization because it had a single unprivileged instruction "MOVE from SR", which allowed user-mode software read-only access to a small amount of privileged state. The 68000 was also unable to easily support virtual memory, which requires the ability to trap and recover from a failed memory access. The 68000 does provide a bus error exception which can be used to trap, but it does not save enough processor state to resume the faulted instruction once the operating system has handled the exception. Several companies did succeed in making 68000 based Unix workstations with virtual memory that worked, by using two 68000 chips running in parallel on different phased clocks. When the "leading" 68000 encountered a bad memory access, extra hardware would interrupt the "main" 68000 to prevent it from also encountering the bad memory access. This interrupt routine would handle the virtual memory functions and restart the "leading" 68000 in the correct state to continue properly synchronized operation when the "main" 68000 returned from the interrupt. These problems were fixed in the next major revision of the 68K architecture, with the release of the MC68010. The Bus Error and Address Error instructions pushed a large amount of internal state onto the supervisor stack in order to facilitate recovery, and the MOVE from SR instruction was made privileged. A new unprivileged "MOVE from CCR" instruction was provided for use in its place by user mode software; an operating system could trap and emulate user-mode MOVE from SR instructions if desired.

Instruction set details

The standard addressing modes are:
- Register direct
  - data register, e.g. "D0"
  - address register, e.g. "A6"
- Register indirect
  - Simple address, e.g. (A0)
  - Address with post-increment, e.g. (A0)+
  - Address with pre-decrement, e.g. -(A0)
  - Address with a 16-bit signed offset, e.g. 16(A0)
  - Note that the actual increment or decrement size was dependent on the operand request: a byte read instruction incremented the address register by 1, a word read by 2, and a long read by 4.
- Register indirect with an Index
  - 8-bit signed offset, e.g. 8(A0, D0) or 8(A0, A1)
- PC (program counter) relative with displacement
  - 16-bit signed offset, e.g. 16(PC). This mode was very useful.
  - 8-bit signed offset with index, e.g. 8(PC, D2)
- Absolute memory location
  - Either a number, e.g. "$4000", or a symbolic name translated by the assembler
  - Most 68000 assemblers used the "$" symbol for hexadecimal, instead of "0x".
- Immediate mode
  - Stored in the instruction, e.g. "#400". Plus: access to the status register, and, in later models, other special registers. Most instructions had dot-letter suffixes, permitting operations to occur on 8-bit bytes (".b"), 16-bit words (".w"), and 32-bit longs (".l"). Most instructions are dyadic, that is, the operation has a source, and a destination, and the destination is changed. Notable instructions were:
- Arithmetic: ADD, SUB, MULU (unsigned multiply), MULS (signed multiply), DIVU, DIVS, NEG (additive negation), and CMP (a sort of subtract that set the status bits, but did not store the result)
- Binary Coded Decimal Arithmetic: ABCD, and SBCD
- Logic: EOR (exclusive or), AND, NOT (logical not)
- Shifting: (logical, i.e. right shifts put zero in the most significant bit) LSL, LSR, (arithmetic shifts, i.e. sign-extend the most significant bit) ASR, ASL, (Rotates through eXtend and not:) ROXL, ROXR, ROL, ROR
- Bit manipulation in memory: BSET (to 1), BCLR (to 0), and BTST (set the Zero bit)
- Multiprocessing control: TAS, test-and-set, performed an indivisible bus operation, permitting semaphores to be used to synchronize several processors sharing a single memory
- Flow of control: JMP (jump), JSR (jump to subroutine), BSR (relative address jump to subroutine), RTS (return from subroutine), RTE (return from exception, i.e. an interrupt), TRAP (trigger a software exception similar to software interrupt), CHK (a conditional software exception)
- Branch: Bcc (a branch where the "cc" specified one of 16 tests of the condition codes in the status register: equal, greater than, less-than, carry, and most combinations and logical inversions, available from the status register).
- Decrement-and-branch: DBcc (where "cc" was as for the branch instructions) which decremented a D-register and branched to a destination provided the condition was still true and the register had not been decremented to -1. This use of -1 instead of 0 as the terminating value allowed the easy coding of loops which had to do nothing if the count was 0 to begin with, without the need for an additional check before entering the loop.

See also


- 68k
- x86

References


- [http://www.freescale.com/files/archives/doc/ref_manual/M68000PRM.pdf Motorola MC68000 Family Programmer's Reference Manual]
- [http://www.esacademy.com/automation/faq/m68k/ comp.sys.m68k FAQ]

External links


- [http://68k.hax.com/ Descriptions of assembler instructions]
- [http://www.cpu-collection.de/?tn=1&l0=cl&l1=68000 68000 images and descriptions at cpu-collection.de] Category:68k microprocessors Category:Microprocessors ja:MC68000

Motorola 68010

The Motorola MC68010 processor is a 16/32-bit microprocessor from Motorola, made in the early 1980s. It is largely similar to the Motorola 68000 CPU with the exception of the addition of several instructions for breakpoint and register control (ccr instead of sr), as well as the ability to save all of the processor state on an interrupt and exception. This made it possible to use the processor for virtual memory applications, for which the 68000 was unsuited (in detail: Contrary to the 68000 the 68010 was able to handle a double bus fault). Additionally, the 68010 had a "loop mode", i.e. a mini instruction cache, which accelerates loops that consist of only 2 instructions. But the overall speed gain compared to the 68000 was below 10% in practice, so it did not make much sense to upgrade the 68000 CPU with its pin compatible successor. The 68010 was not 100% software compatible with the 68000. The most problematic difference was the exception stack frame. The 68010 could be used with the 68451 MMU, but problems with the design, in particular a 1 clock memory access penalty made this configuration unpopular.and lead to other vendors such as Sun Microsystems using their own MMU design. The 68010 was never as popular as the 68000 as the added complexity and cost turned out to not be worthwhile in practice. Most vendors looking for the MMU functionality waited for the 68020 instead. However due to a small speed boost over the 68000, it can be found in a number of Unix workstations and research machines. The 68010 had a feature useful to hackers. The Vector Base Register (VBR) allows the exception vectors to be moved from low memory to an arbitrary location. A monitor/debugger program can intercept the interrupts, and maintain the ability to activate on demand even if the low-memory vectors are modified.

External links


- [http://www.cpu-collection.de/?tn=1&l0=cl&l1=68010 68010 images and descriptions at cpu-collection.de] Category:68k microprocessors ja:MC68010

Motorola 68020

The Motorola 68020 is a microprocessor from Motorola. It is the successor to the Motorola 68010 and is succeeded by the Motorola 68030.

Description

The 68020 had 32-bit internal and external data and address buses. A lower cost version, the 68EC020, only had a 24-bit address bus.

Improvments over 68010

The 68020 added many improvements to the 68010 including a 32-bit arithmetic and logical unit (ALU) and external data bus and address bus, and new instructions and addressing modes. The 68020 (and 68030) had a proper three-stage pipeline.

Multiprocessing features

The Motorola multiprocessing model was added with the 68020. This allowed up to eight processors per system to co-operate, these eight could be any number of CPUs, FPUs but a single MMU (either a 68841 or 68851). This had some limitation, as each CPU used had to be the same model (not necessarily the same clock) and each FPU has to be the same model (again, not necessarily the same clock) so multiprocessing a 68020/25 with a 68030/25 was not allowed (the 020, for example, could not be aware of the 030's internal MMU) but a 68020/25 with a 68882/33 was perfectly acceptable and quite common. It was, however, extremely uncommon to see more than one CPU or FPU in the same system. Most Unix boxes made with 68020s were simply the '020, an FPU (68881 or 68882) and an MMU (68841 or 68851).

Instruction set

The new instructions included some minor improvements and extensions to the supervisor state, several instructions for software management of a multiprocessing system (which were removed in the 68060), some support for high-level languages which did not get used much (and was removed from future 680x0 processors), bigger (32 x 32-bit) multiply and divide instructions, and bit field manipulations.

Addressing modes

The new addressing modes added another level of indirection to many of the pre-existing modes, and added quite a bit of flexibility to various indexing modes and operations. Though it was not intended, these new modes made the 68020 very suitable for page printing; most laser printers in the early '90s had a 68EC020 at their core. The instruction buffer (an instruction cache) was 256 bytes, arranged as 64 direct-mapped 4-byte entries. Although small, it made a significant difference in the performance of many applications.

Usage

The 68020 was used in many models of the Amiga and Apple Macintosh II series of personal computers and Sun 3 workstations. For more information on the instructions and architecture see Motorola 68000.

References

External links


- [http://www.cpu-collection.de/?tn=1&l0=cl&l1=68020 68020 images and descriptions at cpu-collection.de] Category:68k microprocessors ja:MC68020

Motorola 68030

The Motorola 68030 is a 32-bit microprocessor in Motorola's 68000 family. The 68030 was the successor to the Motorola 68020, and was followed by the Motorola 68040. In keeping with general Motorola naming, this CPU is often referred to as the 030. The 68030 features an on-chip split instruction and data cache of 256 bytes each. It also has an on-chip memory management unit. The 68881 and the faster 68882 FPU (floating point unit) chips could be used with the 68030. A lower cost version of the 68030, the Motorola 68EC030, was also released, lacking the on-chip MMU. As a microarchitecture, the 68030 is uninteresting. It is little more than a 68020 core with an added data cache (which made little difference to performance) and a process shrink. Motorola used the process shrink to allow them to pack more hardware on the die. In this case it was the MMU, a 68851 compatible. Per clock, however, the 68030 did not differentiate itself in performance from the 68020 that it was derived from. The finer manufacturing process, however, allowed Motorola to scale the processor to 50MHz. The EC variety topped out at 40MHz. The 68030 was used in many models of the Apple Macintosh IIx and Amiga series of personal computers as well as the NeXT Cube and some descendants of the Atari ST line such as the Atari TT and the Atari Falcon.

References

This processor also powers Cisco Systems' 2500 Series Router, a small-to-medium enterprise computer internetworking appliance.

External links


- [http://www.cpu-collection.de/?tn=1&l0=cl&l1=68030 68030 images and descriptions at cpu-collection.de] Category:68k microprocessors ja:MC68030

Motorola 68EC030

The 68EC030 is a microprocessor from Motorola. It is a lower cost version of the Motorola 68030, the difference between the two being that the 68EC030 does not have an on-chip memory management unit. The 68EC030 was used as the CPU of one model of the Amiga 4000. Category:Microprocessors

Motorola 68040

The Motorola 68040 is a microprocessor from Motorola. It is the successor to the Motorola 68030 and is followed by the Motorola 68060 (the 68050 was an abandoned project and never shipped, the 050 was to the 040 what the 030 was to the 020, a simple die shrink and cache size increase). In keeping with general Motorola naming, the 68040 is often referred to as simply the 040. The stripped-down version of the 68040 that lacks the FPU is the 68LC040. Used mainly by Macintosh computers, the 68040 was found mainly in the high-end Quadras. The fastest 68040 processor was clocked at 40MHz and it was only used in the Quadra 840AV. The more expensive models in the (short-lived) mid-high Centris also used the 68040, while the cheaper Centris and Performas used the 68LC040. The 68040 is the first 680x0 family member with an on-chip FPU (floating point unit). It thus includes all of the functionality that previously required external chips, namely the FPU a