JTAG implements standards for on-chip instrumentation in electronic design automation (EDA) as a complementary tool to digital simulation. It specifies the use of a dedicated debug port implementing a serial communications interface for low-overhead access without requiring direct external access to the system address and data buses. The interface connects to an on-chip Test Access Port (TAP) that implements a stateful protocol to access a set of test registers that present chip logic levels and device capabilities of various parts.
The Joint Test Action Group formed in 1985 to develop a method of verifying designs and testing printed circuit boards after manufacture. In 1990 the Institute of Electrical and Electronics Engineers codified the results of the effort in IEEE Standard 1149.1-1990, entitled Standard Test Access Port and Boundary-Scan Architecture.
The JTAG standards have been extended by many semiconductor chip manufacturers with specialized variants to provide vendor-specific features.
In the 1980s, multi-layer circuit boards and integrated circuits (ICs) using ball grid array and similar mounting technologies were becoming standard, and connections were being made between ICs that were not available to probes. The majority of manufacturing and field faults in circuit boards were due to poor solder joints on the boards, imperfections among board connections, or the bonds and bond wires from IC pads to pin lead frames. The Joint Test Action Group (JTAG) was formed in 1985 to provide a pins-out view from one IC pad to another so these faults could be discovered.
The industry standard became an IEEE standard in 1990 as IEEE Std. 1149.1-1990 after many years of initial use. In the same year, Intel released their first processor with JTAG (the 80486) which led to quicker industry adoption by all manufacturers. In 1994, a supplement that contains a description of the boundary scan description language (BSDL) was added. Further refinements regarding the use of all-zeros for EXTEST, separating the use of SAMPLE from PRELOAD and better implementation for OBSERVE_ONLY cells were made and released in 2001. Since 1990, this standard has been adopted by electronics companies around the world. Boundary scan is now mostly synonymous with JTAG, but JTAG has essential uses beyond such manufacturing applications.
Although JTAG's early applications targeted board level testing, here the JTAG standard was designed to assist with device, board, and system testing, diagnosis, and fault isolation. Today JTAG is used as the primary means of accessing sub-blocks of integrated circuits, making it an essential mechanism for debugging embedded systems which might not have any other debug-capable communications channel. On most systems, JTAG-based debugging is available from the very first instruction after CPU reset, letting it assist with development of early boot software which runs before anything is set up. An in-circuit emulator (or, more correctly, a "JTAG adapter") uses JTAG as the transport mechanism to access on-chip debug modules inside the target CPU. Those modules let software developers debug the software of an embedded system directly at the machine instruction level when needed, or (more typically) in terms of high level language source code.
System software debug support is for many software developers the main reason to be interested in JTAG. Many silicon architectures such as PowerPC, MIPS, ARM, x86 built an entire software debug, instruction tracing, and data tracing infrastructure around the basic JTAG protocol. Frequently individual silicon vendors however only implement parts of these extensions. Some examples are ARM CoreSight and Nexus as well as Intel's BTS (Branch Trace Storage), LBR (Last Branch Record), and IPT (Intel Processor Trace) implementations. There are many other such silicon vendor-specific extensions that may not be documented except under NDA. The adoption of the JTAG standard helped move JTAG-centric debugging environments away from early processor-specific designs. Processors can normally be halted, single stepped, or let run freely. One can set code breakpoints, both for code in RAM (often using a special machine instruction) and in ROM/flash. Data breakpoints are often available, as is bulk data download to RAM. Most designs have "halt mode debugging", but some allow debuggers to access registers and data buses without needing to halt the core being debugged. Some toolchains can use ARM Embedded Trace Macrocell (ETM) modules, or equivalent implementations in other architectures to trigger debugger (or tracing) activity on complex hardware events, like a logic analyzer programmed to ignore the first seven accesses to a register from one particular subroutine.
Sometimes FPGA developers also use JTAG to develop debugging tools. The same JTAG techniques used to debug software running inside a CPU can help debug other digital design blocks inside an FPGA. For example, custom JTAG instructions can be provided to allow reading registers built from arbitrary sets of signals inside the FPGA, providing visibility for behaviors which are invisible to boundary scan operations. Similarly, writing such registers could provide controllability which is not otherwise available.
JTAG allows device programmer hardware to transfer data into internal non-volatile device memory (e.g. CPLDs). Some device programmers serve a double purpose for programming as well as debugging the device. In the case of FPGAs, volatile memory devices can also be programmed via the JTAG port, normally during development work. In addition, internal monitoring capabilities (temperature, voltage and current) may be accessible via the JTAG port.
JTAG programmers are also used to write software and data into flash memory. This is usually done using the same data bus access the CPU would use, and is sometimes handled by the CPU. In other cases the memory chips themselves have JTAG interfaces. Some modern debug architectures provide internal and external bus master access without needing to halt and take over a CPU. In the worst case, it is usually possible to drive external bus signals using the boundary scan facility.
As a practical matter, when developing an embedded system, emulating the instruction store is the fastest way to implement the "debug cycle" (edit, compile, download, test, and debug). This is because the in-circuit emulator simulating an instruction store can be updated very quickly from the development host via, say, USB. Using a serial UART port and bootloader to upload firmware to Flash makes this debug cycle quite slow and possibly expensive in terms of tools; installing firmware into Flash (or SRAM instead of Flash) via JTAG is an intermediate solution between these extremes.
JTAG boundary scan technology provides access to many logic signals of a complex integrated circuit, including the device pins. The signals are represented in the boundary scan register (BSR) accessible via the TAP. This permits testing as well as controlling the states of the signals for testing and debugging. Therefore, both software and hardware (manufacturing) faults may be located and an operating device may be monitored.
When combined with built-in self-test (BIST), the JTAG scan chain enables a low overhead, embedded solution to test an IC for certain static faults (shorts, opens, and logic errors). The scan chain mechanism does not generally help diagnose or test for timing, temperature or other dynamic operational errors that may occur. Test cases are often provided in standardized formats such as SVF, or its binary sibling XSVF, and used in production tests. The ability to perform such testing on finished boards is an essential part of Design For Test in today's products, increasing the number of faults that can be found before products ship to customers.
A JTAG interface is a special interface added to a chip. Depending on the version of JTAG, two, four, or five pins are added. The four and five pin interfaces are designed so that multiple chips on a board can have their JTAG lines daisy-chained together if specific conditions are met. The two pin interface is designed so that multiple chips can be connected in a star topology. In either case a test probe need only connect to a single "JTAG port" to have access to all chips on a circuit board.
The connector pins are:
The TRST pin is an optional active-low reset to the test logic, usually asynchronous, but sometimes synchronous, depending on the chip. If the pin is not available, the test logic can be reset by switching to the reset state synchronously, using TCK and TMS. Note that resetting test logic doesn't necessarily imply resetting anything else. There are generally some processor-specific JTAG operations which can reset all or part of the chip being debugged.
Since only one data line is available, the protocol is serial. The clock input is at the TCK pin. One bit of data is transferred in from TDI, and out to TDO per TCK rising clock edge. Different instructions can be loaded. Instructions for typical ICs might read the chip ID, sample input pins, drive (or float) output pins, manipulate chip functions, or bypass (pipe TDI to TDO to logically shorten chains of multiple chips).
As with any clocked signal, data presented to TDI must be valid for some chip-specific Setup time before and Hold time after the relevant (here, rising) clock edge. TDO data is valid for some chip-specific time after the falling edge of TCK.
The maximum operating frequency of TCK varies depending on all chips in the chain (the lowest speed must be used), but it is typically 10-100 MHz (100-10 ns per bit). Also TCK frequencies depend on board layout and JTAG adapter capabilities and state. One chip might have a 40 MHz JTAG clock, but only if it is using a 200 MHz clock for non-JTAG operations; and it might need to use a much slower clock when it is in a low power mode. Accordingly, some JTAG adapters have adaptive clocking using an RTCK (Return TCK) signal. Faster TCK frequencies are most useful when JTAG is used to transfer much data, such as when storing a program executable into flash memory.
Clocking changes on TMS steps through a standardized JTAG state machine. The JTAG state machine can reset, access an instruction register, or access data selected by the instruction register.
JTAG platforms often add signals to the handful defined by the IEEE 1149.1 specification. A System Reset (SRST) signal is quite common, letting debuggers reset the whole system, not just the parts with JTAG support. Sometimes there are event signals used to trigger activity by the host or by the device being monitored through JTAG; or, perhaps, additional control lines.
Even though few consumer products provide an explicit JTAG port connector, the connections are often available on the printed circuit board as a remnant from development prototyping and/or production. When exploited, these connections often provide the most viable means for reverse engineering.
Reduced pin count JTAG uses only two wires, a clock wire and a data wire. This is defined as part of the IEEE 1149.7 standard. The connector pins are:
It is called cJTAG for compact JTAG.
The two wire interface reduced pressure on the number of pins, and devices can be connected in a star topology. The star topology enables some parts of the system to be powered down, while others can still be accessed over JTAG; a daisy chain requires all JTAG interfaces to be powered. Other two-wire interfaces exist, such as Serial Wire Debug.
In JTAG, devices expose one or more test access ports (TAPs). The picture above shows three TAPs, which might be individual chips or might be modules inside one chip. A daisy chain of TAPs is called a scan chain, or (loosely) a target. Scan chains can be arbitrarily long, but in practice twenty TAPs is unusually long.
To use JTAG, a host is connected to the target's JTAG signals (TMS, TCK, TDI, TDO, etc.) through some kind of JTAG adapter, which may need to handle issues like level shifting and galvanic isolation. The adapter connects to the host using some interface such as USB, PCI, Ethernet, and so forth.
The host communicates with the TAPs by manipulating TMS and TDI in conjunction with TCK, and reading results through TDO (which is the only standard host-side input). TMS/TDI/TCK output transitions create the basic JTAG communication primitive on which higher layer protocols build:
So at a basic level, using JTAG involves reading and writing instructions and their associated data registers; and sometimes involves running a number of test cycles. Behind those registers is hardware that is not specified by JTAG, and which has its own states that is affected by JTAG activities.
Most JTAG hosts use the shortest path between two states, perhaps constrained by quirks of the adapter. (For example, one adapter[which?] only handles paths whose lengths are multiples of seven bits.) Some layers built on top of JTAG monitor the state transitions, and use uncommon paths to trigger higher level operations. Some ARM cores use such sequences to enter and exit a two-wire (non-JTAG) SWD mode. A Zero Bit Scan (ZBS) sequence is used in IEEE 1149.7 to access advanced functionality such as switching TAPs into and out of scan chains, power management, and a different two-wire mode.
Instruction register sizes tend to be small, perhaps four or seven bits wide. Except for BYPASS and EXTEST, all instruction opcodes are defined by the TAP implementor, as are their associated data registers; undefined instruction codes should not be used. Two key instructions are:
On exit from the RESET state, the instruction register is preloaded with either BYPASS or IDCODE. This allows JTAG hosts to identify the size and, at least partially, contents of the scan chain to which they are connected. (They can enter the RESET state then scan the Data Register until they read back the data they wrote. A BYPASS register has only a zero bit; while an IDCODE register is 32-bits and starts with a one. So the bits not written by the host can easily be mapped to TAPs.) Such identification is often used to sanity check manual configuration, since IDCODE is often unspecific. It could for example identify an ARM Cortex-M3 based microcontroller, without specifying the microcontroller vendor or model; or a particular FPGA, but not how it has been programmed.
A common idiom involves shifting BYPASS into the instruction registers of all TAPs except one, which receives some other instruction. That way all TAPs except one expose a single bit data register, and values can be selectively shifted into or out of that one TAP's data register without affecting any other TAP.
The IEEE 1149.1 (JTAG) standard describes a number of instructions to support boundary scan applications. Some of these instructions are "mandatory", but TAPs used for debug instead of boundary scan testing sometimes provide minimal or no support for these instructions. Those "mandatory" instructions operate on the Boundary Scan Register (BSR) defined in the BSDL file, and include:
IEEE-defined "Optional" instructions include:
Devices may define more instructions, and those definitions should be part of a BSDL file provided by the manufacturer. They are often only marked as PRIVATE.
Devices communicate to the world via a set of input and output pins. By themselves, these pins provide limited visibility into the workings of the device. However, devices that support boundary scan contain a shift-register cell for each signal pin of the device. These registers are connected in a dedicated path around the device's boundary (hence the name). The path creates a virtual access capability that circumvents the normal inputs and outputs, providing direct control of the device and detailed visibility for signals.
The contents of the boundary scan register, including signal I/O capabilities, are usually described by the manufacturer using a part-specific BSDL file. These are used with design 'netlists' from CAD/EDA systems to develop tests used in board manufacturing. Commercial test systems often cost several thousand dollars for a complete system, and include diagnostic options to pinpoint faults such as open circuits and shorts. They may also offer schematic or layout viewers to depict the fault in a graphical manner.
To enable boundary scanning, IC vendors add logic to each of their devices, including scan cells for each of the signal pins. These cells are then connected together to form the boundary scan shift register (BSR), which is connected to a TAP controller. These designs are parts of most Verilog or VHDL libraries. Overhead for this additional logic is minimal, and generally is well worth the price to enable efficient testing at the board level.
An example helps show the operation of JTAG in real systems. The example here is the debug TAP of an ARM11 processor, the ARM1136 core. The processor itself has extensive JTAG capability, similar to what is found in other CPU cores, and it is integrated into chips with even more extensive capabilities accessed through JTAG.
This is a non-trivial example, which is representative of a significant cross section of JTAG-enabled systems. In addition, it shows how control mechanisms are built using JTAG's register read/write primitives, and how those combine to facilitate testing and debugging complex logic elements; CPUs are common, but FPGAs and ASICs include other complex elements which need to be debugged.
Licensees of this core integrate it into chips, usually combining it with other TAPs as well as numerous peripherals and memory. One of those other TAPs handles boundary scan testing for the whole chip; it is not supported by the debug TAP. Examples of such chips include:
Those processors are both intended for use in wireless handsets such as cell phones, which is part of the reason they include TAP controllers which modify the JTAG scan chain: Debugging low power operation requires accessing chips when they are largely powered off, and thus when not all TAPs are operational. That scan chain modification is one subject of a forthcoming IEEE 1149.7 standard.
This debug TAP exposes several standard instructions, and a few specifically designed for hardware-assisted debugging, where a software tool (the "debugger") uses JTAG to communicate with a system being debugged:
IDCODE, standard instructions as described above
INTEST, standard instructions, but operating on the core instead of an external boundary scan chain.
EXTESTis nominally for writing data to the core,
INTESTis nominally for reading it; but two scan chains are exceptions to that rule.
SCAN_NARM instruction to select the numbered scan chain used with
INTEST. There are six scan chains:
0- Device ID Register, 40 bits of read-only identification data
1- Debug Status and Control Register (DSCR), 32 bits used to operate the debug facilities
4- Instruction Transfer Register (ITR), 33 bits (32 instruction plus one status bit) used to execute processor instructions while in a special "Debug Mode" (see below)
5- Debug Communications Channel (DCC), 34 bits (one long data word plus two status bits) used for bidirectional data transfer to the core. This is used both in debug mode, and possibly at runtime when talking to debugger-aware software.
6- Embedded Trace Module (ETM), 40 bits (7 bit address, one 32-bit long data word, and a R/W bit) used to control the operation of a passive instruction and data trace mechanism. This feeds either an on-chip Embedded Trace Buffer (ETB), or an external high speed trace data collection pod. Tracing supports passive debugging (examining execution history) and profiling for performance tuning.
7- debug module, 40 bits (7 bit address, one 32-bit long data word, and a R/W bit) used to access hardware breakpoints, watchpoints, and more. These can be written while the processor is running; it does not need to be in Debug Mode.
RESTART, ARM11-specific instructions to halt and restart the CPU. Halting it puts the core into the "Debug Mode", where the ITR can be used to execute instructions, including using the DCC to transfer data between the debug (JTAG) host and the CPU.
ITRSEL, ARM11-specific instruction to accelerate some operations with ITR.
That model resembles the model used in other ARM cores. Non-ARM systems generally have similar capabilities, perhaps implemented using the Nexus protocols on top of JTAG, or other vendor-specific schemes.
Older ARM7 and ARM9 cores include an EmbeddedICE module which combines most of those facilities, but has an awkward mechanism for instruction execution: the debugger must drive the CPU instruction pipeline, clock by clock, and directly access the data buses to read and write data to the CPU. The ARM11 uses the same model for trace support (ETM, ETB) as those older cores.
Newer ARM Cortex cores closely resemble this debug model, but build on a Debug Access Port (DAP) instead of direct CPU access. In this architecture (named CoreSight Technology), core and JTAG module is completely independent. They are also decoupled from JTAG so they can be hosted over ARM's two-wire SWD interface (see below) instead of just the six-wire JTAG interface. (ARM takes the four standard JTAG signals and adds the optional TRST, plus the RTCK signal used for adaptive clocking.) The CoreSight JTAG-DP is asynchronous to the core clocks, and does not implement RTCK. Also, the newer cores have updated trace support.
One basic way to debug software is to present a single threaded model, where the debugger periodically stops execution of the program and examines its state as exposed by register contents and memory (including peripheral controller registers). When interesting program events approach, a person may want to single step instructions (or lines of source code) to watch how a particular misbehavior happens.
So for example a JTAG host might HALT the core, entering Debug Mode, and then read CPU registers using ITR and DCC. After saving processor state, it could write those registers with whatever values it needs, then execute arbitrary algorithms on the CPU, accessing memory and peripherals to help characterize the system state. After the debugger performs those operations, the state may be restored and execution continued using the RESTART instruction.
Debug mode is also entered asynchronously by the debug module triggering a watchpoint or breakpoint, or by issuing a BKPT (breakpoint) instruction from the software being debugged. When it is not being used for instruction tracing, the ETM can also trigger entry to debug mode; it supports complex triggers sensitive to state and history, as well as the simple address comparisons exposed by the debug module. Asynchronous transitions to debug mode are detected by polling the DSCR register. This is how single stepping is implemented: HALT the core, set a temporary breakpoint at the next instruction or next high-level statement, RESTART, poll DSCR until you detect asynchronous entry to debug state, remove that temporary breakpoint, repeat.
Modern software is often too complex to work well with such a single threaded model. For example, a processor used to control a motor (perhaps one driving a saw blade) may not be able to safely enter halt mode; it may need to continue handling interrupts to ensure physical safety of people and/or machinery. Issuing a HALT instruction using JTAG might be dangerous.
ARM processors support an alternative debug mode, called Monitor Mode, to work with such situations. (This is distinct from the Secure Monitor Mode implemented as part of security extensions on newer ARM cores; it manages debug operations, not security transitions.) In those cases, breakpoints and watchpoints trigger a special kind of hardware exception, transferring control to a "debug monitor" running as part of the system software. This monitor communicates with the debugger using the DCC, and could arrange for example to single step only a single process while other processes (and interrupt handlers) continue running.
Microprocessor vendors have often defined their own core-specific debugging extensions. Such vendors include Infineon, MIPS with EJTAG, and more. If the vendor does not adopt a standard (such as the ones used by ARM processors; or Nexus), they need to define their own solution. If they support boundary scan, they generally build debugging over JTAG.
Freescale has COP and OnCE (On-Chip Emulation). OnCE includes a JTAG command which makes a TAP enter a special mode where the IR holds OnCE debugging commands for operations such as single stepping, breakpointing, and accessing registers or memory. It also defines EOnCE (Enhanced On-Chip Emulation) presented as addressing real time concerns.
ARM has an extensive processor core debug architecture (CoreSight) that started with EmbeddedICE (a debug facility available on most ARM cores), and now includes many additional components such as an ETM (Embedded Trace Macrocell), with a high speed trace port, supporting multi-core and multithread tracing. Note that tracing is non-invasive; systems do not need to stop operating to be traced. (However, trace data is too voluminous to use JTAG as more than a trace control channel.)
Nexus defines a processor debug infrastructure which is largely vendor-independent. One of its hardware interfaces is JTAG. It also defines a high speed auxiliary port interface, used for tracing and more. Nexus is used with some newer platforms, such as the Atmel AVR32 and Freescale MPC5500 series processors.
The target's JTAG interface is accessed using some JTAG-enabled application and some JTAG adapter hardware. There is a wide range of such hardware, optimized for purposes such as production testing, debugging high speed systems, low cost microcontroller development, and so on. In the same way, the software used to drive such hardware can be quite varied. Software developers mostly use JTAG for debugging and updating firmware.
There are no official standards for JTAG adapter physical connectors. Development boards usually include a header to support preferred development tools; in some cases they include multiple such headers, because they need to support multiple such tools. For example, a microcontroller, FPGA, and ARM application processor rarely share tools, so a development board using all of those components might have three or more headers. Production boards may omit the headers, or when space is limited may provide JTAG signal access using test points.
Those connectors tend to include more than just the four standardized signals (TMS, TCK, TDI, TDO). Usually reset signals are provided, one or both of TRST (TAP reset) and SRST (system reset). The connector usually provides the board-under-test's logic supply voltage so that the JTAG adapters use the appropriate logic levels. The board voltage may also serve as a "board present" debugger input. Other event input or output signals may be provided, or general purpose I/O (GPIO) lines, to support more complex debugging architectures.
Higher end products frequently use dense connectors (frequently 38-pin MICTOR connectors) to support high-speed tracing in conjunction with JTAG operations. A recent trend is to have development boards integrate a USB interface to JTAG, where a second channel is used for a serial port. (Smaller boards can also be powered through USB. Since modern PCs tend to omit serial ports, such integrated debug links can significantly reduce clutter for developers.) Production boards often rely on bed-of-nails connections to test points for testing and programming.
Adapter hardware varies widely. When not integrated into a development board, it involves a short cable to attach to a JTAG connector on the target board; a connection to the debugging host, such as a USB, PCI, or Ethernet link; and enough electronics to adapt the two communications domains (and sometimes provide galvanic isolation). A separate power supply may be needed. There are both "dumb" adapters, where the host decides and performs all JTAG operations; and "smart" ones, where some of that work is performed inside the adapter, often driven by a microcontroller. The "smart" adapters eliminate link latencies for operation sequences that may involve polling for status changes between steps, and may accordingly offer faster throughput.
As of 2018USB link from the host are the most common approach. Higher end products often support Ethernet, with the advantage that the debug host can be quite remote. Adapters which support high speed trace ports generally include several megabytes of trace buffer and provide high speed links (USB or Ethernet) to get that data to the host., adapters with a
Parallel port adapters are simple and inexpensive, but they are relatively slow because they use the host CPU to change each bit ("bit banging"). They have declined in usefulness because most computers in recent years don't have a parallel port. Driver support is also a problem, because pin usage by adapters varied widely. Since the parallel port is based on 5V logic level, most adapters lacked voltage translation support for 3.3V or 1.8V target voltages.
RS-232 serial port adapters also exist, and are similarly declining in usefulness. They generally involve either slower bit banging than a parallel port, or a microcontroller translating some command protocol to JTAG operations. Such serial adapters are also not fast, but their command protocols could generally be reused on top of higher speed links.
With all JTAG adapters, software support is a basic concern. Many vendors do not publish the protocols used by their JTAG adapter hardware, limiting their customers to the tool chains supported by those vendors. This is a particular issue for "smart" adapters, some of which embed significant amounts of knowledge about how to interact with specific CPUs.
Most development environments for embedded software include JTAG support. There are, broadly speaking, three sources of such software:
All such software tends to include basic debugger support: stopping, halting, single stepping, breakpoints, data structure browsing, and so on. Commercial tools tend to provide tools like very accurate simulators and trace analysis, which are not currently available as open source.
Serial Wire Debug (SWD) is an alternative 2-pin electrical interface that uses the same protocol. It uses the existing GND connection. SWD uses an ARM CPU standard bi-directional wire protocol, defined in the ARM Debug Interface v5. This enables the debugger to become another AMBA bus master for access to system memory and peripheral or debug registers. Data rate is up to . SWD also has built-in error detection. On JTAG devices with SWD capability, the TMS and TCK are used as SWDIO and SWCLK signals, providing for dual-mode programmers.