Multithreading and Multitasking Improves Wireless IC Production Testing
By Joe Kelly, Verigy U.S., Inc.
Production testing of RF-containing SOC (System on a Chip) devices for wireless communications has changed a lot over the past few years, primarily driven by economic factors. Of particular note, semiconductor manufacturers have worked with ATE vendors to increase the amount that devicescan be tested in parallel in order to better utilize their capital equipment. The cost of testing (referred to as Cost of Test or COT) can be up to 5% of the overall cost of manufacturing integrated wireless RF-containing SOCs. From a test engineer’s point of view, two key measures of COT are parallel test efficiency and throughput (units tested in an amount of time). Often, only multi-site efficiency is considered, but in reality, throughput is the more important factor because it directly equates to production yield. These metrics drive development of both, hardware and software features of ATE (Automated Test Equipment).
ATE and Test Architecture
Figure 1 shows the fundamental parts of a test as implemented with ATE. They are:
1. Acquisition (A) of the measured data
2. Upload (U) of the measured data to the processor
3. Processing (P), or calculation, of the results
Consider acquisition, uploading, and processing of the measurement data and even the separate tests (Test 1, Test 2, etc.) to be “tasks” involved in typical analog and RF measurements on SOC devices. Through the following discussion, the tie-in of throughput to improved parallelism of these tasks will be shown.
Figure 2 shows the typical ATE architecture for testing RF devices. The RF signal arrives at the tester from the DUT (Device Under Test) and the signal is downconverted to some intermediate or baseband frequency. Because the incoming RF signal is a physical analog signal, all of the pieces in the dashed box of the ATE that manipulate the signal to this point are hardware-based.
The ADC (digitizer) converts the signal from analog to digital and places it into a discrete format in the form of arrays of data in the system. Once the data has been placed into these arrays, they are uploaded to a workstation where calculations are done. The upload and processing of large amounts of data in RF ATE systems can cause decreased throughput if they are not handled properly within the architectural design of the ATE.
Multi-Site Testing and Multi-Site Efficiency
The term site, as in multi-site, is used throughout this discussion. In the ATE industry, there is a constant strive for multi-site test programs where more than one DUT (Device Under Test) can be tested using as much parallelism as the ATE can provide.
The ATE industry refers to parallel test efficiency using the figure of merit, multi-site efficiency , expressed in percent as,
N is the difference in number of sites relative to single-site, and t1 is the single site test time. From both a mathematical as well as a definition point of view, the number of sites, N, is always greater than 1. Since N and t are always referenced to the single-site case, this is more simply written as
The definition of multi-site efficiency is highly dependent upon an understanding of the qualitative description of what multi-site efficiency really means. Multi-site efficiency within the realm of ATE measures the deviation from perfect parallel execution of a test program (or individual test) across multiple sites in a test cell. Perfect parallelism is exhibited when one site is tested in a fixed amount of time and adding additional sites does not consume any additional test time. Ideal parallelism equates to 100% multi-site efficiency. In contrast to that, if there is absolutely no parallelism during testing, the overall multi-site test execution time scales with the number of sites,
tMS = (N)(t1). (3)
This corresponds to 0% multi-site efficiency and is obviously not desirable, as increased throughput would correspond linearly to increased time. For a detailed analysis on multi-site efficiency and how it impacts the ATE industry see Reference .
The ATE industry standard definition of throughput (related to multi-site efficiency, MSE) is the measure of the units (devices) tested per hour (UPH). Throughput is a key contributor to the COT calculation. Throughput can be derived from Equations (2) and (3) by rearranging them to solve for the total execution time for multi-site, tMS,
Then, throughput is calculated by,
Parallel ATE Measurement Hardware Resources
Using Equations (4) and (5) to examine throughput for various values of multi-site efficiency, as in Figure 3, one can observe that aside from the linear case of perfect parallelism, there is an asymptotic value of throughput that is attained with increasing number of sites. This asymptotic value is due to parallel efficiency degradation caused by the need to not only make the measurement with test equipment, but also to perform calculations on the data, as outlined by the three components of a test shown in Figure 1. The traditional means and most obvious way to improve throughput is to simply add more hardware so that more parallelism and throughput can be achieved, but this comes however, at the expense of capital equipment, which can not practically increase unbounded.
Until recently where parallel processing and multi-core processor-based computers has become prevalent, multi-site efficiencies typically of only 95% were obtained. However, with increasing computing power as discussed in the next section, higher multi-site efficiencies can be achieved and the asymptotic limits impacting increased numbers of parallel sites has been nearly eliminated.
The measurement resources within the ATE determine how much parallelism in measurements can be achieved in real time, but traditionally, one of the bottlenecks as shown in Figure 3 has been the slow computing power of the workstation or computer that controls the overall ATE system. Modern multi-core workstations are providing significant improvements to this bottleneck using features such as multithreading. Even if the hardware can perform the measurement portion of the test and acquire data, the calculations or processing parallelism is limited by the lack of parallel processes, or threads.
Multithreading uses both hardware and software to make the most use of the multiple processors (often with multiple cores) found in today’s high-end workstations. The implementation of multithreading occurs in the code that controls the test program. It uses standard libraries, native to Linux, that provide functions to control threading activity within an application. One key aspect that needs to be considered is assuring that all code implemented in a multithreaded program is thread-safe.
The hardware on the ATE workstation determines the maximum achievable performance. Workstations have one or more processors. To achieve the best throughput, a dual-site program should ideally be run on a computer with two or more cores, quad-site four or more cores, and octal-site eight cores or more.
Figure 4 shows how, through the use of multithreading, the test time of a quad-site program can be greatly reduced. In each of the three scenarios, the traditional test sequence (A, U, P) is exhibited for Test 1.
Scenario A demonstrates the least efficient implementation, i.e., serial. Even if there are parallel-capable hardware resources on the ATE, this scenario shows that they are being wasted and not utilized. The addition of parallel hardware in the ATE allows parallel measurements to be made (B). Through multithreading (C), multiple resources within the tester can be controlled in parallel (Test 2 being performed in parallel to other tests) and uploading of data and calculation of results can all be done in parallel, thus completely hiding one test behind another test. Scenario D shows the benefit of having multiple uploads of the measured data in parallel. This reduces the overall test time of Test 1, thereby allowing additional tests to be hidden by Test 2.
There are many aspects of ATE, both, hardware and software that can contribute to improvement in throughput. Advances in processor technology, including multiple cores, provides parallel processing, or multithreading capability. This enables multiple calculations of the measured data to be performed in parallel. It also allows multiple events to be able to be controlled on the ATE hardware, thereby providing the utmost in throughput.
 Semiconductor Industry Association (SIA), “International Technology Roadmap for Semiconductors, 2007 Edition” (2007)
 J. Kelly, “Multi-Site efficiency and Throughput,” Verigy Technical Note, GoSEMI Newsletter, October (2008)
 J. Rivoir, “Parallel Test Reduces Cost if Test More Effectively than Just a Cheap Tester,” Proceedings of SEMICON Europe 2005 (2005)
 T. Lecklider, “Reducing the Cost of Test,” Evaluation Engineering, July 2011 (2011)
 References  and  provide excellent overviews of the many additional contributors to COT.
2 Thread-safe has numerous implications, but in the most general form, it refers to the ability of a segment of program to be spawned off to a separate thread and resynchronized with the program without breaking itself, or the program.
3 Throughput is affected by upload and calculation. With reference to CPUs, only the calculation portion is impacted. Upload throughput is a function of the data transfer interface between tester and workstation.
About the Author
Joe Kelly, Ph.D., is a principal test engineer with Verigy. He has been with HP, then Agilent, and now Verigy since 1999, working on analog (RF and mixed signal) test methodologies for production testing of SOC devices.
SEMI Global Update
July 5, 2011