data | SEMI

data

AI & Energy: Bending the Curve

By Pushkar P. Apte and Melissa Grupen-Shemansky

Artificial intelligence (AI) is scaling at a pace that is reshaping semiconductor roadmaps, data center design, and long-term infrastructure strategy. AI promises many economic and social benefits; but the growth comes with an escalating demand for power, and energy has emerged as a major challenge.SEMI, as the global semiconductor and electronics association connecting over 4,000 companies, continues to unite the entire ecosystem to “bend the curve” – to maximize AI performance while minimizing power consumption. In a series of successful, sold-out workshops that the SEMI Smart Data-AI Initiative held on this topic, a resonant theme has emerged: sustaining AI progress requires energy-efficient computing with holistic co-design and co-optimization across materials, devices, systems, data transmission, data centers, emerging architectures and software. While this dialog is an important starting point, the ultimate goal is to drive concrete action through collaborative innovation.The AI Energy ChallengeAI training compute for frontier models is growing at an estimated 4–5x per year, driving unprecedented demand for hardware capability and infrastructure capacity. That trajectory has resulted in a global “data center gold rush” and is testing energy availability limits. As model sizes scale exponentially, so too does the energy required to train and deploy them; and power consumption has become a significant limiter to performance gains. Further, this increases heat dissipation, and requires innovations like direct liquid cooling.Modern AI and high-performance computing systems now operate at levels comparable to small cities, with tens of megawatts per installation and a trajectory toward gigawatt-scale data center campuses. Grid capacity—both in the U.S. and globally—may be challenged to keep pace with projected demand. Thus, AI infrastructure is no longer just a technical challenge, but it is an energy, systems, and policy challenge.System-Technology Co-OptimizationContinuous advances in chip and inference efficiency have delivered orders-of-magnitude improvements over many decades. These gains must now be expanded by holistic co-optimization of the entire compute system from silicon technologies to data center to the grid.For example, processors can be made more efficient by customizing them for specific workloads. However, only part of total data center power is consumed by the processor itself. A significant portion is used by data movement, power conversion and cooling. The energy required to move data increases dramatically with distance. Moving bits across packages, boards, and networks can consume far more energy than the compute operations themselves. This makes locality a critical design principle. The opportunity—and necessity—therefore lies in cross-layer optimization: efficient compute, efficient communication, and intelligent power management across the entire system. Not surprisingly, advanced packaging and integration are becoming central to performance. These technologies can enable architectures that tightly couple compute, memory, and I/O—using 2.5D and 3D integration techniques—reducing energy per bit and increasing bandwidth. Photonic interconnects and low-power materials can further lower the cost of processing and moving data.The bottom line is that incremental chip-level gains alone will not be sufficient and energy optimization cannot be siloed—system-technology co-optimization is needed.Hardware-Software Co-optimizationKeeping data as localized as possible depends as much on software algorithms as it does on hardware architectures. The challenge is that the development cycles are mismatched: new software models can be developed in months, while designing and fabricating new hardware can take years. While this cycle mismatch is fundamental, closer coordination between hardware and software developers can significantly improve efficiency. For example, offloading selected functions in the algorithm, including distributed DPUs, and reducing the level of data precision can reduce energy use. Partitioning workloads logically across the hardware/software stack between cloud services and compute-on-edge can also reduce energy appreciably. Further, risk mitigation techniques—for example, building in strategic redundancy—can make future designs more resilient to shifts in software algorithms and models.Diverse Computing ModalitiesWhile AI dominates current infrastructure investment, the future of computing will likely include multiple, diverse computational modalities such as quantum, neuromorphic, photonic and analog computing.Different computational paradigms will be applied where they are most effective. For example, quantum computing is likely to complement—not replace—classical systems; especially for specific classes of problems where it offers exponential advantages. However, progress in quantum computing is tightly coupled to advances in semiconductor infrastructure. Error correction, orchestration, and hybrid algorithms all depend on high-performance classical systems operating with low latency alongside quantum processors. While there is no single silver bullet, system-level design can ensure that multiple computing modalities work together within unified workflows spanning edge, cloud, and exascale environments.Why It Matters What to WatchEnergy will now be a key constraint for AI performance and infrastructure expansion.The evolution of gigawatt-scale AI campuses and their interaction with public energy grids will accelerate – or slow down – AI growth.Data movement, memory bandwidth, interconnect efficiency, advanced packaging and heterogeneous integration will be strategic levers. Enhanced system-technology co-optimization and integration of advanced technologies like 3D ICs and photonics will be critical.Co-optimization across hardware, software, and systems will be required.Future architectures will blend classical and emerging compute modalities like quantum, photonic and neuromorphic.In conclusion, AI has become a defining global force with much promise, but its trajectory will be shaped by technology, energy and infrastructure economics working together. This is a formidable challenge because it requires many diverse players with divergent priorities to collaborate effectively.We invite you to join the SEMI Smart Data-AI initiative to collaboratively address this challenge and help realize AI’s full potential sustainably. Our next workshop in this series will be on September 9 in Silicon Valley – please join us for this exciting event.SourcesSEMI Smart Data-AI Initiative – Future of ComputingEnergy-Efficient Computing for AI and Beyond, SEMICON West, October 2025Sustainable AI Systems, SEMI HQ, March 2026About the AuthorsDr. Pushkar P. Apte is the Strategic Technology Advisor for SEMI Global Lead for the Smart Data-AI Initiative Dr. Melissa Grupen-Shemansky is Senior VP and CTO of SEMI

Blog

AI & Energy: Bending the Curve

By Pushkar P. Apte and Melissa Grupen-Shemansky

May 5, 2026

August 13, 2019

Breaking the Memory Wall: The AI Bottleneck

By Michael Hall

In the long unfolding arc of technology innovation, artificial intelligence (AI) looms immense. In its quest to mimic human behavior, the technology touches energy, agriculture, manufacturing, logistics, healthcare, construction, transportation and nearly every other imaginable industry – a defining role that promises to fast track the fourth Industrial Revolution. And if the industry oracles have it right, AI growth will be nothing shy of explosive.“The gains these days are not incremental,” said Ajit Manocha, SEMI president and CEO, said to a gathering in July of the Chinese American Semiconductor Professional Association (CASPA) for its Summer Symposium at SEMI’s headquarters in Milpitas. “They are hockey stick – exponential – with AI semiconductors growing in market size from $4 billion this year to $70 billion in 2025.”Manocha left little doubt that AI is remaking the semiconductor industry and, in the process, the world at large. Internet of Things (IoT) and 4G/5G, both key AI enablers, will account for more than 75 percent of device connections by 2025.“Today, 30 billion devices worldwide are connected,” Manocha said, citing an Applied Materials prediction that the number of connected devices globally will grow to between 500 billion and 1 trillion by 2030. Those devices will generate stunning amounts of data collected, interpreted and used to reason, solve problems, learn and plan, leading to the holy grail of autonomous machine behavior.To process this colossal amount of data central to the promise of AI, the industry must break through the limits of a key technology: memory. Memory a Critical AI BottleneckThe challenge for memory starts with performance. Historically, every decade gains in compute performance have outpaced improvements in memory speed by 100 times, and over the past 20 years that gap has grown, said Steven Woo, a fellow and distinguished inventor at Rambus, presenting at the symposium. The upshot is that memory has bottlenecked compute and, in turn, AI performance. The industry has responded with new ways to implement memory systems on AI chips. Each is suited to unique performance requirements and, of course, comes with trade-offs. Among the frontrunners: On-chip memory delivers the highest bandwidth and power efficiency but is limited in capacity. HBM (High Bandwidth Memory) offers both very high memory bandwidth and density. GDDR balances trade-offs among bandwidth, power efficiency, cost and reliability. Since 2012, AI training capability has grown 300,000 times, besting Moore’s law by 25,000 times in doubling every 3.5 months, a blistering pace compared to the 18-month doubling cycle of Moore’s law, Woo said. The staggering improvements have been driven by parallel computing capacity and new application-specific silicon like Google’s Tensor Processing Unit (TPU).These specialized silicon architectures and parallel engines are key to sustaining future gains in compute performance and combatting the slowing of Moore’s Law and the end of power scaling, Woo said. By rethinking the way processors are architected for certain markets, chipmakers can develop dedicated hardware capable of operating with 100 to 1,000 times greater energy efficiency than general purpose processors to overcome another big limiter to scaling compute performance – power.For its part, the memory industry can improve performance by signaling at higher data rates and using stacked architectures like HBM for greater power efficiency and performance, and by bringing compute closer to the data.Memory scaling for AIA key challenge is scaling memory for AI. Demand for better voice, gesture and facial recognition experiences and more immersive virtual reality and augmented reality interactions is tremendous, said Bill En, senior director at AMD, speaking at the symposium. These capabilities require more processing power across both high-performance computing (HPC) for big data analytics and machine learning as it relies on AI and machine intelligence to generate meaningful insights. Emerging machine learning applications include classification and security, medicine, advanced driver assistance, human-aided design, real-time analytics and industrial automation. And with 75 billion IoT-connected devices – all generating data – expected by 2025, there will be no shortage of data to analyze, En said. The wings alone of a new Airbus A380-1000 feature some 10,000 sensors.Mountains of this data are stored in massive data centers on magnetic hard drives, then transferred to DRAM before moving to SRAM within the CPU for the handoff to the compute hardware for analysis.With data growing at an exponential clip, the question is how to make sure all other memory systems can handle the flood of data. AMD’s answer is a chiplet architecture featuring eight smaller chips around the edge that drive the compute and a large chip in the center that doubles the IO interface and memory capability to in turn double chip bandwidth.AMD has also moved from a legacy GDDR5 memory chip configuration to HBM to bring memory bandwidth closer to the GPU for more efficient processing of AI applications. The HBM provides much higher bandwidth while reducing power consumption. Compared to DRAM, AMD’s HBM delivers a much faster data rate and far greater memory density, En said.Over the next decade, look for more performance improvements from multi-chip architectures, innovations in memory technology and integration, aggressive 3D stacking and streamlined system-level interconnects, he said. The industry will also continue to drive performance gains in devices, compute density and power through technology scaling.Michael Hall is a global marketing communications manager at SEMI.

Breaking the Memory Wall: The AI Bottleneck

By Michael Hall

August 13, 2019