In the Age of AI, How to Hitch a Ride on Nvidia's Bandwagon?
In the Age of AI, How to Hitch a Ride on Nvidia's Bandwagon?
In the global capital markets, NVIDIA leads with its focused technological innovation and robust performance growth, especially as the AI technology rollout accelerates, its data center business has stood out and shown astonishing growth momentum.
Specifically, NVIDIA's revenue for the first fiscal quarter of 2025 reached $26.04 billion, soaring 262% from the same period last year, of which, the data center business accounted for $22.6 billion in revenue, achieving a year-over-year increase of 427%, representing 87% of the total revenue. Meanwhile, thanks to the high profitability of the data center business, the company's net profit also climbed to $14.88 billion, a 628% increase compared to the same period last year.
Following the earnings announcement, NVIDIA's stock price broke through the $1,000 milestone, with the after-hours share price skyrocketing by 6.05%, fully demonstrating investors' recognition of the company's impressive financial results and future potential.
Previously, NVIDIA was known mainly for its gaming graphics cards. In the fiscal year 2016, the company's total revenue was $5.01 billion, with the gaming GPU business contributing $2.818 billion, while the data center business was only $339 million. However, with the rise of cloud computing, the prevalence of remote work, and the intensity of digital currency mining, the revenue from the data center business has experienced rapid growth. By the fiscal year 2023, the revenue from the data center business surpassed the gaming business for the first time, reaching $15.005 billion, and has since started a rapid growth mode.
In the fiscal year 2024, NVIDIA's total revenue reached $60.922 billion, a significant increase of 126% year-over-year, while the revenue from the data center business realized an astonishing growth of 217%, accounting for nearly 80% of the total revenue. In the latest quarter, the growth rate of the data center business escalated to 427%, with its proportion further increasing to 87%.
With the continued growth of the data center GPU business, NVIDIA has gradually transformed from a hardware supplier into a leading enterprise in the field of AI computing power. Its new positioning has attracted widespread attention in the capital markets, bringing vast valuation space, and the market value has broken through the $2 trillion mark.
NVIDIA's rise is not only the best epitome of the AI wave but also represents the imagination and opportunities that can be unlocked with the keys to the AI era.
But at its core, NVIDIA's ability to stand out in this wave of AI and become the biggest beneficiary is mainly due to the huge growth in demand for computing power in the AI industry.
With the launch of ChatGPT at the end of 2022, artificial intelligence entered the era of large-scale models. The rapid emergence and upgrade of large language models, as well as significant improvements in their performance, further pushed the continuous expansion of training and inference, leading to a rapid increase in demand for computing power. According to a report by Haitong Securities, taking OpenAI's GPT-3 model as an example, the training of its 175 billion parameters required approximately 3,640 PFlops-day of computing power, utilizing 1,024 A100 (GPU) units across a duration of 34 days. By contrast, GPT-4 has a parameter count that is 500 times that of GPT-3, with training involving around 20-30 thousand A100 graphics cards and lasting for about a month.
At NVIDIA's GTC conference, CEO Jensen Huang mentioned that training a GPT model with 18 trillion parameters requires the deployment of 8,000 H100 GPUs and the consumption of 15 megawatts of electricity, running continuously for 90 days to complete. According to the prediction by Dongwu Securities, the computing power required for future inference will exceed training needs by several times, possibly reaching as many as several hundred thousand H100 graphics cards. With the continuous evolution of the model, this thirst for computing power seems endless. IDC expects the data center GPU market to grow from $10.3 billion in 2022 to $65.4 billion in 2027, with a compound annual growth rate (CAGR) of 44.55%.
AI servers, as the carriers of computing power, cover multiple large industry chains including GPUs, CPUs, storage, and communication technologies. Data from TrendForce reveals that the global shipment of AI servers reached 855,000 units in 2022 and is expected to reach 2.369 million units in 2026, with a CAGR of 29.02% from 2022 to 2026, significantly exceeding the growth rate of the overall global server market. Specifically for the Chinese market, the "2022-2023 China Artificial Intelligence Computing Power Development Assessment Report" released by IDC and Inspur Information shows that the size of the Chinese AI server market in 2021 was $5.92 billion, and it is expected to grow to $12.34 billion by 2026, with a CAGR of 15.82% in the same period.
Therefore, the growing demand for AI servers is increasingly deepening the dependency on NVIDIA's data center GPUs. NVIDIA's business expansion from GPUs to complete AI servers has brought significant revenue growth for the company. Exploring NVIDIA's product line, it is known from the company's official website that during each GPU architecture cycle, products are divided into three categories according to brand: GeForce, NVIDIA RTX/Quadro, and Data Center. The GeForce series primarily targets the consumer market in the gaming industry, while NVIDIA RTX/Quadro serves professional fields such as industrial design and media development. Data center GPUs, as NVIDIA's core product, have become the company's major source of income in light of the surge in AI server demand, and their growth rate is notable.
In discussing NVIDIA's supply chain, we notice that the business area with the most significant growth is related to data center GPUs. The GPUs relied upon by data centers can be somewhat understood through NVIDIA's product naming conventions. For example, names with GeForce are typically used for gaming and other consumer fields. Products without GeForce but including RTX or Quadro are often used in the professional design domain, with RTX specifically indicating that the GPU has ray tracing technology. Moreover, products named directly after the initial letter of the GPU architecture, such as A100 (Ampere), H100 (Hopper), L4 (Ada Lovelace), B200 (Blackwell), are mainly used for high-performance computing tasks in data centers.
By comparing GPU performance, we can observe that at the end of 2023, AMD's MI300X once surpassed NVIDIA's flagship product at the time, the H200, in terms of floating-point computing performance. However, with NVIDIA's release of the B200 chip based on the Blackwell architecture in 2024, it regained the lead in low-precision computing capabilities, which is particularly critical for the training and inference of large models, as these tasks usually rely on low-precision calculations, where NVIDIA's latest GPU offers a significant performance advantage.
It is worth noting that NVIDIA has an undeniable advantage at the software level. After more than a decade of development, its CUDA software ecosystem has built a strong moat for NVIDIA, significantly increasing market acceptance for its GPU products.
NVIDIA's ambition in the data center business is not limited to GPU chips themselves. In recent years, it has been continuously exerting effort, developing data center-related businesses and integrated solutions around GPUs, including the multi-GPU DGX series servers and the rack-level solutions composed of these servers. For example, at the 2024 GTC conference, NVIDIA launched a range of new products, including the DGX GB200 server, as well as the liquid-cooled DGX GB100 SuperPod rack-level server. One DGX GB200 SuperPod is composed of 8 DGX GB200 servers, each containing 36 GB200s, and each GB200 is made up of two B200 GPUs and one Grace CPU.
Modern large models require more powerful computing support, and they evolve rapidly. NVIDIA's leading position in the AI server market provides strong momentum for its business and has resulted in exponential growth in performance. Coupled with optimistic market expectations for its business prospects and a supply-and-demand imbalance, NVIDIA's stock price has achieved more than a 7-fold increase since October 2022. Supply chain companies related to its data center business have also soared accordingly.
Investing in NVIDIA's concept products mainly involves the company's proud offerings, such as the rack-level liquid-cooled DGX GB200 SuperPod servers and the construction of the surrounding industry chain and corresponding supply chain enterprises. When delving into the composition of the industry chain, products that are often mentioned like the H100 and A100 are essentially GPU boards, made up mainly of GPU chips, HBM, and other electronic components. On top of these basic units, multiple GPUs are combined with CPUs and SSDs to build AI servers; furthermore, clusters of AI servers, along with switches and racks, further form server farms.
AI servers can be divided into multiple parts according to their functions, including computation, storage, interconnection interfaces, I/O, power, firmware, and other accessories and modules. These specific modules cover areas such as GPUs, CPUs, HBM memory, DRAM, SSD+RAID solid-state drives, on-board chip interconnects, connections between servers, network interface cards, power management, BMS/BIOS, PCBs, power modules, cooling modules, other components/cables, and OEM services.
In terms of the Bill of Materials (BOM) percentage for various components, taking NVIDIA's DGX A100 system as an example, the value of a single component from highest to lowest is GPU (48%), OEM (15%), SSD (10%), HBM memory (9%), DRAM (9%), CPU (7%), network card (1%), interconnect interface (0.7%), PCB (0.6%), cooling system (0.3%), power module (0.3%), other components (0.1%), power management (0.09%), and BIOS/BMS (0.02%).
Since the DGX B200 system uses copper cables internally and has a liquid cooling system, the value of copper cables and the cooling module is expected to increase compared to the DGX A100. Market data predicts that copper cables will account for 5-7% of the BOM in a single server.
In the details of corporate composition, this article will next analyze the companies involved in each segment. The upstream, midstream, and downstream sectors of the GPU chip industry chain refer to chip design, chip manufacturing, and chip packaging and testing, respectively. NVIDIA is at the top of the design segment, and TSMC, with its leading 4nm and 7nm processes and CoWoS packaging technology, is the main manufacturer of NVIDIA's GPU chips and also the main supplier for GPU chip and SK Hynix HBM chip packaging. Simply put, after manufacturing the GPU chips, TSMC encapsulates them with HBM chips and then hands them over to other companies (such as Foxconn) for assembly, eventually forming the GPU card. Currently, the participation of domestic companies is mainly concentrated in the field of packaging and testing.
In the CPU field, as another core chip of AI servers, the CPU plays the role of the "brain" controlling GPU operations. For instance, NVIDIA’s DG9X A200 servers have market predictions, such as including two CPUs and eight GPU chips. NVIDIA previously used Intel CPUs in the HGX H100 servers, but in its newly released GB200 super GPU, it has opted for its self-developed Grace CPU.
NVIDIA's newly released DGX B200 and DGX GB200 servers, in addition to their powerful processing capabilities, are also equipped with DPU chips specially designed to accelerate computation.
As for memory technology, there are two main types on the market: GDDR and HBM (High Bandwidth Memory). As HBM technology demonstrates clear advantages in bandwidth, it has become the preferred memory type for NVIDIA’s mid-to-high-end data center GPUs (such as A100, H100, H200). This storage solution is a new type of high-speed, value-added DRAM product, that allows stacking multiple DRAM chips vertically on a buffer chip through TSV (Through-Silicon Via) technology, thus forming a single integrated memory module. HBM stacks can contain 4, 8, or more layers, forming a DDR stack array with vast capacity and wide bus width. Importantly, HBM is connected to GPU/CPU/SoC not via traditional external interconnects, but via an intermediary dielectric layer that allows rapid and close connection, thereby significantly reducing data transfer latency and power consumption. The 3D stacked structure of HBM enables CPUs/GPUs to access higher bandwidth and more tightly integrated memory, and its packaging as a SiP (System in Package) also optimizes space utilization and reduces the footprint. For example, a machine equipped with 4 to 8 A100 80GB GPU cards may have a total HBM memory capacity between 320GB to 640GB. According to SK Hynix's predictions, the demand for HBM is expected to grow at a rate of about 60% per year. Currently, HBM supply is mainly dominated by three major companies: SK Hynix, Samsung, and Micron, with SK Hynix being NVIDIA's major HBM supplier. Domestic enterprises, such as ChangXin Memory Technologies, indirectly serve NVIDIA by supplying SK Hynix.
Besides video memory, SSDs are also an indispensable storage hardware in AI servers. It's worth noting that, unlike DRAM, which stores information temporarily, NAND type SSDs are more suitable for long-term storage with less variation. Upstream in the SSD industry mainly consists of flash memory chip and controller chip manufacturers; while module manufacturers are responsible for the mid to downstream production. Integrated Device Manufacturers (IDMs), who possess core technology and achieve vertical integration, hold a central position in the design and production of SSDs. On the other hand, module manufacturers focus on assembling these components, and some have also begun to develop their own controller chip technologies. In the market, SK Hynix and Samsung have also maintained a high market share in the SSD sector, a situation similar to that in the HBM field.
In the area of chip and server interconnect interfaces, to enhance AI processing power, besides relying on the improvement of GPU performance, there must also be highly developed chip interconnect and system connectivity technologies to form massive clustered computing power, which is the foundation for enhancing GPU scalability. For NVIDIA, the technologies they are currently using include PCIe Switch, NVLink, and NVSwitch, among others.
PCIe technology has been highly efficient in communication between various manufacturers and devices due to its excellent compatibility since its inception. As the latest achievement of this technology, the sixth generation PCIe is maturing, and people have already glimpsed the dawn of the seventh generation PCIe standard, anticipated for official widespread use in the near future, namely 2025. In this market, the technological leaders of the PCIe Switch core chips are Broadcom, Microchip Technology, and Phison Electronics, which combined hold about 58% of the global market share, while domestic companies such as Montage Technology are also starting to venture into this field.
To overcome the signal attenuation problems that may occur during transmission on the PCIe bus, PCIe Retimer chips have become an indispensable signal repeater. At the same time, with the advancement of technology and growing demand, globally, only two companies are currently able to mass-produce PCIe 5.0 Retimer chips, namely Montage Technology and Astera Labs.
Beyond PCIe, NVIDIA's self-developed NVLink technology is particularly noteworthy. This technology, used to connect multiple GPUs, achieves high-performance parallel computing; compared to traditional PCIe, it has a several-fold increase in bandwidth and significantly reduced latency, allowing for more efficient data sharing and communication between GPUs. Especially with its latest fifth-generation NVLink technology, a single Blackwell architecture GPU can achieve a transfer rate of up to 1.8TB/s, fourteen times the bandwidth of the PCle Gen 5.0 standard. NVIDIA's NVLink-C2C technology further extends this connectivity, allowing chiplets to interconnect with GPUs, CPUs, DPUs, NICs, and SOCs, among other devices. For example, the GB200 superchip adopts this advanced interconnect technology to link CPUs with GPUs.
About NVLink, we have to mention NVSwitch. According to Junjie Lai, the Engineering and Solutions Director of Nvidia in China, NVLink is similar to an ethernet cable, while NVSwitch can be seen as a switch. However, the switch here is not the usual network data exchange device we are familiar with—it connects multiple NVLink ports internally to achieve data forwarding across groups. Taking GB200 NVL72 as an example, Nvidia connected 9 NVSwitches and 18 compute nodes with customized copper cables within the cabinet. The NVIDIA GB200 NVL72 can connect up to 576 GPUs within an NVLink domain, with a total bandwidth exceeding 1 PB/s, along with 240 TB of high-speed flash memory. The most critical change is that NVSwitch has evolved from its original server PCB board-mounted form to nine independent switches within a cabinet, thus enhancing its capability to connect up to 72 GPUs.
In other areas of networking technology, network card servers connect to switches via network cards and transmit information efficiently to their destination by exchanging data frames. This field encompasses technologies such as Ethernet, InfiniBand, and Omnipath.
InfiniBand technology, known for its ultra-high bandwidth and very low latency, has become a key component in AI data centers and servers. The market is dominated by Nvidia's Mellanox division’s HDR products, offering end-to-end bandwidth up to 200 Gbps. These technologies are widely adopted in Nvidia servers, with all network cards coming from Mellanox’s production line.
PCB technology, considered the "mother of electronic products," is an indispensable supporting component in graphics cards and servers. PCBs are classified into single/double-sided boards, multilayer boards, high-density interconnect (HDI) boards, flexible boards, and integrated circuit (IC) carrier boards. In GPUs, high-end multilayer PCBs are commonly used, mainly for server CPUs, motherboards, power backplanes, hard drive backplanes, network cards, and Riser cards. With the continuous development of server platforms, there is an increasing demand for the number of layers and materials of server PCBs—for example, an upgrade from the 10-12 layers of the Purley platform to the 14-20 layers of the Eagle Stream platform, demanding stricter requirements for the loss level of copper-clad laminate (CCL). According to statistics from Ping An Securities, China currently holds over 50% share of the global PCB market. In addition, Guojin Securities pointed out that mainland China's PCB manufacturers have established a firm footing in the global server PCB competition through cooperative research and supply with domestic server manufacturers, such as Hudian shares and Shenghong Technology, which provide PCBs for Nvidia’s servers and graphics cards, respectively.
Heat Dissipation Module is crucial for server performance. It is responsible for absorbing and discharging heat generated inside the server and from external sources, ensuring integrated circuits maintain a normal temperature to prevent damage caused by high temperatures. The mainstream cooling technologies are mainly divided into air cooling and liquid cooling. Against the backdrop of the rapid development of artificial intelligence, there has been a huge demand for computing power and cabinet space. Due to the limited construction space of data centers and environmental policies, traditional air cooling systems are increasingly unable to meet the current cooling needs, necessitating the intervention of liquid cooling technology. Liquid cooling technology, in terms of chip spot cooling, greatly meets the cooling challenges caused by the increased power density of chips; and in terms of overall cabinet cooling, it has gradually become the ideal choice for high-power servers exceeding 15 kW, especially for those data centers that can no longer rely on air convection for effective heat dissipation. Nvidia's DGX GB200 server adopts liquid cooling technology, with its main liquid cooling system partner being Vertiv, a company listed on the U.S. stock exchange, while some domestic suppliers serve this field indirectly by participating in the supply chain of Vertiv.
Copper Cables play an important role in servers. As mentioned above, the GB200 server uses the fifth-generation NVLink technology, with about 5000 copper cables internally to implement the connection between the switch and GPU.
High-Speed Copper Cables DAC (Direct Attach Cable) are a cost-effective choice for short-distance high-speed data transmission. This copper cable connection method can achieve transmission rates of over 10GB at a more economical cost than optical modules, and without the need for electronic components, costing only about one-fifth of the price of optical modules. DACs also have the advantages of low power consumption, low latency, and low insertion loss, thus often used to replace electronic or optical connectors for short-distance high-speed data network transmission. Although there are no direct participating units in mainland China, some enterprises have established an indirect connection with suppliers like Amphenol by providing them with supplies.
Optical Modules continue to expand in their application areas. For instance, Nvidia has constructed a multi-tier NVLink network using the fifth-generation NVLink and fourth-generation NVSwitch technologies to achieve direct interconnectivity and shared memory access among up to 576 GPUs. Copper cables are used for the connections between GPUs and the NVSwitch within a cabinet in this interconnection scheme; however, optical interconnect technology is still used for connecting different levels of NVSwitch, hence the steady growth in the use of optical modules. According to SemiAnalysis research data estimations, as the number of optical modules supported by each of the 72 OSFP ports in the DGX GB200 NVL72—each port accommodating a 400G or 800G optical module—increases, the quantity ratio between the GB200 and 800G optical modules will range from 1:2.5 to 1:3.5 following changes in the network topology structure.
Switches are particularly important for building server clusters. It is a key network device for the forwarding of electro-optical signals, capable of providing a dedicated electrical signal transmission path between any two network nodes connected to the switch. Take the NVIDIA DGX H100 SuperPOD as an example, which includes 32 servers and 12 switches. In 2019 and 2020, NVIDIA entered the manufacturing of switch products by acquiring two switch companies, Mellanox and Cumulus. Regarding hardware composition, switches include the chassis, power supply, fans, backplane, management engine, system controller, switch module, and line cards. Currently, NVIDIA mainly uses its self-developed Spectrum-X800 Ethernet switch.
ODM OEM manufacturing, Foxconn Precision, as NVIDIA's main OEM for AI servers, accounts for over 50% of the AI server chip substrate supply and is also the exclusive supplier of NVIDIA's latest GH200 chip module. Foxconn Precision engages in the AI server business through its subsidiaries, Industrial Fulian and Hongbai Technology. Companies like Wistron InfoComm and Supermicro also cooperate with NVIDIA.
Other partners include companies from various industries that have cooperated with NVIDIA in different ways. For instance, Hongboa Shares and Inspur Information have had project cooperation with NVIDIA in terms of computing power; Shunwang Technology cooperated with NVIDIA in the domestic smart computing field; Qianfang Technology's subsidiary participated in the joint development of servers with NVIDIA; Zhongdian Port is an authorized distributor of NVIDIA; Dingyang Technology supplies electronic test and measurement equipment for NVIDIA.
The information and views provided in this article do not constitute any investment advice. Before making investment decisions, investors should consult professionals and carefully consider the related investment risks.
Comments 0