ABCI Supercomputer
Japan’s Advanced ABCI 3.0 Supercomputer Increases AI Sovereignty. HPE built a new AI supercomputer for AIST, one of the nation’s largest public research institutes, using hundreds of NVIDIA H200 GPUs and Quantum-2 InfiniBand.
NVIDIA H200
Japanese National Institute of Advanced Industrial Science and Technology (AIST) will add thousands of NVIDIA H200 Tensor Core GPUs to its AI Bridging Cloud Infrastructure 3.0 supercomputer. to boost AI sovereignty and R&D. HPE Cray XD will use NVIDIA Quantum-2 InfiniBand networking for performance and scalability.
ABCI 3.0, Japan’s latest Open AI Computing Infrastructure, promotes AI research and development. This partnership demonstrates Japan’s dedication to strengthening its technical independence and developing its AI capabilities.
ABCI
AIST Executive Officer Yoshio Tanaka stated, “They launched ABCI, the world’s first large-scale open AI computing infrastructure, in August 2018.” “They’re upgrading to ABCI 3.0 now, building on their expertise operating ABCI over the past few years. Their goal is to transform ABCI 3.0 into a computer infrastructure that will help Japan’s generative AI research and development capabilities grow in partnership with NVIDIA.
“It’s imperative to quickly cultivate research and development capabilities within Japan as generative AI prepares to catalyze global change,” stated Hirotaka Ogawa, Head of ABCI Operations at AIST Solutions Co. and Producer. “With their partnership with NVIDIA and HPE, Your that this significant upgrade of ABCI will strengthen the organization’s leadership in both domestic industry and academia, advancing Japan’s AI development towards global competitiveness and acting as a cornerstone for future innovation.”
ABCI 3.0: Japanese AI Research and Development Enters a New Era
AIST, its corporate subsidiary AIST Solutions, and its system integrator Hewlett Packard Enterprise (HPE) are responsible for building and running ABCI 3.0.
METI
The ABCI 3.0 project is a component of a larger $1 billion initiative by Japan’s Ministry of Economy, Trade and Industry, or METI, which includes both ABCI efforts and investments in cloud AI computing. METI has supported the project’s efforts to strengthen its computing resources through the Economic Security Fund.
Following a visit by company founder and CEO Jensen Huang last year, where he met with business and political heavyweights, including Japanese Prime Minister Fumio Kishida, to explore the future of AI, NVIDIA is closely partnering with METI on research and teaching.
NVIDIA’s Dedication to the Future of Japan
Huang promised to work with others on research, especially in the areas of robotics, quantum computing, and generative AI. He also promised to invest in AI startups and offer product support, training, and education.
Huang stressed during his tour the importance of “AI factories,” which are next-generation data centers built to handle the most computationally demanding AI activities, in converting massive volumes of data into intelligence.
Huang declared, “The AI factory will become the bedrock of modern economies across the world,” in a December meeting with Japanese media. With its energy-efficient design and ultra-high-density data center, ABCI offers a reliable infrastructure for creating big data and artificial intelligence applications.
By year’s end, the system should be operational and provide cutting-edge resources for AI research and development. It will be located close to Tokyo in Kashiwa.
Superior Processing Speed and Effectiveness
The establishment will provide:
Six AI exaflops, a measurement of AI-specific performance in the absence of sparsity
408 double-precision petaflops, a unit of measurement for overall computer power
The Quantum-2 InfiniBand platform connects each node with a bisectional bandwidth of 200 GB/s.
The foundation of this effort is NVIDIA technology, with hundreds of nodes outfitted with eight NVLlink-connected H200 GPUs each, offering hitherto unheard-of computational performance and efficiency.
The first GPU to provide more than 140 GB of HBM3e memory at 4.8 terabytes per second (TB/s) is the NVIDIA H200. Larger and faster memory on the H200 allows for faster generative AI and LLMs, as well as more advanced scientific computing for HPC workloads with reduced total cost of ownership and improved energy efficiency.
For AI workloads like LLM token creation, NVIDIA H200 GPUs are 15X more energy-efficient than ABCI’s previous-generation architecture.
The combination of cutting-edge NVIDIA Quantum-2 InfiniBand with In-Network computing, which offloads processing from the CPU to networking devices to perform data computations, guarantees effective, fast, low-latency communication essential for managing large datasets and demanding AI tasks.
ABCI is a platform to expedite collaborative AI research and development with industry, academia, and governments. It has state-of-the-art computing and data processing capabilities.
METI’s significant investment demonstrates Japan’s strategic goal of boosting AI development capabilities and quickening the use of generative AI.
ABCI infrastructure
The advanced cloud computing platform ABCI (AI Bridging Cloud Infrastructure) 3.0 aids artificial intelligence research and development. Here are a few of ABCI 3.0‘s main attributes and advantages:
Principal Elements of High-Performance Computing (HPC)
Outfitted with cutting-edge GPUs and CPUs to deliver significant processing capability.
suited for AI workloads, allowing for quicker machine learning model inference and training.
Scalability. Able to flexibly scale resources in response to the demands of AI workloads. Facilitates large-scale production deployments as well as small-scale trials.
Fast-Network
Makes use of high-speed networking technology to provide high throughput and minimal latency. Allows for the effective transport of data across computing nodes.
Energy Effectiveness
Energy-efficient parts were used in the design to reduce the environmental effect.
uses sophisticated power management and cooling systems to cut down on energy use.
Interface That’s Easy to Use
Offers user-friendly APIs and interfaces that make resource management and access simple. Makes it available to academics and developers by supporting a variety of AI frameworks and tools.
Advantages of Accelerated AI Research
Makes it possible for AI models to be trained, deployed, and prototyped quickly.
Supports big datasets and intricate simulations, quickening the rate of advancement in AI.
Expense-effectiveness
Provides a pay-as-you-go pricing structure that lets customers tailor expenses to their usage. Lessens the requirement for large initial hardware purchases.
Working Together and Sharing
Makes it easier for institutions and academics to collaborate by granting shared access to potent computer resources. Encourages cooperative research environments by supporting joint initiatives and data sharing.
Enhanced Protection
Puts strong security measures in place to safeguard confidential information and intellectual property. Offers encryption and safe access restrictions to guarantee the confidentiality and integrity of data.
ABCI 3.0’s main attributes and advantages
- Provides thorough support and educational materials to enable customers to get the most out of the platform.
- Offers technical support, courses, and documentation to help with optimization and troubleshooting.
- With the capabilities and resources needed to promote AI research and applications, ABCI 3.0 is a major improvement in cloud infrastructure designed with AI in mind.