∗ 위의 로고를 클릭하시면 Mellanox 홈페이지의 제품을 보실 수 있습니다.
(구매문의는 오른쪽 위의 매뉴를 이용 바랍니다.)
고성능 컴퓨팅(HPC)
개요
고성능 컴퓨팅에는 병렬 처리를 활용하는 고급 계산이 포함되므로 기후 연구, 분자 모델링, 물리적 시뮬레이션, 암호 해독, 지구 물리학 모델링, 자동차 및 항공 우주선 설계, 재무 모델링, 데이터 마이닝 등 고도로 계산 집약적인 작업을 더욱 신속하게 수행합니다. 고성능 시뮬레이션은 가장 효율적인 컴퓨팅 플랫폼을 요구합니다. The execution time of a given simulation depends upon many factors, such as the number of CPU/GPU cores and their utilization factor and the interconnect performance, efficiency, and scalability. Efficient high-performance computing systems require high-bandwidth, low-latency connections between thousands of multi-processor nodes, as well as high-speed storage systems.
One of HPC’s strengths is the ability to achieve the best possible sustained performance by driving the CPU/GPU performance toward its limits. 컴퓨터 클러스터는 생산성과 유연성에 힘입어 HPC 시뮬레이션에 가장 많이 사용하는 하드웨어 솔루션이 되었습니다. Maximizing productivity in today’s HPC cluster platforms requires the use of enhanced data messaging techniques.
By providing low-latency, high-bandwidth, high message rate, transport offload to facilitate extremely low CPU overhead, Remote Direct Memory Access (RDMA), and advanced communications offloads, Mellanox interconnect solutions are the most deployed high-speed interconnect for large-scale simulations, replacing proprietary or low-performance solutions. Mellanox's Scalable HPC interconnect solutions are paving the road to Exascale computing by delivering the highest scalability, efficiency, and performance for HPC systems today and in the future. Mellanox의 확장 가능한 HPC 솔루션은 매우 다양한 시장 영역, 클러스터링 토폴로지 및 환경(Linux, Windows)에서 입증되고 인증 받았습니다. Mellanox is an active member of the HPC Advisory Council and contributes to high-performance computing outreach and education around the world.
고성능 컴퓨팅은 다양한 시장과 응용 프로그램을 가능하게 합니다.
생명 과학
생명 과학 데이터 센터에서는 HPC 기술을 활용하여 대기 시간을 줄이고 생산성을 향상시킵니다. 이러한 데이터 센터에는 다중 코어, 다중 프로세서 서버 및 고속 스토리지 시스템이 배포되어 있지만 서버와 스토리지를 연결하는 네트워크의 성능이 최적화되지 않아 속도가 느려질 수 있습니다.
EDA
일반적으로 EDA 시뮬레이션에는 3D 모델링, 유체 역학 및 고성능 컴퓨팅(HPC) 데이터 센터 솔루션을 요구하는 기타 계산 집약적 프로세스가 포함됩니다.
제조
Mechanical computer-aided design (MCAD) and computer-aided engineering (CAE) systems have adopted HPC cluster computing environments to speed processing times and reduce revenue for new products.
미디어 및 엔터테인먼트
오늘날의 미디어 데이터 센터는 제작 지연을 줄이기 위해 고성능 HPC 클러스터 기술에 투자하고 있으며, 수백 또는 수천 단위의 CPU를 결합하여 고도로 복잡한 렌더링 작업을 처리할 수 있습니다.
석유 및 가스 산업 모델링
Oil and gas companies use HPC technology to minimize the time involved in processing massive amounts of data in order to reduce costs and speed production.
기후
기상 예보 및 연구는 대량의 데이터 입력을 고속으로 처리해야 하므로 생산은 처리 속도에 따라 결정됩니다. Data centers used in weather research use HPC cluster technology, combining the power of thousands of CPUs with large, high-speed storage systems.
성능 벤치마크 개요
HPC(High Performance Computing) 및 가상화된 엔터프라이즈 데이터 센터에는 복잡한 시뮬레이션과 연구가 필요하므로 높은 처리량과 저지연 I/O 솔루션 사용이 필수적입니다.
Mellanox의 InfiniBand 및 이더넷 어댑터는 세계 최고의 업계 표준 상호 연결 솔루션으로, 컴퓨팅 시스템 생산성, 효율성 및 확장성을 최대화하는 데 필요한 대역폭, 지연 시간, 낮은 전력 소비량 및 CPU 사용률 특성을 제공합니다.
InfiniBand 성능
HPC(High Performance Computing) 및 가상화된 엔터프라이즈 데이터 센터에는 복잡한 시뮬레이션과 연구가 필요하므로 높은 처리량과 저지연 I/O 솔루션 사용이 필수적입니다. InfiniBand는 업계 표준 상호 연결 방식으로, 컴퓨팅 시스템 생산성, 효율성 및 확장성을 최대화하는 데 필요한 대역폭, 지연 시간, 낮은 전력 소비량 및 CPU 사용률 특성을 제공합니다.
이더넷 성능
Mellanox에서는 주류 그리드 및 엔터프라이즈 데이터 센터를 위한 업계 최고의 비용 효율적인 고성능 이더넷 상호 연결 솔루션을 제공합니다.
∗ 위의 로고를 클릭하시면 Omni-Path 홈페이지의 제품을 보실 수 있습니다.
(구매문의는 오른쪽 위의 매뉴를 이용 바랍니다.)
The Next-Generation Fabric
Intel® Omni-Path Architecture (Intel® OPA), an element of Intel® Scalable System Framework, delivers the performance for tomorrow’s high performance computing (HPC) workloads and the ability to scale to tens of thousands of nodes—and eventually more—at a price competitive with today’s fabrics. The Intel OPA 100 Series product line is an end-to-end solution of PCIe* adapters, silicon, switches, cables, and management software. As the successor to Intel® True Scale Fabric, this optimized HPC fabric is built upon a combination of enhanced IP and Intel® technology.
For software applications, Intel OPA will maintain consistency and compatibility with existing Intel True Scale Fabric and InfiniBand* APIs by working through the open source OpenFabrics Alliance (OFA) software stack on leading Linux* distribution releases. Intel True Scale Fabric customers will be able to migrate to Intel OPA through an upgrade program.
The Future of High Performance Fabrics
Current standards-based high performance fabrics, such as InfiniBand*, were not originally designed for HPC, resulting in performance and scaling weaknesses that are currently impeding the path to Exascale computing. Intel® Omni-Path Architecture is being designed specifically to address these issues and scale cost-effectively from entry level HPC clusters to larger clusters with 10,000 nodes or more. To improve on the InfiniBand specification and design, Intel is using the industry’s best technologies including those acquired from QLogic and Cray alongside Intel® technologies.
While both Intel OPA and InfiniBand Enhanced Data Rate (EDR) will run at 100Gbps, there are many differences. The enhancements of Intel OPA will help enable the progression towards Exascale while cost-effectively supporting clusters of all sizes with optimization for HPC applications at both the host and fabric levels for benefits that are not possible with the standard InfiniBand-based designs.
Intel OPA is designed to provide the:
- Features and functionality at both the host and fabric levels to greatly raise levels of scaling
- CPU and fabric integration necessary for the increased computing density, improved reliability, reduced power, and lower costs required by significantly larger HPC deployments
- Fabric tools to readily install, verify, and manage fabrics at this level of complexity
Intel® Omni-Path Key Fabric Features and Innovations
Adaptive Routing
Adaptive Routing monitors the performance of the possible paths between fabric end-points and selects the least congested path to rebalance the packet load. While other technologies also support routing, the implementation is vital. Intel’s implementation is based on cooperation between the Fabric Manager and the switch ASICs. The Fabric Manager—with a global view of the topology—initializes the switch ASICs with several egress options per destination, updating these options as the fundamental fabric changes when links are added or removed. Once the switch egress options are set, the Fabric Manager monitors the fabric state, and the switch ASICs dynamically monitor and react to the congestion sensed on individual links. This approach enables Adaptive Routing to scale as fabrics grow larger and more complex.
Dispersive Routing
One of the critical roles of fabric management is the initialization and configuration of routes through the fabric between pairs of nodes. Intel® Omni-Path Fabric supports a variety of routing methods, including defining alternate routes that disperse traffic flows for redundancy, performance, and load balancing. Instead of sending all packets from a source to a destination via a single path, Dispersive Routing distributes traffic across multiple paths. Once received, packets are reassembled in their proper order for rapid, efficient processing. By leveraging more of the fabric to deliver maximum communications performance for all jobs, Dispersive Routing promotes optimal fabric efficiency.
Traffic Flow Optimization
Traffic Flow Optimization optimizes the quality of service beyond selecting the priority—based on virtual lane or service level—of messages to be sent on an egress port. At the Intel® Omni-Path Architecture link level, variable length packets are broken up into fixed-sized containers that are in turn packaged into fixed-sized Link Transfer Packets (LTPs) for transmitting over the link. Since packets are broken up into smaller containers, a higher priority container can request a pause and be inserted into the ISL data stream before completing the previous data.
The key benefit is that Traffic Flow Optimization reduces the variation in latency seen through the network by high priority traffic in the presence of lower priority traffic. It addresses a traditional weakness of both Ethernet and InfiniBand* in which a packet must be transmitted to completion once the link starts even if higher priority packets become available.
Packet Integrity Protection
Packet Integrity Protection allows for rapid and transparent recovery of transmission errors between a sender and a receiver on an Intel® Omni-Path Architecture link. Given the very high Intel® OPA signaling rate (25.78125G per lane) and the goal of supporting large scale systems of a hundred thousand or more links, transient bit errors must be tolerated while ensuring that the performance impact is insignificant. Packet Integrity Protection enables recovery of transient errors whether it is between a host and switch or between switches. This eliminates the need for transport level timeouts and end-to-end retries. This is done without the heavy latency penalty associated with alternate error recovery approaches.
Dynamic Lane Scaling
Dynamic Lane Scaling allows an operation to continue even if one or more lanes of a 4x link fail, saving the need to restart or go to a previous checkpoint to keep the application running. The job can then run to completion before taking action to resolve the issue. Currently, InfiniBand* typically drops the whole 4x link if any of its lanes drops, costing time and productivity.
Ask How Intel® Omni-Path Architecture Can Meet Your HPC Needs
Intel is clearing the path to Exoscale computing and addressing tomorrow’s HPC issues. Contact your Intel representative or any authorized Intel® True Scale Fabric provider to discuss how Intel® Omni-Path Architecture can improve the performance of your future HPC workloads.