"For example, driven by the popularity of live broadcasting of sports events and computer games, 4K video and h.265 coding algorithm have ushered in great development, and the traditional CPU is difficult to cope with the required frame rate. Similarly, artificial intelligence is also using massive data for training neural networks to learn and identify patterns in data, and then used in many different applications such as image recognition, automatic driving, search optimization and natural language translation. In these two areas, data centers increasingly use special accelerators to achieve low latency response to user queries.
Intel: We support the ARM kernel
At the Intel SOC FPGA Developer Forum (ISDF) held in November 2016, Yang Xu, global vice president and President of Intel (Intel), once again made a decision to the industry, including investing in new FPGA and SOC FPGA products, supporting a longer product life cycle, and continuously providing customers with first-class service and support And continue to support the four commitments including ARM core in Altera SOC FPGA.
He said that today's technology is constantly "breaking the barriers between the digital world and the real world". Intel calls the relationship between things, devices and cloud a "virtuous circle of growth". Through this cycle, a series of technologies enhance their value and promote Intel to constantly adjust its own strategy. The uniqueness and flexibility of FPGA play a very differentiated role in the field of intelligent connection, which is the fundamental reason why Intel is willing to spend $16.7 billion to acquire Altera.
According to the plan, the development roadmap of Intel's next-generation FPGA and SOC FPGA products will be divided into three product platforms (Figure 1) of low, medium and high, and support Intel Architecture (IA) integration. Among them, low-end products are mainly used in industrial IOT, automobile and community RF fields, using Intel 22nm process technology; Midrange products are mainly used in 4.5g/5g wireless, UHD / 8K broadcast video, industrial IOT and automotive fields, using Intel 10nm process technology; High end products are mainly used in the fields of cloud and acceleration, terabit system and high-speed signal processing, using Intel 10nm process technology.
Figure 1: Intel next generation FPGA and SOC FPGA development roadmap
Intel promises to provide different heterogeneous architectures according to different customer needs, including discrete CPU + FPGA, encapsulated integrated CPU + FPGA, and FPGA integrating Intel CPU / FPGA / arm.
As the most powerful FPGA product of Intel at present, all models of Stratix 10 FPGA / SOC FPGA series adopt heterogeneous 3D SIP integration technology, and integrate high-density single-chip FPGA core logic, high-speed serial transceiver and protocol block by using Intel dedicated embedded multi-core interconnect bridge (emib) technology. In addition, the series is also the first device to adopt the new hyperflex architecture. By introducing registers on all core interconnection line segments, Stratix 10 series can effectively reduce wiring delay and improve overall performance.
Xilinx: reconfigurable acceleration stack improves computing efficiency by 2-6 times
As the inventor of FPGA technology, Xilinx has made great progress in data center applications in the past few years. According to the relevant information provided by Steve Glaser, senior vice president of Xilinx strategy and marketing department, at present, three of the world's seven super large-scale cloud service companies have adopted Xilinx FPGA, of which Baidu announced the design of Xilinx ultrascale FPGA pool in October 2016 to speed up machine learning inference; In May 2016, Xilinx joined hands with AMD, arm, Huawei, IBM, mellanox and Qualcomm to establish the intelligent cache consistency interconnection standard (ccix) alliance. Five months later, the number of members has surged to three times that at the beginning of its establishment; In November, Xilinx released the latest 16nm virtex ultrascale + FPGA product using high bandwidth memory (HBM) and ccix technology, which not only increased the memory bandwidth by 20 times, but also reduced the power consumption per unit bit by 4 times.
However, FPGA has always been difficult to program, and relevant personnel need to have both software and hardware skills. In order to better meet the needs of emerging markets, following the release of sdaaccel software definition development environment for FPGA acceleration in 2014, Xilinx launched reconfigurable acceleration stack at the end of 2016 for the three fastest-growing computing intensive applications in the field of super large-scale data center: machine learning, data analysis and streaming video live broadcast.
This means that Xilinx will provide not only FPGA chips, but also optimized mathematical function library and application function library (such as cafe for machine learning), software framework implementation scheme, tools supporting high-level languages such as OpenCL and C / C + +, openstack support for easy configuration and management, and expected accelerator board reference design. Andy Walsh, director of strategic market development of Xilinx cloud computing, said that through Xilinx FPGA, the reconfigurable accelerated stack scheme provides the industry's highest computing efficiency 40 times higher than x86 server CPU and 6 times higher than competitive FPGA scheme. By switching to the best designed bitstream, the optimization of these workloads can be completed in milliseconds.
"An accelerator may be fast under a specific workload, but it must also see whether it can reduce the overall operating cost of the data center." Andy Walsh explains that there are two decisive factors in the total cost of ownership of acceleration technology: the breadth of applications that accelerators can support, and how accelerators can be easily and efficiently configured and pooled to determine the utilization of accelerators.
Figure 2 lists the different options for accelerating data center workloads: CPU, custom ASIC, GPU, and FPGA. According to Andy Walsh, although GPU and custom ASIC also adopt pooled configuration deployment to improve utilization, neither can support a wide range of applications. Due to the lack of reconfigurability, they can only support workloads matching their fixed hardware architecture. In addition, the huge design investment, design risk and design cost brought by creating custom ASIC make it very uneconomical and economical compared with FPGA.
As for the strategy of integrated CPU FPGA design proposed by Intel, he believes that this scheme limits the breadth of applications and the utilization of accelerators, making it enter "no man's land". This CPU + FPGA device is limited by power density, which usually limits FPGA to medium and low-end devices and limited workload. FPGA integration in the CPU suite will also limit the ability of the pooled accelerator, thus greatly reducing its utilization.
Figure 2: differences in application breadth and utilization of different schemes for accelerating data center workload
"Altera emphasizes floating-point precision DSP, which does not match many applications, including machine learning inference, and is far lower than the computational efficiency of GPU optimized for training." Andy Walsh said that in terms of computing efficiency, Xilinx FPGA is 2-6 times higher than Altera independent FPGA, and its utilization rate is greatly improved compared with Intel Integrated MCM. Its advantages stem from its excellent DSP architecture, memory hierarchy and leading position in chip technology.
Figure 3: differences in utilization and computing efficiency of different schemes for accelerating data center workload. Read the full text and the technology zone
FPGA has gradually evolved from the peripheral device of electronic design to the core of digital system
The uniqueness and flexibility of FPGA play a very different advantage in the field of intelligent connection
Application of speedcore FPGA in automotive intelligence
Gaoyun semiconductor launched gw1nz series FPGA chips with small package and ultra-low power consumption
How to operate and set the multi pin chip? Download the enthusiast app
Create an electronic circle of your contacts
Pay attention to wechat of electronic enthusiasts
Interesting and informative information and technology dry goods
Focus on enthusiast class
Lock in the live broadcast of the latest course activities and technologies and collect them
0 collections
Share:, comment
Lin Chaowen PCB Design: pads tutorial, pads video tutorial, Zheng Zhenyu teacher: Altium designer tutorial, Altium designer video tutorial, Zhang Fei actual combat electronic video tutorial, Zhu Youpeng teacher: Hisilicon hi3518e tutorial, hi3518e video tutorial, Li Zeng teacher: signal integrity tutorial, high-speed circuit simulation tutorial, Huawei Hongmeng system tutorial, harmonyos video tutorial, saisheng: EMC design tutorial, EMC video tutorial Mr. Du Yang: STM32 tutorial, STM32 video tutorial, Tang zuolin: basic C language tutorial, basic C language video tutorial, Zhang Fei: Buck Power tutorial, buck power video tutorial, punctual atom: FPGA tutorial, FPGA video tutorial, Mr. Wei Dongshan: embedded tutorial, Embedded video tutorial Zhang Xianfeng: C language basic video tutorial Xu Xiaogang: Modbus communication video tutorial Wang Zhentao: NB IOT development video tutorial mill: FPGA tutorial, zynq video tutorial c language video tutorial linux driver development video tutorial Zhu Youpeng: u-boot source code analysis video tutorial harmonyos,
Press and hold the slider and drag to the far right
Learn about new features
Published, relevant recommendations
FPGA has gradually evolved from peripheral devices of electronic design to digital system
As a programmable logic device, FPGA has gradually evolved from the peripheral device of electronic design to the core of digital system in the development process of more than 20 years
Published at 17:54, October 29, 2018
•
0 readings
Powering Altera aria 10 FPGA and aria 10 SOC: a tested and proven power management solution
Published at 17:01, October 29, 2018
•
6 readings
Intel's revenue exceeded expectations, and the CEO worried about the impact of the trade war next year
According to the news from the supply chain, among the first batch of more than 3 million iPhone XR phones sold worldwide, Apple has provided for China
Published at 16:28, October 29, 2018
•
48 readings
Gaoyun semiconductor launches gw1nz Series F with small package and ultra-low power consumption
Guangdong Gaoyun Semiconductor Technology Co., Ltd. (hereinafter referred to as Gaoyun semiconductor), a leading domestic supplier of programmable logic devices, announced the launch of small package
Published at 16:05, October 29, 2018
•
58 readings
Our other product: