PCEVA,PC绝对领域,探寻真正的电脑知识
打印 上一主题 下一主题
开启左侧

Bulldozer Version 2.0 -- 推土机2代的一些消息

[复制链接]
跳转到指定楼层
1#
royalk 发表于 2010-11-7 22:53 | 显示全部楼层 回帖奖励 |倒序浏览 |阅读模式
点击数:3625|回复数:8
AMD recently added several new extensions for the "upcoming bdver2 processors" into the set of patches for GNU operating system, findings  of a famous blogger reveal. The fact that Advanced Micro Devices made the new extensions available under non-disclosure agreement to select software developers mean that the company's Bulldozer Version 2.0 - or, perhaps, Bulldozer NG - may be just several years away.

The new extensions that should be supported by the Bulldozer 2 processors are the following:

    * BMI - Bit Manipulation Instructions
    * TBM - Trailing Bit Manipulation
    * FMA3 - three operand FMA [fused multiply-add] instructions

Unfortunately, we know nothing about the aforementioned instructions and their potential. What we do know is that AMD's own Bulldozer does support FMA4 instruction already and the FMA3 may be implemented for better compatibility with Intel's Sandy Bridge/Ivy Bridge chips that support the FMA3


坐等翻译
2#
royalk  楼主| 发表于 2010-11-8 10:54 | 显示全部楼层
FMA是一种乘法和加法的混合操作,它会先将2个数相乘,然后再与第三个数相加。看上去就相当于:(A*B)+C。这里的差别就是操作数量是3次还是4次。我们将A、B、C假想为三个寄存器。那么FMA3就是(A*B)+C=写入C寄存器。而AMD的版本是将得到的结果存入第四个寄存器D。而事实上,Intel在计算的结束时,也需要将数据覆盖到C寄存器上。如果你询问AMD的工程师,他们会告诉你,FMA4可以帮你节省最后一步覆盖拷贝操作。而 Intel的工程师则会告诉你,FMA3可以使用更少的寄存器就完成了操作。但是有一点我们需要搞清楚,如果你需要做大量的计算任务,例如(A*B)+常量,那么AMD的FMA4方案将会帮助你节省更多的时钟周期。也就是说,从算法技术的方面看,这两种方案都有其优势和缺点。Intel的FMA3并不能完胜FMA4。

这只是AMD官方的评论。。实际对我们有多大作用还不得而知
您需要登录后才可以回帖 登录 | 注册

本版积分规则

快速回复 返回顶部