Has anyone else tried using the ATI Stream SDK 2.0 OpenCL with the experimental double-precision extension? I'm trying it on my CPU and using double precision is 650 times slower than using single precision (floats). That is REALLY not good. Is the double precision support in Nvidia's SDK better? Fast double precision support is absolutely critical for scientific computations.