赞
踩
在arm系统下,不能使用sse指令加速,这让带sse指令加速的程序员头疼不已,很幸运的在网上找了这个,neon指令集生成了一套替换sse的函数接口,给大家恭喜以下,感谢github,互帮互助,共同进步!
无 '_mm_loadu_ps'
https://github.com/merckhung/sse2neon/blob/master/SSE2NEON.h
这个完整:
需要 #define ANDROID,还是不完整
https://github.com/lucien-ye/sse-neon/blob/master/sse_neon.hpp这两个有点像,带有测试例子:
'_mm_loadu_ps' was not declared
https://github.com/noxo/sse2neon/blob/master/SSE2NEON.h
1197行,有具体实现,有测试例子
https://github.com/jratcliff63367/sse2neon/blob/master/SSE2NEON.h
好像是完整的
https://github.com/TuringKi/fDSST_cpp
https://github.com/TuringKi/fDSST_cpp/blob/master/src/SSE2NEON.h
https://github.com/intel/ARM_NEON_2_x86_SSE/blob/master/NEON_2_SSE.h
RETf RCP(const __m128 x) {
__m128 recip = vrecpeq_f32(x);
recip= vmulq_f32(recip, vrecpsq_f32(recip, x));
return recip;
}
RETf SQRT(const __m128 x) {
return vsqrtq_f32(x);
}
//这个精确度比较高
RETfRCPSQRT(const __m128 x) {
__m128 e = vrsqrteq_f32(x);
e= vmulq_f32(e, vrsqrtsq_f32(x, vmulq_f32(e, e)));
e= vmulq_f32(e, vrsqrtsq_f32(x, vmulq_f32(e, e)));
return e;
}
RETf RCP(const __m128 x) {
__m128 recip = vrecpeq_f32(x);
recip= vmulq_f32(recip, vrecpsq_f32(recip, x));
return recip;
}
RETf SQRT(const __m128 x) {
return vsqrtq_f32(x);
}
pc上的:
RETf RCPSQRT(const simdqf x) { return _mm_rsqrt_ps(x); }
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。