[ARM] Problem with inline asm code for NEON matrix multiplication
#1
Question 
Hi,

i am using an ARM with neon, vfp and vfpv3, the inline code provided in the MatrixGLES.cpp is broken for me.
Code was added here and is still the same:
https://github.com/xbmc/xbmc/commit/21f1...269c79dba6
but i don't get whats actual the problem.
Other samples using neon for Matrix Multi look very different e.g.:
http://code.google.com/p/math-neon/sourc...ath_mat4.c

But the code used in xbmc looks very much like that:
http://code.google.com/p/vfpmathlibrary/...x_impl.cpp

but thats not neon, its vfp, maybe apple specific vfp?

Could anybody use this on a non apple device? I would like to replace it with a plain neon implemantation, what do you think?
Reply
#2
its a little bit complicated.

v instructions are UAL, f instructions are pre-UAL
neon is UAL. i believe vfpv3 is also UAL, whereas vfpv2 and below isnt

the code we have there is vfpv2 code, not neon
i am unaware of which devices this code has been tested on so cant give further insight

also note there is no such thing as apple specific vfp code. i would recommend changing your mfloat option from neon/vfpv3 to vpfv2 to see if that helps.
¤ [McGeagh] ¤
Reply
#3
do you mean mfpu? i have no mfloat option set.

But maybe I should commit a patch containing real neon code? My CPU is an ARMv7 and has no feature vfp2. IMO the #if __ARM_NEON__ should keep cpu's out which don't support neon.
Reply
#4
please commit some real neon asm code for matrix Smile
Reply
#5
overflowed Wrote:do you mean mfpu? i have no mfloat option set.

But maybe I should commit a patch containing real neon code? My CPU is an ARMv7 and has no feature vfp2. IMO the #if __ARM_NEON__ should keep cpu's out which don't support neon.

yes sorry, i meant mfpu... its been a long day today!

Any code patches are more than welcome, and highly appreciated. Thanks.
¤ [McGeagh] ¤
Reply
#6
fine, i created a ticket with patch http://trac.xbmc.org/ticket/11848
Reply
#7
and applied, thanks overflowed.
Reply

Logout Mark Read Team Forum Stats Members Help
[ARM] Problem with inline asm code for NEON matrix multiplication0