{"id":279,"date":"2007-12-02T14:53:15","date_gmt":"2007-12-02T18:53:15","guid":{"rendered":"http:\/\/www.rakkar.org\/blog\/?p=279"},"modified":"2007-12-02T14:53:15","modified_gmt":"2007-12-02T18:53:15","slug":"d3dx-math-functions-very-fast","status":"publish","type":"post","link":"https:\/\/rakkar.org\/blog\/index.php\/2007\/12\/02\/d3dx-math-functions-very-fast\/","title":{"rendered":"D3DX Math functions very fast"},"content":{"rendered":"<p>\t\t\t\tI ran across this<br \/>\n<a HREF=\"http:\/\/cache-www.intel.com\/cd\/00\/00\/01\/76\/17699_code_zohar.pdf\">http:\/\/cache-www.intel.com\/cd\/00\/00\/01\/76\/17699_code_zohar.pdf<\/a> which is a math library using SSE2 to do fast math operations.  I spent a lot of time upgrading his code to be more suitable to games, such as doing a transpose for an inverse of a orthonormal matrix.  But after profiling I found his code was slower than the equivalent D3DX math functions, even for matrix multiply.<\/p>\n<p>D3DXMatrixMultiply 10000000 times:<br \/>\ndiffGP=564 milliseconds diffDX=396 milliseconds<\/p>\n<p>Bleh.<\/p>\n<p>I also tried this: http:\/\/www.cs.nmsu.edu\/CSWS\/techRpt\/2003-003.ps<\/p>\n<p>It appears that D3DX already does better than this as well:<br \/>\nTheirs=695 milliseconds Mine=801 milliseconds<\/p>\n<p>However, the version without scaling was 100 milliseconds faster.  However, that is such a special case it&#8217;s not worth leaving in.<\/p>\n<p>So Kudos to Direct3D because their math functions are very fast!<\/p>\n<p>By the way, this is something I was able to figure out while experimenting.  In every library I&#8217;ve ever used, except the one at The Collective, this was very unclear.  I think they used to always store the matrices transposed to make them easier to use or something.<\/p>\n<p>[code]<br \/>\ninline D3DXVECTOR3 * GetAtVec(D3DXVECTOR3 *out, D3DXMATRIX *in)<br \/>\n{<br \/>\n\tout->x=in->_13;<br \/>\n\tout->y=in->_23;<br \/>\n\tout->z=in->_33;<br \/>\n\treturn out;<br \/>\n}<br \/>\ninline D3DXVECTOR3 * GetUpVec(D3DXVECTOR3 *out, D3DXMATRIX *in)<br \/>\n{<br \/>\n\tout->x=in->_12;<br \/>\n\tout->y=in->_22;<br \/>\n\tout->z=in->_23;<br \/>\n\treturn out;<br \/>\n}<br \/>\ninline D3DXVECTOR3 * GetRightVec(D3DXVECTOR3 *out, D3DXMATRIX *in)<br \/>\n{<br \/>\n\tout->x=in->_11;<br \/>\n\tout->y=in->_21;<br \/>\n\tout->z=in->_31;<br \/>\n\treturn out;<br \/>\n}<br \/>\ninline D3DXVECTOR3 GetAtVec(D3DXMATRIX *in)<br \/>\n{<br \/>\n\treturn D3DXVECTOR3(in->_13,in->_23,in->_33);<br \/>\n}<br \/>\ninline D3DXVECTOR3 GetUpVec(D3DXMATRIX *in)<br \/>\n{<br \/>\n\treturn D3DXVECTOR3(in->_12,in->_22,in->_23);<br \/>\n}<br \/>\ninline D3DXVECTOR3 GetRightVec(D3DXMATRIX *in)<br \/>\n{<br \/>\n\treturn D3DXVECTOR3(in->_11,in->_21,in->_31);<br \/>\n}<br \/>\n[\/code]<\/p>\n","protected":false},"excerpt":{"rendered":"<p>I ran across this http:\/\/cache-www.intel.com\/cd\/00\/00\/01\/76\/17699_code_zohar.pdf which is a math library using SSE2 to do fast math operations. I spent a lot of time upgrading his code to be more suitable to games, such as doing a transpose for an inverse of a orthonormal matrix. But after profiling I found his code was slower than the [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[2],"tags":[],"_links":{"self":[{"href":"https:\/\/rakkar.org\/blog\/index.php\/wp-json\/wp\/v2\/posts\/279"}],"collection":[{"href":"https:\/\/rakkar.org\/blog\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/rakkar.org\/blog\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/rakkar.org\/blog\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/rakkar.org\/blog\/index.php\/wp-json\/wp\/v2\/comments?post=279"}],"version-history":[{"count":0,"href":"https:\/\/rakkar.org\/blog\/index.php\/wp-json\/wp\/v2\/posts\/279\/revisions"}],"wp:attachment":[{"href":"https:\/\/rakkar.org\/blog\/index.php\/wp-json\/wp\/v2\/media?parent=279"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/rakkar.org\/blog\/index.php\/wp-json\/wp\/v2\/categories?post=279"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/rakkar.org\/blog\/index.php\/wp-json\/wp\/v2\/tags?post=279"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}