After having a looong read through the MFC documentation, i finally came up with something what looks like double buffered huffman stream.
The good thing, all that dma stuff doesn't compromised decompression speed.
So far 1024x768 image takes <15 msec to decompress.
This is 3 times faster than scalar prototype on my Athlon.
Not very bad for poor little SPE ;)
Monday, May 7, 2007
Saturday, May 5, 2007
SIMD rulez!
After iDCT + colorspace conversion has been done in SIMD way, i've got nice 13x speedup ;)
Now the test image decoding is 1.4ms vs. 18.5ms in scalar version.
Still, input double buffering and code cleanup must be done.
Now the test image decoding is 1.4ms vs. 18.5ms in scalar version.
Still, input double buffering and code cleanup must be done.
Subscribe to:
Comments (Atom)