Resample And
Composite Engine II (RACE II) |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
The RACE I and RACE II architectures are high-performance engines that were designed to be extremely memory efficient; hence, they are the only architectures that access a voxel only once per projection regardless of the degree of over-sampling. Furthermore, it can potentially exhibit frame-to-frame coherence for multiple view positions. Previous DVR solutions rely primarily on one of two forms of acceleration: 1) sample-reduction or 2) large sample-throughput (sample-memory efficiency defined in [3]). We define sample-memory efficiency as the ability to render the dataset as fast as it can be read from the voxel-memory subsystem when one-to-one voxel-to-samples is present and there is no sample reduction. This is the theoretical maximum performance for rendering volumes that exhibit one-to-one sampling and no sample reduction. To date DVR-architectures have only been able to sufficiently utilize one form acceleration. By combining both forms of acceleration, the RACE II architecture is expected to have a two-fold speedup over other solutions. The RACE I architecture will achieve larger average perspective performance than other solutions that have more than double the amount of voxel-bandwidth as the RACE engine. To achieve a new level of performance RACE implements a truly hybrid algorithm. In a nutshell, it uses object-order control over image-order rendering units. It allows the RACE architectures to trivially handle perspective projections and sample reduction techniques such as Early-Ray Termination (ERT) and Space-leaping. RACE as presented in [4] only utilizes early-ray termination. RACE II supports space-leaping. Based on simulation results the RACE architecture will achieve higher average perspective rendering performance than currently available solutions using anywhere from 33-75% less bandwidth the other solutions (See table below). RACE II renders 256x256x128 volumes at an average of 80-90Hz and 256x256x256 volumes at an average of 44 Hz with worst-case 40 Hz and 20 Hz respectively. Hardware Description (RACE I/RACE II):
Current limitations of OTHER approachesImage-order architectures:
The figure above shows that the RACE II architecture achieves the largest amount acceleration when 5-90% dataset contributes to the final image. RACE I achieves better aggregate performance (parallel and perspective projections) than the remaining architectures when 12-100% of the dataset contributes to the final image. Volume Pro excels when 90-100% of the dataset contributes to the final image for parallel projections. The two image-order architectures (plots overlap) will achieve the most acceleration when less than 5% of the dataset contributes to the final image. Since VIZARD II currently only uses Early-Ray termination, it is unlikely that less than 70% of the dataset will contribute to the final image using reasonable opacity thresholds. However, the architecture (called the VG Engine) that utilizes space-leaping may reach as low as10% under certain conditions (i.e., binary opacity classifications of sparse datasets). The RACE engines provides robust acceleration for general purpose Volume Graphics and Volume Visualization since they provide the nearest-to-optimal acceleration for over 85% of the range. As a result of efficient voxel-sample processing, we can achieve comparable or better performance than other solutions using anywhere from 33-75% less voxel-bandwidth!
DVR Architecture
Comparison |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
|
|
|
RACE II |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
(16M samples per frame) |
|
|
|
|
|
|
|
|
|
|
(opaque object) 28 Hz (single-layer transparency) 17 Hz (multiple-layer semi-transparent) |
44 Hz (RACE II) |
|
|
|
|
|
|
|
|
|
|
|
|
|
100% w/ Space-leaping (RACE II) |
|
|
|
|
|
|
1.7 (RACE II) |
|
|
|
|
|
|
|
|
|
|
|
|
|
Ray Casting |
|
|
|
|
|
|
|
|
|
>32GVoxel/Second >5120 MB >160 memory units |
4GVoxel/Second 256 MB 4 memory untis |
25GVoxel/Second 4096 MB 128 memory units |
13GVoxel/Second 2048 MB 64 memory units |
20 Pipelines 4GVoxel/Second 256 MB units 4 memory units RACE II
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
parallel rendered (40 Hz) |
perspective rendered (48 Hz) |
![]() |
![]() |
|
CT dataset parallel rendered (80 Hz) |
CT dataset perspective rendered (85 Hz) |
![]() |
![]() |
|
synthetic dataset parallel rendered (84 Hz) |
synthetic dataset perspective rendered (89 Hz) |
![]() |
![]() |
2) R. Osborne, H. Pfister, H. Lauer, N. McKenzie, S. Gibson, W. Hiatt, and T. Ohkami, EM-CUBE: An Architecture for Low-Cost Real-Time Volume Rendering. In Proceedings of the Siggraph/Eurographics Workshop on Graphics Hardware, pages 131-138, Los Angeles, August 1997
5) B. Vettermann, J. Hesser, and R. Manner, Solving the Hazard Problem
for Algorithmically Optimized Real-Time Volume Rendering. In International
Workshop on Volume Graphics, March 1999.