A typical 3D data set is a group of 2D slice images acquired by a CT, MRI, or MicroCT scanner. Usually these are acquired in a regular pattern (e.g., one slice every millimeter) and usually have a regular number of image pixels in a regular pattern. This is an example of a regular volumetric grid, with each volume element, or voxel represented by a single value that is obtained by sampling the immediate area surrounding the voxel.
To render a 2D projection of the 3D data set, one first needs to define a camera in space relative to the volume. Also, one needs to define the opacity and color of every voxel. This is usually defined using an RGBA (for red, green, blue, alpha) transfer function that defines the RGBA value for every possible voxel value.
For example, a volume may be viewed by extracting isosurfaces (surfaces of equal values) from the volume and rendering them as polygonal meshes or by rendering the volume directly as a block of data. The marching cubes algorithm is a common technique for extracting an isosurface from volume data. Direct volume rendering is a computationally intensive task that may be performed in several ways.
Volume rendering is distinguished from thin slice tomography presentations, and is also generally distinguished from projections of 3D models, including maximum intensity projection. Still, technically, all volume renderings become projections when viewed on a 2-dimensional display, making the distinction between projections and volume renderings a bit vague. Nevertheless, the epitomes of volume rendering models feature a mix of for example coloring and shading in order to create realistic and/or observable representations.
A direct volume renderer requires every sample value to be mapped to opacity and a color. This is done with a "transfer function" which can be a simple ramp, a piecewise linear function or an arbitrary table. Once converted to an RGBA color model (for red, green, blue, alpha) value, the composed RGBA result is projected on the corresponding pixel of the frame buffer. The way this is done depends on the rendering technique.
A combination of these techniques is possible. For instance, a shear warp implementation could use texturing hardware to draw the aligned slices in the off-screen buffer.
The technique of volume ray casting can be derived directly from the rendering equation. It provides results of very high quality, usually considered to provide the best image quality. Volume ray casting is classified as image based volume rendering technique, as the computation emanates from the output image, not the input volume data as is the case with object based techniques. In this technique, a ray is generated for each desired image pixel. Using a simple camera model, the ray starts at the center of projection of the camera (usually the eye point) and passes through the image pixel on the imaginary image plane floating in between the camera and the volume to be rendered. The ray is clipped by the boundaries of the volume in order to save time. Then the ray is sampled at regular or adaptive intervals throughout the volume. The data is interpolated at each sample point, the transfer function applied to form an RGBA sample, the sample is composited onto the accumulated RGBA of the ray, and the process repeated until the ray exits the volume. The RGBA color is converted to an RGB color and deposited in the corresponding image pixel. The process is repeated for every pixel on the screen to form the completed image.
This is a technique which trades quality for speed. Here, every volume element is splatted, as Lee Westover said, like a snow ball, on to the viewing surface in back to front order. These splats are rendered as disks whose properties (color and transparency) vary diametrically in normal (Gaussian) manner. Flat disks and those with other kinds of property distribution are also used depending on the application.
The shear warp approach to volume rendering was developed by Cameron and Undrill, popularized by Philippe Lacroute and Marc Levoy. In this technique, the viewing transformation is transformed such that the nearest face of the volume becomes axis aligned with an off-screen image data buffer with a fixed scale of voxels to pixels. The volume is then rendered into this buffer using the far more favorable memory alignment and fixed scaling and blending factors. Once all slices of the volume have been rendered, the buffer is then warped into the desired orientation and scaled in the displayed image.
This technique is relatively fast in software at the cost of less accurate sampling and potentially worse image quality compared to ray casting. There is memory overhead for storing multiple copies of the volume, for the ability to have near axis aligned volumes. This overhead can be mitigated using run length encoding.
Many 3D graphics systems use texture mapping to apply images, or textures, to geometric objects. Commodity PC graphics cards are fast at texturing and can efficiently render slices of a 3D volume, with real time interaction capabilities. Workstation GPUs are even faster, and are the basis for much of the production volume visualization used in medical imaging, oil and gas, and other markets (2007). In earlier years, dedicated 3D texture mapping systems were used on graphics systems such as Silicon Graphics InfiniteReality, HP Visualize FX graphics accelerator, and others. This technique was first described by Bill Hibbard and Dave Santek.
These slices can either be aligned with the volume and rendered at an angle to the viewer, or aligned with the viewing plane and sampled from unaligned slices through the volume. Graphics hardware support for 3D textures is needed for the second technique.
Volume aligned texturing produces images of reasonable quality, though there is often a noticeable transition when the volume is rotated.
Due to the extremely parallel nature of direct volume rendering, special purpose volume rendering hardware was a rich research topic before GPU volume rendering became fast enough. The most widely cited technology was the VolumePro real-time ray-casting system, developed by Hanspeter Pfister and scientists at Mitsubishi Electric Research Laboratories, which used high memory bandwidth and brute force to render using the ray casting algorithm. The technology was transferred to TeraRecon, Inc. and two generations of ASICs were produced and sold. The VP1000 was released in 2002 and the VP2000 in 2007.
A recently exploited technique to accelerate traditional volume rendering algorithms such as ray-casting is the use of modern graphics cards. Starting with the programmable pixel shaders, people recognized the power of parallel operations on multiple pixels and began to perform general-purpose computing on (the) graphics processing units (GPGPU). The pixel shaders are able to read and write randomly from video memory and perform some basic mathematical and logical calculations. These SIMD processors were used to perform general calculations such as rendering polygons and signal processing. In recent GPU generations, the pixel shaders now are able to function as MIMD processors (now able to independently branch) utilizing up to 1 GB of texture memory with floating point formats. With such power, virtually any algorithm with steps that can be performed in parallel, such as volume ray casting or tomographic reconstruction, can be performed with tremendous acceleration. The programmable pixel shaders can be used to simulate variations in the characteristics of lighting, shadow, reflection, emissive color and so forth. Such simulations can be written using high level shading languages.
The primary goal of optimization is to skip as much of the volume as possible. A typical medical data set can be 1 GB in size. To render that at 30 frame/s requires an extremely fast memory bus. Skipping voxels means that less information needs to be processed.
Often, a volume rendering system will have a system for identifying regions of the volume containing no visible material. This information can be used to avoid rendering these transparent regions.
This is a technique used when the volume is rendered in front to back order. For a ray through a pixel, once sufficient dense material has been encountered, further samples will make no significant contribution to the pixel and so may be neglected.
Image segmentation is a manual or automatic procedure that can be used to section out large portions of the volume that one considers uninteresting before rendering, the amount of calculations that have to be made by ray casting or texture blending can be significantly reduced. This reduction can be as much as from O(n) to O(log n) for n sequentially indexed voxels. Volume segmentation also has significant performance benefits for other ray tracing algorithms. Volume segmentation can subsequently be used to highlight structures of interest.
By representing less interesting regions of the volume in a coarser resolution, the data input overhead can be reduced. On closer observation, the data in these regions can be populated either by reading from memory or disk, or by interpolation. The coarser resolution volume is resampled to a smaller size in the same way as a 2D mipmap image is created from the original. These smaller volume are also used by themselves while rotating the volume to a new orientation.
Pre-integrated volume rendering is a method that can reduce sampling artifacts by pre-computing much of the required data. It is especially useful in hardware-accelerated applications because it improves quality without a large performance impact. Unlike most other optimizations, this does not skip voxels. Rather it reduces the number of samples needed to accurately display a region of voxels. The idea is to render the intervals between the samples instead of the samples themselves. This technique captures rapidly changing material, for example the transition from muscle to bone with much less computation.
Image-based meshing is the automated process of creating computer models from 3D image data (such as MRI, CT, Industrial CT or microtomography) for computational analysis and design, e.g. CAD, CFD, and FEA.
For a complete display view, only one voxel per pixel (the front one) is required to be shown (although more can be used for smoothing the image), if animation is needed, the front voxels to be shown can be cached and their location relative to the camera can be recalculated as it moves. Where display voxels become too far apart to cover all the pixels, new front voxels can be found by ray casting or similar, and where two voxels are in one pixel, the front one can be kept.