Hardware-Assisted Feature Analysis and Visualization of Procedurally Encoded Multifield Volumetric...

8
Hardware-assisted Feature Analysis and Visualization of Procedurally Encoded Multifield Volumetric Data Manfred Weiler University of Stuttgart Ralf P. Botchen University of Stuttgart Jingshu Huang Purdue University Yun Jang Purdue University Simon Stegmaier University of Stuttgart Kelly P. Gaither University of Texas David S. Ebert Purdue University Thomas Ertl University of Stuttgart Figure 1: Four visualizations of RBF encoded datasets. Left image: Interactively extracted and volume rendered vorticity from the Tornado dataset encoded with 2,100 RBFs. Second: Traces of 110 particles tracked in experimentally obtained Channel dataset encoded with 2,105 RBFs. Third image: Volume rendering of water pressure for an injection well. The 156,642 tetrahedra dataset of a simulated black-oil reservoir is encoded using 141 RBFs. Fourth image: Isosurface rendering of vorticity magnitude. Positive helicity has been mapped to red colors and negative helicity to blue colors. ABSTRACT Procedural encoding of scattered and unstructured scalar datasets using Radial Basis Functions (RBF) is an active area of research with great potential for compactly representing large datasets. This reduced storage requirement allows the compressed datasets to completely reside in the local memory of the graphics card, thus, enabling accurate and efficient processing and visualization with- out data transfer problems. We have developed new hierarchical techniques that effectively encode data on arbitrary grids including volumetric scalar, vector, and multifield data. Once the RBF representation is transferred to texture memory, GPU-based visualization using particle advection, cutting planes, isosurfaces, and volume rendering can be performed by functional reconstruction of the encoded data in the fragment pipeline. For the special requirements of flow visualization, we de- rive the definitions of well known features in RBF space allowing us to integrate pixel accurate hardware-accelerated feature detection and visualization techniques. By eliminating the need for storing and processing mesh information, our approach is particularly at- tractive for large scattered and irregular structured datasets, as well as datasets created by the emerging field of meshless simulation techniques. CR Categories: I.3.3 [Computer Graphics]: Scientific Visualiza- tion, Feature Detection—Radial Basis Functions Keywords: procedural encoding, volume rendering, meshless rep- resentation, feature detection, RBF, flow visualization e-mail: {weiler|botchen|stegmaier|ertl}@vis.uni-stuttgart.de e-mail: {jhuang2|jangy|ebertd}@purdue.edu e-mail: [email protected] 1 I NTRODUCTION Computational power has increased dramatically over the past decade and has allowed computational researchers to more accu- rately simulate many types of phenomena with added detail and precision. This increase in power has supported the calculation of complex unsteady simulations modeling real world conditions. These complex simulations typically employ grid-based finite ele- ment methods (FEM), finite volume methods (FVM), or finite dif- ference methods (FDM), and these grids are constructed such that they completely fill the domain of interest and obey the necessary laws governing changes of scale. The size and scale of data that is generated and saved by these complex simulations is increasing at an alarming rate, creating a data deluge for analysts wanting to vi- sualize, interactively manipulate, and explore the problem at hand. Moreover, most current visualization techniques are datagrid spe- cific and can not allow scientists and researchers to interactively visualize various unstructured and scattered large-scale datasets in a single system on their desktop computers. Therefore, we have taken a new approach to interactive visual- ization and feature detection of large scalar, vector, and multifield CFD datasets that is also well-suited for meshless CFD methods. Previously, both Jang et al. [11] and Co et al. [4] have used Ra- dial Basis Functions (RBFs) to procedurally encode both scattered and irregular gridded scalar datasets. The RBF encoding creates a complete, unified, functional representation of the scalar field throughout three-dimensional space, independent of the underly- ing data topology, and eliminating the need for the original data- grid during visualization. We have extended their work in several ways. First, we have developed techniques for encoding vector and

Transcript of Hardware-Assisted Feature Analysis and Visualization of Procedurally Encoded Multifield Volumetric...

Hardware-assisted Feature Analysis and Visualization ofProcedurally Encoded Multifield Volumetric Data

Manfred Weiler∗University of Stuttgart

Ralf P. Botchen∗University of Stuttgart

Jingshu Huang†

Purdue UniversityYun Jang†

Purdue University

Simon Stegmaier∗University of Stuttgart

Kelly P. Gaither‡

University of TexasDavid S. Ebert†

Purdue UniversityThomas Ertl∗

University of Stuttgart

Figure 1: Four visualizations of RBF encoded datasets. Left image: Interactively extracted and volume rendered vorticity from the Tornadodataset encoded with 2,100 RBFs. Second: Traces of 110 particles tracked in experimentally obtained Channel dataset encoded with 2,105RBFs. Third image: Volume rendering of water pressure for an injection well. The 156,642 tetrahedra dataset of a simulated black-oil reservoiris encoded using 141 RBFs. Fourth image: Isosurface rendering of vorticity magnitude. Positive helicity has been mapped to red colors andnegative helicity to blue colors.

ABSTRACT

Procedural encoding of scattered and unstructured scalar datasetsusing Radial Basis Functions (RBF) is an active area of researchwith great potential for compactly representing large datasets. Thisreduced storage requirement allows the compressed datasets tocompletely reside in the local memory of the graphics card, thus,enabling accurate and efficient processing and visualization with-out data transfer problems.

We have developed new hierarchical techniques that effectivelyencode data on arbitrary grids including volumetric scalar, vector,and multifield data. Once the RBF representation is transferred totexture memory, GPU-based visualization using particle advection,cutting planes, isosurfaces, and volume rendering can be performedby functional reconstruction of the encoded data in the fragmentpipeline. For the special requirements of flow visualization, we de-rive the definitions of well known features in RBF space allowing usto integrate pixel accurate hardware-accelerated feature detectionand visualization techniques. By eliminating the need for storingand processing mesh information, our approach is particularly at-tractive for large scattered and irregular structured datasets, as wellas datasets created by the emerging field of meshless simulationtechniques.

CR Categories: I.3.3 [Computer Graphics]: Scientific Visualiza-tion, Feature Detection—Radial Basis Functions

Keywords: procedural encoding, volume rendering, meshless rep-resentation, feature detection, RBF, flow visualization

∗e-mail: weiler|botchen|stegmaier|[email protected]†e-mail: jhuang2|jangy|[email protected]‡e-mail: [email protected]

1 INTRODUCTION

Computational power has increased dramatically over the pastdecade and has allowed computational researchers to more accu-rately simulate many types of phenomena with added detail andprecision. This increase in power has supported the calculationof complex unsteady simulations modeling real world conditions.These complex simulations typically employ grid-based finite ele-ment methods (FEM), finite volume methods (FVM), or finite dif-ference methods (FDM), and these grids are constructed such thatthey completely fill the domain of interest and obey the necessarylaws governing changes of scale. The size and scale of data that isgenerated and saved by these complex simulations is increasing atan alarming rate, creating a data deluge for analysts wanting to vi-sualize, interactively manipulate, and explore the problem at hand.Moreover, most current visualization techniques are datagrid spe-cific and can not allow scientists and researchers to interactivelyvisualize various unstructured and scattered large-scale datasets ina single system on their desktop computers.

Therefore, we have taken a new approach to interactive visual-ization and feature detection of large scalar, vector, and multifieldCFD datasets that is also well-suited for meshless CFD methods.Previously, both Jang et al. [11] and Co et al. [4] have used Ra-dial Basis Functions (RBFs) to procedurally encode both scatteredand irregular gridded scalar datasets. The RBF encoding createsa complete, unified, functional representation of the scalar fieldthroughout three-dimensional space, independent of the underly-ing data topology, and eliminating the need for the original data-grid during visualization. We have extended their work in severalways. First, we have developed techniques for encoding vector and

multifield datasets. Second, we have developed efficient feature de-tection techniques, which utilize the functional representation forefficiency. Additionally, we provide the capability to further refineregions of interest in the data by detecting and rendering featuresin the RBF space. The capability of commodity PC graphics hard-ware to accelerate reconstructing, rendering, and performing fea-ture detection from this functional representation provides a verypowerful tool for visualizing procedurally encoded volumes. Withour RBF encoding and GPU-accelerated reconstruction, feature de-tection, and visualization tool, we have created a new, flexible sys-tem for visually exploring and analyzing large, stuctured, scattered,and unstructured scalar, vector, and multifield datasets at interactiverates on deskop PCs.

We begin by reviewing related work in RBF encoding (Section2). We then present our new method for procedurally encoding vol-umetric scalar and vector fields (Section 3), followed by the math-ematical foundations for calculating and detecting features in theradial basis (Section 4). We then discuss our hardware acceleratedrendering and reconstruction system in detail (Section 5) and con-clude by presenting results achievable by our system.

2 PREVIOUS WORK

The process of approximating data (either measured or simulated)at given locations in space is one of finding a function s(x) thatprovides a “good” fit to the given data. Given data points p j =(x j,y j), j = 1, ...,n with x j ∈ ℜ3, y j ∈ ℜk, we want to find a contin-uous function s such that s(x j) = y j, j = 1, ..,n. For our purposes,“good” has been defined using information from the application do-main, and the accuracy is bounded by a pre-defined error tolerance.The values x j are the spatial locations, and the values y j are the datavalues that exist at the corresponding spatial locations. Using RBFsto find an approximation function s(x) dates back to 1968 whenHardy used multiquadric RBFs to represent topographical surfacesgiven sets of sparse scattered measurements [8].

RBFs are circularly-symmetric functions centered at a singlepoint. Within computer graphics, RBFs are most often used forcompactly representing surface models and for mesh reduction(e.g., [3, 18]). RBFs have also been used for surface constructionand rendering of large scattered datasets (e.g., [3, 6]).

The primary advantages of RBFs include their compact descrip-tion, ability to interpolate and approximate sparse, non-uniformlyspaced data, and analytical gradient calculations. Common choicesfor the RBFs are thin-plate splines, multiquadrics, and Gaussians.Splines have no adjustable parameters and do not have local sup-port, thus leading to a denser system of equations necessary to solvefor the function s. Inverse multiquadrics have been proven to havea physically relevant foundation [8]. However, Gaussian RBF mod-els offer several advantages. They are concise, robust, and have aregular and smooth behavior outside the fitting domain, providing alocalized function through which local data features are preserved.The Gaussians also offer the additional advantage of being less ex-pensive to reconstruct on modern graphics hardware.

3 TECHNIQUES FOR ENCODING SCALAR AND VECTORFIELDS

The algorithm that we have developed to encode volumetric datais based on the work of Jang et al. [11]. We use Gaussian RBFsbecause of their previously mentioned advantages. Our data ap-proximation function is of the form:

s(x) =N

∑i=1

λie−‖x−µi‖2

2σ2i (1)

λi RBF blending coefficients or Gaussian weightsµi RBF centersσ2

i Variances or Gaussian widthsN Number of basis functions

Our significant improvements on Jang et al. [11] are the encodingtechniques for vector fields, the extension of the encoding systemto arbitrarily large scalar and vector datasets, and a frequency ap-proach that allows a more accurate capturing of major features in adataset.

To encode a dataset within a user specified error tolerance, weimplement a multi-level encoding scheme based on a data fre-quency approach to provide a multi-resolution representation. First,the scattered data points are smoothed using a low-pass filter sothat we can capture the major structures. We compute the Gaussianfunctions in Equation (1) using the low-frequency components toposition Gaussian centers and remove these low-frequency compo-nents from the dataset. For the residual high-frequency characteris-tics of the data, we utilize principal component analysis (PCA) clus-tering and fit a Gaussian function into each cluster. We repeatedlysplit the clusters with largest errors, adding new Gaussian functions,until the user-defined error tolerance is reached.

At each stage of the above process, we perform a least-squaredfit to determine the parameters in Equation (1) by minimizing

ψ =n

∑j=1

[N

∑i=1

λie−‖x j−µi‖2

2σ2i − y j]2 (2)

y j Data value of jth data pointn Number of given data points

The least-squared fit is computationally and memory prohibitive forvery large datasets. Therefore, for large datasets, we first decom-pose the dataset into smaller domains, as described in Section 3.1.We encode each domain (Section 3.2) and combine the resultingRBF representation of each domain to obtain the global RBF repre-sentation. The encoding of vector fields is handled differently fromscalar fields, as described in Section 3.3. Our resulting algorithmfor encoding vector and scalar fields is the following:

1. Perform domain localization on the original dataset to obtainsmaller domain datasets with a limited number of data points.

2. Perform a multi-level least-squared fit of the data using Equa-tion (1) in each domain. In each level, the following steps areexecuted sequentially:

(a) Find the Gaussian centers with filtering and clusteringtechniques.

(b) Set initial Gaussian weights and widths as described inSections 3.2.

(c) Perform nonlinear optimization on Gaussian widths tominimize least-squared errors in the domain.

(d) Solve the linear system for Gaussian weights.

(e) Compute the errors at all data points in this domain. Ifthe maximal error is less than a predefined threshold,stop encoding the current domain and go to the nextdomain; otherwise, encode the residual error as a newdataset starting at Step 2a.

3.1 Domain Localization

As previously mentioned, encoding large datasets is computation-ally expensive and the encoding system might be ill-conditioned.We use domain localization to solve this problem. Domain lo-calization was used by Nielson [16], whose method localizes the

domain by multiplying by a weight function for each evenly de-composed subvolume. These spatially decomposed subvolumes,however, may generate very sparse subdomains, which do not haveenough data points to be fit. Additionally, some subdomains mightcontain a large number of data points to be encoded.

Therefore, we use a k-d tree to decompose the volume into over-lapping subdomains, each with an equal number of data points. Wealso define the weighting functions for the overlapping regions be-tween subdomains. Users can predefine the maximum number ofdata points in each domain and the size of the overlap to whichthe weighting function is applied, according to their available com-putational power. This domain localization provides a reasonablenumber of data points for the encoding system. Theoretically, onesubdomain is independent of others with zero error encoding. Sinceour encoding system is an approximation, non-zero error may ex-ist and may be largest in the overlap areas because of the addi-tion of subdomain encoding errors. We address this problem byre-encoding the residual dataset error from the first-round encodeddataset.

3.2 Determining Gaussian Parameters

To define the basis functions, Gaussian centers, variances andweights need to be determined. For selecting Gaussian centers,we used different methods in the low-frequency and high-frequencycomponents of the dataset for increased encoding accuracy. To ini-tially capture the main features of the dataset (low-frequency com-ponents), we smooth the dataset with filtering and place centers atthe peaks and valleys. The high-frequency components of the datahave a very large number of peak and valley spikes, making this ap-proach inappropriate. Therefore, we use PCA clustering [11] for thehigh-frequency components and fit one Gaussian center per cluster.The Gaussian center is placed at C = p j :

∥∥y j∥∥ = max‖yk‖ ,k =

1,2, ...,n.To find the variances of the Gaussians, we implement a bound-

constrained nonlinear optimization algorithm [1] and perform aglobal least squares estimate of the variances, σ2

i in Equation (1),over all the data points with the other parameters fixed. Duringthe data filtering process, Gaussian weights are set to the temporar-ily smoothed data values at the centers. For each PCA cluster-ing step, Gaussian weights are set to the residual data values1 atcluster centers. During PCA clustering, the initial estimate of theGaussian variances in our optimization is set by performing a localleast square estimate of σ2

i that minimizes errors at the closest 25neighbors of a Gaussian center. During filtering, the initial estimateof σi is set to half of the distance between the Gaussian center andits closest neighboring center (filtered peaks and valleys).

Given Gaussian centers and computed widths, we can solve anover-determined linear system for weights. Solving the system di-rectly involves the inverse of a very large matrix and can produceextreme weight values. Therefore, we approximate the solution it-eratively. Compared to the solutions from direct methods, the ap-proximation provides reasonable weights and a significant advan-tage during the reconstruction.

3.3 Vector Encoding Techniques

There are many approaches to encode vector datasets. A simpleapproach encodes each component of the vector separately, pro-ducing 3 separate systems of RBFs (one for each component). Wehave found that this not only requires a much larger number of ba-sis functions, but that the reconstructed vectors are less accuratesince the errors of the individual components may add to createlarger errors. We, therefore, encode the vector data as one three-valued quantity, computing the error of the encoding based on the

1Original data values less the already encoded data values.

vector error. For more accurate encoding, we calculate separateweights and variances for each vector component, but not separateRBF center locations. We experimented with using a single weightand variance for each Gaussian RBF, but found that the encodingerrors were much larger or that we needed many more basis func-tions. Therefore, we store one Gaussian center, three weights, andthree variances for each RBF.

3.4 Error Measurements

To evaluate the quality of the encoding, we compute percentageerrors at each data point with respect to the original maximal valuein a dataset and analyze the error distribution histograms. For scalarfields, mean percentage errors and maximal percentage errors aredefined as:

emax = max|e j|, emean = 1n ∑n

j=1 |e j|, e j =y j−y

′j

ymax(3)

y′j RBF reconstructed value at jth data point

ymax maximal absolute data value in original datasetn Number of data points

For vector fields, the error measurements are defined similarly asEquation (3) for x-, y-, z-components, respectively. The percentage

error for the norm of a vector is e j,norm =∥∥e j

∥∥ =√

e2j,x + e2

j,y + e2j,z

where e j,x,e j,y,e j,z are the percentage errors of the x-, y-, z-components at the jth data point.

4 FEATURE DETECTION IN THE RADIAL BASIS

Feature detection provides a powerful means of automatically iden-tifying regions of interest. A great deal of work has been performedin defining and detecting analytically based features in computa-tional fluid dynamics data (e.g., [10, 12, 14]). For the purposes ofthis research, we have chosen to compute features that can be ana-lytically described and those for which computation in RBF spaceis both possible and beneficial. All of the features can be derivedthrough the linear combination of a subset of fundamental opera-tors, each of which has roots in vector calculus.

This base set of operators has the ability to compute propertiessuch as magnitude, divergence, curl, and partial derivatives of mul-tivariate functions. The magnitude operator shown in Equation (4)is the most straightforward operator to implement and can be ap-plied to scalar and vector values alike. When applied to scalars, theresult is equivalent to the absolute value.

∥∥−→V ∥∥ =∥∥(

Vx,Vy,Vz)∥∥ =

√−→V 2

(4)

Divergence, or the tendency of a vector field to diverge from a point,is shown in Equation (5). If −→V represents the velocity field of aflowing fluid, then div(−→V ) or ∇ ·−→V represents the net rate of changeof the mass of the fluid flowing from a given point per unit volume.If the divergence of a field is zero then −→V is incompressible.

∇ ·−→V =∂Vx

∂x+

∂Vy

∂y+

∂Vz

∂ z(5)

The curl operator shown in Equation (6) represents the tendency ofparticles at the point (x,y,z) to rotate about the axis that points inthe direction of ∇×−→V . If −→V once again represents the velocityfield of a flowing fluid, then computing ∇×−→V shows whether ornot the flow is irrotational (∇×−→V = 0). Withi, j, andk denotingthe unit base vectors of the Cartesian coordinate space, curl can beexpressed as follows:

curl(−→V ) =(

∂Vz∂y − ∂Vy

∂ z

)−→i +(

∂Vx∂ z − ∂Vz

∂x

)−→j+

(∂Vy

∂x − ∂Vx∂y

)−→k

(6)

The multivariate derivative function, commonly referred to as theJacobian matrix is shown in Equation (7). If −→V represents a veloc-ity field in a moving fluid, then J represents the velocity gradienttensor. The determinant of the Jacobian matrix represents the trans-formation of one volume unit from one coordinate space to another.

J =

⎛⎜⎜⎝

∂Vx∂x

∂Vx∂y

∂Vx∂ z

∂Vy

∂x∂Vy

∂y∂Vy

∂ z∂Vz∂x

∂Vz∂y

∂Vz∂ z

⎞⎟⎟⎠ (7)

We can construct linear combinations of these operators and cre-ate compound equations that represent features of interest. Becausethey are constructed from the preceding operators, all of these equa-tions include the calculation of a partial derivative. Rather thanusing numerical approximation to compute these partials, we cancompute them directly in the RBF space. By applying Kansa’smethod of collocation [13], we can compute the partial derivativesof the function in Equation (1) directly at any location in the volumeas:

∂Vx

∂x= −

N

∑i=1

x−µi

σ2i,x

λi,x e−‖x−µi‖2

2σ2i,x (8)

The partials with respect to y and z can be computed accordingly.

4.1 Critical Point Detection

Three dimensional vector fields are difficult to visualize and evenmore difficult to analyze if the vector values exist on anything otherthan a regularly structured grid. One method of analyzing the volu-metric vector field is through the analysis of the vector field topol-ogy. The vector field topology consists of key points, curves, andsurfaces, that when combined characterize the integral manifolds.With a few exceptions, all integral manifolds must begin and end atzeros in the vector field or at the boundaries. These zeros form thecritical manifolds, allowing us to characterize the flow in the areassurrounding these critical points [10].

If we compute the eigenvectors and eigenvalues of the Jacobianmatrix shown in Equation (7) at every point, we can classify theunderlying flow topology and identify the critical points present inthe volume [9]. A positive real part indicates a repelling direction,a negative real part indicates an attracting direction, and an imagi-nary part denotes circulation. A purely repelling node, or a purelyattracting node is denoted by all three eigenvalues being real andhaving the same sign. Identifying these critical points allows theuser to further examine the regions in which there is a greater prob-ability of a region of interest.

4.2 Helicity

By computing the helicity of a velocity field, we can examinethe potential for helical flow, or flow that appears to move in acorkscrew pattern. Helicity is computed using Equation (9), andphysically represents the curl in the direction of the velocity field.

helicity =(∇×−→V ) ·−→V (9)

If the fluid moves in a dominant streamwise direction, then helicitylooks similar to vorticity, discussed in the next section. However,if the flow is not dominated by a single direction, then the helicitywill provide interesting and different results than those obtained bycomputing and analyzing either curl or the vorticity.

4.3 Vortex Detection

Jeong and Hussain [12] proposed a method for identifying a vortexthat is commonly referred to as the λ2-definition. To detect vortices,they decompose J into its symmetric part, the strain-rate tensor S,and the asymmetric part, the spin tensor Ω. For vortex detection,we need only consider the contribution from S2 +Ω2, where:

S = J+JT

2 , Ω = J−JT

2 with J = ∇−→V . (10)

The vortex is defined as a connected region where S2 +Ω2 has twonegative eigenvalues. If we let λ1, λ2, and λ3 be eigenvalues suchthat λ1 ≥ λ2 ≥ λ3, then we can say a point belongs to a vortex core,if λ2 is negative.

4.4 Shock Detection

Shocks represent a large class of features which more broadly arerepresented by connected regions of sharp discontinuities. We canidentify potential shock regions and further classify them by meth-ods first defined in [15]. We can calculate the quantities:

E1 = max((−→V /∣∣−→V ∣∣) ·∇U,0)

E2 = min((−→V /∣∣−→V ∣∣) ·∇U,0)

E3 =∣∣∇U − (−→V /

∣∣−→V ∣∣)−→V ·∇U∣∣ (11)

−→V denotes the velocity field and U can represent either pressure P,density ρ , or Mach number M. If U = P and E1 > 0, then the pointis part of a compression shock, otherwise if E2 < 0, the point is partof an expansion shock. If U = ρ and E1 > 0, then the point is partof an expansion shock, otherwise if E2 < 0, the point is part of acompression shock. If U = M and E1 > 0, then the point is part ofan expansion shock, otherwise if E2 < 0, the point is part of a com-pression shock. The value E3 represents shear shock orthogonal tothe flow direction.

5 INTERACTIVE RENDERING

All of our rendering techniques are based on the capability of mod-ern graphics adapters to perform arbitrary arithmetical operationson the GPU. The principal idea is presented in Figure 2. We exploitthe high memory bandwidth and parallel processing capability ofmodern graphics hardware by downloading the coefficients of theRBF encoding as texture maps to the graphics card. This allows thefragment unit to access all data required for the RBF reconstructionand to perform the decoding during rasterization.

Since the RBFs are evaluated by the GPU for each renderedfragment, the encoding of the data is hidden from the renderingand, therefore, our approach allows for a variety of visualizationalgorithms. For RBF encoded scalar fields, direct volume render-ing, volume-rendered isosurfaces and arbitrarily oriented cuttingplanes have been demonstrated in [11]. Further possibilities includethe mapping of the reconstructed data onto the surface of relatedgeometry, e.g., color coded pressure on the body of an airplane.Reconstruction and visualization of vector data, however, requiresnot only more sophisticated data handling, but even more impor-tantly, vector field specific visualization techniques as demonstratedin Sections 5.3 and 5.4.

In this paper we present our new RBF-based visualization andfeature detection system for the nVidia GeForce6 chip series whichis the first graphics chip to provide dynamic branching on the frag-ment level allowing for fragment programs with up to 65535 in-structions. This feature enables us to overcome certain limitationsof the solution presented in [11]. First of all, the need for multipassrendering is eliminated in most cases since the RBF reconstruction,the feature detection, and the final mapping can be computed in one

single fragment program. Moreover this dynamic branching allowsus to adapt the fragment program to the number of basis functionsper cell without expensive switching between different shaders withfixed numbers of accumulation steps. The latter not only avoids so-phisticated shader maintenance but also simplifies the texture layoutfor the RBF data as described in Section 5.1.

5.1 Texture Layout

The performance of the GPU-based RBF reconstruction heavily de-pends on the number of basis functions that have to be evaluatedper fragment. Fortunately, due to their limited spacial extent, onlya subset of all Gaussian RBFs in the encoding contribute to a sin-gle fragment and, therefore, needs to be considered in the recon-struction process. In order to efficiently access these subsets, weexploit a spatial decomposition of the data domain into cells whichcan be rendered independently. We use an octree-like hierarchicalstructure splitting each cell into eight subcells. For each cell in thetree, the list of contributing basis functions is calculated by deter-mining if their radius of influence ri intersects the cell. SolvingEquation (1) for the Gaussian basis function yields

ri = σi ·√

2 · ln( |λi|

ε

)(12)

where ε is a user defined error tolerance. Our subdivision termi-nates when the number of basis functions per cell is less than athreshold n or when further subdivision does not significantly re-duce the number of basis functions.

The organization of the RBF coefficients into textures has to al-low efficient access to all contributing RBFs of a particular cell.Therefore, coefficients of all RBF basis functions influencing a cellare stored consecutively along the row of the texture maps. Thus,our fragment program only needs to know the index of the first RBFfor the current cell and can access the remaining coefficients by ap-plying an increasing offset to the x-texture coordinate. Note thatthis may result in some data duplication, since the spatial decom-position generates several instances of the same RBF basis functionin different cells. However, as the size of a single set of RBF coef-ficients is small, this overhead is acceptable.

In [11], a rather sophisticated texture layout was required due tothe multipass reconstruction that was applied to overcome the lim-ited number of instruction slots of the chosen graphics platform.With the GeForce6’s dynamic branching and long fragment pro-grams, we can use a simple but effective greedy layout algorithm:

Texture 2

Texture 1

Texture 0

FragmentProgram

RBF Parameters

Figure 2: The rendering system is based on the programmable frag-ment unit evaluating the RBF encoded dataset on the fly during ras-terization. The RBF coefficients are accessed directly from multipletwo-dimensional texture maps storing the compact RBF representa-tion in the local memory of the graphics card.

we process all cells sorted by descending number of basis functionsstarting with the cell containing the most RBFs and assign it to thetexture line with the smallest but sufficient number of free slots.For fast computation of this slot, we utilize a free space list withone entry for each texture row. This list is sorted by ascending freeslots after the placement of each cell. Texels in the texture map arefilled in left-to-right order. Using this simple approach, we achievea minimal overhead of unused texels.

We encode the RBF vector data at full precision into three float-ing point texture maps as indicated in Figure 2. All maps share thesame basic layout to enable the use of the same index to address allcomponents of the RBF parameter set. Our first texture is an RGBmap holding the positions µi of the RBF centers. A second and thirdmap store the weights λi and the widths σi of the RBF functions re-spectively. Note that we actually store (2σ2

i )−1 instead of the widthin order to reduce the number of required fragment operations. Forefficiency, the texture format is adapted to the dimensionality of theinput data using either an R-, RG-, RGB-, or RGBA texture map.Thus, our RBF reconstruction algorithm supports vector data up tofour dimensions, multifield datasets with up to four different scalarvalues, or any combination with up to four data components.

5.2 Slicing Planes and Volume Visualization

Our first implemented class of visualization methods for multifielddata is centered around the evaluation of the original encoded dataproperties. Here our fragment programs can take advantage of thefact that almost all fragment instructions work on a four-componentvector; thus, the number of fragment operations required for a mul-tifield reconstruction is essentially the same as for a single datacomponent. Additional instructions, however, are introduced, sincefor a multifield RBF encoding, three texture lookup operations arenecessary in order to determine the coefficients for one basis func-tion. The five coefficients for an RBF encoded scalar can always bestored in two RGBA texels.

We can apply these fragment programs to the rendering of a sin-gle slicing plane through the volume domain. The color mappingof the slice is performed through a lookup in a transfer function im-plemented as a one-dimensional texture map. The lookup is basedeither on a single component of the multifield dataset or on themagnitude of the encoded vector. Interactive switching betweenthese behaviors can be achieved by utilizing a dynamically assignedcomponent mask, implemented as a parameter of the reconstructionfragment program. Alternatively, we can apply a three-dimensionaltexture as the transfer function allowing for a more sophisticatedmapping based on all components of the encoded vector or multi-field dataset. During the rendering of the slice, we have to accountfor the domain decomposition, since a different list of RBF centershas to be considered for each cell. Therefore, we clip the slice at thecell boundaries and render the resulting slice portions with indicespointing to the RBF coefficients of the corresponding cell.

Increasing the number of slicing planes and blending them back-to-front leads to direct volume renderings of the datasets, similar tothe well established texture-based volume rendering approach [2].The technique can also be extended to render shaded isosurfacesas demonstrated in [19]. In the latter case, we use the analyticallyreconstructed gradient for the lighting.

5.3 Feature Extraction

All the features relevant for this work are based on the velocitygradient tensor J of the vector field. Therefore, if features of theRBF encoded vector datasets are to be extracted, a fragment pro-gram is needed that is capable of analytically calculating the nine-component Jacobian matrix. On a GeForceFX graphics board,which was used in [11], this requires four rendering passes—onefor each column of the Jacobian matrix and an additional pass for

# dynamic loop over all RBFs of the current cellREP numfunc;

# fetch center position, variances and weightsTEX center.xyz, texpos.xyxx, texture[0], RECT;

TEX variances.xyz, texpos.xyxx, texture[1], RECT;

TEX lambdas.xyz, texpos.xyxx, texture[2], RECT;

# expval = −‖x − mu i‖ˆ2 / (2∗ sigma iˆ2)ADD vec.xyz, center.xyzx, -fragment.texcoord[0];

DP3 expval.xyz, vec.xyzx, vec.xyzx;

MUL expval.xyz, expval.xyzx, -variances.xyzx;

# compute exponents to the base of twoEX2 expres.x, expval.x;

EX2 expres.y, expval.y;

EX2 expres.z, expval.z;

# factor = (x−mu i) ∗ lambda i / (2∗ sigma iˆ2) - see correction belowMUL factors, lambdas, variances;

MUL factorX, vec, factors.x;

MUL factorY, vec, factors.y;

MUL factorZ, vec, factors.z;

# partial derivative for every directionMAD gradientX, factorX, expres.x, gradientX;

MAD gradientY, factorY, expres.y, gradientY;

MAD gradientZ, factorZ, expres.z, gradientZ;

# reconstructed vector propertyMAD value, expres, lambdas, value;

# increment texture coordinateADD texpos, texpos, texinc;

ENDREP;

# correction term since variances store 1 /(2∗ sigma iˆ2)MUL gradientX, gradientX, 2.x;MUL gradientY, gradientY, 2.x;MUL gradientZ, gradientZ, 2.x;

# feature calculation and shading

...

Figure 3: Fragment program for the combined calculation of thevector field and the velocity gradient tensor. The program exploitsthe possibility of dynamic loops.

the shading—since the limited number of instructions enforce mul-tipass reconstruction with the help of an accumulating p-buffer andonly four 32 bit floating point values per pixel can be written intothe p-buffer. However, as mentioned before, the GeForce6 chipseries not only allows the reconstruction of a dynamic number ofRBFs, but also the computation of all nine elements of the Jaco-bian matrix and the final evaluation in one single fragment program.Therefore, now only one single rendering pass is required resultingin significantly improved performance. Sample code that shows theRBF reconstruction with gradient computation in a dynamic loop isgiven in Figure 3.

Based on the reconstructed velocity gradient tensor, we have im-plemented a series of important feature calculations, vorticity, he-licity and λ2 vortex detection, to demonstrate the flexibility and useof this approach for feature detection.

Vorticity Calculation To gain an understanding of the localflow, a good first approach is to calculate the vorticity of a vector.Given the Jacobian, the vorticity can be easily extracted from therotational part Ω (cf. Eq. 10) of the matrix [7], which makes vortic-ity an attractive choice for a GPU-based implementation:

−→Ω =12

∇×V . (13)

Adding the required computation to the fragment program pre-sented in Figure 3 is straight forward. Since vorticity is a vectorquantity, our implementation allows the user to interactively definea bitmask for masking out single components of

−→Ω and visualizingthe magnitude of the resulting vector.

λ2 Vortex Detection Vorticity already gives a good impres-sion of where vortices can be found in the vector field. However, ad-vanced vortex detection algorithms, like the λ2 criterion explainedin Section 4.3 give even better results and are also suitable for GPU-based implementations due to their local nature.

As before, a fragment program like the one sketched in Figure 3is used to retrieve the partial derivatives. After this reconstructionthe Jacobian is decomposed into a symmetric and an asymmetricpart. To determine the eigenvalues of the matrix S2 + Ω2, and tofind the relevant eigenvalue λ2, we adopted the approach proposedin [17]. The basic idea is to use a modified version of Cardan’s So-lution to analytically determine the root of the characteristic poly-nomial. By pre-computing coefficients and storing them into tex-ture maps, no trigonometric functions need to be evaluated. As aresult, the computation is very efficient despite its higher complex-ity.

5.4 Particle Advection

Particle tracking is another well-known technique for understand-ing flows. RBF encoded vector fields are particularly well-suitedfor this technique since the vector field has to be reconstructed onlyat a small number of positions. Although the positions may bedistributed across the 3D volume, the compact representation withRBFs can be used to accomplish this task. We have implementeda particle advection routine that is capable of tracking a large num-ber of particles simultaneously, exploiting the parallel renderingpipelines of current graphics cards.

The initial positions of the particles have to be defined by the useras a set of 3D coordinates. The particle coordinates are then storedin a 2D floating point texture with as many texels as are requiredfor storing the positions. In the next step, a quadrilateral of the sizeof the texture is rendered to a p-buffer. For each generated frag-ment, the velocity is then reconstructed as described in Section 5,using the appropriate particle coordinate for evaluating the sum ofbasis functions. The particle position is updated using an Euler in-tegration step based on the reconstructed velocity. These steps areiterated using the new particle positions as the input texture until auser-defined number of timesteps has been reached.

After each timestep, the updated particle positions are stored ina 2D floating point texture. Ideally, this texture should be useddirectly for rendering particle traces as OpenGL vertex arrays with-out the need to read back the particle positions from the graph-ics card. However, this functionality is currently only availablefor ATI cards by means of the so-called uberbuffer extension. Amore flexible and more generally supported render buffer manage-ment has already been proposed by means of the OpenGL extensionGL framebuffer object, which will probably be extended to al-low for the direct rendering into vertex arrays as well. This func-tionality should soon be available in new or even current graphicsadapters. Currently, rendering the particle traces on the GeForce6,however, requires a costly glReadPixels for each timestep to readthe particle positions to main memory and a transfer back to theGPU for the rendering. We consider this only a temporary solution,which should improve at least with the next graphics chip genera-tion.

Table 1: Accuracy and compression for RBF encodings.

Norm Error (%)Dataset # Cells # RBFs Avg. Max.

X38 Compr. Shock 1,943,483 4,883 0.12 4.99X38 Exp. Shock 1,943,483 6,789 0.21 4.99Oil Reservoir 156,642 141 0.58 4.99Tornado 32,768 2,100 1.82 6.82Channel 32,805 2,105 1.46 4.73MHD 35,937 2,145 0.72 6.23

6 RESULTS

We have tested our system on an Intel Pentium 4 3.60 GHz proces-sor with 2 GB memory, and a 256 MB nVidia GeForce 6800GTgraphics board. Our implementation supports both, MS Windowsand Linux systems and has been tested on a variety of datasets.

The X38 dataset is based on a tetrahedral finite element viscouscalculation computed on geometry configured to emulate the X38Crew Return Vehicle. The geometry and the simulation were com-puted at the Engineering Research Center at Mississippi State Uni-versity by the Simulation and Design Center. This dataset repre-sents a single time step in the reentry process into the atmosphere.The simulation was computed on an unstructured grid containing1,943,483 tetrahedra at a 30 degree angle of attack.

The Channel dataset is a time-dependent dataset obtained in anexperiment studying laminar-turbulent boundary layer transitionsin a water channel. This 81× 45× 9 hexahedra-based dataset wasprovided by the Institute for Aerodynamics and Gasdynamics of theUniversity of Stuttgart.

The Magnetohydrodynamics dataset (MHD) is a simulation ofplasma flow in the outer heliosphere of the sun computed by D.Aaron Roberts at NASA Goddard Space Flight Center. The normof the curl of the velocity field is used as a measure of vorticity,showing the alternating vortices in the plasma flow.

The oil reservoir data was computed by the Center for Subsur-face Modeling at The University of Texas at Austin. This 156,642tetrahedra dataset is a simulation of a black-oil reservoir model usedto predict placement of water injection wells to maximize oil fromproduction wells. The last dataset we used for the evaluation is thesynthetic tornado dataset, courtesy of Roger Crawfis of The OhioState University.

Table 1 shows the size of the datasets, the number of RBFs usedin the encoding process, and both the average and maximum normerror percentages. These results clearly show the compact storageof the RBF representation while meeting the maximum specifiederror criteria of either 5% or 7%.

Figure 4: Slices of features extracted directly from the RBF encod-ing. The left image shows vorticity magnitude of the MHD datasetencoded with 2,145 RBFs. In the right image λ2 values have beenextracted from the tornado dataset. Note that the same transferfunctions as in Figure 6 have been applied, in order to allow directcomparison with the corresponding volume renderings.

Figure 5: A volume rendering of the X38 compression shock (left) andexpansion shock (right) using 4,883 and 6,789 RBFs, respectively.

Figure 1 (left) shows a volume rendering of the tornado datasetas has been described in Section 5.2. For this image, the vorticitywas used to determine color and opacity. The third image of Fig-ure 1 illustrates pressure values for water injection into the black-oilreservoir to maximize conveyor capability. This dataset has beenencoded with 141 RBF and renders at approximately 2.0 fps on aGeForce 6800GT graphics board using 64 slices. The right imageof Figure 1 shows extracted helicity values of the MHD dataset.Figure 5 shows a volume rendering of the compression shock (left)and expansion shock (right) for the X38 spacecraft as described inSection 4.3.

Two slices of features detected on RBF encoded datasets areshown in Figure 4. The left image shows vorticity magnitude ofthe MHD dataset. The right image shows a slice of λ2 values ex-tracted from the tornado dataset. Computation time for these slicesis about 0.04 s and almost independent of the number of basis func-tions reconstructed per rendering pass. Single slices are only usefulto the knowledgeable fluid dynamics engineer. Volume renderingsbased on a stack of slices reveal more structure. Figure 6 (left) il-lustrates this with an example of the volume rendered MHD datasetwith extracted vorticity magnitude. This dataset has been encodedwith 2,145 RBFs and is rendered with 400 slices. The right im-age of Figure 6 shows λ2 values computed for the tornado datasetand visualized with 256 slices. Since feature detection involves ex-pensive gradient calculations, volume rendering of dynamically ex-tracted features provides only limited interactivity. Rendering thetornado dataset with 32 slices and λ2 extraction reaches an aver-age of 1.2 fps. Nevertheless, GPU-based feature detection in radialbasis space is a very promising technique that should approach in-teractive rates with the next generation of graphics hardware.

The new features of nVidia’s GeForce6 hardware had primary ef-fects on our implementation and therefore on the rendering results.Compared to a solution without dynamic loops as presented in [11],

Figure 6: Volume rendered features extracted from the RBF encodedvortex and tornado datasets. Left image: Velocity magnitude ofthe plasma flow (MHD). Right image: computed λ2 values of thesynthetic tornado.

Figure 7: Traces of 110 particles tracked over 400 timesteps in theChannel dataset. Upper image: RBF encoded dataset and GPU-based computation. Lower image: Original Cartesian grid and soft-ware implementation.

most of the multipass rendering could be removed, which for vectordatasets resulted in a speedup factor of three. For scalar datasets, nosignificant speedup has been observed because the dynamic loop inthe fragment program requires three additional instructions per it-eration. Considering that there are only eleven instructions in theloop body of the scalar reconstruction program, this is an overheadof 27%, and requires more computation time than the switching be-tween different fragment programs with a fixed number of unrolledloop iterations.

Figure 7 shows two screenshots of 110 particles traced over 400timesteps. The upper image shows the particle traces calculatedon the GPU for the RBF encoded dataset, using Euler integrationwith fixed stepsize. The lower image shows the traces computedby the commercial flow visualization package PowerVIZ [5] us-ing the original Cartesian grid and fourth order Runge-Kutta in-tegrating with adaptive stepsize. As expected, the Euler integra-tion employed for the GPU implementation produces less accuratebut nevertheless comparable results. Regarding the performance,we observed almost constant computation times of about 3 s on aGeForce 6800GT for up to 400 particles and a sublinear growth inthe number of particles for larger particle numbers.

7 CONCLUSION

We have demonstrated that RBF encoding of scalar, vector, andmultifield data provides a compact representation of large datasets,enabling them to be efficiently stored and reconstructed on com-modity graphics cards. This representation is very well suited toGPU processing and many traditional feature detection techniquescan be more efficiently computed using the radial basis space repre-sentation. Performing these computations on the GPU allows pixel-accurate feature detection and provides a flexible framework for in-teractive feature exploration, where the feature parameters can beinteractively adjusted. We have also demonstrated the flexibility invisualizing these RBF encoded datasets using texture-based volumerendering, cutting planes, isosurfaces, and particle traces.

In the future, we plan to extend this work by applying thesetechniques to the emerging field of meshless simulation techniquesfor computational fluid dynamics, which already employ RBFs fortheir simulation. With our visualization techniques, we can directly

visualize and detect features from the simulation RBF representa-tion, without the need to resample the data to an underlying grid.We also are exploring better error measurement methods for vec-tor encoding, including techniques using a comparison of resultingstreamlines.

REFERENCES

[1] M. A. Branch, T. F. Coleman, and Y. A. Li. A subspace, interior,and conjugate gradient method for large-scale bound-constrained min-imization problems. SIAM Journal on Scientific Computing, 21(1):1–23, 1999.

[2] Brian Cabral, Nancy Cam, and Jim Foran. Accelerated VolumeRendering and Tomographic Reconstruction Using Texture MappingHardware. 1994 Symposium on Volume Visualization, pages 91–98,October 1994.

[3] Jonathan C. Carr, Richard K. Beatson, Jon B. Cherrie, Tim J. Mitchell,W. Richard Fright, Bruce C. McCallum, and Tim R. Evans. Recon-struction and Representation of 3D Objects With Radial Basis Func-tions. In Proceedings of ACM SIGGRAPH 2001, Computer GraphicsProceedings, Annual Conference Series, pages 67–76, August 2001.

[4] Christopher S. Co, Bjoern Heckel, Hans Hagen, Bernd Hamann, andKenneth I. Joy. Hierarchical Clustering for Unstructured VolumetricScalar Fields. In Proceedings of IEEE Visualization 2003, October2003.

[5] Exa Corporation. PowerVIZ specifications, 2001.http://www.exa.com/pdf/PowerVIZscreen.pdf.

[6] A. Ardeshir Goshtasby. Grouping and parameterizing irregularlyspaced points for curve fitting. ACM Transactions on Graphics (TOG),19(3):185–203, 2000.

[7] Robert Haimes and David Kenwright. On the Velocity Gradient Ten-sor and Fluid Feature Extraction, 1999.

[8] R.L. Hardy. Theory and applications of the multiquadric-biharmonicmethod. Computers and Mathematics with Applications, 19:163–208,1990.

[9] J.L. Helman and L. Hesselink. Visualizing Vector Field Topology inFluid Flows. IEEE Computer Graphics & Applications, pages 36–46,1991.

[10] L. Hesselink and J. Helman. Evaluation of Flow Topology from Nu-merical Data, AIAA-87-1811., 1987.

[11] Yun Jang, Manfred Weiler, Matthias Hopf, Jingshu Huang, David S.Ebert, Kelly P. Gaither, and Thomas Ertl. Interactively Visual-izing Procedurally Encoded Scalar Fields. In Proceedings JointEUROGRAPHICS-IEEEE TCVG Symposium on Visualization 2004,2004.

[12] J. Jeong and F. Hussain. On the Identification of a Vortex. Journal ofFluid Mechanics, 285:69–94, 1995.

[13] E.J. Kansa. A scattered data approximation scheme with applicationsto computational fluid-dynamics - I: Surface Approximations and Par-tial Derivative Estimates. Computers and Mathematics with Applica-tions, 19(8):127–145, 1990.

[14] D. Lovely and R. Haimes. Shock Detection from Computational FluidDynamics Results, 1999.

[15] D. Marcum and K. Gaither. Solution Adaptive Unstructured Grid Gen-eration Using Pseudo Pattern Recognition Techniques, 1997.

[16] Gregory M. Nielson. Scattered data modeling. IEEE Comput. Graph.Appl., 13(1):60–70, 1993.

[17] Simon Stegmaier and Thomas Ertl. A Graphics Hardware-based Vor-tex Detection and Visualization System. In Proceedings of IEEE Vi-sualization 2004, pages 195–202, 2004.

[18] Greg Turk and James F. O’Brien. Modelling with implicit surfacesthat interpolate. ACM Transactions on Graphics (TOG), 21(4):855–873, 2002.

[19] R. Westermann and T. Ertl. Efficiently Using Graphics Hardware inVolume Rendering Applications. Computer Graphics (SIGGRAPH’98), 32(4):169–179, 1998.