### Lecture 1.2: Data

• Data Sources
• Simulations

•       ex: CFD, environmental modeling, virtual crash tests
• Sensors/Scanners

•       ex: medical diagnosis, satellites, emissions monitors
• Surveys/Records

•       ex: census, consumer tracking, polls, observational studies
• Equations

•       ex: math, health effects models
• Data Characteristics
• Continuity
• Continuous: nature is continuous (for most purposes), but only implicit reps
• Discrete: anything sampled or stored on digital media
representation error
possible aliasing
artifacts of sampling
• Structure
• Definitions
• Topology: connectivity (triangle)
• Geometry: realization of topology (coordinates)
• Elements
• Points: located where data value known (geom)
• Cells: set up interpolation parameters (topology)

•       common types: point, line, triangle, quad, tetra, voxel
• Structured: inherent spation relationship among points

•       relatively efficient storage: topology is implicit
• regular

•       can be represented implicitly (3x3: dimension, origin, aspect)
ex: medical data
• rectilinear

•       can be represented semi implicitly (nx + ny + nz)
ex: CFD -- refinement around objects
• curvilinear

•       geometry represented explicitly (3*nx*ny*nz)
ex: CFD -- flow along river

ease of computation
wide array of visualization algorithms
• Unstructured: no (or unknown) spatial relationship among points

•       ex: FEM, structural analysis, census, monitor devices
flexibility
often reality
more limited array of visualization algorithms
• Dimension: # of independent variables (2D, 3D, etc)

•       usually means number of spatial/temporal dimensions
• Multiple
• scalar: single value per position
• multivariate: multiple values per position
multiple scalars
vector
tensor
• Type
• char
• int
• real
• Scale
• Nominal: just names or categories or identifiers

•       can say "this one is different from that one"
ex: county, land use, ethnicity or race, tissue type
• Ordinal: values are ordered

•       can say "this one is bigger than that one"
ex: preference, ranking
• Interval: constant step size

•       can say "the difference between these two is the same as the difference between those two"
ex: test scores, degrees Fahrenheit
• Ratio: meaningful zero

•       can say "this one is twice as big as that one"
ex: degrees Kelvin, income, percent below poverty line, wind speed
• Data Representation
• Compact: efficient memory use

•       structured schemes, unstructured schemes, sparse matrices, shared verts
• Efficient: computationally accessible; retrieve and store in constant time

•       structured schemes
• Mappable: straight-forward conversions

•       native -> rep: simple conversion, no lost info
rep -> graphics prim: esp for interactive display
• Minimal coverage: manageble # options

•       few variants which work for a wide range of data sets
• Simple

•       easier to use
easier to optimize
errors less likely
• Data Transformations
• Interpolation
• Aggregation
• Smoothing
• Simplification
• Data Quality
• Missing data
• Uncertain data
• Representation error
• Sampling artifacts