A data model is a mental model of the nature of some data. It answers such questions as the following:
Data is often classified as follows according to measurement level:
| Name of Level | Description | Valid Operations | Appropriate Measure of Central Tendency | Examples |
|---|---|---|---|---|
| nominal | values denote categories which have no order | = ≠ | mode | color, postal code (e.g. zip) chemical species (e.g. CO2) |
| ordinal | values are ordered differences meaningless | = ≠ < ≤ > ≥ | median | Richter earthquake scale dates in form YYYYMMDD |
| interval | differences are valid quotients meaningless | = ≠ < ≤ > ≥ + − | arithmetic mean | temperature in °C |
| ratio | quotients are valid | = ≠ < ≤ > ≥ + − × ÷ | geometric mean | temperature in °K |
A measurement error is the difference between the true value and the measured value. Measured values can differ from true values due to:
A time series consists of values located along the time dimension. Geographic data is located along spatial dimensions such as latitude, longitude and altitude and may also have a time dimension. Note that longitude is cyclic.
The dimensions of the space can have a measurement level of nominal. For example, an accounting spreadsheet might have columns corresponding to charge codes and rows corresponding to company divisions.
Data located in a continuous space can be either gridded or scattered. Both types are discussed in NAP Grids.