Analytical model A structure and process for analyzing a dataset. For example, a decision tree is a model for the classification of a dataset.
Glossary
of Terms
Analytical model A structure and process for
analyzing a dataset. For example, a decision tree is a model for the
classification of a dataset.
Artificial neural networks Non-linear
predictive models that learn through training and resemble biological neural
networks in structure.
Classification The process of dividing a
dataset into mutually exclusive groups such that the members of each group are
as “close” as possible to one another, and different groups are as “far” as
possible from one another, where distance is measured with respect to specific
variable(s) you are trying to predict. For example, a typical classification
problem is to divide a database of companies into groups
that are as homogeneous as possible with
respect to a creditworthiness variable with values “Good” and “Bad.”
Clustering The process of dividing a dataset
into mutually exclusive groups such that the members of each group are as
“close” as possible to one another, and different groups are as “far” as
possible from one another, where distance is measured with respect to all
available variables.
Data cleansing The process of ensuring that all
values in a dataset are consistent and correctly recorded.
Data mining The extraction of hidden predictive
information from large databases.
Data navigation The process of viewing
different dimensions, slices, and levels of detail of a multidimensional
database. See OLAP.
Data visualization The visual interpretation of
complex relationships in multidimensional data.
Data warehouse A system for storing and
delivering massive quantities of data.
Decision tree A tree-shaped structure that
represents a set of decisions.
These decisions generate rules for the
classification of a dataset.
Genetic algorithms Optimization techniques that
use processes such as genetic combination, mutation, and natural selection in a
design based on the concepts of natural evolution.
Logistic regression A linear regression that
predicts the proportions of a categorical target variable, such as type of
customer, in a population.
Multidimensional database A database designed
for on-line analytical processing.
Structured as a multidimensional hypercube with
one axis per dimension.
Multiprocessor computer A computer that
includes multiple processors connected by a network. See parallel processing.
OLAP On-line analytical processing. Refers to
array-oriented database applications that allow users to view, navigate
through, manipulate, and analyze multidimensional databases.
Outlier A data item whose value falls outside
the bounds enclosing most of the other corresponding values in the sample. May
indicate anomalous data. Should be examined carefully; may carry important
information.
Parallel processing The coordinated use of
multiple processors to perform computational tasks. Parallel processing can
occur on a multiprocessor computer or on a network of workstations or PCs.
Retrospective data analysis Data analysis that
provides insights into trends, behaviors, or events that have already occurred.
Time series analysis The analysis of a sequence
of measurements made at specified time intervals. Time is usually the
dominating dimension of the data.