DOE Dictionary
Have a term to add - submit it here
#  |  A  |  B  |  C  |  D  |  E  |  F  |  G  |  H  |  I  |  J  |  K  |  L  |  M
N  |  O  |  P  |  Q  |  R  |  S  |  T  |  U  |  V  |  W  |  X  |  Y  |  Z

#:

2-level design - an experiment where all factors are set at one of two levels, denoted as low and high (-1, 1 or 1, 2)
2-tailed test - also known as a two-sided test; it is a hypothesis test with a two-sided alternative hypothesis. That is, one could possibly err on either side of the center.
3-level design an experiment where all factors are set at one of three levels, denoted as low, medium, and high (-1, 0, 1, or 1, 2, 3)

[ back to top ]

A:

alpha risk - the probability of concluding the alternative hypothesis (H1) when the null hypothesis (H0) is true.
aliasing - when two factors or interaction terms are set at identical levels throughout the entire experiment (i.e., the two columns are 100% correlated).
alternative hypothesis - the hypothesis to be accepted if the null hypothesis is rejected. It is denoted by H1.
analysis of variance (ANOVA) - a procedure for partitioning the total variation. It is often used to compare more than two population means.
assignable cause (of variation) - significant, identifiable change in a response which is caused by some specific variable from the cause and effect diagram.
attributes data (quality) - data coming basically from GO / NO-GO, pass/fail determinations of whether units conform to standards. Also includes noting presence or absence of a quality characteristic.
average (x) (of a statistical sample) - also called the sample mean, it is the arithmetic average value of all of the samle calues. It is calculated by adding all of the sample calues together and dividing by the number of elements (n) in the sample.

[ back to top ]

B:

balanced design - a 2-level experimental design is balanced if each factor is run the same number of times at the high and low levels.
bar chart - a graphical method which depicts how data fall into different categories.
Β (Beta) risk - the probability of concluding the null hypothesis (H0) when the alternative (H1) is true.
bias (in measurement) - systematic error which leads to a difference between the average result of a population of measurements and the true, accepted value of the quantity being measured.
bias (in measurement) - systematic error which leads to a difference between the average result of a population of measurements and the true, accepted value of the quantity being measured.
Box-Behnken design - systematic error which leads to a difference between the average result of a population of measurements and the true, accepted value of the quantity being measured.
brainstorming - a group activity which generates alist of possible factors and levels, and the method by which the results may be evaluated.

[ back to top ]

C:

calibration (of instrument) - adjusting an instrument using a reference standard to reduce the difference between the average reading of the instument and the "true" value of the standard being measured, i.e., to reduce measurement bias.
capability (of process) - a measure of quality for a process usually expressed as sigma capability, Cpk, or defects per million (dpm). It is obtained by comparing the actual process with the specification limit(s).
causality - the assertion that changes to an input factor will directly result in a specified change in an output.
cause-and-effect diagram - a pictorial diagram showing possible causes (process inputs) for a given effect (process output).
center points - experimental runs with all factor levels set halfway between the low and high settings.
central composite design - a 3-level design that starts with 2-level fractional factorial and some center points. If needed, axial points can be tested to complete quadratic terms. Used typically for quantitative factors and designed to estimate all linear effects plus desired quadratics and 2-way interactions.
central tendency - a measure of the point about which a group of values is clustered; some measures of central tendency are mean, mode, and median.
characteristic - a process output which can be measured and monitored for control and capability.
chi-square distribution - the distribution of chi-square statistics.
chi-square - the test statistic used when testing the null hypothesis of independence in a contingency table or when testing the null hypothesis of a set of data following a prescribed distribution.
classical methods - statistical experimental design thoughts and processes orignally developed by Fisher and other as early as the 1920's. Uses ANOVA as the primary analysis tool, along with orthogonal designs such as fractional factorials, latin squares, Plackett-Burman, Box-Behnken, central composite, and D-optimal.
coefficient of variation - the ratio of the standard deviation to the mean. It is a standardized method of looking at variation.
coefficient of determination (r-squared) - the square of the sample correlation coefficient; it represents that strength of a model.(1 - r-squared) x 100% is the percentage of noise in the data not accounted for by the model.
common causes of variation - those sources of variability in a process which are truly random, i.e., inherent in the process itself.
confidence interval - range within which a parameter of a population (e.g., mean, standard deviation, etc.) may be expected to fall, on the basis of measurement, with some specified confidence level or confidence coefficient.
confidence limits - the upper and lower boundaries of a confidence interval.
control (of process) - a process is said to be in a state of statistical control is the process exhibits only random variations (as opposed to systematic variations and / or variations with known sources). When monitoring control with control charts, a state of control is exhibited when all points remain between set control limits without any abrnomal (non-random) patterns.
control chart - the basic tool of statistical process control. It consists of a run chart, together with statistically determined upper and lower control limits and a centerline.
correlation coefficient (R) - a measure of the linear relationship between two random variables. Not as useful as r-squared, the coefficient of determination.
controllable factors - factors the experimenter has control of during all phases, i.e., experimental, production, and operational phases.
Cp - during process capability studies, Cp is a capability index which shows the process capability potential but does not consider how centered the process is. Cp may range in value from 0 to infinity, with a large value indicating greater potential capability. A value of 1.33 or greater is usually desired.
Cpk - during proces capability studies, Cpk is an index used to compare the natural toleranceof a process with teh specification limits.Cpk has a value equal to Cp if the proces is centered on the nominal; if Cpk is negative, the process mean is outside the specification lmits; if Cpk is between 0 and 1 then the natural tolerances of the process fall outside the spec limits. If Cpk is larger than 1, the natural tolerances fall completely within the spec limits. A value of 1.33 or greater is usually desired.

[ back to top ]

D:

D-optimal design - an experimental design in which the minimum number of runs is based on degrees of freedom needed to analyze the desired effects. Not necesarily orthogonal or balanced, it does minimize the correlation (confounding) between factors.
defect - departure of a quality characteristic from its acceptable level or state, i.e., the measured value of the characteristic is outside of specification. Also referred to as non-conformance to requirements.
defective unit - a sample (part) which contains one or more defects, making the sample unacceptable for its intended, normal usage.
defining relationship - a statement of one or more factor word equalities used to determine the aliasing structure in a fractional factorial design.
defining words - factor word equalities in building a defining relationship.
degrees of freedom - a parameter in the t, F, and x-squared distributions. It is a measure of the amount of independent information available for estimating the population variance, a-squared. It is the number of independent observations minus the number of parameters estimated.
design array - (also referred to as design matrix) - an array representing the experimental settings. Usually contains values ranging from -1 to +1, but could be wider is using a CCD. The rows represent the runs and the columns represent the factors.
deviation - the difference between an observed value and the mean or average of all observed values.
discrete random variable - a variable that is based on count data (number of defects, number of births, number of deaths, etc.). It supposedly has a countable number of possible outcomes.
dispersion (of a statistical sample) - the tendency of the values of the elements in a sample to differ from each other. Dispersion is commonly expressed in terms of the range of the sample (difference between the lowest and highest values) or by the standard deviation.

[ back to top ]

E:

estimation - an approach to making inference about population parameters. This includes both point estimates and interval estimates (confidence intervals).
experimental design - purposeful changes to the inputs (factors) to a process (or activity) in order to observe corresponding changes in the outputs (responses). A means of extracting knowledge from a process (or activity).
exponential distribution - a probability distribution mathematically described by an exponential function. Used to describe the probability that a product survives a length of time t in service, under the assumpution that the probability of a product failing in any small time interval is independent of time.

[ back to top ]

F:

F table - provides a means for determining the significance of a factor to a specified level of confidence by comparing calculated F ratios to those from the F distribution (see Appexndix F). If the F ratio is greater than the table value, there is a significant effect.
factor - an input to a process which can be manipulated during experimentation.
failure rate - the average number of failures per unit time. Used for assessing reliability of a product in service.
fault tree analysis (FTA) - a technique for evaulating the possible causes which might lead to the failure of a product. For each possible failure, the possible causes of the failure are determined; then the situation leading to those causes are determined; and so forth, until all paths leading to possible failures have been traced. The result is a flow cart for the failure process. Plans to deal with each path can then be made.
feedback - using the results of a process to contol it. The feedback principle as wide application. An example would be using control charts to keep production personnel informed o the results of a process. This allows them to make the suitable adjustments tot he process.Some form of feedback on the results of a process is essential in order to keep the process under control.
fishbone diagram - see cause-and-effect diagram.
flow chart or diagram (for programs, decision making, process development) - a pictorial representation of a process indicating the main steps, branches, and eventual outcomes of the process.
foldover design - a way to obtain a resolution four (RIV) design based on two designs of RIII. Used when the confirmation runs from a resolution III design differ substantially from their prediction and the experimenter desires to de - alias the 2 - way interactions from the main effects.
fractional factorials - instead of using a full factorial, a subset or fraction of it can be used if the experimenter can assume some interactions will not occur.
freehand regression line - the best-fit line drawn by "eyeballing."
full factorial - all possible combinations of the factors and levels. Given k factors, all with two levels, there will be 2k runs. If the factors have three levels, the 3k runs would be needed.

[ back to top ]

G:

Gaussian distribution - the engineering name for the normal distribution.
gradient - the slope at a point on a surface.
grand average - overall average of data.

[ back to top ]

H:

histogram - a bar chart that depicts the frequencies of numerical or measurement data.
hypothesis test - a procedure whereby one of two mutually exclusive and exhaustive statements about a population parameter is concluded. Infrmation from a sample is used to infer something abuot a population from which the sample was drawn.

[ back to top ]

I:

inner array - a Taguchi term used in parameter design to identify the combinations of controllable factors to be studied in a designed experiment. Also called a "design array" or "design matrix."
interaction - 2 factors (input variables) are said to interact if one factor's effect on the response is dependent upon the level of the other factor.
Ishikawa diagram - see cause-and-effect diagram.

[ back to top ]

J:

just-in-time (KIT) manufacturing - a strategy that coordinates scheduling, inventory, and production to move away from the batch mode of production in order to improve the quality and reduce inventories.

[ back to top ]

L:

latin squares - provides a means for determining the significance of a factor to a specified level of confidence by comparing calculated F ratios to those from the F distribution (see Appexndix F). If the F ratio is greater than the table value, there is a significant effect.
LCL (lower control limit) - for control charts: the limit above which the process subgroup statistics (X, R, etc.) must remain when the process is in control. Typically 3 standard deviations below the center line.
least squares - a method of curve-fitting that defines the "best" fit as the one that minimizes the sum of the squared deviations of the data points from the fitted curve..
level - a setting or testing value of a factor.
level of significance - a measure of the outcome of a hypothesis test. It is the P-value or probability of making a Type 1 error.
linear graph - a tool used bu Taguchi to identify sets of interacting columns in orthogonal arrays.
loss function - a technique for quantifying loss due to product deviations from target values.
lower confidence limit - the smaller of the two numbers that form a confidence interval.
LSL (lower specification limit) - the lowest value of a product dimension or measurement which is acceptable.

[ back to top ]

M:

main effect - the influence a single factor has on the response when it is changed from one level to another. Often used to represent the "linear effect" associated with a factor.
mean square error - a weighted average of the variances for each run.
mean - the average of a set of values. We usually use x or y to denote a sample mean, whereby we use the Greek letter μ to denote a population (or true) mean.
measure of central tendency - numerical measures that depict the center of a data set. The most commonly used measures are the mean and the median.
modeling designs - designed experiments used to build a linear and / or non-linear relationship of the output(s) with a reasonable number (< 5) of inputs.
MTBF (mean time between failures) - mean time between successive failures of a repairable product. This is a measure of product reliability.
multicollinearity - the existence of strong correlations between input factors or independent variables. High levels of multicollinearity will prevent terms from being evaluated independently.
multiple regression - a model where several independent variables are used to predict on dependent variable.

[ back to top ]

N:

natural tolerances (of a process) - 3 standard deviations on either side of the center point (mean value). In a normally distributed process, the natural tolerances encompass 99.73 % of all measurements.
noise - unexplained variability in the response. Typically due to variables which are not controlled.
nominal - the desired mean value for a particular product dimension; the target value.
nonconforming unit - a sample (part) which ahs one or more nonconformities, making the sample unacceptable for its intended use.
normal distribution - the distribution characterized by the smooth, bell-shaped curve.
null hypothesis (H0) - the conclusion that typically includes equality, i.e., H0: μ1 = μ2 or H0: σ1 = σ2.

[ back to top ]

O:

one-at-a-time approach - a popular but inefficient way to conduct a designed experiment.
outer array - a Taguchi term used in parameter design to identify the combinations of noise factors to be studied in arobust designed experiment.
out of control (of a process) - a process is said to be aout of control if it exhibits variations larger than its control limits, or shows a systematic pattern of variation.

[ back to top ]

P:

P (2 tail) - the probability that a term does not belong in a regression model. Typically, Rules of Thumb will state that P(2 tail) < .1 indicate the term should be place
P-value - the probability of making a Type 1 error. This value comes from the data itself. It also provides the exact level of significance of a hypothesis test.
Pareto chart - a bar chart for attribute (or categorical) data that is presented in descending order or frequency.
percent defective - for acceptance sampling; the percentage of units in a lot which are defective, i.e., of unacceptable quality.
population - a set or collection of objects or individuals. It can also be the corresponding set of values which measure a certain characteristic of a set of objects or individuals.
predicted value of y - a value of the response (dependent) variable y that is computed from the predication equation for some particular value of the input (independent) variable x. It is lableled ŷ and is computed from ŷ = b0 + b1 x.
primary reference standard (for measurements) - a standard maintained by the National Institute of Standards and Technology (NIST) for a particular measuring unit. The primary reference standard duplicates as nearly as possible the internation stndard and is used to calibrate other (transfer) standards, which in turn are used to calibrate measuring instruments for industrial use.
probability - a measure of the likelihood of a given event occuring. It is a measure that take on value between 0 and 1 inclusive, with 1 being the certain event and 0 meaning that there is relatively no chance at all of the event occurring. How probabilities are assigned is another matter. The relative frequency approach to assigning probabilities is one of the most common.
process capability - comparing actual process performance with process specification limits. There are various measures of process capability, such as Cpk, σ-level, and dpm (defects ber million).
process control - a process is said to be in control or it is a stable predictable process if all special causes of variation have been removed. Only common causes or natural variation remains in the process.

[ back to top ]

N:

quality assurance - the function of assuring that a product or service will satisfy given needs. The function includes necessary verification, audits, and evaluations of quality factors affectingthe intended usage and customer satisfaction. This function is normally the responsibility of one or more upper management individuals overseeingthe quality assurance program.
quality characteristic - a particular aspect of a product which relates to its ability to perform its intended function.
quality function - the function of maintaining product quality levels; i.e., the execution of quality control.
quality specifications - particular specifications of the limits within which each quality characteristic of a product is to be maintained in order to meet the minimum functional requirements of the customer.
quality control - the process of maintaining an acceptable level of product quality.

[ back to top ]

R:

random sample - a sample selected from a population in such a way taht every element of the population has an equally likely chance of being selected.
random - varying with no discernable pattern.
range - a measure of the variability in a data set. It is a value, namely the difference between the largest and smallest values in a data set.
regression analysis - a statistical technique for determining the mathematical relation between a measured quanitity and the variables it depends on.
regression line - the line that is fit to a st of data points by using the method of least squares.
reliability - the probability that a product will function properly for some specified period of time; under specified conditions.
repeatability (of a measurement) - the extent to which repeated measurements of a partcular object with a particular instrument produce the same value.
reproducibility - the variation between individual people taking the same measurement and using the same gaging.
residual - the difference between an observed value and a predicted value: residual = y - ŷ.
resolution (of a measuring instrument) - the smallest unit of measure which an instrument is capable of indicating.
Rule of Thumb (ROT) - a simplified, practical procedure that can be used in place of a formal statistical test that will produce approximately the same result.
run chart - a basic graphical toolthat charts a process over time, recording either individual readings or averages over time. .

[ back to top ]

S:

sample size - the number of elements, or units, in a sample.
sample - a set of values or items selected from some population.
sampling variation - the variation of a sample's properties from the properties of the population from which it was drawn.
sampling - the process of selecting a sample of a population and determining the properties of the sample. The sample is chosen in such a way that its properties are representative of the population.
sampling variation - the variation of a sample's proerties from the properties of the population from which it was drawn.
sampling - the process of selecting a sample of a population and determining the properties of the sample. The sample is chosen in such a way that its properties are representative of the population.
scatterplot - a 2-dimensional plot for displaying bivariate data.
screening designs - designed experiments used to identify the "vital few" factors from a large number of factors (> 6) to be tested.
short-run SPC - a set of techniques used for SPC in low-volume, short duration manufacturing or service.
sigma limits - for histograms; lines marked on the histogram showing the points n standard deviations above and below the mean.
sigma (σ) - the standard deviation of a statistical population.
Sigma capability - a commonly used measure of process capability that represents the number of standard deviations between the center of a process and the closest specification limit.
signal-to-noise (s/n) - a comparison of the influence of controllable factors (signals) to the influence of noise factors. The higher the s/n value, the better.
significance level - see level of significance.
simple linear regression - a model where one independent variable is used to predict one dependent variable.
simulation (modeling) - using a mathematical model of a system or process to predict the performanceof the real system. The model consists of a set of equations or logic rules which operate on numerical values representing the operating parameters of the system. The result of the equation is a prediction of the system's output.
skewed distribution - a distribution that (graphically) has a onger tail on the right than it does on the left (or vice versa), i.e, it is not symmetric about its center point. Numerically, a distribution is skewed if the mean and the median are not the same.
slope - the term b in the prediction equation ŷ = b0 + b1 x.
special causes of variation - those nonrandom causes of variation that can be detected by the use of control charts and good process documentation.
specification (spec) limits - the bounds of acceptable values for a given product or process. They should be customer driven.
SSE - sum of squares due to error, or some of squared residuals.
SST - total sum of squares, or the sum of squared deviations of the observed values from the mean.
stability (of a process) - a process is said to be stable is it shows no recognizable pattern of change.
standard (measurement) - a reference item providing a known value of a quantity to be measured. Standards may be primary - i.e., the standard essentially defines the unit of measure - or secondary (transfer) standards, which have been compared to the primary standard (directly or by way of an intermediate transfer standard). Standards are used to calibrate instruments which are then employed to make routine instruments.
standard deviation - one of the most common measures of variability in a data set or in a population.
standardized normal distribution - a normal distribution or random variable having a mean and standard deviation of 0 and 1, respectively. It is denoted by the symbol Z and is also called the Z distribution.
stationary point - the corrolary ina single variable calculus would be a critical point. A stationary point is a point where the gradient is zero, i.e, no slope at the point.
statistic - a value calculated froma random sample which is used to estimate a population parameter.
statistical inference - the process of drawing concusions abouta population on the basis of statistics.
statistical quality control (SQC) - the application of statistical methods for measuring and improving the quality of processes. SPC is one method included in SQC.
statistical control (of a process) - a process is said to be in a state of statistical control when it exhibits only random variation and is otherwise stable and predictable.
statistical process control - the use of basic graphical and statistical methods for analyzing and controlling the variation of a process, and thus continuously improving the process.
statistics - numerical measures obtained from a sample (as opposed to parameters, which are numerical measures of a population).
system design - the selection of materials, parts, products, factors, equipment, and process parameters.
systematic variation (of a process) - variation which exhibit a predictable pattern. The pattern may be cyclic (i.e., a recurring pattern) or may progress linearly (i.e. a trend).

[ back to top ]

T:

t-distribution - a symmetric, bell-shaped distribution that resembles the standardized normal (or Z) distribution, but it typically has more area in its tails than does the Z distribution. That is, it has greater variability than the Z distribution.
t-test - a hypothesis test of population means when small samples are involved.
Taguchi methods - experimental design thoughts and processes as developed by Taguchi. Based on philosophy of the loss function, he uses signal-to-noise ratios as the primary analysis tool. Orthogonal design matrices are tabled and originate from fractional factorials, Plackett-Burman, and latin square designs.
test statistic - a single value which combines the evidence obtained from sample data. The P-value in a hypothesis test is directly related to this value.
tolerance design - the specification of appropriate tolerances, product parameters, and process factors.
total quality management (TQM) - a management philosophy of integrated control, including engineering, purchasing, financial administration, marketing and manufacturing, to ensure customer satisfaction and economical cost of quality.
trend - a gradual, systematic change with time or some other variable.
Tukey test - a statistical test to measure the difference between several mean values and tell the user which ones are statistically different from the rest.
Type II error - concluding H0 and H1 is really true.
Type I error - concluding H1 (or rejecting H0) when H- is really true.

[ back to top ]

U:

UCL (upper control limit) - for control charts: the upper limit below which a process statistic (X, R, etc.) must remain to be in control. Typically this value is 3 standard deviations above the center line.
uncontrollable factors - factors that are difficult, undesirable, or impossible to change. Also called "noise factors."  The uncontrollable factors included in a design matrix must be controllable in the experimental phase.
uniform distribution - a distribution in which all outcomes are equally likely.
USL (upper specification limit) - the highest value of a product dimension or measurement which is acceptable.

[ back to top ]

V:

variability - the property of exhibiting variation, i.e., changes or differences, in key measurements of a process.
variables data - concerning the values of a variable, or measurement data, as opposed to attribute data. A dimensional value can be recorded and is only limited in value by the resolution of the measurement system.
variables - quantities which are subject to change or variability.
variance - a measure of variability in a data set or population. It is the square of the standard deviation.

[ back to top ]

X:


x and R charts - for variables data; control charts for thea verage and range of subgroups of data.

[ back to top ]

Y:

y-intercept - the term b0 in the prediction equation ŷ = b0 + b1 x.

[ back to top ]

Z:

z-value - a standardized value formed by subtracting the mean and then dividing this difference by the standard deviation.

[ back to top ]