The level of confidence, MSA and ability in metrology

The invisible precision: Measuring technology, quality assurance and statistical certainty. In the world of digital measurement technology, where precision and reliability are crucial, the quality assurance of measurements is of paramount importance.

Similar to gaming, where every pixel and every calculation counts, the results must also be reliable in technical projects. This is where measurement technology comes into play, which ensures the reliability of data through standardized procedures and statistical analyses.

Dr. Wolfgang Kessel from PTB has written about this in his publication General explanation of the measurement uncertainty The following text is written:

The [...] definition of measurement uncertainty expresses the well-known fact that measurements do not provide, indeed cannot provide, an exact value. Measurements are subject to shortcomings and imperfections that cannot be accurately quantified. Some of them have
their cause in random effects, such as short-term fluctuations in the temperature, humidity and air pressure of the environment. Also, the uneven performance of the observer performing the measurement may be the cause of random effects: whether it is necessary to estimate certain deviations from a scale value when reading a value or to set a parameter in a measurement process. Measurements repeated under the same conditions show different results due to these random influences.

Other inadequacies and imperfections are due to the fact that certain systematic effects cannot be corrected exactly or are only approximately known. This subheading covers:
Among other things, the zero point deviation of a measuring instrument, the change in the characteristic values of a normal between two calibrations (drift), the bias of the observer to find a previously obtained value on reading, or also the uncertainty with which the value of a reference standard or reference material is specified in a certificate or manual.

In addition to statistical confidence levels, it is above all standardized procedures of the Measuring system analysis (MSA), which make the reliability of measurements tangible. Without a capable measuring system, machine or process capability metrics such as Cm, Cmk, Pp or Ppk cannot provide reliable statements. Therefore, the measuring capability forms the basis of every reliable quality statement.

For laymen, this can be summarized as follows: A measuring device is like a camera. If the lens is sharp (accuracy), the image repeatably the same (repeat precision), consistent (reproducibility) with each photographer and stable over time, then we can trust the shots.

A basic tool in metrology is the method 1 for assessing the measuring medium capability (MSA procedure 1). This method is used to evaluate the suitability of a measuring device for a specific measurement task by focusing on repeatability. It examines how close individual measured values are to each other when the same specimen with the same characteristic, by the same operator and with the same measuring medium is measured under identical conditions.

An important prerequisite for this method is that the measuring means has a sufficient resolution, typically not more than 5% the tolerance of the characteristic to be measured in order to ensure safe and legible readings.

Method 1 is a short-term assessment and is often used as part of routine audits or interim tests to assess the measurement stability under the most real conditions possible.

The reliability of measurements is expressed by the measurement uncertainty, which is often associated with Levels of confidence quantified. Many measurements, especially in quality assurance, follow a normal distribution (also called Gaussian distribution), which is graphically represented as a bell curve. Within this normal distribution, certain percentages of the data points lie within defined standard deviations (sigma, σ) from the mean:

68,27% Level of confidence (±1σ): About 68% the measured values are within a standard deviation from the mean value. This can already be used for quality assurance. From e.g. 10 test specimens are therefore 7 ‘good enough’ or dimensionally stable.

95,45% Level of confidence (±2σ): This level means that approximately 95.45% the measured values are within two standard deviations from the mean value. This is a commonly used confidence level, including in opinion polls where it indicates the ‘error margin’.

99,73% Level of confidence (±3σ): An even higher level of trust, where 99.73% the measured values are within three standard deviations from the mean value. This ‘three-sigma rule’ is often regarded as almost absolute certainty in the empirical sciences. In practice, therefore, although often ‘nice to have’, please do not aim at all costs.

These statistical certainties form the basis for advanced quality management methods. They make it possible to quantify the reliability and consistency of a process or measurement.

The ‘Level 92’ philosophy is further confirmed here: It is about achieving a high, statistically significant level of quality and reliability that is ‘good enough’ for the intended purpose without getting lost in an unattainable pursuit of absolute perfection.

Measuring medium capability: The basis for the most reliable data possible

A measuring instrument is ‘capable’ if it depicts reality in such a precise and precise way that the values obtained can be reliably used for process decisions. The most important criteria are:

Accuracy (bias): Deviation from the reference value.
Repeatability: How close are several measurements of the same specimen to each other?
Comparative precision (reproducibility): How much do different operators influence the results?
Linearity: Is the measuring system reliable over the entire measuring range?
Stability: Does the measurement system provide consistent results over long periods of time?
Discrimination (NDC): Does the measuring system reliably detect differences between test specimens?

Methods of Measuring System Analysis (MSA)

Procedure 1 – Short-term assessment of measuring capability

Method 1 is used to evaluate new or modified measurement systems. It focuses on repeatability and, where a reference standard exists, also on systematic deviation.

Practice: 20-50 repeat measurements on the normal part by the same controller under identical conditions.
Key figures: Cg (scattering) and Cgk (scattering + deviation from setpoint).
Criteria: Resolution ≤ 5% tolerance; Calibration uncertainty of the normal significantly smaller than the tolerance.

Method 1 is therefore a kind of ‘quick test’ before a measuring device goes into productive use.

Method 2 – Gauge R&R (repeat and comparison precision)

While method 1 checks for pure repeatability, method 2 continues: It also examines the influence of several operators.

Practice: Several examiners measure several specimens, each several times.
Key figure: R&R value (proportion of the measurement system scattering in the total fluctuation).
Targets: R&R < 10% The tolerance is ideal, 10-30% acceptable, >30% critical.

This method is particularly important when human influence plays a role, for example in hand measurements with gauges or measuring screws.

Procedure 3 – For automated systems

Method 3 is a special form of method 2 and tests measuring systems without direct operator influence, such as coordinate measuring machines or automated inline tests.

Practice: Repeat tests on the system, usually automated.
Key figure: also R&R value.
Objective: Evidence that the system measures stable and capable even without operator influence.

Procedure 7 – Attributive tests

Not all characteristics can be expressed in numbers. It is often about good/bad decisions, such as visual inspections or borderline teachings. This is where Procedure 7 comes in.

Practice: Inspectors or systems evaluate the same parts multiple times.
Key figures: Match rate, Kappa value or error rate.
Objective: Ensure that audit decisions are clear, reproducible and reliable.

Measurement uncertainty and level of trust

The reliability of measurements is determined by the Measurement uncertainty expressed. It describes the range in which the true value lies with a certain probability. Statistical confidence levels also play a key role here:

68,27% (±1σ): First estimate, about 7 out of 10 test specimens are within tolerance.
95,45% (±2σ): Commonly used level, also known from surveys as ‘error margin’.
99,73% (±3σ): Almost absolute certainty, often sought in highly critical applications.

The two faces of measurement uncertainty: Type A and Type B

A physical measurement never provides the ‘true value’ with absolute certainty. Instead, the measurement result is always accompanied by a Measurement uncertainty (u) It defines a range in which the true value lies with a certain probability. This uncertainty arises from two main sources:

Systematic deviations (type B): These are predictable, often constant deviations that come from external sources. Examples are errors in the calibration certificate of the measuring instrument, environmental influences such as temperature fluctuations or the individual influence of the operator.

Random deviations (type A): These uncertainties result from the scattering of repeated measurements under the same conditions. They are unpredictable and are recorded statistically, for example by the standard deviation of a measurement series. This can later help to center processes.

The combined measurement uncertainty results from both proportions. Important: It is not a mistake, but a Measure of confidence into a measurement result.

Example: If a caliper indicates the value of 10.00 mm ± 0.05 mm, this means that with a high probability the true value is between 9.95 and 10.05 mm. This joint consideration makes confidence in a measurement result tangible.

Again, I would like to quote Professor Kessel:

The definition of measurement uncertainty shows that it is a quantitative measure of the quality of the respective measurement result. It gives an answer to the question of how well the obtained result reflects the value of the measurement. It allows the user to assess the reliability of the measurement result, for example to compare the results of different measurements of the same measurement with each other or with reference values. Confidence in the comparability of measurement results is important in national trade and international trade in goods. It helps to break down trade and economic barriers.

A measurement value is often to be compared with limit values specified in a specification or normative regulation. In this case, the measurement uncertainty can be used to determine whether the measurement result is clearly within the specified limits or whether the requirements are only barely met. If the measured value is very close to a limit value, there is a great risk that the measured value will not meet the requirements. The associated measurement uncertainty is an important help in this case to assess this risk realistically.

Linear regression and measurement uncertainty for measurement series

In practice, entire series of measurements are often recorded. In order to analyze trends or relationships, one uses the Linear regression. A straight line is laid through the measuring points, which describes the relationship between two variables.

Uncertainty also plays a role here: The greater the scattering of points around the regression line, the more uncertain the prediction. Statistics provide tools such as Determination (R2) or confidence intervals to quantitatively capture this uncertainty.

For laymen, this can be explained as follows: The closer the measuring points are to the line, the better the prediction. If they deviate sharply, trust in the statement decreases.

Best practices for the application

Always run an MSA before Machine or process capability analyses.
Repeat the analysis for new measuring systems, after repairs, process changes or at regular intervals.
Use realistic conditions: Same environment, same controller, same components.
Document the results – they are central to audits and customer approvals.
Aim for ‘good enough’: The goal is not absolute perfection, but reliable, purposeful results.

Conclusion: The Level 92 Philosophy in Measuring Technology

Measurement technology is not about forcing unattainable perfection. It is crucial to achieve a high, statistically assured level of quality and reliability.

A measuring system must be ‘capable’, not flawless, but Reliable for the intended purpose. This is where the art lies: The balance between precision, effort and pragmatism. Or, in the language of the Level 92 philosophy -> ‘good enough to really get ahead’.

92 is half of 99

The level of trust, MSA and ability in measurement technology