「利用者:YasuakiH/Weibull distribution」の版間の差分

削除された内容 追加された内容
比例ハザードモデルの下書きを削除
タグ: ビジュアルエディター: 中途切替 曖昧さ回避ページへのリンク
1行目:
{{short description|Frequency with which an engineered system or component fails}}
'''Failure rate''' is the [[frequency]] with which an [[systems engineering|engineered system]] or component fails, expressed in failures per unit of time. It is usually denoted by the [[Greek alphabet|Greek letter]] [[λ]] (lambda) and is often used in [[reliability engineering]].
 
The failure rate of a system usually depends on time, with the rate varying over the life cycle of the system. For example, an automobile's failure rate in its fifth year of service may be many times greater than its failure rate during its first year of service. One does not expect to replace an exhaust pipe, overhaul the brakes, or have major [[Transmission (mechanics)|transmission]] problems in a new vehicle.
 
In practice, the [[mean time between failures]] (MTBF, 1/λ) is often reported instead of the failure rate. This is valid and useful if the failure rate may be assumed constant – often used for complex units / systems, electronics – and is a general agreement in some reliability standards (Military and Aerospace). It does in this case ''only'' relate to the flat region of the [[bathtub curve]], which is also called the "useful life period". Because of this, it is incorrect to extrapolate MTBF to give an estimate of the service lifetime of a component, which will typically be much less than suggested by the MTBF due to the much higher failure rates in the "end-of-life wearout" part of the "bathtub curve".
 
The reason for the preferred use for MTBF numbers is that the use of large positive numbers (such as 2000 hours) is more intuitive and easier to remember than very small numbers (such as 0.0005 per hour).
 
The MTBF is an important system parameter in systems where failure rate needs to be managed, in particular for safety systems. The MTBF appears frequently in the [[engineering]] design requirements, and governs frequency of required system maintenance and inspections. In special processes called [[renewal process]]es, where the time to recover from failure can be neglected and the likelihood of failure remains constant with respect to time, the failure rate is simply the multiplicative inverse of the MTBF (1/λ).
 
A similar ratio used in the [[transport industry|transport industries]], especially in [[railway]]s and [[Truck driver|trucking]] is "mean distance between failures", a variation which attempts to [[Correlation|correlate]] actual loaded distances to similar reliability needs and practices.
 
Failure rates are important factors in the insurance, finance, commerce and regulatory industries and fundamental to the design of safe systems in a wide variety of applications.
 
==Failure rate data==
Failure rate [[data]] can be obtained in several ways. The most common means are:
;Estimation:From field failure rate reports, statistical analysis techniques can be used to estimate failure rates. For accurate failure rates the analyst must have a good understanding of equipment operation, procedures for data collection, the key environmental variables impacting failure rates, how the equipment is used at the system level, and how the failure data will be used by system designers.
;Historical data about the device or system under consideration: Many organizations maintain internal databases of failure information on the devices or systems that they produce, which can be used to calculate failure rates for those devices or systems. For new devices or systems, the historical data for similar devices or systems can serve as a useful estimate.
;Government and commercial failure rate data: Handbooks of failure rate data for various components are available from government and commercial sources. [[#Online|MIL-HDBK-217F]], ''Reliability Prediction of Electronic Equipment'', is a [[United States Military Standard|military standard]] that provides failure rate data for many military electronic components. Several failure rate data sources are available commercially that focus on commercial components, including some non-electronic components.
;Prediction: Time lag is one of the serious drawbacks of all failure rate estimations. Often by the time the failure rate data are available, the devices under study have become obsolete. Due to this drawback, failure-rate prediction methods have been developed. These methods may be used on newly-designed devices to predict the device's failure rates and failure modes. Two approaches have become well known, Cycle Testing and FMEDA.
; Life Testing: The most accurate source of data is to test samples of the actual devices or systems in order to generate failure data. This is often prohibitively expensive or impractical, so that the previous data sources are often used instead.
;Cycle Testing: Mechanical movement is the predominant failure mechanism causing mechanical and electromechanical devices to wear out. For many devices, the wear-out failure point is measured by the number of cycles performed before the device fails, and can be discovered by cycle testing. In cycle testing, a device is cycled as rapidly as practical until it fails. When a collection of these devices are tested, the test will run until 10% of the units fail dangerously.
;FMEDA: [[Failure modes, effects, and diagnostic analysis]] (FMEDA) is a systematic analysis technique to obtain subsystem / product level failure rates, failure modes and design strength. The FMEDA technique considers:
* All components of a design,
* The functionality of each component,
* The failure modes of each component,
* The effect of each component failure mode on the product functionality,
* The ability of any automatic diagnostics to detect the failure,
* The design strength (de-rating, safety factors) and
* The operational profile (environmental stress factors).
Given a component database calibrated with field failure data that is reasonably accurate<ref>{{cite manual
| title = Electrical & Mechanical Component Reliability Handbook
| publisher = exida
| year = 2006
| url = http://www.exida.com
}}</ref>
, the method can predict product level failure rate and failure mode data for a given application. The predictions have been shown to be more accurate<ref>{{cite manual
| last = Goble
| first = William M.
|author2= Iwan van Beurden
| title = Combining field failure data with new instrument design margins to predict failure rates for SIS Verification
| publisher = Proceedings of the 2014 International Symposium - BEYOND REGULATORY COMPLIANCE, MAKING SAFETY SECOND NATURE, Hilton College Station-Conference Center, College Station, Texas
| year = 2014
}}</ref> than field warranty return analysis or even typical field failure analysis given that these methods depend on reports that typically do not have sufficient detail information in failure records.<ref>W. M. Goble, "Field Failure Data – the Good, the Bad and the Ugly," exida, Sellersville, PA [http://www.exida.com/resources/whitepapers.asp]</ref>
[[Failure modes, effects, and diagnostic analysis]]
 
==Failure rate in the discrete sense==
 
The failure rate can be defined as the following:
 
:The total number of failures within an item [[statistical population|population]], divided by the total time expended by that population, during a particular measurement interval under stated conditions. (MacDiarmid, ''et al.'')
 
Although the failure rate, <math>\lambda (t)</math>, is often thought of as the [[probability]] that a failure occurs in a specified interval given no failure before time <math>t</math>, it is not actually a probability because it can exceed 1. Erroneous expression of the failure rate in % could result in incorrect perception of the measure, especially if it would be measured from repairable systems and multiple systems with non-constant failure rates or different operation times. It can be defined with the aid of the [[reliability function]], also called the survival function, <math>R(t)=1-F(t)</math>, the probability of no failure before time <math>t</math>.
 
::<math>\lambda(t) = \frac{f(t)}{R(t)}</math>, where <math>f(t)</math> is the time to (first) failure distribution (i.e. the failure density function).
 
::<math>\lambda(t) = \frac{R(t_1)-R(t_2)}{(t_2-t_1) \cdot R(t_1)}
= \frac{R(t)-R(t+\Delta t)}{\Delta t \cdot R(t)} \!</math>
 
over a time interval <math>\Delta t</math> = <math>(t_2-t_1)</math> from <math>t_1</math> (or <math>t</math>) to <math>t_2</math>. Note that this is a [[conditional probability]], where the condition is that no failure has occurred before time <math>t</math>. Hence the <math>R(t)</math> in the denominator.
 
Hazard rate and ROCOF (rate of occurrence of failures) are often incorrectly seen as the same and equal to the failure rate. {{clarify|date=April 2015}} To clarify; the more promptly items are repaired, the sooner they will break again, so the higher the ROCOF. The hazard rate is however independent of the time to repair and of the logistic delay time.
 
==Failure rate in the continuous sense==
[[File:Loglogistichaz.svg|thumb|right|300px|Hazard function <math>h(t)</math> plotted for a selection of [[log-logistic distribution]]s.]] Calculating the failure rate for ever smaller intervals of time results in the '''{{visible anchor|hazard function}}''' (also called '''hazard rate'''), <math>h(t)</math>. This becomes the ''instantaneous'' failure rate or we say instantaneous hazard rate as <math>\Delta t </math> approaches to zero:
 
:<math>h(t)=\lim_{\Delta t \to 0} \frac{R(t)-R(t+\Delta t)}{\Delta t \cdot R(t)}.</math>
 
A continuous failure rate depends on the existence of a '''failure distribution''', <math>F(t)</math>, which is a [[cumulative distribution function]] that describes the probability of failure (at least) up to and including time ''t'',
 
:<math>\operatorname{Pr}(T\le t)=F(t)=1-R(t),\quad t\ge 0. \!</math>
 
where <math>{T}</math> is the failure time.
The failure distribution function is the integral of the failure [[probability density function|''density'' function]], ''f''(''t''),
 
:<math>F(t)=\int_{0}^{t} f(\tau)\, d\tau. \!</math>
 
The hazard function can be defined now as
 
:<math>h(t)=\frac{f(t)}{1-F(t)}=\frac{f(t)}{R(t)}.</math>
 
[[File:Exponential pdf.svg|thumb|right|300px|Exponential failure density functions. Each of these has a (different) constant hazard function (see text).]]
Many probability distributions can be used to model the failure distribution (''see [[List of probability distributions|List of important probability distributions]]''). A common model is the '''exponential failure distribution''',
 
:<math>F(t)=\int_{0}^{t} \lambda e^{-\lambda \tau}\, d\tau = 1 - e^{-\lambda t}, \!</math>
 
which is based on the [[exponential distribution|exponential density function]]. The hazard rate function for this is:
 
:<math>h(t) = \frac{f(t)}{R(t)} = \frac{\lambda e^{-\lambda t}}{e^{-\lambda t}} = \lambda .</math>
 
Thus, for an exponential failure distribution, the hazard rate is a constant with respect to time (that is, the distribution is "[[memorylessness|memory-less]]"). For other distributions, such as a [[Weibull distribution]] or a [[log-normal distribution]], the hazard function may not be constant with respect to time. For some such as the [[deterministic distribution]] it is [[monotonic]] increasing (analogous to [[Wear and tear|"wearing out"]]), for others such as the [[Pareto distribution]] it is monotonic decreasing (analogous to [[Burn-in|"burning in"]]), while for many it is not monotonic.
 
Solving the differential equation
:<math>h(t)=\frac{f(t)}{1-F(t)}=\frac{F'(t)}{1-F(t)}</math>
 
for <math>F(t)</math>, it can be shown that
 
:<math>F(t) = 1 - \exp{\left(-\int_0^t h(t) dt \right)}.</math>
 
==Decreasing failure rate==
A decreasing failure rate (DFR) describes a phenomenon where the probability of an event in a fixed time interval in the future decreases over time. A decreasing failure rate can describe a period of "infant mortality" where earlier failures are eliminated or corrected<ref>{{Cite book | doi = 10.1007/978-1-84800-986-8_1 | chapter = Introduction | first = Maxim | last = Finkelstein| title = Failure Rate Modelling for Reliability and Risk | series = Springer Series in Reliability Engineering | pages = 1–84 | year = 2008 | isbn = 978-1-84800-985-1 }}</ref> and corresponds to the situation where λ(''t'') is a [[decreasing function]].
 
Mixtures of DFR variables are DFR.<ref name="brown1980">{{Cite journal | last1 = Brown | first1 = M. | title = Bounds, Inequalities, and Monotonicity Properties for Some Specialized Renewal Processes | doi = 10.1214/aop/1176994773 | journal = The Annals of Probability | volume = 8 | issue = 2 | pages = 227–240 | jstor = 2243267| year = 1980 | doi-access = free }}</ref> Mixtures of [[exponential distribution|exponentially distributed]] random variables are [[Hyperexponential distribution|hyperexponentially distributed]].
 
===Renewal processes===
 
For a [[renewal process]] with DFR renewal function, inter-renewal times are concave.<ref name="brown1980" /><ref name="shanthikumar">{{Cite journal | last1 = Shanthikumar | first1 = J. G. | doi = 10.1214/aop/1176991910 | title = DFR Property of First-Passage Times and its Preservation Under Geometric Compounding | journal = The Annals of Probability | volume = 16 | issue = 1 | pages = 397–406 | year = 1988 | jstor = 2243910| doi-access = free }}</ref> Brown conjectured the converse, that DFR is also necessary for the inter-renewal times to be concave,<ref>{{Cite journal | last1 = Brown | first1 = M. | title = Further Monotonicity Properties for Specialized Renewal Processes | doi = 10.1214/aop/1176994317 | journal = The Annals of Probability | volume = 9 | issue = 5 | pages = 891–895 | year = 1981 | jstor = 2243747| doi-access = free }}</ref> however it has been shown that this conjecture holds neither in the discrete case<ref name="shanthikumar" /> nor in the continuous case.<ref>{{Cite journal | last1 = Yu | first1 = Y. | title = Concave renewal functions do not imply DFR interrenewal times | doi = 10.1239/jap/1308662647 | journal = Journal of Applied Probability | volume = 48 | issue = 2 | pages = 583–588 | year = 2011 | arxiv = 1009.2463 }}</ref>
 
===Applications===
 
Increasing failure rate is an intuitive concept caused by components wearing out. Decreasing failure rate describes a system which improves with age.<ref name="proschan" />
Decreasing failure rates have been found in the lifetimes of spacecraft, Baker and Baker commenting that "those spacecraft that last, last on and on."<ref>{{Cite journal | last1 = Baker | first1 = J. C. | last2 = Baker | first2 = G. A. S. . | doi = 10.2514/3.28040 | title = Impact of the space environment on spacecraft lifetimes | journal = Journal of Spacecraft and Rockets | volume = 17 | issue = 5 | pages = 479 | year = 1980 | bibcode = 1980JSpRo..17..479B }}</ref><ref>{{Cite book | doi = 10.1002/9781119994077.ch1 | chapter = On Time, Reliability, and Spacecraft | first1 = Joseph Homer | last1 = Saleh | first2 =Jean-François | last2 =Castet| title = Spacecraft Reliability and Multi-State Failures | pages = 1 | year = 2011 | isbn = 9781119994077 }}</ref> The reliability of aircraft air conditioning systems were individually found to have an [[exponential distribution]], and thus in the pooled population a DFR.<ref name="proschan">{{Cite journal | last1 = Proschan | first1 = F. | title = Theoretical Explanation of Observed Decreasing Failure Rate | doi = 10.1080/00401706.1963.10490105 | journal = Technometrics | volume = 5 | issue = 3 | pages = 375–383 | jstor = 1266340| year = 1963 }}</ref>
 
===Coefficient of variation===
 
When the failure rate is decreasing the [[coefficient of variation]] is ⩾&nbsp;1, and when the failure rate is increasing the coefficient of variation is ⩽&nbsp;1.<ref>{{Cite journal | last1 = Wierman | first1 = A. | author-link1 = Adam Wierman| last2 = Bansal | first2 = N. | last3 = Harchol-Balter | first3 = M.|author3-link=Mor Harchol-Balter | title = A note on comparing response times in the M/GI/1/FB and M/GI/1/PS queues | doi = 10.1016/S0167-6377(03)00061-0 | journal = Operations Research Letters | volume = 32 | pages = 73–76 | url = http://users.cms.caltech.edu/~adamw/papers/fbnote.pdf| year = 2004 }}</ref> Note that this result only holds when the failure rate is defined for all t&nbsp;⩾&nbsp;0<ref>{{cite book | title = Analysis of Queues: Methods and Applications | first = Natarajan | last =Gautam | publisher = CRC Press | year = 2012 | page = 703 | isbn = 978-1439806586}}</ref> and that the converse result (coefficient of variation determining nature of failure rate) does not hold.
 
===Units===
Failure rates can be expressed using any measure of time, but '''hours''' is the most common unit in practice. Other units, such as miles, revolutions, etc., can also be used in place of "time" units.
 
Failure rates are often expressed in [[engineering notation]] as failures per million, or 10<sup>−6</sup>, especially for individual components, since their failure rates are often very low.
 
The '''Failures In Time''' ('''FIT''') rate of a device is the number of failures that can be expected in one [[1000000000 (number)|billion]] (10<sup>9</sup>) device-hours of operation.<ref>
Xin Li; Michael C. Huang; Kai Shen; Lingkun Chu.
[http://www.cs.rochester.edu/~kshen/papers/usenix2010-li.pdf "A Realistic Evaluation of Memory Hardware Errors and Software System Susceptibility"].
2010.
p. 6.
</ref>
(E.g. 1000 devices for 1 million hours, or 1 million devices for 1000 hours each, or some other combination.) This term is used particularly by the [[semiconductor]] industry.
 
The relationship of FIT to MTBF may be expressed as: MTBF = 1,000,000,000 x 1/FIT.
 
===Additivity===
Under certain [[engineering]] assumptions (e.g. besides the above assumptions for a constant failure rate, the assumption that the considered system has no relevant [[Redundancy (engineering)|redundancies]]), the failure rate for a complex [[system]] is simply the sum of the individual failure rates of its components, as long as the units are consistent, e.g. failures per million hours. This permits testing of individual components or [[subsystem]]s, whose failure rates are then added to obtain the total system failure rate.<ref>
[http://www.weibull.com/hotwire/issue108/relbasics108.htm "Reliability Basics"].
2010.
</ref><ref>
Vita Faraci.
[http://src.alionscience.com/pdf/1Q2006.pdf "Calculating Failure Rates of Series/Parallel Networks"].
2006.
</ref>
 
Adding "redundant" components to eliminate a [[single point of failure]] improves the mission failure rate, but makes the series failure rate (also called the logistics failure rate) worse—the extra components improve the mean time between critical failures (MTBCF), even though the mean time before something fails is worse.<ref>
[https://www.quanterion.com/mission-reliability-and-logistics-reliability-a-design-paradox/ "Mission Reliability and Logistics Reliability: A Design Paradox"].
</ref>
 
===Example===
Suppose it is desired to estimate the failure rate of a certain component. A test can be performed to estimate its failure rate. Ten identical components are each tested until they either fail or reach 1000 hours, at which time the test is terminated for that component. (The level of statistical [[confidence interval|confidence]] is not considered in this example.) The results are as follows:
 
Estimated failure rate is
 
: <math>\frac{6\text{ failures}}{7502\text{ hours}} = 0.0007998\, \frac{\text{failures}}{\text{hour}} = 799.8 \times 10^{-6}\, \frac{\text{failures}}{\text{hour}}, </math>
 
or 799.8 failures for every million hours of operation.
 
==See also==
{{Portal|Mathematics}}
{{Div col|colwidth=22em}}
*[[Annualized failure rate]]
*[[Burn-in]]
*[[Failure]]
*[[Failure mode]]
*[[Failure modes, effects, and diagnostic analysis]]
*[[Force of mortality]]
*[[Frequency of exceedance]]
*[[Reliability engineering]]
*[[Reliability theory]]
*[[Reliability theory of aging and longevity]]
*[[Survival analysis]]
*[[Weibull distribution]]
{{div col end}}
 
==References==
{{Reflist|30em}}
 
==Further reading==
 
*{{Citation
| last = Goble
| first = William M.
| year = 2018
| title = Safety Instrumented System Design: Techniques and Design Verification
| publisher = International Society of Automation
| location = Research Triangle Park, NC 27709
}}
*{{cite book |last=Blanchard |first=Benjamin S. |year=1992 |title=Logistics Engineering and Management |edition=Fourth |pages=26–32 |publisher=Prentice-Hall |location=Englewood Cliffs, New Jersey |isbn=0135241170 }}
*{{cite book |last=Ebeling |first=Charles E. |year=1997 |title=An Introduction to Reliability and Maintainability Engineering |pages=23–32 |publisher=McGraw-Hill |location=Boston |isbn=0070188521 }}
*[[Federal Standard 1037C]]
*{{cite book |last=Kapur |first=K. C. |last2=Lamberson |first2=L. R. |year=1977 |title=Reliability in Engineering Design |pages=8–30 |publisher=John Wiley & Sons |location=New York |isbn=0471511919 }}
*{{cite journal |last=Knowles |first=D. I. |year=1995 |title=Should We Move Away From 'Acceptable Failure Rate'? |journal=Communications in Reliability Maintainability and Supportability |volume=2 |issue=1 |page=23 |publisher=International RMS Committee, USA }}
*{{cite book |last=MacDiarmid |first=Preston |last2=Morris |first2=Seymour |date=n.d. |title=Reliability Toolkit |edition=Commercial Practices |pages=35–39 |publisher=Reliability Analysis Center and Rome Laboratory |location=Rome, New York |display-authors=etal}}
*{{cite book |last=Modarres |first=M.|author-link=Mohammad Modarres |last2=Kaminskiy |first2=M. |last3=Krivtsov |first3=V. |year=2010 |title=[[Reliability Engineering and Risk Analysis: A Practical Guide]] |edition=2nd |publisher=CRC Press |isbn=9780849392474 }}
*{{cite journal |url=http://www.mitre.org/work/best_papers/02/mondro_approx/mondro_approx.pdf |last=Mondro |first=Mitchell J. |date=June 2002 |title=Approximation of Mean Time Between Failure When a System has Periodic Maintenance |journal=IEEE Transactions on Reliability |volume=51 |issue=2 |pages= 166–167|doi= 10.1109/TR.2002.1011521}}
*{{cite book |last=Rausand |first=M. |last2=Hoyland |first2=A. |year=2004 |title=System Reliability Theory; Models, Statistical methods, and Applications |publisher=John Wiley & Sons |location=New York |isbn=047147133X }}
*{{cite book |last=Turner |first=T. |last2=Hockley |first2=C. |last3=Burdaky |first3=R. |year=1997 |title=The Customer Needs A Maintenance-Free Operating Period |work=1997 Avionics Conference and Exhibition, No. 97-0819, P. 2.2 |publisher=[[ERA Technology Ltd]]. |location=Leatherhead, Surrey, UK }}
*U.S. Department of Defense, (1991) ''Military Handbook, “Reliability Prediction of Electronic Equipment, MIL-HDBK-217F, 2''
 
==External links==
*[http://www.asq.org/reliability/articles/bathtub.html Bathtub curve issues], ASQC
*[http://lamspeople.epfl.ch/kirrmann/Pubs/FaultTolerance/Fault_Tolerance_Tutorial_HK.pdf ''Fault Tolerant Computing in Industrial Automation''] by Hubert Kirrmann, ABB Research Center, Switzerland
 
{{Statistics|analysis}}
 
{{DEFAULTSORT:Failure Rate}}
[[Category:Actuarial science]]
[[Category:Engineering failures]]
[[Category:Reliability engineering]]
[[Category:Survival analysis]]
[[Category:Maintenance]]
[[Category:Statistical ratios]]
[[Category:Error measures]]
[[Category:Rates]]