Inspection optimization model with imperfect maintenance based on a three-stage failure process

Rolling element bearings are widely used in industrial rotating machinery, for example wind turbine and helicopter; and they are also considered as a type of critical components. Unexpected bearing faults are one of the most frequent reasons for machine breakdown and may result in significant economic losses. Therefore, taking appropriate and effective maintenance activities is required for achieving higher Ruifeng yang Fei Zhao Jianshe Kang haiping li hongzhi Teng


Introduction
Rolling element bearings are widely used in industrial rotating machinery, for example wind turbine and helicopter; and they are also considered as a type of critical components. Unexpected bearing faults are one of the most frequent reasons for machine breakdown and may result in significant economic losses. Therefore, taking appropriate and effective maintenance activities is required for achieving higher availability and lower operational cost. Numerous maintenance policies have been implemented on the bearings to prevent the occurrence of failure [10]. Preventive maintenance (PM) is perhaps one of the most popular maintenance policies, by which maintenance activities are executed with a planned interval aiming at preventing potential failures from occurring [3,13]. For most practical situations, PM is still a dominant maintenance policy due to its easy implementation.
Inspection as an important PM activity can or could reveal the status of the system being inspected, thus it helps maintenance engineers make decisions to avoid the occurrence of failure [20]. Inspections may be performed discretely with a periodic or aperiodic interval by using inspection instrument or continuously by modern condition monitoring devices. In industrial applications, inspectors carry out inspection activities on bearings mostly utilizing the inspection instrument, such as, SPM (Shock Pulse Method) instrument. SPM, developed by SPM instrument AB Company in the early 1970s and originated in Sweden, is a patented technique and has achieved wide acceptance as a quantitative method for efficiently inspecting the condition of bearings [6,12]. Through sampling the shock pulse amplitude of bearings over a period of time and displaying the maximum value dB m and the carpet value dB c , SPM provides a direct normalized shock value indicating the bearing operating state and lubrication condition [7]. Accordingly, maintenance engineers can make decisions in terms of the final shock value.
However, how often do engineers inspect the bearings or the determination of inspection interval is still a key issue. Traditionally in most industries, the inspection interval is usually determined by engineers' experience or by the manufacturer's recommendation [5]. However, such determination is conservative and undesired although it is easy to implement. Many researchers have developed numerous PM models to find the optimal inspection interval under various modeling assumptions [1,8,9]. However, in contrast with other PM models [3,13], the models based on the delay-time concept have been proved to have the obvious advantages for optimizing the inspection interval since the delay-time technique can directly model the relationship between the inspection intervention and the system performance, see [15,17,22].
The delay time concept was first introduced by Christer in 1976 [2], which defines the failure process as a two-stage failure process, namely normal stage and delay time stage. The normal stage is from new to an initial point that a defect can be first identified by an inspection and the delay time stage from this initial point to failure [19]. By the definition of delay time concept, the plant can be in one of two states before failure, namely normal and defective. A defect can be identified if the inspection is carried out during the delay-time stage. Many inspection and PM models, especially successful case studies based on the two-stage failure process have been reported with actual applications in industry [18,19].
However, some systems may be described more than two states before failure in industrial applications. For example, SPM indicates the bearing condition before failure as the condition scale (namely Green-Yellow-Red scale) depending on the shock value. Green with the shock value range 0-20 denotes bearing is normal, a minor defective bearing is represented by Yellow (20-35) and the shock value range 35-60 indicates a severe defective bearing, shown by Red. Then the state of bearing before failure may be in one of these three states, namely normal, minor defective, and severe defective. Considering this industrial scenario, Wang [16] firstly extended the two-stage failure process into a three-stage failure process, which is closer to the practical situation and provides more decision options for different states. In the work [16], the inspection interval is shortened to be half of the current interval to more frequently inspect the system when the minor defective stage is identified by inspection, but immediately replace the system if it is in the severe defective stage. Wang et al. [23] further extended the work [16] by considering a two-level inspection policy with PM and delaying the maintenance once the severe defective stage is identified and the time interval to the next PM is less than a threshold level. However, perfect maintenance for the defective system is assumed in the works [16,23].
After the condition of bearing is measured utilizing the SPM instrument, maintenance engineers will carry out different decisions depending on the shock value. When the shock value is less than 20, namely the condition of bearing is good, do nothing and check the bearing until the next inspection. If the bearing is found to be in the minor defective stage, i. e., the shock value is within the interval (20,35), lubrication is in need; however, replacement needs to be done immediately once the shock value exceeds 35 as replacement is generally a direct measure for bearings, where replacement can be viewed as renewing the bearing. However, when the bearing is found to be in the minor defective stage by inspection, being in yellow, lubrication is a common practice adopted in industry as a way to prolong the life of the system. The lubrication problem in PM models based on the delay time concept has been presented by Wang [21], which considered integratedly the routine service (RS), inspection, condition monitoring and preventive replacement. However, the lubrication is only implemented when the bearing is identified to be in the minor defective stage in the industry, rather than a part of inspection, or of repair or replacement. The lubrication aims at preventing the defects from arising, which will affect the instantaneous rates of defects and failure such that it can be regarded as imperfect maintenance. Wang et al. [14] have considered imperfect maintenance which is based on the two-stage failure process. However, the inspection model considering imperfect maintenance based on the three-stage failure process is closer to reality but not developed. To model the imperfect maintenance at the time of inspection identifying the minor defective stage, we borrow the concept of proportional age reduction (PAR), which has been widely utilized in imperfect maintenance optimization modeling [4].
The contributions of this work are as follows: (1) the system deterioration is subject to the three-stage failure process; (2) imperfect maintenance is considered after the system is found to be in the minor defective stage; (3) an inspection model with imperfect maintenance based on the three-stage failure process is presented.
The remaining parts of the paper are organized as follows: Section 2 presents the problem description and modeling assumptions. Section 3 formulates all possible renewal probabilistic models based on the three-stage failure process. The cost model is developed to find the optimal inspection interval by minimizing the expected cost per unit time in Section 4. Section 5 gives numerical examples and some discussions. Finally, Section 6 concludes the paper and the future researches are suggested.

Problem description
The bearing is regarded as a single component system subject to a single failure mode and in the following part we refer to it simply as the system. The system is inspected periodically with the interval T to obtain the shock value and measure the operating condition. When the shock value of the system falls into green, namely less than 20, do nothing. If the shock value indicates that the system is in the minor defective stage, lubrication is carried out to prolong the system life, regarding as imperfect maintenance. However, the system needs to be replaced immediately once it is revealed by inspection to be in the severe defective stage with the shock value in the interval (35, 60). Failure can be found once it occurs such that replacement needs to be made immediately with an identical one. Replacement can restore the system to a new condition. However, imperfect maintenance at the time of inspection identifying the minor defective stage will affect the instantaneous rates of the minor defective stage, the severe defective sciENcE aNd tEchNology stage and failure at the next maintenance stage [14]. Fig.1 shows an illustration of the three-stage failure process based on SPM.

Modeling assumptions
The following assumptions are proposed for a modeling purpose. Most assumptions have been explained in the previous section or problem description [14,18,19].
The failure process of the system is divided into three stages, (1) namely normal, minor and severe defective stages. These three stages are assumed to be independent.
The system is inspected periodically with an interval (2) T. There is no downtime caused by inspection since the system is checked online. Inspection is perfect such that the system condition can be (3) identified with a probability 100%. When the inspection detects the system being in the normal (4) stage, do nothing.
If the system is found to be in the minor defective stage, we (5) consider implementing lubrication, regarding as imperfect maintenance, which will affect the instantaneous rates of the minor defective stage, the severe defective stage and failure at the next maintenance stage.
Once the system is found to be in the severe defective stage, (6) replacing is always carried out immediately. The failure of the system can be observed immediately and (7) replacement will be carried out at once. The system after replacement is viewed as new. (8) The following notation will be used in the subsequent modelling:

Probabilistic models of a failure renewal and an inspection renewal considering imperfect maintenance
In order to establish the cost model using the renewal theorem, two renewal scenarios at the end of a renewal cycle, namely a failure renewal and an inspection renewal, should be considered. The probability models for both renewals need to be formulated as [16,19] for a modelling purpose. Since it is assumed that imperfect maintenance is applied when the system is found to be in the minor defective stage, we introduce the PAR model firstly.

The PAR model
The PAR model assumes that the ith imperfect maintenance at t i will shorten the length of the last operation time from t i −t i-1 to ρ(t i −t i-1 ) [4]. The effective age after the ith imperfect maintenance is t-ρt i (t > t i ) that indicates that it has no relationship with the maintenance history prior to t i . So the difference between t−ρt i and t−t i , Δi=(1−ρ)t i , is defined as the accumulative age which will affect the instantaneous rate of each stage for the (i+1)th maintenance stage [14].
If the ith imperfect maintenance for the minor defective stage is triggered at t i , then the accumulative age is: The instantaneous rate of the minor defective stage after the ith imperfect maintenance can be defined as: where x=t-t i .
The instantaneous rate of the severe defective stage after the ith imperfect maintenance can be defined as: where y=t-t i -x.
Furthermore, the failure rate after the ith imperfect maintenance can be defined as: where z=t-t i -x-y.
Here, ρ=1 corresponds to the maintenance at the minor defective stage is perfect and ρ=0 means the maintenance is minimal. However, the maintenance is imperfect if 0<ρ<1.
Because the probability density function (pdf) of the system can Using the similar way, the pdf of the severe defective stage after the ith imperfect maintenance is given as:

Probabilities of a failure renewal and an inspection renewal
Since the imperfect maintenance is done at the time of the minor defective stage identification by inspection and after the imperfect maintenance, the instantaneous rates of both defective stages and failure change, we should model the probability for a failure renewal and an inspection renewal based on the PAR model.
Probability of imperfect maintenance due to the identification (1) of the minor defective stage The minor defective stage is identified at iT and the last imperfect maintenance at t j when the system restarts the three-stage failure proc-ess, as shown in Fig. 2. It leads from assumption (3) that inspection is perfect, the duration of the normal stage is within (i-j-1)T and (i-j)T. The minor defective stage is more than (i-j)T-x, where x is the duration of the normal stage. Accordingly, the probability of identifying the minor defective stage at iT since the last imperfect maintenance at jT is given as: ( since the last imperfect maintenance at ) Summing over all possibilities in Eq. (9) for the last imperfect maintenance, namely j = 0,…i-1, we have the probability of the minor defective stage identified at iT, P m (iT), is given as: where P m (0)=1, j=0 means there is no imperfect maintenance before renewal, i=1,…. Probability of a failure renewal (2) If a failure occurs at f T , since the last imperfect maintenance at jT, the system is renewed immediately according to assumption (7). Since it is assumed that inspection is perfect to identify the state of the system, the minor defective stage has been ended within ((i-1)T, iT) before a failure, as shown in Fig. 3. Similar to the derivation of Eq. (9), the probability of a failure renewal since the last imperfect maintenance at jT is given by: (( 1) since the last imperfect maintenance at ) Then the probability of a failure renewal in ((i-1)T, iT), P f ((i-1)T, iT) is given as: since the last imperfect maintenance at ) The pdf of a failure in (( 1) can be derived from Eq. (12) as:

Fig. 3. The system fails in ((i-1)T, iT) since the last imperfect maintenance at jT
sciENcE aNd tEchNology Probability of an inspection renewal (3) If inspection detects the system being in the severe defective stage at iT since the last imperfect maintenance at jT, then from assumption (3), the minor defective stage must end within the interval ((i-1)T, iT), as shown in Fig. 4. Then we have the probability of an inspection renewal due to identifying the severe defective stage at iT under the condition that the last imperfect maintenance is carried out at jT.
Since the time of the last imperfect maintenance jT may range from 0 to (i-1)T, using Eq. (14), the probability of an inspection renewal at iT, P s (iT), is formulated as:

The Cost model
In order to calculate the expected cost per unit time using the renewal theorem, the expected renewal cycle cost and length need to be formulated based on the renewal probabilities, shown as Eqs. (12) and (15), and the cost parameters for inspection, imperfect maintenance and replacement.

The expected renewal cycle cost EC(T)
If the system is renewed due to a failure at (1) , the cost of a failure renewal cycle includes the cost of inspection, failure replacement and the imperfect maintenance previously and the cost caused by imperfect maintenance can be formulated by summing up all the possibilities before iT. Therefore, the cost of such an event is: Considering a failure could fall into any of inspection intervals, according to Eqs. (12) and (16), the expected renewal cycle cost caused by a failure renewal by summing up all the possible realizations of i, EC f (T), is given as: (2) If the severe defective stage is detected at an inspection iT, using the similar way as Eq. (16), the corresponding cost is given by:

The expected renewal cycle length EL(T)
Based on the pdf of a failure in ((i-1)T+z, (i-1)T+z+dz) shown in Eq. (13), the expected renewal cycle length caused by a failure renewal, EL f (T), is formulated as: The expected renewal cycle length caused by an inspection renewal, EL s (T), is given as:

The expected cost per unit time
Based on the expected cycle cost and length for a failure renewal and an inspection renewal, the expected cost per unit time taking the inspection interval T as a decision variable can be calculated using the renewal reward theorem [11], shown as: 22) sciENcE aNd tEchNology pected renewal cycle cost will tend to be constant. Moreover, in order to study the effect of imperfect maintenance on the expected renewal cycle cost, three different values of ρ is selected. One is ρ=1 which means the maintenance at the minor defective stage identification is perfect and the other two values ρ=0.9 and ρ=0.8 imply the imperfect maintenance. From Fig. 5, it is noted that the expected renewal cycle cost rises with the increase of ρ which shows that imperfect maintenance will decrease the expected renewal cycle cost and the smaller the value of ρ, the smaller the expected renewal cycle cost. Fig. 6 shows the expected renewal cycle length in terms of different values of T using Eqs. (20) and (21). We can see that the expected renewal cycle length decreases monotonically and finally turns to be relatively stable. It can be explained that once the inspection interval excesses the expected failure time, the failure must occur before inspection such that the expected renewal cycle length is constant. Moreover, the expected renewal cycle length also decreases with the reduction of ρ and the smaller value of ρ, the smaller the cycle length. Fig.7 shows the expected cost per unit time in terms of the inspection interval under different values of ρ using Eq. (22). From the results in Fig.7, it can be seen that with the increase of the inspection interval, the expected cost per unit time firstly reduces and then increases. It is what we expected that the smaller inspection interval will lead to the frequent inspections with more inspection cost and if the inspection interval exceeds the expected failure time, a failure renewal is required such that the expected cost per unit time tends to be constant. Also for three scenarios of ρ in Fig.7, the trend of expected cost per unit time with different improve factors is same. It is confirmed that the smaller the value of ρ, the larger the expected cost per unit time. It is because that decreasing the value of ρ will decrease the expected cost and length simultaneously, but the expected renewal length falls faster. For the given parameters of distribution and cost, it can be seen from Fig.7 that the optimal inspection interval T*=3 is same although the improve factor is different. The optimal expected cost per unit time is 12.9009, 14.9409 and 17.9949 respectively when the improve factor is 1, 0.9 and 0.8. Therefore, the optimal inspection interval T*=3 with

Numerical example
In this section a numerical example is presented to show the application of the proposed model to minimize the expected cost per unit time and find the optimal inspection interval. Since the Weibull distribution is one of the most commonly used distributions in reliability [16], this paper assumes that these three stages in the failure process follow Weibull distributions with a non-decreasing failure rate. The occurrence rate of the minor defective stage, the severe defective stage and failure with Weibull distribution is given by: Since we cannot have the complete experimental data at present, the distribution parameters for these stages are assumed in Table 1. The cost parameters used in the cost model are shown in Table 2.  17) and (19). It can be seen that the renewal cycle cost decreases firstly and then increases as T increases, finally turning to be relatively stable. It is due to that when the inspection interval is small, the expected renewal cycle cost is high because of the frequent inspection. However, once the inspection interval exceeds the expected failure time, failure occurs more easily and changing the inspection interval will not affect the failure process such that the ex-  the minimal expected cost per unit time can be adopted to implement inspection and maintenance activities.

Conclusions
In this paper, an inspection optimization model is proposed based on a three-stage failure stage to optimize the inspection interval of bearings by minimizing the expected cost per unit time. The three states before failure using the concept of the three-stage failure process correspond to the three color scheme of bearings by SPM technique.
The maintenance at the minor defective stage is regarded as imperfect maintenance, which will affect the instantaneous rates of the minor defective stage, the severe defective stage and failure. The maintenance at the severe defective stage and failure can be viewed as perfect maintenance. The proportional age reduction model is used to model the effect of imperfect maintenance at the minor defective stage identification. The results from the numerical example show that the optimal inspection interval can be found using the proposed model. Moreover, imperfect maintenance will decrease the expected renewal cycle cost and length but increase the expected cost per unit time.
Further research along this line can be developed such as: (1) considering a finite time horizon, (2) considering the availability of spare parts, and (3) case studies need to be implemented to validate the model. These issues will be researched in the future.