Page 46 - MSDN Magazine, May 2019
P. 46
address and relax some of these assumptions. You can read more about such models and techniques in the book, “The Statistical Analysis of Failure Time Data” by Kalbfleisch and Prentice (Wiley-Interscience, 2002), at bit.ly/2TACdLR.
It’s frequently desirable to perform additional transformations on the covariates, which is often called “feature engineering.”
I’ll make the assumption that each maintenance operation performed on a machine component completely resets that component and can therefore be treated independently. It’s then possible to use survival regression on two types of intervals (depicted in Figure 1):
• The interval between a failure and the preceding maintenance operation (time to event).
• The interval between subsequent maintenance operations (censoring).
Each interval in Figure 1 starts with a maintenance operation. The first type of interval ends with X, denoting a failure, while the second type ends with O, denoting another maintenance operation prior to a failure (this is essentially a proactive maintenance oper- ation), which in this case means a censored observation.
Therefore, the original data needs to be transformed into this format with the two required fields. The “time_to_event” field rep- resents the time in hours until either failure or the next maintenance occurs. The “event” field is set to one for a failure and to zero for a maintenance operation before failure.
It’s frequently desirable to perform additional transformations on the covariates, which is often called “feature engineering.” The purpose of this process is to generate covariates with better predic- tive power. For example, you can create another covariate that will calculate the mean of the pressure in the 10 hours prior to failure. There are many different options for functions and possible time
windows to create such covariates, and there are a few tools you can use to help automate this process, such as the open source Python package tsfresh (tsfresh.readthedocs.io/en/latest).
Now I’m going to discuss the two survival regression models: the Cox proportional hazard model (or Cox PH model) available in h2o.ai and the Weibull Accelerated Failure Time model avail- able in Spark MLLib.
Cox Proportional Hazards Regression
Recall that a hazard function determines the event rate at time t for objects or individuals that are alive at time t. For the predictive maintenance example, it can be described as the probability of failing in the next hour, for a given time t and for all the machines where component 1 failure hasn’t occurred since their last maintenance. Higher hazard rates imply higher risk of experiencing failure. The Cox PH regression estimates the effects of covariates on the hazard rate as specified by the following model:
h(t) = h0(t)e β1 X1 +...+ βp Xp
Here, h(t) is the hazard function at time t, h0(t) is the baseline hazard at time t, the Xi variables are the different covariates and the corresponding betas are coefficients corresponding to the covariates (more on that a bit later). The baseline hazard is the haz- ard when all covariates are equal to zero. Note that this is closely related to the intercept in other regression models, such as linear or logistic regression.
According to this model, there’s no direct relationship between the covariates and the survival time. This model is called semi-parametric because the hazard rate at time t is a function of both a baseline hazard rate that’s estimated from the data and doesn’t have a parametric closed form and a multiplicative component that’s parameterized.
The baseline hazard is the hazard when all covariates are equal to zero.
The reason this model is called a proportional hazard model is because it allows you to compare the ratio of two hazard functions. In this case, given an estimated model, the ratio between two dif- ferent data points is:
h1(t) = h0(t)eβ1 X1 +...+ βp Xp = eβ1 (X1 - X'1) +...+ βp (Xp - X'p) h2(t) = h0(t)eβ1 X1 +...+ βp Xp
The baseline hazard rate cancels out and the resulting ratio between the hazards is only a function of coefficients and covari- ates and again doesn’t depend on time. This is closely related to logistic regression where the log of the odds is estimated. Also, the Cox PH regression model doesn’t directly specify the survival function, and the information it provides focuses on the ratio or proportion of hazard functions. Therefore, it’s primarily used to understand the effects of covariates on survivability, rather than
Machine X
Machine Y
Time
Figure 1 Survival Representation of Machine Failures 40 msdn magazine
to directly estimate the survival function.
Machine Learning
Machine