1、 Abstract This paper presents a method for the predictive maintenance of distribution transformers.That is,a method of predicting which transformers are most likely to fail soon.Once predicted,such transformers may be subject to maintenance or replacement.This practice reduces the costs and increase
2、s the reliability of power distribution systems.The practice is common in transmission systems.In that domain,physical methods such as dissolved gas analysis see fantastic results.Data-driven techniques utilizing DGA data are also popular.But such methods are cost prohibitive for distribution system
3、s.Instead,this paper proposes to utilize a data driven framework for the task which only uses readily available data.Such data include the transformers specification,loading,location,and weather-related information.Such data inspire the use of two suitable machine learning algorithms.The first is ra
4、ndom forests.The second is the Random Undersampling with AdaBoost(RUSBoost)algorithm.These algorithms are tested on over 700,000 distribution transformers in Southern California.This test finds that both algorithms outperform the current state of practice.Further,it finds that the RUSBoost algorithm
5、 performs better than the Random Forest.Index Terms Data-driven method,distribution transformer,predictive maintenance,Random forest.I.INTRODUCTION An aging infrastructure is the undoing of a reliable electric grid.Unhealthy hardware can result in power outages,raise the costs of power,and start fir
6、es.Equipment failure caused 15%of electric disturbances reported to Department of Energy of the United States in 2015.The current electric transmission and distribution infrastructure in the United States are aging.Many electric grid equipment are approaching or have surpassed their useful life.70%o
7、f power transformers are 25+years old.60%of circuit breakers are 30+years old,and over 60%of distribution poles are 30-50 years old.This far surpasses their useful lives of 25 years,20 years and 50 years 1.One critical hardware component susceptible to failure is the distribution transformer.There a
8、re many ways for a transformer to fail.For Manuscript received June 25,2018.Authors are with the Department of Electrical and Engineering,University of California,Riverside,CA 92507,USA(e-mail:fkabi001ucr.edu;bfogg002ucr.edu;nyuucr.edu).example,high ambient temperatures and excessive loading may dam
9、age a transformer.A deficient power supply or exposure to a hostile environment can destroy one.Something as simple as poor workmanship can see a transformers demise 2.Yet the most common cause of transformer failure is age.The average age of the distribution transformers in the United States is eve
10、n higher than the transformers in the transmission system.Thus,proper maintenance of distribution transformers is essential.Current equipment maintenance strategies fall into three main categories.The first is run-to-failure.In this category,interventions occur only after a transformer has already f
11、ailed.The second category is preventive maintenance.Here,maintenance actions are carried out according to a planned schedule.The final category,predictive maintenance,is the most cost effective.Predictive maintenance attempts to assess the health conditions of each device.This allows for the advance
12、d detection of pending failures 3.The detection,in turn,allows for targeted maintenance to the devices most in need.Currently,electric utilities practice run-to-failure maintenance management for distribution transformers.Employing predictive maintenance instead would be beneficial.It would help to
13、achieve more reliable system operations and reduce the number of sudden power supply interruptions.These benefits are shared by both predictive maintenance and preventative maintenance.But predictive maintenance further reduces costs by avoiding unnecessary maintenance operations.Existing predictive
14、 maintenance research and practice focus-es on large power transformers.The methods assess trans-former health via dissolved gas analysis(DGA).DGA is a well-known diagnostic technique in the industry 4.It works by monitoring the concentration of certain gases in the insul-ation oil of a transformer.
15、The concentration of the dissolved gases is characteristic of the insulations decomposition.Gases used in DGA include hydrogen,methane,ethane,acetylene,ethylene,carbon monoxide and carbon dioxide.DGA has also been combined with data-centric machine learning techniques.Tested techniques include artif
16、icial neural networks(ANN)57,and fuzzy logic 7.Support vector machines,the extreme learning machine(ELM)and deep belief networks have been employed as well 810.These methods identify patterns in historical DGA data to assess transformer health.Data Driven Predictive Maintenance of Distribution Trans
17、formers Farzana Kabir,Brandon Foggo,Student Member IEEE and Nanpeng Yu,Senior Member,IEEE 2018 China International Conference on Electricity Distribution Tianjin,17-19 Sep.2018CICED2018 Paper No.201805280000166 Page1/5 312 Many such studies formulate the failure prediction problem as a supervised cl
18、assification task.Results of such methods are excellent.An evaluation of 15 standard machine-learning algo-rithms was performed in 4.The authors of this study separated their results based on false alarm rate.With a false alarm rate of 1%,the researchers were able to detect between 30%and 50%of faul
19、ty transformers.When allowed a false alarm rate of 10%,they could detect 80%to 85%of faulty transformers.DGA however,requires semiconductor gas sensors on each transformer.Installing these is feasible for transmission systems which do not have many transformers.High voltage power transformers make u
20、p 3%of all transformers in the United States.But distribution systems have far more.Thus,these installations are prohibitively expensive for distribution systems.But there are ways of predicting transformer failure which are less direct.For example,environmental conditions play a causal role in tran
21、sformer failure.Thus,data related to these conditions contain information about a transformers health.This is verified somewhat in reference 4.The reference supplements DGA data with transformer specific features like age and nominal power.Such data are low cost and readily available.It thus enables
22、 cheap predictive maintenance.This study focuses on predictive maintenance of distribution transformers.Machine learning techniques are applied to model the dependency between low cost data and transformer health.The random under-sampling with boosting(RUSBoost)algorithm is adopted to handle data im
23、balance.The unique contribution of this paper is that it just uses low-cost transformer-specific and environmental related features.The rest of this paper is organized as follows:In Section II,an overall framework of the failure prediction problem is presented.Section III describes the technical met
24、hods used in the study.Section IV presents the case study by describing the dataset and application of the machine learning algorithms on the dataset.The performance of the failure prediction models is reported in Section V.Finally,Section VI concludes the paper.II.FRAMEWORK The aim of this study is
25、 to predict if a distribution transformer will fail in a given horizon.Such prediction is performed via transformer-specification,loading,location and weather related data.The dataset is first divided by year into a training set,a validation set and a test set.Transformer failure information within
26、each period acts as binary label.The convention that a 1 indicates failure and a 0 indicates a non-failure is used.Thus,the failure prediction problem is formulated as a supervised binary classification task.The dataset is denoted as(,).This consists of pairs(,)of features and failure labels.As with
27、 most real data,there are a few challenges involved in dealing with this dataset.First,there is missing data.Thus,imp-uting those will be necessary.Second,the dimensionality of the data involved in this study is high.Thus,feature selection is important for obtaining better learning performance.Third
28、,the dataset is of mixed type,i.e.the features can be either continuo-us or categorical.Thus,a tree-based model may be useful.Last-ly,transformer failures are rare events.This creates an imbalan-ce in the dataset.As a result,traditional algorithms can create suboptimal classification models 11.Rando
29、m under sampling with boosting is employed to ease the class imbalance problem.The study focuses on keeping the number of false predictions small.If the number of false predictions is high,then the cost of their premature replacement will exceed the cost of their sud-den failure.As a result,the matc
30、h in top N(MITN)metric is suitable for assessing the quality of a given method.To calcula-te this metric,predicted failures are first ranked by likelihood.The N transformers deemed most likely to fail are then placed in a set L.Transformers that ended up failing in the given horizon are then placed
31、in a set F.The MITN metric is then the carnality of.The work flow is summarized in Fig.1.III.TECHNICAL METHODS A.Data Preprocessing 1)Treating Missing Values The Existing methods for dealing with missing values can be divided into two categories.The first category simply removes instances with missi
32、ng data.But this has drawbacks such as substantial data loss and biased instance sampling.The second category attempts to instead impute missing data 12.Some popular single imputation strategies are mean imputation,hot-deck imputation,and predictive imputation 12.In the first,missing values are repl
33、aced by the mean of the observed values in that variable.In the second,missing values are replaced by nearby data values from the same dataset.The third encompasses more sophisticated procedures for handling missing data.These methods treat a missing variable as a new classification or regression va
34、riable.All other relevant variables become predictors of this new variable.Commonly used techniques are decision trees,artificial neural networks,and random forests.However,single imputation methods might ignore the variance associated with the imputation process.Fig.1:Workflow for failure predictio
35、n of distribution transformers Fig.1 Work flow for failure prediction of distribution transformers.Fig.2 Work flow for failure prediction of distribution transformers.2018 China International Conference on Electricity Distribution Tianjin,17-19 Sep.2018CICED2018 Paper No.201805280000166 Page2/5 313
36、Multiple imputation schemes can address this problem 13.Using a random forest as a prediction model for imputation is a promising approach.It can handle mixed data types,high dimensionality,and address complex interactions.A random forest also forms a multiple imputation scheme intrinsically.This is
37、 due to the averaging of the many trees found in the forest.The MissForest method 14 is an iterative imputation method based on random forests.It has been shown to outperform well known methods such as parametric MICE 15.Imputation error can be determined from the out-of-bag error estimates of the r
38、andom forests.2)Feature Selection High dimensional data has always presented challenge to existing machine learning methods.Feature selection reduces the dimensionality by choosing a subset of the features.This helps our methods perform better.It increases learning accuracy,lowers computational cost
39、s and improves model interpretability.Supervised feature selection methods are chosen to use in this study.Existing methods can be classified into filter models and wrapper models 16.In filter methods,the relevancy of each feature is ranked.The highly ranked features are selected for inclusion in th
40、e dataset.Filter methods can also rank feature subsets instead of ind-ividual features.Popular ranking metrics include the Pearson correlation coefficient(PCC)and mutual information.The PCC is calculated easily from the dataset.Mutual information,however,must be estimated.A common nonparametric esti
41、m-ation method follows from nearest neighbor distances 17.Wrappers models use an interaction between feature selecti-on and a predetermined classification algorithm.These models include sequential forward and backward selection 16.In seq-uential forward selection,features are added until classificat
42、ion performance converges.In sequential backward selection,feat-ures are removed instead of adding.Though wrapper methods have better performance,they are computationally expensive.Decision trees inherently estimate the suitability of features.The features found at the top of a binary decision tree
43、are the best at separating instances for the task at hand.This characteristic can be exploited for feature selection.B.Learning Algorithms The random forest classification algorithm 18 is used in this study.A random forest is an ensemble of decision trees.Each tree is formed by randomly sampling fea
44、tures iteratively.1)Dealing with Imbalanced Dataset When a dataset is imbalanced,learning algorithms will under-perform on the minority class.Data re-sampling and boosting are two techniques which ease the data imbalance problem.Under sampling removes examples from the majority class.It has the bene
45、fit of reduced training time due to reduced number of training data points.But it has the drawback of losi-ng useful information.Boosting builds an ensemble of models by assigning higher weights to difficult instances.In imbalanc-ed problems,these difficult instances are the minority example-es.Pred
46、ictions are then made using a weighted average of each of the separate models.Random undersampling with Boosting(RUSBoost)19 integrates these methods.Instances are remo-ved randomly from the majority class until balanced.An itera-tion of the boosting method is then performed.The under-sampled traini
47、ng data is then re-sampled according to the ins-tances assigned weight.This process is repeated for several ite-rations.RUSBoost with the AdaBoost.M.2 boosting algorithm 20 is adopted in this study.The Random forest classifier is selected as the base learner in the AdaBoost.M.2 algorithm.IV.CASE STU
48、DY Predictive maintenance is performed for one of the largest utility companies,Southern California Edison.This companys distribution transformers are becoming old.35%of them were approaching or had surpassed the useful life of 35 years by 2016.Thus,employing predictive maintenance to these transfor
49、mers would be beneficial for the company.The prediction horizon in this study is two years.A.Dataset Description The predictive maintenance dataset contains over 700,000 transformers in the Los Angeles,Mono,Fresno,Riverside,San Bernardino,Orange,Kern,Tulare and Ventura counties of California.The dat
50、aset covers the years 2012 to 2016.There are 42 categorical and 30 continuous variables.Features fall into four broad categories.The first is data related to transformer specification.These include line and phase voltages,KVA ratings,ages,manufacturers,models,subtypes,primary ratings,overhead/underg