It is well established that the human contribution to the risk of operation of complex technological systems is significant, with typical estimates lying in the range of 60-85%. Human errors have been a contributor to many significant catastrophic technological accidents. Examples are 1) the termination of safety injection during the Three Mile Island accident, leading to extensive damage to the reactor core; 2) the introduction of water into the methyl isocyanate storage tank at the Union Carbide facility in Bhopal, India, which led to a large uncontrolled release and thousands of offsite fatalities; 3) the series of deliberate violations, leading to an explosion, combustion of the graphite moderator, and uncontrolled release of radioactivity at the Chernobyl nuclear plant in Ukraine (Reason, 1990). Therefore, in order to adequately characterize and quantify the risk of complex technological systems, the human contribution must be included in the risk assessment. Human reliability analysis, a component of an integrated probabilistic risk assessment, is the means by which the human contribution to risk is assessed, both qualitatively and quantitatively. Human reliability analysis as a discipline has as its goals the identification, modeling, and quantification of human failure events in the context of an accident scenario. There are literally dozens of human reliability analysis methods to choose from, good practices have been developed for human reliability analysis, many of the methods have been evaluated against these good practices, and new methods are still being developed in the U.S. and other countries around the world. However, many difficulties remain. A principal difficulty, and one that hampers use of human reliability analysis results in risk-informed decision-making, is the large variability associated with the analysis results, from one method to another, and between analysts for a given method. An important part of any comprehensive human reliability analysis is a task analysis. Task analysis is the name given to a range of techniques that can be used to examine the ways in which humans undertake particular tasks. Some of these techniques focus directly upon task performance, while others consider how specific features of the task, such as the interfaces, operating procedures, and team organization or training, can influence task performance. An important ingredient of the task analysis, however it is performed, is observations from system simulators. These observations are important in order for the analysis team to be able to realistically model procedure implementation, interactions between the crew and the system, and interactions among the crewmembers themselves during low-frequency high-consequence scenarios, for which direct observational data on human performance are lacking. Without such observations, the HRA is likely to deviate significantly from reality. Simulator observations are also a major source of information for some of the newer human reliability analysis methods, and guidelines promulgated by the U.S. nuclear industry emphasize the importance of gathering simulator data. However, the industry guidelines do not provide guidance on how analysts could benefit from the wealth of information that can be obtained from observing simulator exercises to support understanding of crew characteristics and behavior, and other general plant-specific factors that could influence performance in particular scenarios. A current use of simulator studies is to inform the development of new human reliability analysis methods, and this effort is faced with the same lack of guidance on effectively and efficiently employing the abundance of information produced by these simulator studies. Put another way, what needs to go into an HRA is well understood. The problem the analysis community has faced for many years is how. The resources required to analyze the resulting information are likely a reason for the infrequent use of simulator observations in support of the human reliability task analysis. The goals of the qualitative analysis are to provide insights about process improvements that reduce risk, and to produce a model of operator performance for later quantification, in particular the principal process (and deviations from this process) followed by operators in responding to a plant upset condition. What is missing are tools to allow analysts to more efficiently and consistently make use of the sometimes vast amount of information gathered in the qualitative analysis, particularly during observations of operator responses in plant simulators. This research illustrates how select process mining tools, applied to event logs from a facility simulator, can be used to efficiently develop a model of operator performance in an accident scenario, including both the nominal process and significant deviations from this process, which could lead to risk-significant errors of commission. Such errors are known to be important contributors to risk, but have heretofore been largely absent from risk analyses of complex technological systems. This represents an advance in human reliability task analysis, which requires input from simulator observations. The dissertation explores the following four research questions: 1. What are the requirements for a tool to aid in the analysis of large amounts of simulator data in support of human reliability analysis? 2. How do current human reliability analysis methods approach the issue of simulator observations and are these approaches suitable for incorporating simulator observations into the human reliability task analysis? 3. Are there tools in other domains that are more suitable and which, if adopted (and adapted to their new domain), could improve the state of the art in human reliability modeling and task analysis? 4. What are the limits of applicability of these tools from other domains, and what improvements are needed in order to make them practical for use by an analyst who is not a specialist in using such tools? The first question is explored in Ch. 2, which examined the overall human reliability analysis process. The following characteristics were identified from experience and a literature review as factors to be considered in human reliability modeling: • Plant behavior and conditions • Timing of events and the occurrence of human action cues • Parameter indications used by the operators and changes in those parameters as the scenario proceeds • Time available and locations necessary to implement the human actions • Equipment available for use by the operators based on the sequence • Environmental conditions under which the decision to act must be made and the actual response must be performed • Degree of training, guidance, and procedure applicability. The first three and the last of these can be informed by simulator observations. However, simulators can produce very large output files, in a variety of formats. Manually analyzing such output data is very resource intensive, and has in the past limited the use of simulator experiments and observations in support of human reliability analysis. Thus, one requirement for an analysis tool is that it be capable of accepting data in a flexible format, and that it be able to handle large amounts of data. A second requirement is that the analysis cannot be purely statistical, because the human reliability analysis is concerned with the process followed by the operators, and not solely with statistical variables, such as the time at which a certain action is performed.Ch. 2 also examined the task analysis guidance provided by two representative human reliability analysis methods, THERP and ATHEANA, both of which are considered complete methods, in that they address all three aspects of the analysis: identification, modeling, and quantification. In addition to these two methods, Ch. 2 also examined other approaches for human reliability task analysis. The conclusion of these examinations was that there are no extant tools in the human reliability community of practice that are suitable for analyzing large amounts of simulator data in the context of a human reliability task analysis. In examining the third research question, Ch. 3 provided an overview of business process mining tools, along with some selected industrial applications of these tools, and concluded that these tools have potential for application in support of human reliability task analysis specifically, and simulator data analysis more generally. To begin answering the fourth research question, Ch. 3 examined some of the process mining tools and techniques in the context of analyzing simulator data. The most promising tool appeared to be the fuzzy mining algorithm developed by Christian Guenther as part of his PhD research at TU/e. Because simulator log files are typically very large, traditional process mining approaches can be expected to produce an overly complex "spaghetti model" that would be quite opaque to analysis. The fuzzy model abstracts away irrelevant details, leaving the salient aspects of interest for the task analysis. Ch. 4 began exploring how the tools of process mining might be applied to simulator data, beginning with a relatively small set of logs collected at the Halden Reactor Project simulator in Norway. A number of difficulties were encountered during conversion of the data files to the format required by the process mining software. These problems became even more severe in the application of Ch. 6, which involved much larger event logs. Despite the problems with file conversion, Ch. 4 concluded that certain process mining tools, especially the fuzzy miner, had the potential to be of use in support of human reliability task analysis, because they could clearly highlight differences in the underlying process governing each crew’s performance. Ch. 6 continued the examination of the fourth research question by exploring the application of process mining tools to a much larger set of simulator data collected at a U.S. plant. File conversion was found to be a particularly severe problem, worse than for the Halden data analyzed in Ch. 4, and considerable time had to be spent in writing a file conversion routine. Following file conversion, considerable up-front manual filtering of the simulator action logs to remove low-level actions was still necessary to reduce the complexity of the mined models. Such filtering has the potential to introduce errors into the resulting mined models, and so the analyst who does the filtering must have detailed knowledge of facility procedures and operations, or have access to someone who does, to ensure that such errors are not introduced. Applying the fuzzy miner to the filtered logs provided some especially useful insights for human reliability task analysis, and particularly for construction of the crew response trees being considered for use in the new hybrid human reliability analysis method described in Ch. 5. This method is not being developed as part of the research described herein, although the author is part of the team that is developing the method. The scientific contributions of this research are as follows. • This research illustrates how process mining, applied to event logs from a facility simulator, can be used to efficiently develop a model of operator performance in an accident scenario, including both the nominal process and significant deviations from this process, which could lead to risk-significant errors of commission. Such errors are known to be important contributors to risk, but have heretofore been largely absent from risk analyses of complex technological systems. This represents an advance in human reliability task analysis, which requires input from simulator observations. • This research illustrates how process mining can aid in the construction of crew response trees, which are a proposed framework for task analysis and quantification in a new hybrid human reliability analysis method being developed by the U.S. Nuclear Regulatory Commission, in collaboration with a team (of which the author is a member) comprising researchers from Sandia National Laboratories, Idaho National Laboratory, the University of Maryland, the Electric Power Research Institute, and the Paul Scherer Institute. • This research illustrates the potential for process mining to improve data reduction and analysis for future simulator experiments at the Halden Reactor Project and elsewhere. It also illustrates some of the limitations in current process mining tools, which will need to be overcome in order for these techniques to be able to be applied broadly by analysts in the field who are not process mining specialists. Several potential future contributions of process mining to human reliability analysis and facility analysis more generally have been identified in this research, although these are not explored in detail in this dissertation. These are 1) the potential to employ process mining in post-processing of data produced by dynamic simulation models being developed by the risk analysis research community, 2) expanding the process model by incorporating process variables such as pressure and temperature, which are often collected at very short intervals (e.g., 1 msec), 3) use of the underlying Petri net model produced by process mining to simulate operator performance in an accident scenario, 4) use of process mining to post-process data produced by dynamic PRA simulation tools, and 5) use of process mining to identify process deviations in a nuclear reprocessing facility, where such deviations could be indicative of an attempt to divert special nuclear material from the facility.
|Qualification||Doctor of Philosophy|
|Award date||31 Aug 2011|
|Place of Publication||Eindhoven|
|Publication status||Published - 2011|