Hollnagel: What is Resilience Engineering?

From Erik Hollnagel.

” A system is resilient if it can adjust its functioning prior to, during, or following events (changes, disturbances, and opportunities), and thereby sustain required operations under both expected and unexpected conditions. “

Origins

‘Resilience’ is a term that has been used for a long time and in several different ways. It was first used in to describe a property of timber, and to explain why some types of wood were able to accommodate sudden and severe loads without breaking. Almost four decades later, a report to the Admiralty referred to a measure called the modulus of resilience as a means of assessing the ability of materials to withstand severe conditions.

Many years later, Holling (1973) referred to the resilience of an ecosystem as the measure of its ability to absorb changes and still exist. He further contrasted resilience with stability, defined as the ability of a system to return to its equilibrium state after a temporary disturbance, but also argued that resilience and stability were two important properties of ecological systems. This later led to a distinction between engineering resilience and ecological resilience. Engineering resilience considers ecological systems to exist close to a stable steady-state. Resilience is here the ability to return to the steady-state following a perturbation. Ecological resilience emphasizes conditions far from any stable steady-state, where instabilities can flip a system from one regime of behaviour into another. Resilience is here the system’s ability to absorb disturbances before it changes the variables and processes that control behaviour.

In the early 1970s, the term ‘resilience’ began to be used as a synonym for stress resistance in psychological studies of children. It soon became a frequently used term in psychology, and was many years later, in 2007, defined as: ‘The capacity to withstand traumatic situations and the ability to use a trauma as the start of something new’. At the beginning of the 21st century, it was picked up by the business community and used to describe the ability dynamically to reinvent business models and strategies as circumstances change.

The Resilience Engineering understanding of ‘resilience’

As the above brief history shows, the thinking about resilience has typically referred to a dichotomy of sorts: on the one hand materials, systems, or situations where resilience was absent and where adverse outcomes therefore might happen, and on the other hand materials, systems or situations where resilient was present and where adverse outcomes could be avoided. This was also the case in the early 2000s when resilience engineering was proposed as an alternative (or as a complement) to the conventional view of safety. This led to early discussions about resilience versus robusteness, resilience versus brittleness, etc.

But resilience (or more accurately, the ability to perform in a resilient manner – although this is too long to write every time) is not about avoiding failures and breakdowns, i.e., it is not just the opposite of a lack of safety. When it was said, in ‘Resilience Engineering: Concepts and Precepts‘ that ‘failure is the flip side of success’ the intention was not to propose a binary universe, but rather to point out that things that go wrong happen in (more or less) the same way as things that go right. (This has later been elaborated in ‘The ETTO principle‘ and in ‘Safety-I and Safety-II‘.) This is by no means the so-called ‘new view’ – which by the way was not new at all even when it was touted as such – but rather the realisation that humans always try to do what they think is right in the situation. (Remember Mach’s dictum: “Erkenntnis und Irrtum fließen aus denselben psychischen Quellen; nur der Erfolg vermag beide zu scheiden.”)

The focus of resilience engineering is thus resilient performance, rather resilience as a property (or quality) or resilience in a ‘X versus Y’ dichotomy. This can be seen in how the definition of resilience has changed over the years.

In the first book (Resilience Engineering: Concepts and Precepts, 2006) the following definition was given. “The essence of resilience is therefore the intrinsic ability of an organisation (system) to maintain or regain a dynamically stable state, which allows it to continue operations after a major mishap and/or in the presence of a continuous stress.”

This definition reflects the historical context by its juxtaposition of two states – one of stable functioning and one where the system has broken down. The definition is also limited to consider situations of threat, risk or stress.

In the fourth book (Resilience Engineering in Practice, 2010) – depnding on how one counts – the definition reads as this: “The intrinsic ability of a system to adjust its functioning prior to, during, or following changes and disturbances, so that it can sustain required operations under both expected and unexpected conditions.”

In this definition the emphasis on risks and threats has been reduced, and the reference is instead to ‘expected and unexpected conditions’. The focus has also changed from ‘maintaining or regaining a dynamically stable state’ to the ability to ‘sustain requried operations’. The logical continuation of these developments is a definition like the following (not yet documented in a book):

A system is resilient if it can adjust its functioning prior to, during, or following events (changes, disturbances, and opportunities), and thereby sustain required operations under both expected and unexpected conditions.

The change in the definitions has been to broaden the scope of resilient performance. It is not just to be able to recover from threats and stresses, but rather to be able to perform as needed under a variety of conditions – AND TO RESPOND APPROPRIATELY TO BOTH DISTURBANCES AND OPPORTUNITIES.

The emphasis on opportunities is important for the change from protective safety to productive safety – and ultimately for the dissociation of resilience from safety, thereby leaving the sterile discussions and the stereotypes of the past behind. Resilience is about how systems perform, not just about how they remain safe. (And even here Safety-II would mean something quite different from Safety-I, but even Safety-II is just another step on the road ahead.) A system that is unable to make use of opportunities is not in a much better position than a system that cannot respond to threats and disturbances – at least not in the long term.

The above definition is probably not the last and final one. Although resilience engineering started as a contrast to conventional safety thinking (Safety-I), it should become something in its own right. Resilience engineering must free itself from the frame of reference that might have been of some value ten years ago (yet even that is doubtful), but which surely will impede any further development. Resilience engineering is about the characteristics of resilient performance per se, how we can recognise it, how we can assess (or measure) it, how we can improve it. The discussions should therefore focus on what resilience (or rather, resilient performance) IS, rather than on what it IS NOT.

For more see http://erikhollnagel.com/ideas/resilience-engineering.html