Blog

Welcome to the Mangan Software blog page. Here you will find lots of useful articles regarding functional safety, process safety and news and events here at Mangan. Please feel free to have a browse around and don’t forget to sign up to the Newsletter to receive updates when a new blog post is added. If there is a blog post you would like to see then please feel free to get in contact and we will aim to add it to the Mangan Software blog page.

Digitalization Demands

Digitalization Demands

Part 2 – Hazard Identification and Allocation of Safety Functions Digitalization Demands An...

I Like Logic Diagrams

I Like Logic Diagrams

I like logic diagrams, although I’m often in the minority, to do the detailed design of a Safety...

What is Data Integrity?

Many of us may remember playing a game as a child, commonly referred to as Telephone, where everyone would sit in a circle with the sole responsibility of passing along a message to the next player. The goal of this game was to successfully pass the original message back to the first player without any changes to the original message. If your experiences were anything like mine, you would agree that the final message rarely made it back to the first player in the same state that it left in. In some cases, the final message was so far from the original that it would induce laughter throughout the whole group. Although this game was supposed to provide laughter and enjoyment during our childhood, it was also a good teaching moment to reinforce the importance of detail and attention. This exercise is a simple demonstration of the importance of data integrity and communication and their reliance on each other.

Data Integrity in the Process Industry

In the human body, blood transports oxygen absorbed through your lungs to your body’s cells with assistance from your heart, while the kidneys are continuously filtering the same blood for impurities. In this example, three systems (heart, kidneys, lungs) are working together to ensure adequate maintenance of the body. Much like the human body, the process industry is complex and requires multiple systems working together simultaneously to achieve their goal. If any system were to break, it would result in reduced performance and possibly, eventual failure. These data integrity challenges are very similar, regardless of whether tasked with designing a new site or maintaining existing facilities.

Chemical plants, refineries, and other process facilities maintain multiple documents that are required to operate the facility safely. Any challenges with maintaining these documents and work processes could result in process upsets, injuries, downtime, production loss, environmental releases, lost revenue, increased overhead, and many more negative outcomes. Below are just a small example of the critical documents that must be updated to reflect actual engineering design:

P&IDs Electrical One-Lines Cause & Effects Instrument Index Loop Diagrams Control Narratives Wiring Diagrams Process Control Logic

There are many processes and workflows that may trigger required changes to the above documentation, such as PHAs, LOPAs, HAZOPs, MOCs, SRSs, Maintenance Events, and Action Items, to name a few. Each of these processes requires specific personnel from multiple groups to complete. As the example earlier in this blog pointed out, it can be a challenge to communicate efficiently and effectively in a small group, much less across multiple groups and organizations. Data integrity can easily be compromised by having multiple processes and multiple workgroups involved in decisions affecting multiple documents.

Data Integrity

Considerations

When starting a new project or becoming involved in a new process, it is essential to consider how the requested changes will affect other workgroups and their respective documentation. Will your change impact others? Could understanding how your changes affect other data and workgroups minimize rework or prevent incidents? Could seeing the full picture help you to make better decisions for your work process? Below are some approaches to consider to improve data integrity and communication in your workspace:
  • Understand how changes you make may affect others
  • Identify duplicated data that exist across multiple databases or files
  • Look for ways to consolidate data and processes
  • Create Procedures to audit required changes
  • Designate Systems of Record (SOR) for all data
  • Implement roles to follow guidelines and maintain integrity and communication

Digital Transformation of Control and Safety Systems

The Digital Transformation of Control and Safety Systems has come a long way. They used to be simple yet were unreliable, not very robust, or died from neglect.  In the past, the term Safety System generally wasn’t used very much, rather you would see terms such as ESD and Interlock. The technologies used in the past were often process connected switches and relays that were difficult to monitor, troubleshoot, and maintain. Field instrumentation used 3-15 psig air or 4-20 ma signals. Things have changed since then. They have become more effective yet with that, a lot more complicated as well.

As control systems, safety systems, and field instrumentation were digitized, the amount of data a user has to specify and manage grew by orders of magnitude. Things that were defined by hardware design, that were generally unchangeable after components were specified, became functions of software and user configuration data which could be changed with relatively little effort.  This caused the management of changes, software revisions, and configuration data to become a major part of ownership.

The problem is that the market is dominated by proprietary systems that apply only to manufacturers line of products, so the user is required to have multiple software packages to support the wide variety of instrumentation, control systems, safety systems and maintenance management support systems that exist in any of today’s process plants. Here’s an overview of the evolution and landscape of these systems and the relative chaos that still exists.

netwrok

Field Instrumentation 

Back in the early 1980’s an operating company was involved in the first round of process control system upgrades to the first generation of DCS that were available. There were projects for field testing prototypes of a new digital transmitter major manufacturers. The transmitters that were being tested were similar to the 4-20 ma transmitters, but the digital circuity that replaced the old analog circuitry was programmed by a bulky handheld communicator. It took about 10 parameters to set up the transmitter. 

Now you can’t buy anything other than a digital transmitter, and instead of a few parameters available, there are dozens. Digital valve controllers have also become common and the number of parameters available number in the hundreds. Device types with digital operation have also exploded, including adoption of wireless and IOT devices. The functionality and reliability of these devices far exceed those of their prior analog circuit-based relatives. The only cost is that someone has to manage all of that data. A binder full of instrument data sheets just doesn’t work anymore. 

Field Instrumentation Management Systems 

When digital field instrumentation was first introduced the only means of managing configuration data for each device was through a handheld communications device, and the configuration data resided only on the device. This was simple enough when the parameters mirrored the settings on non-smart devices. However, these devices got more sophisticated and the variety of devices available grew. Management of their configuration data became more demanding and the need for tools for management of that data became fairly obvious.

The market responded with a variety of Asset Management applications and extended functionality from basic configuration date management to include calibration and testing records and device performance monitoring.  The systems were great, but there was major problem in that each manufacturer had packages that were proprietary to their lines of instrumentation.

There have been attempts to standardize instrument Asset Management, such as the efforts of the FTD group, but to date most users have gravitated towards specific manufacturer software based upon their Enterprise or Site standard suppliers. This leaves a lot of holes when devices from other suppliers are used, especially niche devices or exceptionally complex instruments, such as analyzers are involved. Most users end up with one package for the bulk of their instrumentation and then a mix of other packages to address the outliers, or no management system for some devices. Unfortunately, manufacturers aren’t really interested in one standard. 

Communications Systems 

As digital instrumentation developed, the data available was still constrained by a single process variable transmitted over the traditional 4-20 ma circuit. The led to development of digital communications methods that would transmit considerable device operation and health data over top of, or in replacement of, the 4-20 ma PV signal. The first of these was the HART protocol developed by one manufacturer but released to the industry as an open protocol. However, other manufacturers developed their own protocols that were incompatible with HART. As with Asset Management software, the market is divided up into competing proprietary offerings and a User has to make choices on what to use.

In the 1990’s, in an attempt to standardize something, the Fieldbus Foundation was established to define interoperable protocols. Maneuvering for competitive advantage led some companies to establish their own consortiums such as Profibus and World FIP that used their own protocols. The field instrument communications world has settled on a few competing and incompatible systems. Today a user basically has to make a choice between HART, Fieldbus, Profibus and DeviceNet, and then use the appropriate, often proprietary, support software and hardware. 

Distributed Control Systems and PLC’s 

1980 is back when programming devices required customized hardware. The PLC had its own suitcase sized computer that could only be used for the PLC. Again, data was reasonably manageable, but a crude by today’s standards. 

Over the years the power of the modules has evolved from the original designs that could handle 8 functions, period, to modules that can operate all or most of a process plant. The industry came up with a new term, ICSS for Integrated. Control and Safety System to describe DCS’s that had been expanded to include PLC functions as well as Safety Instrumented Systems. 

The data involved in these systems has likewise exploded as has the tools and procedures for managing that data.  The manufacturers of the DCS, PLC and SIS systems have entire sub-businesses devoted to the management of the data associated with their systems. 

As with other systems software the available applications are usually proprietary to specific manufacturers. Packages that started out as simpler (relatively speaking) configuration management software were extended to include additional functions such as alarm management, loop turning and optimization, and varying degrees of integration with field device Asset Management Systems. 

Safety Instrumented Systems 

Safety Instrumented System logic solvers were introduced in the earl 1980’s, first as rather expensive and difficult to own stand-alone systems. The SIS’s evolved and became more economic. While there still are stand along SIS available, some of the DCS manufacturers have moved to offering Integrated Control and Safety Systems (ICSS) in which SIS hardware and software for Basic Process Control (BPCS), SIS and higher-level functions such as Historians and Advanced Control applications are offered within integrated product lines.

As with all of the other aspects of support software, the packages available for configuration and data management for SIS hardware and software is proprietary to the SIS manufacturers. 

Operation and Maintenance Systems 

The generalized Operation and Maintenance Systems that most organizations use to manage their maintenance organizations exist and have been well developed for what they do. Typically, these packages are focused on management of work orders, labor and warehouse inventory management and aren’t at all suitable for management of control and safety systems.

Most of the currently available packages started out as offerings by smaller companies but have gotten sucked up into large corporations that have focused on extending of what were plant level applications into full Enterprise Management Systems that keep the accountants and bean counters happy, but make life miserable for the line operations, maintenance and engineering personnel. I recall attending an advanced control conference in which Tom Peters (In Search of Excellence) was the keynote speaker. He had a sub-text in his presentation that he hated EMS, especially SAP. His mantra was “SAP is for saps”, which was received by much head nodding in the audience of practicing engineers. 

Some of the Operations and Maintenance Systems have attempted to add bolt on functionality, but in my view, they are all failures. As described above, the management tools for control and safety systems are fragmented and proprietary and attempting to integrate them into generalized Operation and Maintenance Systems just doesn’t work. These systems are best left to the money guys who don’t really care about control and safety systems (except when they don’t work). 

Process Safety System Data and Documentation 

The support and management software for SIS’s address only the nuts and bolts about programming and maintaining SIS hardware. They have no, or highly limited functionality for managing the overall Safety Life Cycle from initial hazard identification through testing and maintaining of protective functions such as SIFs and other Independent Protection Layers (IPLs). Some of the Operation and Maintenance System suppliers have attempted to bolt on some version of Process Safety Management functionality, but I have yet to see one that was any good. In the last decade a few engineering organizations have released various versions of software that integrate the overall Safety Lifecycle phases. The approach and quality of these packages varies. I’m biased and think that Mangan Software Solutions’ SLM package is the best of the available selections. However, The ARC Advisory Group also agrees.

Conclusions  

The Digital Transformation of Control and Safety Systems has resulted in far more powerful and reliable systems than their analog and discrete component predecessors. However, the software required to support and manage these systems is balkanized mixed of separate, proprietary and incompatible software packages, each of which has a narrow scope of functionality. A typical plant user is forced to support multiple packages based upon the control and safety systems that are installed in their facilities. The selection of those systems needs to consider the support requirements for those systems, and once selected it is extremely difficult to consider alternatives as it usually requires a complete set of parallel support software which will carry its own set of plant support requirements. Typically, a facility will require a variety of applications which include: 

  • Field device support software and handheld communicators
  • Field device Asset Management Software, typically multiple packages if the User uses multiple suppliers
  • DCS/BPCS/PLC/ICSS support software for configuration, alarm management and optimization functions as used by the Site. If a Site has multiple suppliers, then multiple parallel packages are required
  • SIS support software for configuration and software management if not integrated with and ICSS software package. If a Site has multiple suppliers, then multiple parallel packages are required
  • Operations and Maintenance Management packages – selected by others and not within the control of personnel responsible for Process Control and Safety Systems.
  • Safety Lifecycle Management Software – preferably an integrated package that includes Hazard Analysis, Safety Function and System design and Safety Function testing, event data collection and performance analysis and management functions.

So choose wisely.  

Rick Stanley has over 40 years’ experience in Process Control Systems and Process Safety Systems with 32 years spent at ARCO and BP in execution of major projects, corporate standards and plant operation and maintenance. Since retiring from BP in 2011, Rick has consulted with Mangan Software Solutions (MSS) on the development and use of MSS’s SLM Safety Lifecycle Management software and has performed numerous Functional Safety Assessments for both existing and new SISs. 

Rick has a BS in Chemical Engineering from the University of California, Santa Barbara and is a registered Professional Control Systems Engineer in California and Colorado. Rick has served as a member and chairman of both the API Subcommittee for Pressure Relieving Systems and the API Subcommittee for Instrumentation and Control Systems. 

Moving Existing Data into the SLM® solution

When considering whether to move Safety Lifecycle Management into the SLM® solution, the question “What do I do with my existing data?” arises. This was a significant concern when the SLM® software was being developed and has thus been addressed. SLM® software has an Adapter Module that provides the tools for importing data into the SLM® system and exporting data to external systems. Import Adapters use an intermediate .csv file, typically created in Excel, to organize data so that the SLM® software can read the data, create the correct object hierarchy, and then import the data into SLM® software data fields. The software import process is illustrated in the figure below

import_image

During planning for an SLM® software installation, the user and Mangan Software Solution staff will review the data that is available for import and identify what Adapters are needed to support data import. During this review, the linkages between Modules and data objects should be reviewed to ensure that after import objects such as HAZOP Scenarios, LOPA’s, IPL Assets, and Devices are properly linked. If large amounts of data from applications for which an Adapter has not yet been created, it usually is advisable to have the MSS team create a suitable Adapter instead of attempting to use a Generic Import Adapter.

Once the user’s data has been exported to the intermediate .csv file a data quality review and clean up step is advisable. Depending upon the data source, there are likely to be many internal inconsistencies that are much easier to correct prior to import. These may be things as simple as spelling errors, completely wrong data, or even inconsistent data stored in the source application. I recall a colleague noting after a mass import from a legacy database to a Smart Plant Instrument database – “I didn’t realize how many ways there were to incorrectly spell Fisher.”

Once the data has been imported, correcting such things can be very tedious unless you are able to get into the database itself. For most users, errors such as this get corrected one object at a time. However, editing these types of problems out of the .csv file is pretty quick and simple as compared to post import clean up.

To Import the data, the User goes to the Adapter Module and choses the desired Import Adapter and identifies the .csv file that contains the data. The SLM® solution does the rest. It should also be noted that SLM® software is capable of exporting data too. The User selects data types to export along with the scope (e.g. a Site or Unit). The exported data is in the form of a .csv file. This can be used to import data into a 3rd party application, or to use a data template to import more data.

Rick Stanley has over 40 years’ experience in Process Control Systems and Process Safety Systems with 32 years spent at ARCO and BP in execution of major projects, corporate standards and plant operation and maintenance. Since retiring from BP in 2011, Rick has consulted with Mangan Software Solutions (MSS) on the development and use of MSS’s SLM Safety Lifecycle Management software and has performed numerous Functional Safety Assessments for both existing and new SISs.

Rick has a BS in Chemical Engineering from the University of California, Santa Barbara and is a registered Professional Control Systems Engineer in California and Colorado. Rick has served as a member and chairman of both the API Subcommittee for Pressure Relieving Systems and the API Subcommittee on Instrumentation and Control Systems

Digitalizing Safety Information into Intelligence

What is Digital Transformation and how can the SLM® system help?
Digital Transformation is the process of converting non-digital or manual information into a digital (i.e. computer-readable) format. For an organization, a phase of digital transformation is a great opportunity for organizations to take a step back and evaluate everything they do, from the basic operations to complex workflows.

Digital transformation is a time to understand the potential opportunity involved in a technology investment. It’s an ideal time to ask questions, such as ‘Can we change our processes to allow for great efficiencies that potentially allow for better decision making and cost savings.’ A perfect example could be trending data to identify optimum test intervals based on degradation over time. This could provide cost savings in fewer required tests.

Advantages of Digital Transformation

The key tactical benefit of digital transformation is to improve the efficiency of core business processes. In the image below, you can see the efficiencies provided by digital data broken down into three key module areas:

SLM benefits

Digitalizing Safety Information into Intelligence

What is Digital Transformation and how can the SLM® system help?
Digital Transformation is the process of converting non-digital or manual information into a digital (i.e. computer-readable) format. For an organization, a phase of digital transformation is a great opportunity for organizations to take a step back and evaluate everything they do, from the basic operations to complex workflows.

Digital transformation is a time to understand the potential opportunity involved in a technology investment. It’s an ideal time to ask questions, such as ‘Can we change our processes to allow for great efficiencies that potentially allow for better decision making and cost savings.’ A perfect example could be trending data to identify optimum test intervals based on degradation over time. This could provide cost savings in fewer required tests.

Advantages of Digital Transformation

The key tactical benefit of digital transformation is to improve the efficiency of core business processes. In the image below, you can see the efficiencies provided by digital data broken down into three key module areas:

Let SLM® be your Digital Twin

Digital twins are powerful virtual representations to drive innovation and performance. Imagine it as a digital replica of your most talented product technicians with the most advanced monitoring, analytical, and predictive capabilities at their fingertips. It is estimated that companies who invest in digital twin technology will see a 30 percent improvement in cycle times of critical processes.

A digital twin captures a virtual model of an organization and helps accelerate strategy. This could be in products, operations, services, and can even help drive the innovation of new business. The model can identify elements that are hindering or enabling strategy execution and suggests specific recommendations based on embedded pattern recognition. Digital twin technology is used to collect more dots and connect them faster, so you can drive to better solutions with more confidence.

digital_twin_cloud
Today’s organizations are complex, evolving systems, built on the collective ambitions and talents of real people operating in a dynamic culture. The world is increasingly defined by data and machine learning, however, there is no simple way to measure human motivation or clear-cut formula for building an effective future. In a nutshell a digital twin is a tool that can be used to analyze your business to identify potential concerns in any area, and show you how those issues link together. Armed with that information, you can build solutions immediately and overcome the most important obstacles – all before they happen. Get in touch and let our Safety LIfecycle Management tools manage your digital needs.

SLM® for Process Safety Solution

Mangan Software Solutions (MSS) is a leading supplier in the Process Safety and Safety Lifecycle software industry. For the past decade, MSS has been leading the market in innovative technologies for the Refining, Upstream Oil & Gas, Chemical, Pipeline, and Biopharmaceutical industries, transforming Process Safety Information into Process Safety Intelligence. MSS’ engineers and programmers are experts in the fields of Safety Lifecycle Management and Safety Instrumented Systems. With a scalable software platform and years of experience working with the premier energy companies in the world, MSS has established itself as the leader in software solutions engineered specific to the clients’ needs.

SLM® HAZOP Module

With our market leading SLM® software our clients are able to conduct, review, report, and approve HAZOP studies in one place without tedious work in Excel or other closed toolsets that keep you from your data.

The SLM® HAZOP module ensures HAZOP Study uniformity across the enterprise and ensures reporting is standardized and consistent.  It allows direct comparison of hazard and risk assessment between sites or units. Using our SLM® Dynamic Risk Matrix visually identifies enterprise hazards and risk.. The HAZOP Study data can be filtered based on site, unit, health & safety, commercial, or environmental criteria.

hazop-screen

SLM® LOPA Module

The SLM® LOPA module now provides intuitive worksheets to standardize your LOPA process and conduct IPL assessments. The Dynamic Risk Matrix is configurable to your risk ranking system and severities and offers real-time risk monitoring and identification. Dynamic reports and KPIs reveal unmitigated risks to allow for IPL gap closure scheduling and progress status. These reports offer unprecedented review of risk mitigation strategies.

lopa-screen

SLM® Action Item Tracker Module

Identify risks and safeguards and track them with action items from HAZOP meetings through to the implementation of an IPL. The SLM® Action Item Tracker module is a centralized area where users can access assigned action item information pulled from all modules for action or reporting. Data relating to the action item is linked across modules and readily available for reference purposes. Customized reports and KPIs are available with a click of the mouse.

AIT-screen

SLM® Functional Safety Assessment Module

The SLM® Functional Safety Assessment (FSA) module allows you to readily complete a Stage 1 through Stage 5 FSA in a standardized format – ensuring consistency throughout your organization. This tool allows you to define requirements for an FSA and then use the application to improve the effectiveness and efficiency of execution.

Part 2 – Hazard Identification and Allocation of Safety Functions

Digitalization Demands An Integrated Safety Lifecycle Management System (part 1) of this blog series, the general organization of the Safety Lifecycle, as described in IEC 61511, was discussed.  Part 1 highlights the difficulties the application of tools typically used in the day to day operations have with effectively administrating the Safety Lifecycle.

In Part 2 of this blog series, the discussion moves on to a more detailed view of Safety Lifecycle Management for the Requirements Identification phases of the Safety Lifecycle as illustrated in the modified IEC 6111 Figure 1 below.

Safety_Life_Framework

Hazard Identification and Allocation of Safety Functions

While IEC 61511 does not specify procedures, it does require that a hazard and risk assessment be performed and that protective functions that prevent the hazard be identified and allocated as appropriate to Safety Instrumented Functions.

In practice this is usually accomplished by performing a hazard assessment using HAZOP or similar techniques. Scenarios that have a high consequence are then further evaluated using LOPA or similar techniques.The LOPA studies identify protective functions or design elements that prevent the consequences of the scenario from occurring. These functions and design elements are generally designated as Independent Protection Layers (IPLs) and may take the form of instrumented functions such as Alarms, BPCS and Interlock functions, Physical design elements or Safety Instrumented Functions.

The Traditional Way

The market has a number of Process Hazards Assessment (PHA) software available. However, these software tools are all focused on performing PHAs or associated studies such as LOPAs and are almost always stand-alone tools. The capabilities have generally met the needs of Process Safety Engineers yet have had their limitations. Some of the available packages have attempted to extend their functionality to other phases of the Safety Lifecycle, yet they still tend to fall short of being a complete Safety Lifecycle Management function due to their original PHA focus.

Problems

Stand AloneThe biggest issues with stand-alone PHA and LOPA software packages is the fact that they are “stand alone”. They are self-contained and some of them have such draconian licensing restrictions, that sharing of PHA and LOPA data is extremely limited and often limited to transfer of paper copies of reports. Licensing costs are extremely high which results in organizations restricting the number of licenses that are available. Usually, the PHA and LOPA data can only be accessed from a very limited number of computers (often only one or two within an organization), even in view mode.

Difficult to link PHA and LOPAA second major issue is that it is difficult, if not impossible to link PHA and LOPA data for a series of PHA and LOPA studies done on the same process. The typical life cycle of PHA and LOPA studies is that initial studies are done during initial design of a process plant, and then a revalidation of those studies is done every 5 years. Within the 5-year cycle, multiple sub-studies may be done if there are any significant revisions done to the process.

HAZOP of RecordLarger projects may use the same HAZOP tools as used for the HAZOP of Record, but they are usually considered in complete isolation from the HAZOP of Record. Often new nodes are defined that are numbered quite differently than the HAZOP of Record and may not contain the same equipment. As many of these studies are done at an engineering contractor’s office, the same licenses may also not be used. Many smaller modifications may be made that do not use the formal PHA procedure but use perceived simpler methods such as checklists and what-if analysis. The simpler methods are usually resorted because of the extreme licensing limitations noted above.

hazop-lopa

The Independence Mess of Traditional HAZOP Tools

Over a typical 5-year HAZOP cycle, a large number of additional hazard assessments are done, each independent, and often inconsistent with the HAZOP of Record. Project based HAZOPs may be performed on sections of the process with completely different node identifications and node scopes. In effect, there is no current HAZOP of Record as it is partially superseded by these incremental HAZOPs and other hazard assessment. At the time of the 5-year revalidation, integration of all of these independent studies with the prior HAZOP of Record is a major undertaking.

 

As these applications are stand-alone applications, any associations of Safeguards and IPLs identified during Hazard Analysis with the real Plant Assets used to implement those items must be done externally, if it is done at all. This results in a layer of documentation that is often difficult to manage, of limited availability and not very useful to the operations and maintenance personnel that really need the data.

Top 3 Issues with traditional Hazard Identification methods:

Licensing restrictions

Licensing restrictions often severely limit access to the data. Furthermore, personnel that need to understand the reasons for various IPLs do not have access to the necessary data.

No Clearly Defined Data

IPLs and other Safeguards are usually identified in general terms and often do no clearly define what Plant Assets such as Alarms, BPCS Functions, Interlock Functions and Safety Instrumented Functions correspond to the identified IPLs. This is even more of a gap when a User needs to link an existing Plant Asset back to a specific IPL and PHA scenario.

Separate HAZOP and LOPA files

There is no way to integrate HAZOP and LOPAs of Record with incremental HAZOPs, LOPAs, and MOC hazard assessments. This leads to multiple, inconsistent versions of HAZOP and LOPA which then need to be manually resolved, and often are not integrated with the HAZOPs and LOPAs of Record.

5 Major Benefits of Digitalization

 

An Integrated Safety Lifecycle System, provides functionality that addresses the shortcomings of a system that is based upon single purpose HAZOP and LOPA software. Among the functions that are not provided by traditional PHA and LOPA software are:

  • The HAZOP and LOPA modules in the software provide functionality to link HAZOPs and LOPAs that are performed as part of Management of Change activities back to the current HAZOP of Record. This assures that Management of Change PHA’s are consistent with the HAZOP of Record in that the same Nodes, Equipment and Scenarios are copied to the MOC PHA’s and become the basis for the hazard assessments.
  • MOC hazard assessment data may be easily integrated back into the HAZOP of Record when the changes are actually integrated. The original versions are kept as archive records, but the HAZOP of Record may be kept up to date and reflect the actual state of the process, and not what it was several years ago. As the incremental HAZOPs and LOPAs are integrated back into the HAZOP and LOPAs of Record as changes are implemented, there is no large task of sorting out all of the studies done since the last HAZOP of Record into a new HAZOP of Record.
  • Integrated Safety Lifecycle Management applications have global access. Licensing restrictions do not limit access to HAZOP and LOPA data to a few licensed computers. However the Integrated Safety Lifecycle Management applications do contain security functions that allow restriction of data access to authorized Users.
  • IPLs identified by LOPAs are linked directly to the HAZOP scenarios and may also be linked directly to the Plant Assets what implement the IPLs. This means that the Process Safety basis for all IPLs is immediately available to all authorized personnel.
  • Checklists may be associated with IPLs to provide validation of the IPLs ability to mitigate the hazard and its independence from causes and other IPLs. Checklists are available at both the IPL functional level (when an IPL is identified by a LOPA) and a design level (when the Plant Assets that perform the IPLs functions are designed).
Hazop-of-record

Conclusion

The traditional tools used for Process Hazards Analysis severely limit access to Process Hazards data and do not support other activities required to manage the Safety Lifecycle. Process Hazards data is fragmented and requires major efforts to keep the data current.

In an integrated Safety Lifecycle Management application, HAZOP and LOPA data is readily available to any authorized User. This includes the current HAZOP and LOPAs of Record as well as a full history of prior risk assessment studies. The linking of LOPA identified IPLs to real Plant Assets allow for access of the risk assessment basis for all Plant Assets that perform IPL functions from the Plant Asset data. So an operations or maintenance user can clearly understand why various IPL functions exist and the risks that they are mitigating.

Digitalization Demands An Integrated Safety Lifecycle Management System (part 1)

An integrated safety lifecycle management system is crucial to properly manage the entire safety lifecycle from cradle to grave. Anyone who has attempted to manage the Safety Lifecycle has quickly realized that the tools that a typical processing facility uses are wholly unsuited to meet the requirements of the Safety Lifecycle.

Most tools available are single purpose and don’t exchange or share information. The tools available are directed towards managing things such as costs, labor management, warehouse inventory management, and similar business-related functions. The systems upon which these functions are based generally use a rigid hierarchy of data relationships and have little flexibility.

An Integrated Safety Lifecycle Management program must supplement or replace the traditional tools to even be considered. Otherwise, the result is a mix of paper files (or image files on network drives)and a variety of independent word processor and spreadsheet files. Not to mention the procedures for data collection that fall outside of what the traditional plant management tools will do. This places an unreasonable and unsustainable burden on plant personnel. These systems may be forced to work for awhile, but don’t perform well over time. Also, its necessary to consider changes of personnel in various positions that occur.

Safety Lifecycle Management

The Safety Lifecycle is a continuous process that originates with the conceptual design of a processing facility and continues throughout the entire service life of that process. Process Safety related functions start their life during the initital Hazard Assessments when potential hazards and their consequences are evaluated. Protective functions are designed to prevent the consequences of the hazards from occurring and their lifecycle proceeds through design, implementation and operation. As plant modifications occur, the existing functions may need to be modified, may be found to no longer be necessary, or new functions are identified as being required. This results in another trip through the lifecycle as illustrated below.

safety-lifecycle

The Safety Lifecycle IEC Regulations  

 IEC 61511, defines the processes that are to be followed when developing, implementing and owning of Safety Instrumented Systems (SIS). While the scope of IEC 61511 is limited to SIS, the concepts also apply to other Protective Functions that have been identified such as Basic Process Control Functions, Interlock, Alarms or physical Protective Functions such as barriers, drainage systems, vents and other similar functions.

The Safety Lifecycle as described in IEC 61511 is shown in the figure below. This figure has been excerpted from IEC 61511 and annotated to tie the various steps with how Process Safety Work is typically executed. These major phases represent work that is often executed by separate organizations and then is passed onto the organizations responsible for the subsequent phase.

Safety lifecycle management process diagram

1.) Requirements Identification

This phase involves conducting Process Hazards Analyses and identifying the Protective Functions required to avoid the consequences of process hazards from occurring.

The tools typically used for these activities are a Process Hazards Analysis application and Layers of Protection Analysis (LOPA). The CCPS publication Layer of Protection Analysis: Simplified Process Risk Assessment describes the process of identification and qualification of Protective Functions, identified as Independent Protection Layers (IPL’s).

2.) Specification, Design, Installation and Verification

This phase is typically thought of as “Design”, but it is so much more:

The Specification phase is involving specification of the functional requirements for the identified IPL’s. When the IPL’s are classified as Safety Instrumented Functions (SIF), they are defined in a Safety Requirements Specification as defined by IEC 61511. Other non-SIF IPL’s are defined as described in the CCPS LOPA publication, although the concepts defined in IEC 61511 are also an excellent guide.

Once requirements are specified, physical design is performed. The design must conform to the functional, reliability and independence requirements that are defined in the SRS or non-SIF IPL requirements specifications.
The designs of the Protective Functions are installed and then are validated by inspection and functional testing. For SIS’s a Functional Safety Assessment as described by IEC 61511 is performed prior to placing the SIS into service.

3.) The Ownership Phase

This is the longest duration phase, lasting the entire life of the process operation. This phase includes:

Operation of the process and its Protective Functions. This includes capture of operational events such as Demands, Bypasses, Faults and Failures.
Periodic testing of Protective Functions at the intervals defined by the original SRS or IPL requirements. This involves documentation of test results and inclusion of those results in the periodic performance evaluations.

Periodic review of Protective Function performance and comparison of in-service performance with the requirements of the original SRS or IPL requirements. If performance is not meeting requirements of the original specifications, identification and implementation of corrective measures is required.
Management of Change in Protective Functions as process modifications occur during the process lifetime. This starts a new loop in the Safety Lifecycle where modifications, additions or deletions of Protective Functions are identified, specified and implemented.

Final decommissioning where the hazards associated with decommissioning are assessed and suitable Management of Change processes are applied.

Execution Challenges

Execution of the Safety Lifecycle interacts with numerous process management tools. Some of those tools that are typically available are illustrated in the figure below. All of these tools have the characteristics that they are generally suitable for the single purposes for which they were chosen, but all of them have limitations that make them unsuitable for use with a Safety Lifecycle Management process.

The Safety Lifecycle involves numerous complex relationships that cross traditional organizational boundaries and require sharing of data across these boundaries. The tools traditionally used in process operational management just don’t fit the requirements of Managing the Safety Lifecycle. Attempts to force fit them to Safety Lifecycle Management results in fragmented information that is difficult to access and maintain or which is just missing, and which results in excessive costs and highly ineffective Safety Lifecycle Management. The work around become so fragmented and complex, they rapidly become unsustainable.

Execution Challenges

The Value of an Integrated Safety Lifecycle Management System

An Integrated Safety Lifecycle Management System provides the benefits that an organization expects from the protective systems installed in a facility. The System provides fit for purpose work processes that account for the multiple relationships among the various parts of the Safety Lifecycle that traditional tools do not provide. A few of the high-level benefits are:
  • Consistency and quality of data is vastly improved by using common processes, data selection lists, data requirements and procedures that have been thought out and optimized for the needs of managing protective systems.
  • Design of Protective Functions is made much more efficient due to standardization of the information needed and the ability to copy SRS and non-SIF IPL data from similar applications that exist elsewhere in an organization. Design data is readily available to all authorized Users that need that data.
  • Process Safety awareness is enhanced because the Safety Lifecycle Management System provides links between the originating hazard assessments, PHA Scenarios, LOPA’s, LOPA IPL’s and the Plant Assets used to implement the Protective Functions. Authorized users can readily identify Protective Functions and Plant Assets that implement them, and directly access the process hazards for which the functions were installed to prevent.
  • Protective Function and associated Plant Asset performance events can be readily captured with a minimum of effort. The Safety Lifecycle Management System collects all of the event data and automatically produces performance data such as Tests Overdue, Tests, Failure Rates, Tests Upcoming, Demand Rates, Failure Rates and Prior Use statistics on a real time basis. The performance can be reported on a Unit, Site or Enterprise basis and can be categorized by Protective Function type, Device Type, Device manufacturer or similar categories. This allows Users to fully understand the conformance of Protective Function and Device performance relative to their Safety Requirements and identify any performance issues.
 Rick Stanley has over 45 years’ experience in Process Control Systems and Process Safety Systems with 32 years spent at ARCO and BP in execution of major projects, corporate standards and plant operation and maintenance. Since retiring from BP Rick has consulted with Mangan Software Solutions (MSS) on the development and use of MSS’s SLM Safety Lifecycle Management software and has performed numerous Functional Safety Assessments for both existing and new SISs. Rick has a BS in Chemical Engineering from the University of California, Santa Barbara where he majored in beach and minored in Chemical Engineering… and has the grade point to prove it. He is a registered Professional Control Systems Engineer in California and Colorado. Rick has served as a member and chairman of both the API Subcommittee for Pressure Relieving Systems and the API Subcommittee for Instrumentation and Control Systems.

A Holistic Approach to the Safety Lifecycle

Holistic (adj): relating to or concerned with wholes or with complete systems rather than with the analysis of, treatment of, or dissection into parts.

A lot of factors enter into how a Process Safety Culture develops in an organization, but the net result is that either an organization has a positive, effective Safety Life Cycle culture or becomes exposed to major incidents that can cause a business to fail. The history of major incidents in process plants is littered with root causes related to failed Safety Cultures.

A robust Process Safety Management culture in a facility also leads to multiple other improvements. In an operation where Process Safety has a major focus, operators tend to be more attentive to keeping their units stable and on spec and the entire organization tends to be more focused on quality of work. If there is a lax Process Safety Culture, then it is easy for operations to become sloppy and for other groups to just let things slide.

When I stand back a bit and think of what factors determine whether a facility has an effective Safety Culture, the following come to mind. All of these are complex subjects, so only a bit is discussed. However, the combined effect of these items has deep impacts upon whether a Process Safety Culture is positive or toxic. In the end however, people do the work for which they are rewarded, even if it’s just a positive performance review. If Process Safety performance is not a key item on the expectations for an employee’s performance, its probably not going to be something that gets a lot of effort.

Management Attitude

Unfortunately, the number one factor in determining how successful a Process Safety Culture becomes, is the attitude of the management of an Enterprise or Site. I’ve had the fortune and mis-fortune to work in environments where the management had some level of appreciation of Process Safety and work in environments where Process Safety came right after cost, schedule and getting my next promotion (and I hope I get out of here before something goes wrong).

A successful Process Safety Culture, and the Process Safety Management structure that evolves from it, starts at the top. In order to have an effective system, the management of an organization has to demonstrate that Process Safety is as important as the quarterly results. Management has to continue to reinforce that commitment. A basic philosophy has to be defined and spread through the organization, and the expectations of that philosophy need to be rigorously applied at all levels of management and supervision. Failure to meet those expectations has to have real consequences.

Management has to demonstrate a basic knowledge of, and high and continuous interest in the Process Safety Management System. The status of Process Safety needs to be as high on the priority list as more measurable things like production results and costs. Plant staff needs to understand that missing key performance targets for Process Safety functions such as periodic testing, having too many demands or tolerating poor safety function performance have the same consequences as other financially related shortfalls. If management isn’t actively following the Process Safety Life Cycle, they are really telling their staff that they don’t care, and the staff is going to let things slide to pursue things that they think that management cares about.

The systems also have to be robust enough that they become embedded in the organization’s operating culture so that it can survive the changes in personnel, including management, that always happen. Personnel need to have clearly defined responsibilities and be trained to meet those responsibilities. When an individual takes on a new position, the Process Safety responsibilities and procedures need to be part of the transition process. It’s tough to build a Process Safety Culture, but it’s fairly easy to destroy one. When the first question out of manager’s mouth is what does it cost? Or why are you doing that? it’s a good sign that the Process Safety Culture isn’t doing very well.

Information Availability and Training

Part of implementing a robust Process Safety Management System is making sure that all of the personnel that are expected to perform tasks related to the system are fully trained and have access to the information they need. This extends far beyond just the mechanics of performing their assigned tasks.

The training they receive needs to include a clear identification of how their tasks fit in with the Safety Life Cycle Management System, and full training in the underlying process hazards and access to usable reference data. Training needs to be routinely reinforced. Refresher training should be routine and training on changes to Process Safety Systems should be an integral part of the Management of Change procedures. As noted above, Process Safety requirements and procedures need to be part of all transition plans.

Operations personnel in particular require comprehensive initial training and periodic refresher training. Operations personnel need to be fully aware of the protective functions that are installed in their units, what process hazards are responsible for their installation, and how they are operated. Operations supervision needs to take an active role in making sure that this knowledge is current, and operators are routinely drilled in the properly responses to process safety related events.  Procedures for collection of event data for demands, failures, bypasses and similar events need to be reinforced and accurately captured.

Procedures

Written procedures need to be prepared and maintained for Process Safety related activities. This includes validation and periodic testing procedures, operating procedures and procedures for capture and transmittal of Process Safety related events such as Demands, Tests, Failures and Bypasses. These procedures need to be readily available to all individuals whose jobs involve Process Safety, which means just about everybody.

Personal Experience and Biases

Everyone who is part of the Safety Life Cycle comes to the process with their own experiences and biases. The most general categorization is those who have experienced a major incident and those who have not. The members of the those that have group seldom need to be convinced of the need to have a robust and effective Safety Life Cycle Management process.

The those who have not group often are the most difficult to bring into compliance as they often do not recognize the critical value that the process has. This is an especially difficult problem if the members of management at the higher levels believe that “it can’t happen here”. Unfortunately, these folks get multiple opportunities to join the “those that have” group and its just a matter of how severe their lesson is. Trevor Kletz’s books, What Went Wrong, and Still Going Wrong should be mandatory reading for those folks. They need to be convinced that it can happen to them.

Silos, Tribes and Conflict

Every process facility is organized into various departments and work groups. Over time the divisions between these departments and work groups can become tribal with each group working in their own silo and not sharing information. Information becomes power and often isn’t readily shared.

Process Safety Information is unfortunately one class of information that is far too closely held. This is partially due to the isolated nature of the common process hazards analysis software packages, but in some places, especially those with poor Process Safety Cultures, process hazard data is almost treated as a state secret. I recall on multiple occasions attempting to get copies of HAZOP data from a Process Safety Group and getting the equivalent of “who wants to know” before I could force release of the data. Not a healthy environment. Process Safety information was distributed to operations and maintenance personnel in heavily curated forms and very few people had access to the actual HAZOP data.

The same thing can happen between operations, engineering and maintenance groups. They end up performing day to day work in a vacuum and data sharing is determined only by what is available on the common operation and maintenance tools that are available. It isn’t always intentional, that’s just the way the work processes end up dividing people.

Process Safety Management Systems require a lot of data sharing and organizational barriers need to be broken down, or at least partially broken down. In a robust Process Safety Culture, these barriers are not as firm and you see a lot more data sharing that can be observed in organizations that don’t have a good Process Safety Culture.

See how industry leaders like Shell are digitizing their process safety lifecycle!

  System Capabilities, Limitations and PerformanceI’ve long had a private theory that the operating culture in a plant is set by the design, capabilities and failures of the plant’s process control systems. It’s not that personnel set out to make it that way, but over time people adapt their behavior to match what the process control system allows them to do or what the system’s performance and reliability imposes upon them in forced work around or other less than optimum practices. Everything an operator sees on a daily basis is viewed through the lens of the information provided by the process control system and that shapes a lot of culture. This ends up affecting how other organizations behave, as in most facilities operations is king no matter what the organization chart says.

In the same manner, the presence or lack of presence of Process Safety Systems and the importance that the plant management and supervision place on those systems shapes a plant’s process safety culture and determines how effective these systems are. This determines whether they become the assets that were intended to be or become perceived as an obstacle to operations

Poorly designed systems may fail to provide the protection with which they have been credited. Even worse, poorly designed systems result in loss of credibility with the staff that have to work with them. Operators will not tolerate a system that causes false trips, operating difficulty or is just too hard to understand. Before long the systems are disabled, and nobody asks why.

I’ve seen lots of skepticism, some well-earned, from operators when a new safety system was installed. Often, they get handed a system designed by a contractor that had little guidance other than a Project Engineer beating them up for cost and schedule. Upon the first operational difficulties, the criticism starts. In an organization that has a poor Safety Life Cycle management system, the criticism is often just accepted, and management starts hearing the complaints and decides that the safety systems don’t really have much value.

The first requirement is that the design all Safety Related functions get adequate direction and review from qualified engineering staff who are skilled in design for reliability and design of human interfaces and understand how the plant operators view things. When performance issues do occur, the design needs to be looked at to determine where the problem occurred. In some cases, it’s a learning experience as prior poor operating practices may have caused the operators to be careless and allowed the process to go where it should not have gone. In other cases, the protective system operated exactly as it should have, and the operators don’t initially appreciate the bullet they dodged.

Well-designed systems can have the opposite effect. Engineering and Process Safety personnel need to take the performance of the installed protective systems very seriously. These are not install-and-forget systems. Operations often needs considerable hand holding for quite a while after commissioning. This involves continued contact with operations personnel about their experiences and seriously listening to their feedback. Sometimes there are explanations, clarifications and follow up training, but just as often there is something that needs to be fixed.  All trips that occur need to be investigated to determine if a trip was valid and then operations needs to be brought into the loop on the findings.

Sometimes they just have to learn by being saved by a process safety system. I recall installing a rather complex protective system on an FCCU. The operators were very afraid of the system (first question during training – How do I turn it off? Answer – You don’t. Second question What do I do if it trips – Answer – Secure the unit, calm down and then start the restart procedure). It took a lot of convincing to get them to turn on the system and more than a few questions over time about what it really would do.

You could tell it was always on their mind as I seldom could walk through the control room without someone having a question or complaint, but I did make it a point to wander by fairly regularly and start a conversation before I got hijacked. One day they had an equipment failure that resulted in the system tripping the unit. First response was that it was the trip system that caused it. After a couple of days of the investigation, one operator realized that it really was a valid trip, and it saved them from a lot of equipment damage and people getting hurt. The operator passed on his epiphany to others on his crew. The questions stopped and there wasn’t any more grumbling. I knew we had broken through when the operators were reminding each other about putting the system into service before they started back up.

A lot of factors affect how a Process Safety Culture develops in an organization.

 Rick Stanley has over 45 years’ experience in Process Control Systems and Process Safety Systems with 32 years spent at ARCO and BP in execution of major projects, corporate standards and plant operation and maintenance. Since retiring from BP Rick has consulted with Mangan Software Solutions (MSS) on the development and use of MSS’s SLM Safety Lifecycle Management software and has performed numerous Functional Safety Assessments for both existing and new SISs.

Rick has a BS in Chemical Engineering from the University of California, Santa Barbara where he majored in beach and minored in Chemical Engineering… and has the grade point to prove it. He is a registered Professional Control Systems Engineer in California and Colorado. Rick has served as a member and chairman of both the API Subcommittee for Pressure Relieving Systems and the API Subcommittee for Instrumentation and Control Systems.

Non-Instrumented Independent Protection Layers (IPLs) – Hiding in Plain Sight

Non-Instrumented independent protection layers (IPLs) are hiding in Plain Sight. A few of the often-missed non-Instrumented IPL types are often forgotten in the real world. This blog is not a complete list of non-Instrumented IPLs but instead it highlights how easy it is for these types of protective features to be forgotten and unintentionally disabled. An organization needs to rigorously manage these “invisible” IPLs to assure that they receive the maintenance and management of change procedures they require to continue to be able to perform their functions.

When an Organization conducts Layer of Protection Analyses, IPLs are identified by the LOPA teams. As described in the CCPS publication Layer of Protection Analysis: Simplified Process Risk Assessment-there are a wide variety of functions and design features that may be credited as an IPL, provided that other criteria such as the ability to prevent a consequence, independence and availability are also met.

Many Organizations tend to focus on Instrumented IPLs such as Alarms and Basic Process Control System functions. However, there are a significant number of other non-Instrumented IPLs for which credit may be taken. Many of these IPLs are passive or mechanical functions that often fade from the Organization’s attention as they are often look just like non-IPL equipment.

Tank Berms and Dikes

One of the most common non-Instrumented IPLs is installation of Berms and Dikes (Bunds if you are outside of the US) that contain the contents of a storage tank or vessel should there be a loss of containment event. Berms and Dikes get a fair amount of attention during their initial design, but soon become just a background feature in the tank farm. Over time, they can degrade or be compromised by ongoing operations.

One of the more recent and spectacular failures of containment IPLs is the Buncefield storage facility fire that occurred in the UK in 2005. As with most major incidents, there were a number of contributing causes, but one of them was the failure of the tank containing walls to contain the liquid released by a tank failure. This allowed inventory to escape the secondary containment. The investigation of the incident found that seals in the concrete wall containment system had not been maintained and significant material flowed beyond containment.

Drainage Systems

When sizing pressure relief systems, credit is often taken for the presence of drainage systems that will prevent the accumulation of flammable liquids around process vessels. This allows the designers to eliminate or reduce the size of the pressure relief systems for fire cases. A drainage system consists of physical grading or installation of drainage trenches or lines that carry away flammable material to a “safe location”. These systems are usually dry for years and decades and aren’t that hard to compromise. Drains and trenches can become plugged with debris or the “safe containment” area gets compromised or even built over. The Buncefield fire and explosion mentioned above was aggravated by the fact that the drainage systems failed to function as they were designed, and material that leaked from the tank containment did not flow away from the area as intended.

 

Frangible Tank Roofs

Storage tanks are subject to overpressure from a variety of sources as described in API RP-2000. For the more extreme cases, such as external fire, designers may choose to specify that the that tank be constructed with a weak roof to wall connection, or a frangible roof. The design is intended to provide a failure point which would allow a path to relieve vapors generated prior to failure of the tank at more catastrophic locations such as at the floor to wall seam.

The difficulty with constructing tanks with a frangible roof specification is that externally it is extremely difficult to verify that the welds at the roof seam meet the requirements for a weak seam. In tank over pressure audits conducted many years after tank construction, it was found that it was basically impossible to verify that the existing tank roof to wall welds qualified as a frangible roof. During the study, a few reports of welds not meeting frangible roof specifications were found. There is no practical means of testing the seam, so there was little alternative other than to not take credit for a frangible roof, which resulted in retrofit installation of some very large emergency roof vents.

Excess Flow Valves

Excess flow valves are typically installed to prevent the uncontrolled flow of hazardous material from a vessel to the environment should an external failure occur, such as failure of light ends pump seals or other loss of containment events involving equipment downstream of the process vessel. They are also found in transportation applications such as truck loading racks or in pipelines.

In regulated industries such as transportation and pipelines, excess flow valves typically have high visibility and usually get tested and maintained. However, in process applications, this isn’t necessarily the case. Process excess flow valves are often installed at the outlet of a process vessel and are of a design that uses a check valve installed in a reversed position. The check valve is held open by a mechanical linkage that is released by either a fusible link that melts when exposed to a fire or a solenoid valve that releases the linkage, and sometime both.

Once installed, these valves appear remarkably common. They look like most any other check valve and often get ignored and sometimes forgotten about. I recall being in an older process unit on other business when I just happened to notice a couple of wires hanging from an open conduit. In itself this was a big issue as if those wires were energized an ignition event could occur. So, I started to look around and found an old, solenoid operated excess flow valve nearby that was missing its wires. Worse yet, the excess flow valve hadn’t operated. A bit of inspection showed that the solenoid was indeed deenergized, but the mechanical latch mechanism was severely corroded and had not allowed the valve to operate. Even more interesting was when I reported this to the Operations group, they had no idea that the excess flow valve was there. No wonder it never got looked at. The wiring disconnection appeared to be some casual modification that no one had any idea of when or who did it, or why. This incident started a hunt for other excess flow valves in the plant, which turned up another handful of issues. After decades of neglect, the excess flow valves got added to an inspection and maintenance list.

Check Valves

In high pressure applications, such as feed pumps for hydrocrackers, other services where liquid is being pumped from a very low pressure to a very high pressure, or when a high pressure process can back flow into a low pressure process, check valves are often depended on to provide some level of overpressure protection for the low pressure system. API RP-521 recognizes this practice and recommends that credit only be taken for installations consisting of two check valves of differing designs installed in series and describes considerations that should be used in assessing potential leakage through the check valves.

The difficulty in operating these systems is that almost every pump in a process plant has at least one check valve installed in its discharge lines, so keeping track of which check valves are being credited for over pressure protection can be a challenge. It’s quite easy to lose track of these valves and not give them the routine inspection and leak testing required for those services which are being used at IPLs or to reduce the low-pressure relief system requirements. The check valves that are used for these purposes are usually high-pressure designs (2500 or 1500 pound class) and are difficult to maintain due to weight and sometimes being installed with welded ends. At the same time, the hazards of a failed check valve service are quite high, as high-pressure backflow will generally result in the rapid unscheduled disassembly of the low-pressure equipment.

Flame Arrestors

Flame arrestors are static device, usually consisting of some form of metal mesh or similar convoluted flow passages at the location where a tank or other vessel is vented to atmosphere. Flame arrestors are designed to prevent flame propagation from the vent outlet back into the tank or vessel, usually by cooling an external flame and reducing the flame propagation velocity.

Flame arrestors are passive devices and may remain in place for many years without any attention. This often results in the functionality being compromised due to build up of dirt, insect nests, corrosion or other degradation. Flame arrestor design is also based upon a very specific set of conditions such as the flammable material contained in the tank and environmental conditions. It is not that difficult to compromise or plug up a flame arrestor, and there are reports of them failing to function when needed or being found in an inoperable condition when inspections were eventually performed.

Block Valve Lock Systems

In some process designs, safe operation of the process is depended upon block valves installed for maintenance, startup or shutdown operations being kept in specific positions, either opened or closed. For example, pressure relief system designs are often dependent upon block valve installed under pressure relief devices being kept open at all times, or other block valves required to isolate parallel equipment being kept open whenever process fluids are in the system.

Often block valve lock systems are manually managed with only manual monitoring. The physical “lock” varies with the operations, ranging from simple aluminum car seals such as those used on rail cars or truck doors, to new plastic designs, to designs that used metal cables or chains with physical locks. In some cases, an organization will attempt to not use physical barriers and rely only upon hanging warning tags on valves.

Use of block valve lock systems requires that there be a robust administration system whereby the status of all locked open or closed valves are continuously kept and logged, and that procedures to follow when removing or installing a block valve lock/seal and changing of the valve positions are clearly specified and followed. If locking systems are used, an additional layer of tracking of keys is also required.

For a process plant of any size, there may be a large number of block valves that are designated as CSO, CSC, LO, LC etc. (Car Seal Open, Car Seal Closed, Locked Open, Locked Closed). Administration of these valve seals or locks is no small task and more than a few units have failed surveys of their valve lock systems.

Captive Key Systems

Captive key systems are a step above the use of simple valve seals and locks. In most cases, captive key systems are used in applications where a number of valves or other equipment must have their status changed in a specific order. In these systems, the valves or other operating equipment are provided with a mechanism that requires that a key be used to unlock the valve or system for operation. The mechanism captures the initiating key when the operation is performed and releases another key that is used to operate the next valve or system in the sequence. The system has multiple keys, all of which are different. When using a captive key system, the operator starts with an initiating key that is used to operate the first device in the chain. Keys are trapped and released in sequence, with the final device releasing a key that then is stored in a safe location. When the sequence is to be reversed, the operator starts with the final key and the sequence is reversed.

Captive key systems are often used to assure that equipment is safely isolated for entry or maintenance, such as in high voltage electrical systems, or in systems that require a large number of sequential valve movements to isolate equipment such as a spare reactor. The challenges of ownership are administration of the starting and ending keys, so they do not get lost and keeping the various locking mechanisms clean and operable. The use of these systems is often very infrequent and it’s not difficult to lose track of keys or find that the locking mechanisms aren’t working when needed.

Conclusions

Non-Instrumented IPLs have process safety roles that are every bit as important as Instrumented IPLs. However, as they are often passive design features and may be so similar to other equipment, they often fall out of view and fail due to age, neglect or modifications. It is of critical importance that these Non-Instrumented IPLs are clearly documented and that their process safety functions are clearly communicated to Operations and Maintenance personnel so they can be taken into account during Management of Change activities. A system that manages only Instrumented IPLs and does not allow management of Non-Instrumented IPLs is incomplete and can be an obstacle to effective IPL and Process Safety Management.

 

Rick Stanley has over 40 years’ experience in Process Control Systems and Process Safety Systems with 32 years spent at ARCO and BP in execution of major projects, corporate standards and plant operation and maintenance. Since retiring from BP in 2011, Rick has consulted with Mangan Software Solutions (MSS) on the development and use of MSS’s SLM Safety Lifecycle Management software and has performed numerous Functional Safety Assessments for both existing and new SISs. 

Rick has a BS in Chemical Engineering from the University of California, Santa Barbara and is a registered Professional Control Systems Engineer in California and Colorado. Rick has served as a member and chairman of both the API Subcommittee for Pressure Relieving Systems and the API Subcommittee for Instrumentation and Control Systems. 

Bring Back the Basics – A deeper Dive into Practical Field Instrumentation in Process Safety Service

(Blog Series Part 2) General Requirements:
Field-Instrumentation

When it comes to specification of field instrumentation used for Process Safety related applications, designers are often left on their own to specify the instrumentation. SIF input and output performance may be specified in a Safety Requirements Specification, but all too often I’ve observed that these specifications are minimal or go directly to specifying a manufacturer and model number without defining what the performance requirements actually are. I’ve also observed some level of carelessness in specifying field instrumentation for Process Safety related applications. Engineers, being Engineers are often attracted to shiny things. This doesn’t really have a role in selecting field instruments for Safety Related Services. I’ve seen quite a few instances where inappropriate selections were made in pursuit of the bright and shiny.

Field instrumentation must be specified and installed so that it meets the requirements of the IPL or the SIF Safety Requirements Specification. Some of the considerations to include when selecting the field instrumentation and its installation are:

  • Ability to make the required measurements with the required accuracy and response time
  • Independence of the measurement from IPL Initiating Causes and from other IPL’s
  • Ability of final elements to execute the required Safe State actions with the required response time and performance
  • Availability of the instruments and the ability to deliver performance that meets the IPL or SRS Integrity Level requirements
  • Reliability of the instruments and their installation to deliver the SIF Spurious Trip Rate specified in the SRS.
  • Maintainability and Testability of the instruments

 Making the Measurement

Quality input measurements are critical to a SIF’s or other instrumented IFPL’s successful implementation. Selection of input field instruments must be reliable and provide the level of accuracy of the measurements needed to provide a well performing IPL or SIF.

Input measurement instruments should be selected based upon a User Organization’s experience with the measurement methods and the devices specified. They also only need to perform well enough to meet the requirements. Over specifying accuracy or attempting to use measurement methods that don’t have an established track record can result in poor IPL performance. This isn’t always obvious to a lot of SIS engineers or to the engineers specifying the field instrumentation. Specification of new, untried measurement techniques has gotten more than one engineer in trouble in the pursuit the new, bright and shiny. A few applications where this has resulted in problems are highlighted below

  • A Site was moving towards greater use of guided wave radar level measurements. Early experience in a few test applications had been limited, but positive. A decision was made to apply them to a retrofit of a set of reciprocating compressor protective shutdowns. The designs proved to be unreliable for shutdown applications as the measurements would frequently drop out for short periods, but which the shutdown systems saw and shut down the compressors repeatedly. Since then, guided wave instrument manufacturers have refined their designs to address the problems that were causing the instrument signals to drop out.
  • A Site was moving towards the use of vortex shedding flow meters as an alternative to traditional orifice plate and DP transmitter flow measurements. A decision was made to apply them to a process heater low feed shutdown. Further, to save the cost of multiple flow meters a two-head vortex shedder design was specified. During an FSA for the SIS, the selection of this flow meter was questioned. After some investigation it was determined that the two-headed design was a prototype for the manufacturer, and that there was no data at all on its performance. The design had not even been flow tested in the lab. The FSA report recommended that this device not be used in SIF applications.
  • A Process Licensor specified a feed gas specific gravity analyzer be used in correcting a feed flow measurement, and that the corrected value be used in the unit master shutdown. The system was installed as specified by the licensor, and within the first week of operation a spurious trip occurred due to a brief power interruption to the analyzer. It was suspected that a technician had mistakenly shut off power and then turned it back on after realizing their mistake. The shutdown system saw the dip and tripped the unit on low flow. As a result, the analyzer was taken out of the flow correction calculation.
basic-measurements

Some general thoughts on specification of input instrumentation

  • The SRS for SIF’s should specify functional requirements for input instrumentation such as the basic instrument type, required accuracy of the instrument and the trip point, and the measured range. The SRS should also identify the service severity of the process measurement (high temperature, high pressure, corrosive, high vibration, etc.).
  • Input instrumentation should be selected from types for which the User has significant experience and be as basic as possible. Process Safety related instrumentation is not a place to try out new stuff. Complex instrumentation, especially analyzers should be kept out of any shutdown applications. Where possible, the User should have a list of qualified instrumentation, either due to prior use experience or vetting though manufacturers service data or third party certification.
  • Input instrument ranges should be selected to provide a live measurement at normal operating conditions and have sufficient signal margin around the trip point to provide a reliable indication of a trip condition. This can be a problem in some applications such as low flow trip that uses an orifice plate and DP transmitter. There is a square relation between signal and flow, so a trip set point of 10% of the instrument’s flow range is actually only 1% of the DP transmitter signal. This is generally too low of a signal to reliably indicate a trip condition so either the trip point needs to be raised or a flow measurement method that doesn’t have this behavior should be considered.

It should be noted that many times a trip point is specified based upon maintaining a comfortable margin between normal operation and trip conditions, and that sometimes a too wide of a margin is specified. There is a balance between maximizing this margin and specifying a trip point that is too low to reliably measure.

Final Element Design

basic-measurements

Final elements need to be carefully considered for their availability and reliability in being able to bring a process to a safe state. Shutdown valves need to be selected based upon high reliability so they should be of a design where critical parts are not overly subjected to severe process conditions.

For SIF’s the allowable leakage across closed shutdown valves needs to be specified and needs to recognize the realities of how valves behave in real services. Those with operational experience know that no leakage specification survives the first day in process service.

Dedicated shutdown valves with open flow paths and whose critical seating surfaces are protected during the 99.99% of the time the valve is open should be considered. Standard control valve designs typically should not be used where plugs and seats may be subjected to process wear and high pressure drop. Sharing of control valves in shutdown designs is often considered in order to save the cost of a trip valve but in my view doing so to avoid an extra trip valve is usually false economy

Stroking speed needs to be carefully considered and oversized actuators that can break any sticking forces that develop over time are not a bad idea. Actuator designs should be fail-safe and use of double acting actuators or motor actuators should be avoided. Where non fail-safe actuator designs are used, the loss of motive power such as air supply or motor power must be included in PFD calculations and the motive power sources fall under the testing and MOC requirements for SIF’s.

Most shutdown valves use a solenoid valve to vent air off the actuator. Single solenoid valves are subject to spurious trips should a coil burn out or a circuit problem occur. A common application is to use 2oo2 solenoid valves for shutdown valve services. This provides high reliability and allows for frequent online testing. Some of the original home-grown designs with which I was involved were fairly expensive to implement but justified considering the costs of spurious trips. Since then commercial packages with full diagnostic and testing functionalities are now readily available on the market.

Shutdown services may include shutting down of motor driven equipment. Usually a SIF or Interlock that shuts down a motor uses 1 or 2oo2 interposing relays, but often the designer doesn’t consider the reliability of the motor starter itself. This is a particular issue with high voltage, high power motors that use full blown switchgear to operate the motor. These applications usually have an energize to trip motor trip coil, so the performance of this SIF is often dominated by the availability of switchgear power. When a energize to trip motor controller is used, the power systems for that system now fall under the testing and MOC requirements that apply to any SIF.

Independence

The design needs to be independent from other IPL’s or from the IPL’s Initiating Causes. For example, a high-pressure alarm that warns of failure of a pressure control loop requires its own pressure transmitter and process connection.

Instruments associated with a SIF should not share services with BPCS functions, although some organizations allow for minimal sharing, such as allowing one of three transmitters in a 2oo3 voting scheme to share inputs with a BPCS loop. This requires some special attention as now there are two functions involved when maintaining or testing the field instrumentation. In these designs, the transmitter should directly drive SIS inputs using SIS power. The input to the BPCS should be taken from an isolator that will not affect the SIS loop performance if the BPCS circuit is broken

Reliability

While HAZOP’s and LOPA’s concentrate on the required availability of protective functions, in most real plants, reliability is every bit as important. IEC 61511 says almost nothing about reliability and leaves it to the User to figure it out. A key part of specification of any protective function is determining the effect of spurious trips upon a process unit. A spurious trip introduces both process safety risks and commercial risks. It is an adage in the process industries that the highest risk periods are during a unit startup or shutdown. When a unit experiences a spurious trip, the lost production while the unit is secured and restarted, even if no damage has occurred, can be significant. Some processes can take days to restart, and loss of production is measured in the hundreds of thousands to millions of dollars. When an SRS is prepared for SIF’s, the costs and safety risks associated with spurious trips should be identified and specific reliability requirements included.

Process Safety risks are even more of an issue when the shutdown is a crash induced by a protective system’s spurious trip. When a unit is crashed, all sorts of bad things can happen. Hot joints that cool down too quickly can leak or even cause significant loss of containment. Equipment can also be damaged. Some process types are especially intolerant of unit crashes, such of Fluid Catalytic Crackers. A sudden shutdown can result in damage to refractory damage or plugging systems with catalyst or heavy oil. It’s bad enough that an FCC may take days to restart, but if refractory damage occurs, that interruption can spread to weeks and the repair costs can be significant.

The net result is that the costs associated with spurious trips can justify a lot of reliability features in field instrument design. The most common method is the use of voting logic that tolerates a level of faults in the field instrument. Schemes such as 2oo3 or 2oo2d can provide high availability and are tolerant of failures in field instrumentation. I’ve seen places where it was decided to not provide robustness in order to save a transmitter or reduce SIS programming. Those kinds of decisions are usually false economies. Usually the cost of adding robustness to a trip function is covered in the first few minutes after a false trip occurs.

Another aspect of selection of instrumentation for Process Safety Services is to avoid the use of switches. At one time process switches such as float levels, pressure and DP switches, etc. were very common. Their long service history has demonstrated their unreliability. A process switch sits dormant for extended periods and there is no way other than frequent testing to verify whether a process switch is operable or not. As a result of the many improvements made in continuous process measurements and of programmable logic solvers, the justification to use what were perceived as cheaper instruments has become moot. Almost every current standard or recommended practice discourages, or even forbids the use of process switches in Process Safety Services.

Spurious trips can also be significantly reduced by selecting the proper field instrumentation. As discussed above, the field instruments should be a simple and robust as possible. Where an orifice plate and DP transmitter will do, it’s far preferable to use these relatively simple and tried design vs. a complex flow transmitter that may experience signal drop-outs or failure modes that are not well understood.

Accuracy is another area where instrumentation in shutdown services gets over specified. The basis for the specification of a trip point in a SIF or Interlock needs to be clearly understood when field instrumentation is being specified. If the trip point is specified to protect against a specific equipment limit, high accuracy may be required. But if the trip point is needed only to detect when a process has had a failure that can result in hazards, such as a low flow caused by the failure of a pump or compressor, the trip point may be specified only to provide a comfortable operating margin. In these cases, accuracy may not be such a big deal. Attempting to provide highly accurate field instrumentation in a situation where that accuracy isn’t needed can result in designs that are not as reliable as they should be.

Maintain and Test 

All field instrumentation in Process Safety Services needs to be periodically tested. The frequency of testing is based upon the failure rates of the types of devices being used and the performance requirements of the service. Intervals can vary from quarterly to several years apart. In many instances, the test interval is more frequent than the Unit turnaround or outage frequency and provisions must be made for testing while the Unit is in operation.

basic-measurements

Testing provisions can take a number of forms, including

  • Provisions for bypassing of Input signals within the Safety Function logic
  • Provisions for bypassing of commands to final devices
  • Provisions for bypassing of final devices themselves
  • Other on-stream testing provisions such as partial stroke testing

My preference has always been to design testing systems so that the final elements such as valves or motor starters can be bypassed for testing, and that the SIS logic includes detection of when a system is properly isolated and safe for testing. This isolates the trip functions from the inputs and allows for full testing from end to end without running the risk of false trips. Motor starters are difficult to test as generally they can’t be tested without shutting down the motor. Fortunately, the failure rates of motor controllers are low relative to other devices and seldom does the motor controller (other than power for energize to trip switch gear) factor into test interval calculations. However, testing of motor starters at every unit shutdown should be part of any testing program.

Voting systems are usually used in SIF services to provide both availability and reliability. Voting systems can also simplify maintenance of field instrumentation while the SIF and its process are in service. My preference has always been to use 2oo3 voting on inputs, with degradation to 2oo2 during the periods when one of the input instruments has failed. The 2oo3 scheme allows for one instrument to fail and allows one instrument to be removed from service for maintenance with minimum risk of a spurious trip occurring. The fall back to 2oo2d tends to protect against maintenance errors that might take out a second transmitter. In any event, detailed procedures for working on Safety Related instrumentation should be developed during their design.

I also prefer that physical bypass piping and a bypass valve be used with trip valves. In most installations the cost is nominal compared to the costs of a false trip caused during testing. The bypass valve should be equipped with position indication that allows the logic solver to verify that the valves are in bypass before allowing testing to proceed, and which provides an alarm whenever the bypass valve is not fully closed.

Many valve manufacturers offer functionality to support partial stroke testing during ongoing operations. This function results in the valve being commanded to move to some partially closed position, typically 10 to 20% of stroke (90 to 80% open) and then return to a full open position. This hasn’t been a real attractive alternative for me. It complicates the installation, often requiring an expensive smart valve positioner to facilitate the testing in addition to trip solenoid valves. Partial stroke testing also makes life harder for instrument technicians who require special training to partial stroke tests. Newer designs do allow for partial stroke tests to be automatically scheduled and executed, which reduces the probabilities of human error.

There are installations where partial stroke testing is justified, such as a requirement for very frequent testing or for large or very high-pressure valves where installation of a bypass for testing is impractical or just too expensive (that high-pressure piping takes up a lot of space). Some organizations have embraced partial stroke testing, but his requires commitment to fully understanding how these functions work and owning and using the additional software required to set up stroke testing and analyze the results.

Conclusions

The performance and reliability of field instrumentation is critical to the safe performance Process Safety Protective Functions. Proper and complete performance requirements need to be clearly defined in documents such as Safety Requirements Specifications. This includes specification of all set points and the basis for the selection of those set points.

The selection of field instrumentation should be based upon proven performance and well-known characteristics and failure mechanisms. The designs must include provisions for maintenance and testing over the life cycle of the instrumentation. Overall designs should be simple as possible but include provisions for high reliability and simple and safe maintenance and testing. Overly complex selections, or systems not designed for high reliability typically result in poor performance, excessive spurious trips and the accompanying high costs of operation and overall reduced safety of operation.

 

Rick Stanley has over 40 years’ experience in Process Control Systems and Process Safety Systems with 32 years spent at ARCO and BP in execution of major projects, corporate standards and plant operation and maintenance. Since retiring from BP in 2011, Rick formed his company, Tehama Control Systems Consulting Services, and has consulted with Mangan Software Solutions (MSS) on the development and use of MSS’s Safety Lifecycle Management software.

Rick has a BS in Chemical Engineering from the University of California, Santa Barbara and is a registered Professional Control Systems Engineer in California and Colorado. Rick has served as a member and chairman of both the API Subcommittee for Pressure Relieving Systems and the API Subcommittee on Instrumentation and Control Systems.

Bring Back the Basics- Practical Field Instrumentation in Process Safety Service

(Blog Series Part 1) 

The selection of field instrumentation in Process Safety Service can make or break the performance of safety instrumented functions. Poor selection and design can result in unreliable and costly failures and unscheduled shutdowns. In some processes, an unscheduled shutdown can cost millions of dollars as well as expose operating personnel and the community to hazards associated with equipment failure or damage caused by uncontrolled shutdowns. This discussion is directed towards the more practical aspects of specifying and owning field instrumentation that is used in IPL services such as Alarm, BPCS functions and Safety Instrumented System (SIS) Safety Instrumented Functions (SIFs).   

Get a demo of Safety Lifecycle Manager  

The Safety Lifecycle for Process Safety applications runs from the initial conceptual designs for a process unit throughout its life to decommissioning. 

  • The first part of the Safety Lifecycle involves the Hazards Analysis that identifies the need for specific Protective Functions that act as Independent Protection Layers (IPLs). Most of these are instrumented functions that have specified availability requirements. 
  • The second part of the Safety Lifecycle is the detailed specification and design of the field instrumentation sensors and final elements that implement the IPL functions.
  • The third part, and longest duration, of the Safety Life Cycle is the ownership phase where the devices selected are operated, maintained and tested over the life of a process.

Rick Stanley has over 40 years’ experience in Process Control Systems and Process Safety Systems with 32 years spent at ARCO and BP in execution of major projects, corporate standards and plant operation and maintenance. Since retiring from BP in 2011, Rick formed his company, Tehama Control Systems Consulting Services, and has consulted with Mangan Software Solutions (MSS) on the development and use of MSS’s Safety Lifecycle Management software.

Rick has a BS in Chemical Engineering from the University of California, Santa Barbara and is a registered Professional Control Systems Engineer in California and Colorado. Rick has served as a member and chairman of both the API Subcommittee for Pressure Relieving Systems and the API Subcommittee on Instrumentation and Control Systems.

Process Safety Data for Pressure Relief Systems

Pressure Relief Systems have been around a long time, but the documentation of their Process Safety Data has a pretty tortured past. Due to the relatively long history, dating back to the early boilers in locomotives and steamboats, there are lots of regulations that grew a bit haphazardly in response to early incidents.

Most States in the US have regulations for pressure relief devices, however those regulations are all over the place in defining Process Safety Data requirements. Some are focused on Relief Device Inspection, Testing and Repair while others attempt to address other aspects of Pressure Relief System Process Safety Data. Some states hardly address Pressure Relief Systems, while others have provisions that focus on single applications in great detail. Some cities have their own rules independent of State rules.

The National Board of Boiler and Pressure Vessel Inspections has published a summary of the various US and Canada regulations. These regulations are inconsistent from jurisdiction to jurisdiction, are usually incomplete, and often are well out of date.

 Click here to review The National Board Synopsis

While most jurisdictions address basics such as identifying ASME Section I and Section VIII Code (often an obsolete, dated version and not the current version), they generally address only a few of the things an owner should be doing to properly administrate the design, installation, operation and maintenance of Pressure Relief Systems. They can focus on the oddest aspects of relief systems and not address other critical requirements. This can result in organizations deciding they only have to comply with a few requirements, when really, they should be doing much more. 

The overall structure of a robust Pressure Relief System Process Safety Data program is illustrated in the figure below. Each of these items is described in the sections below.

Pressure-relief-system

Pressure Relief System Definition

The first consideration in maintaining Process Safety Information is to have a definition of the scope of a Pressure Relief System. A single Pressure Relief System consists of one or more pressure relief devices, the equipment and piping that the Pressure Relief System protects, and any other relevant equipment such as block valves or supplemental devices. Some jurisdictions require this to be clearly documented but others do not specifically address this requirement.

The Process Safety Information for a Pressure Relief System should include material that clearly defines the scope and boundaries of each System. This may take the form of a sketch that shows all relevant equipment, piping and pressure relief devices or a list that contains the same information.

However, an issue with sketches or lists is that it can be difficult to verify that all equipment in a facility is adequately protected from over pressure. The sketches and lists can effectively document what is protected, but it is a much more difficult task to find what may have been left out. Ideally, a database that allows linking of Pressure Relief System definition to a complete equipment data base should be used. Query of this database can rapidly identify equipment that may not have over pressure protection.

Process Sizing Basis

When a Pressure Relief System is designed, a large number of potential Pressure Relief System sizing scenarios, usually based upon the descriptions contained in API RP-521, are evaluated by a Process Engineering group. The case that requires the largest pressure relieving device sizes is recorded as the sizing case. This is the process data that makes it to the pressure relief device data sheets. The full process analysis often then goes into a box that gets sent to the owner with all of the other design documentation. Often this box goes into long term storage somewhere and is seldom seen again.

Processing facilities are dynamic things, and modifications and improvements often affect the required pressure relief capacity. If the original Relief System sizing calculations are stored in an inaccessible box, every change requires a full re-evaluation of the Pressure Relief System from scratch. This is something that becomes prohibitive and the required evaluations often don’t get done, are highly inefficient, or are inconsistent with the initial evaluations.

Pressure Relief System sizing evaluations and calculations for all of the pressure relief causes should be retained in records that are readily available when the Pressure Relief System needs to be re-evaluated. These can take the form of paper records or scanned records stored digitally, but the key point is that a complete, progressive and up to date set of Pressure Relief System sizing evaluations and calculations needs to be kept in a format and location that allows for efficient recall, review and update.

Relief Device Specifications

The physical specifications for Pressure Relief Devices are typically contained on data sheets used to purchase the devices. These sheets contain physical requirements such as manufacturer, model number, materials of construction, body and process connection ratings, required relief area and the process conditions for the relief device sizing case. Often these data sheets are the only Pressure Relief Process Safety data that is readily available to a Site’s staff and are often erroneously considered to be full documentation.

What is often not appreciated, is that the information contained on a Pressure Relief Device data sheet is only a subset of the full information required to own and maintain the Pressure Relief Devices. These data sheets are used as tool for procurement and contain only the information needed to procure a specific device. They are an integral part of the Relief System Process Safety data, but only a small part. In addition to the Pressure Relief System evaluation records described above, a complete set of manufacturer’s manuals need to be available to identify key parts and dimension tolerances.

Relief Device Installation

The proper functioning of Pressure Relief Devices requires that they be properly installed, and the systems around them such as inlet and output piping, block valves and other support devices be properly designed. For example:

Section VIII spring opposed pressure relief valves have rather narrow requirements for the dynamic inlet losses (the 3% rule) and have similar requirements for both static and dynamic back pressures. Inlet and outlet block valves also have size and design restrictions, such as requirements that all block valves be at least full ported valves that do not restrict flow to the pressure relief device. There are also specific limitations of the static and dynamic piping loads that can be imposed upon pressure relief device bodies.

The Process Safety Data for a Pressure Relief System should include full installation specifications and support calculations. Support calculations should include inlet and output pressure losses and imposed back pressures and piping load calculations, including both relieving and non-relieving conditions. These calculations should be kept in a readily accessible format and locations similar to the requirements for Process Sizing evaluations and calculations.

Associated Relief System Equipment

The design of a pressure relief system dependent upon the capacity of other equipment requires that certain equipment be in service at all times the pressure relief system may have a demand placed upon it.  This equipment and other functions must be identified. Some examples are:

  • The capacity of process equipment such as fired heater duties, pump impeller sizes, etc.
  • Relief systems subject to fire cases require that the assumed vessel insulation is in place and of a design that can withstand external fire
  • HIPPS or other equipment installed to reduce the required relief load are in service
  • Block valve locks and required block valve positions

Pressure Relief System Procedures

Any Pressure Relief System Process Safety Data system includes the procedures required to safely operate the systems. This includes items such as:

  • Safe Operating Limits within which the Pressure Relief System is properly sized
  • Lock Lists and Lock/Car Seal status
  • Permissible line-ups for system with multiple relief devices and installed spares
  • Other operating restrictions required to safety operate the Pressure Relief System

Relief Device Inspection, Testing and Maintenance

Almost all jurisdictions require that Pressure Relief Devices be inspected, tested and repaired at regular intervals, and that the results of these activities be kept in a progressive record. The National Board of Boiler and Pressure Vessel Inspectors VR stamp program defines a set of minimum qualifications, procedures and documentation for the inspection, testing and repair of pressure relief devices. Some jurisdictions make it mandatory that any organization doing inspection, testing and repair of pressure relief devices hold a VR stamp. Other jurisdictions may have other requirements. In any event, a pressure relief system management program should include a robust device inspection, testing and repair program including complete documentation of all events associated with the Pressure Relief System and its devices.

Management of Change

All Pressure Relief Systems are subject to Management of Change procedures whenever modifications are made to process equipment or to the Relief System components. Often de-bottlenecking projects will increase equipment capacity, which then can impact the sizing criteria for the Pressure Relief Systems. Modifications to the pressure relief devices can also affect pressure relief system performance or affect the physical installation of the devices and their associated piping and equipment. Even apparently small changes such as modifying a pump impeller or vessel insulation can result in an improperly sized Pressure Relief System. Therefore, it is imperative that whenever a change is made to a process that its effect on Pressure Relief System design be fully evaluated and the required changes to the Pressure Relief System also be implemented and documented.

 

Rick Stanley has over 40 years’ experience in Process Control Systems and Process Safety Systems with 32 years spent at ARCO and BP in execution of major projects, corporate standards and plant operation and maintenance. Since retiring from BP in 2011, Rick formed his company, Tehama Control Systems Consulting Services, and has consulted with Mangan Software Solutions (MSS) on the development and use of MSS’s Safety Lifecycle Management software.

Rick has a BS in Chemical Engineering from the University of California, Santa Barbara and is a registered Professional Control Systems Engineer in California and Colorado. Rick has served as a member and chairman of both the API Subcommittee for Pressure Relieving Systems and the API Subcommittee on Instrumentation and Control Systems.

How accurate are safety calculations

If you have ever sat in a LOPA, inevitably there is someone that questions the accuracy of one factor or another. Usually they are trying to justify making an initiating cause less frequent or take more credit for an Independent Protection Layer (IPL).

As unsatisfying as it is, assessment of Process Safety Systems is a statistical adventure. And when you get down to it, people just don’t like, or understand statistics. They find it a playground in which to argue that their number is the “right” number. Statistics are real, but people don’t like to believe them. Otherwise casinos would not be in business.

Evaluation of protective function performance requirements and performance of the designs for those functions requires establishment of probabilities for things like how often an initiating event may occur and how effective mitigating functions are likely to be. Deciding what probabilities to use is the hard part. The problem when it comes to Process Safety Systems is that these probabilities are very fuzzy numbers. How accurate are safety calculations, unlike a pair of dice, which have very precisely defined odds of 7 or snake eyes coming up, real process related risk and failure data is less than precise.

The Process Safety Standards and Practices used by the Process Industries have developed over the past 20-30 years and the various factors used in Process Safety analysis have tended to converge on consensus values. The AIChE CCPS publication, Layers of Protection Analysis, provides a fairly complete set of values for LOPA related factors, and various publications on Safety Instrumented Systems provide representative failure rates for commonly used devices. In these instances, the publications note that the factors actually used are up to the User.

One of the things these publications generally state, is that absent any hard data supporting another value, all of the factors used should be no more accurate than a power of 10. So, factors used are values of 10, 100, 1000, or their inverse (10-1, 10-2, etc). Attempting to use any values that have more precision is usually an exercise in self-delusion. Even the factor of 10 practice is only an approximation. However, the recommend values in reputable publications are based upon the collective experience and judgement of some very experienced and pragmatic people. Unless you have lots of actual in-service data, you don’t really know anything better. Use the expert’s numbers.

When working with input data that only has a precision of powers of 10, you also have to be cognizant that the answer you get after stringing a bunch of them together in a Risk Reduction or PFD calculation, isn’t going to be any more precise than the input data. So that RRF calculation that tells you need a SIF with an RRF of 186 is giving you a false sense of precision. It’s not actually 186, it could be 100 or could be 1000.

This is why ISA 84 ed.2 and IEC 61511 only recognize Safety Integrity Levels (SIL) specified in decades – RRF’s 10, 100, and 1000.   When you are calculating a PFD of a SIF design, that 186 is often used as a hard requirement, when in reality it is a very, very fuzzy target. There is absolutely no basis to say that that a calculated SIF RRF of 130 doesn’t meet the requirements of a LOPA target RRF of 186. Given the accuracy of the input values used, 130 and 186 are the same number.

This doesn’t say that a practice of requiring a SIF design to meet a higher (but not precise) target is wrong.  It does give a design target and tends to result in more thought about the SIF designs. However, you shouldn’t fool yourself into thinking that you are being very precise. If it’s a major expense to get a SIF from 130 to 186, think about whether that really is making a difference.

Rick Stanley has over 40 years’ experience in Process Control Systems and Process Safety Systems with 32 years spent at ARCO and BP in execution of major projects, corporate standards and plant operation and maintenance. Since retiring from BP in 2011, Rick formed his company, Tehama Control Systems Consulting Services, and has consulted with Mangan Software Solutions (MSS) on the development and use of MSS’s Safety Lifecycle Management software.

Rick has a BS in Chemical Engineering from the University of California, Santa Barbara and is a registered Professional Control Systems Engineer in California and Colorado. Rick has served as a member and chairman of both the API Subcommittee for Pressure Relieving Systems and the API Subcommittee on Instrumentation and Control Systems.

Assessing SIF and IPL Performance using SLM

84.01.00 and IEC 61511 Part 1, require that the performance of Safety Functions be assessed at regular intervals. The Operate-Maintain Module in Mangan Software Services’ SLM application can provide the user with real time performance data through availability and Probability of Failure on Demand (PFD) calculations performed on a daily basis. This eliminates the need to manually pull records periodically (or spend excessive time hunting, and perhaps not even finding them) and calculate performance.

Performance data is presented in reports Assessing SIF and IPL Performance using SLM displays at the Enterprise, Site, Unit and Function levels.

  • At the Enterprise Level, data is rolled up by Site
  • At the Site Level, data is rolled up by Unit
  • At the Unit Level, data is rolled up by the SIF or IPL Function
Why use Views and Reports?

They allow users to easily monitor and identify trends such as unusually good or poor performance in a Unit or Site and bore down to the bad actors. The API Tier 3 Management View/Report provides a summary of Independent Protective Layer (IPL) performance by IPL type using values calculated from Events. Values such as Demand Rates, Failure Rates on Demand or Test, Overdue or upcoming tests, and the overall availability of SIF and IPL’s are presented. SLM also provides performance reports at each level for categories of Bad Actors (items with poor performance indicators), and the Top 5 Highest Risk SIF’s or HIPPS (functions with the worst performance indicators). All these reports are driven by the powerful SLM Event capture capability. Every Safety Instrumented System (SIS), SIF, IPL or Device contained in the SLM database may have many Events of various types recorded against it. For example, A SIF may have Demand, Bypass, Fault/Failure and Test Events recorded for it. A Device may have Test, Fault/Failure, Maintenance, Calibration and Demand Events recorded for it.

SIF-performance-metrics
Demand-rate-performance-summary
Event Entry Events are entered into SLM using a guided approach that simplifies the Event Entry Process. Events are usually entered at the Function or Test Group level and the User is guided to identify any Devices associated with the Event and whether or not a failure was associated with the Function or Device. Usually data for Events that did not involve a failure are automatically entered by the system to reduce repetitive data entry tasks. SLM is also capable of accepting data from a Process Historian to automatically generate Events such as Demands, Faults/Failures, and Bypasses. The system is designed to allow Users closest to an Event to record the Event. For example:
  • Operators may be assigned Event entry responsibilities for Demand, Fault/Failure, and Bypass Events
  • A Maintenance Technician or Supervisor may be assigned Event Entry responsibilities for Testing and Maintenance Events
  • Engineering may handle other events such as Status changes or Test Deferrals

SLM allows the User to define whether an Event requires Approval before being used in performance calculations. For Event types that require Approval, the primary and secondary Approvers for each Event Type can be independently defined at the Site or Unit levels. Each Event record has a check box which is used to identify if an Event that had a failure was a Safety Related Failure. For example: On a test of a SIF, a shutdown valve was found not to close when it was commanded to do so. When the Test data is entered into SLM, the test on the SIF would be identified as a failed test and the Device Event for the valve would also be identified as a failed test. Both Events would be identified as being Safety Related Failures.
Probability of failure on demand

All of this provides the user with a continually updated view of the performance of Safety Functions at whatever granularity the user needs. Event entry provides an efficient way to assure that performance information is captured at the source. The overall result is an unprecedented level of continuous, highly efficient Safety Lifecycle monitoring.

How to Manage Bypasses Using SLM

IEC 61511 Part 1, contains extensive discussions of the design and operating procedures for SIF bypasses. Clause 16.2 describes operational requirements such as:

  • Performing a hazard analysis prior to initiating a bypass
  • Having operational procedures in place for when a protective function has been bypassed
  • Logging of all bypasses.

How to Manage Bypasses Using SLM – through a robust Bypass management and logging function that meets all of the requirements of IEC-61511 and integrates with the performance analysis and reporting functions of the SLM Operate-Maintain (OM) Module. SLM OM Module uses the Bypass Authorization object and Work Flow to initiate, approve and record the execution of Protective Function Bypasses. SLM does not limit the use of Bypasses to just SIFs. In SLM, any protective function may have a Bypass Authorization associated with it.

The figure below illustrates how the Work Flow supports the tasks required on how to manage bypasess using SLM:

manage-bypasess

A Bypass Authorization is initiated by any authorized SLM user. In practice this will usually be a member of the Operations Staff for a Unit, often a shift foreman or Operations Engineer. The originator is guided to enter the information required to support the Bypass Authorization. This includes:

  • Basic Bypass information such as the reason for the Bypass, the anticipated start date and time and the maximum time which the Protective Function is to be Bypassed
  • Hazard Assessment information – this includes an identification and assessment of the potential severity of hazards that may occur while the Protective Function is Bypassed and the corrective measures that should be taken if the hazard occurs
  • Operation Procedures that are required to be used to mitigate potential hazards that may occur while the Protective Function is Bypassed
  • Identification of the Devices associated with the Protective Function that will be Bypassed

Once this information is provided, the user then may submit the Bypass Authorization for Approval. The Bypass Authorization is submitted to the designated Approver, typically an Operations Supervisor of Manager. The Approver reviews the Bypass Authorization and either Approves or Disapproves the request.

Once the Bypass Authorization is Approved, the Operations team may execute the Bypass as planned. The originator or other authorized user may then record the start date and time of the Bypass in the Bypass Authorization. If a Bypass is expected to exceed the time requested in the original Bypass Authorization submittal, the user may request approval for a Bypass Extension by filling in the data in the Bypass extension section of the Bypass Authorization object. When approved, this will update the authorized Bypass time and prevent the Bypass from being reported as exceeding its authorized time.

When the Bypass is completed, the originator or other authorized user then enters the date and time when the Protective Function is returned to service and then close the Bypass Authorization. Closing of the Bypass Authorization results in the Bypass data being added to the SLM Events that are used to analyze Protective Function and Device performance.

SLM contains performance views that capture all Bypasses for a Function, Unit and Site. The analysis functions include reporting on the number of bypasses, time in bypass and identification of Bypass Events that exceeded the authorized bypass time. SLM will also compute the effect of Bypasses on a Protective Function’s in-service performance, such as computing the in-service reduction in the Functions’ RRF or PFD. Functions that have excessive Bypasses will also show up on the Unit and Site bad actors lists.

 

Rick Stanley has over 40 years’ experience in Process Control Systems and Process Safety Systems with 32 years spent at ARCO and BP in execution of major projects, corporate standards and plant operation and maintenance. Since retiring from BP in 2011, Rick formed his company, Tehama Control Systems Consulting Services, and has consulted with Mangan Software Solutions (MSS) on the development and use of MSS’s Safety Lifecycle Management software.

Rick has a BS in Chemical Engineering from the University of California, Santa Barbara and is a registered Professional Control Systems Engineer in California and Colorado. Rick has served as a member and chairman of both the API Subcommittee for Pressure Relieving Systems and the API Subcommittee on Instrumentation and Control Systems.

I like logic diagrams

I like logic diagrams, although I’m often in the minority, to do the detailed design of a Safety Instrumented Function (SIF). Others believe that Cause and Effect (C&E) Diagrams are all that is needed. My view is that C&E’s are an effective overview tool when it comes to familiarizing people that aren’t part of the design team with a SIF’s function. However, when it comes to clearly and unambiguously defining how a SIF, or any other logic function works, I find the logic diagram to be the best tool. C&E’s just are not very good in conveying stuff like timing or sequencing, rather they are quite confusing. C&E’s are good at saying if an input condition exists, this is the output condition. But more complex things such as timing, permissives and other things that usually exist can’t be readily defined in a C&E. Suddenly you are trying to figure out what Notes 6, 12, and 19 actually mean. Users programming from C&E’s, particularly for larger systems can easily make mistakes or misinterpret the intent. I find myself to be quite capable of misinterpreting all those notes. I like logic diagrams because on the other hand, when done well, they are easy to interpret. The diagram symbols like those identified in a standard such as ISA 5.2 allow the user to clearly define almost everything you need. You can clearly and precisely cover set points, time delays, and complex relationships etc. Personally, I’ve used an amended version where input and outputs get more directly defined as power sources, dead bands for switches, whether the logic operates on an open or closed switch, or energizes or de-energizes a solenoid valve, etc. Logic diagrams also can become a one to one road map for the Safety Instrumented System (SIS) programming. Most SIS programming languages have a function block option. This makes things a whole lot easier and saves a lot of time in programming, quality checking and field testing. It’s true that preparing a good logic diagram takes more time than putting together a C&E, but it’s my belief that you get your money back and then some just in simplicity and reduced errors. It’s simple to put a note on a C&E that someone might easily misunderstand, but in a logic diagram you have to sweat the details up front and not pass it on for a programmer to figure out (and everyone thereafter). I think that logic diagrams are an investment. They take time to prepare yet start paying back at the time of programming. The real payout comes at the time of pre-startup validation. With well-reviewed logic diagrams, the potential errors in programming are pretty much beat out of the system by the time you get to checkout, loop testing, and validation. Well checked and tested systems can radically reduce startup time. An example of what a logic diagram looks like:

A case study I cite is a process unit for which I was the lead control systems engineer. The process required over 100 distinct Interlock, sequence and SIFs as well as required over 400 logic diagrams to represent. We spent a ton of time developing the logic diagrams and a bunch more putting together test procedures and executing them. We taught the Operators to read the logic diagrams, and during the first 6 months of operation, there was always one set or another open on a desk.

It took a lot of effort, and had real costs, when it was all said and done, we only had two logic glitches during startup (one of those was an unauthorized logic change made by a contract engineer at the last minute). The net result was that we were up and making on spec product in the first week of operation. The licensor’s startup engineers were amazed. Their feedback was that they always experienced that there would be 6 to 8 weeks of logic debugging before reliable production could be established. The extra 5 weeks of production paid for all of time we spent on logic diagrams and testing, many times over. That’s why I like logic diagrams.

Rick Stanley has over 40 years’ experience in Process Control Systems and Process Safety Systems with 32 years spent at ARCO and BP in execution of major projects, corporate standards and plant operation and maintenance. Since retiring from BP in 2011, Rick formed his company, Tehama Control Systems Consulting Services, and has consulted with Mangan Software Solutions (MSS) on the development and use of MSS’s Safety Lifecycle Management software.
Rick has a BS in Chemical Engineering from the University of California, Santa Barbara and is a registered Professional Control Systems Engineer in California and Colorado. Rick has served as a member and chairman of both the API Subcommittee for Pressure Relieving Systems and the API Subcommittee on Instrumentation and Control Systems.

Promoting a sustainable process safety culture

Promoting a sustainable process safety culture can be difficult. The Chemical Safety Board just issued their report on the November 2014 incident at the DuPont La Porte plant in which 4 people were killed:

Click here to read the full report.

Promoting a sustainable process safety culture CSB Toxic Chemical release Investigation Report

Summary of the Report 

The report reads like so many others highlighting some issues with promoting a sustainable process safety culture.  The root cause was said to have been the fact that site and corporate management failed to implement Process Safety Management Systems. Furthermore, the report also cites multiple failures in the overall management of Process Safety. The core failure in this incident was that the facility had a very poor Safety Culture as summarized in the excerpt from the CSB Report. The CSB Report devotes an entire section on how the Process Safety deficiencies in this incident parallel the findings in an earlier CSB report on the 2005 BP Texas City Refinery explosion, despite the site being aware of the report and its findings. ®

DuPont La Porte did not use any type of robust, formal process safety culture assessment. Process safety culture can affect whether a site has a sense of vulnerability in terms of process safety management risks; complacency or overconfidence; and transparent, timely, and thorough responses to PSM concerns, action items, and issues, including leadership measures to prevent a “check-the-box” mentality (i.e., simply accomplishing a task as the objective rather than ensuring a high degree of focus on risk management and prevention). The Safety Perception Surveys conducted at the DuPont La Porte facility before the November 2014 incident were designed to lower OSHA total recordable injury rates [59, p. 8]. DuPont did not intend for these surveys to measure or address the perception of process safety performance.

One of the issues with promoting a sustainable Process Safety culture is that it’s hard without the right tools:

  • The tools that an organization typically has to manage day to day operations don’t fit the needs of Process Safety Management.
  • The tools that do exist are typically single purpose and result in isolation of data and limited availability across organizational boundaries.

It’s extremely easy for individuals to become disconnected as they often are not involved in day to day management of Process Safety and it becomes the perception that it’s taken care of by others. Add a management attitude such as existed at La Porte where there was a misplaced emphasis on measuring personnel safety with no corresponding concern for Process Safety and establishment of a sustainable Process Safety culture becomes impossible.

An effective Process Safety Program needs to make Process Safety information widely available to all employees involved in design, operation, and maintenance of the processes in order to keep them all involved on a daily basis. When a site is using the tools traditionally available for plant operations, information becomes isolated from personnel and is difficult to access. 

Safety Lifecycle Manager (SLM®)                                   

The SLM® application was developed to address the many shortcomings relative to Process Safety that exist in many plants today. The origins of the application were a direct result of the CSB and Baker Panel findings relative to the BP Texas City Refinery explosion.

The objectives of the development of SLM were:

  • Unify Process Safety data in one location so it is accessible to all personnel that need it. This includes PHA, LOPA, IPLs, Plant Assets that perform the functions required by IPLs or other functions, and Field Devices that are used to implement the functions and performance records for Plant Assets and Devices.
  • Provide the tools to perform PHA and LOPA studies within the SLM application.
  • Link Plant Assets to PHA, LOPA, and IPL data so the Process Safety requirements such as LOPA IPLs or PHA Safeguards that resulted in the Plant Assets being installed can be directly accessed from Plant Assets and vice-versa. This provides a very clear linkage between real functions and devices in the plant to the Process Safety reasons that they exist. This addresses the very real issue of IPLs not being clearly identified by LOPA Teams, or multiple LOPA teams identifying multiple IPLs that correspond to the same Plant Asset. The figure below illustrates these relationships.
promoting a sustainable process safety culture flow chart linking PHA and LOPA plant assests
  • Provide engineering tools that support design of safety instrumented systems (SIS) and safety instrumented functions (SIF) and maximize consistency of design. This includes tools to rapidly develop Safety Requirements Specifications (SRS) and evaluate SIF Probability of Failure upon Demand (PFD). Functional requirements for Safety Related Devices are clearly defined as are testing requirements and any operational requirements.
  • Provide the ability to define functional requirements for other non-SIF Protective Functions and Safeguards and their field Devices including alarms, interlocks, BPCS functions and non-instrumented protective functions.
  • Provide the ability to upload or link key documents directly with Process Safety related Assets. These may be operation and testing procedures or other documents or drawings.
  • Provide a complete Life Cycle management capability that allows for management of Protective Functions during operation and maintenance after initial design and installation. Events associated with the operation of Safety Related Plant Assets such as Demands, Testing, Bypasses and Failures may be recorded using simplified and consistent Work Flows. The figure below illustrates the relationship of Events to Plant Assets and Devices.
Promoting a sustainable process safety culture flow chart of Operate maintain events
  • Provide a real time view of the performance of Protective Functions and their field Devices based upon the Event data contained in the database. Various Key Performance Indicators (KPI) allow the plant to understand the current state of performance and manage periodic testing and identify bad actors for which corrective action is required. Examples of various KPI’s are shown below.
Promoting a Sustainable Process Safety Culture-failure in demand and bypass graph

How SLM® can help when promoting a sustainable Process Safety Culture

  • Instead of separation of various aspects of Process Safety, they are integrated in one web-based application that makes Process Safety information available to all personnel who have Process Safety interests.
  • SLM® allows personnel with Process Safety responsibilities to participate in Process Safety Management on a routine basis. Furthermore, personnel that are closest to various Events may be assigned responsibility to record those Events.

For example:

Operations personnel are responsible for entering Events for operational related items such as Demands, Bypasses, and Failures while maintenance personnel are responsible for monitoring testing that is approaching its due date as well as entering the results of testing and any other maintenance Events. This keeps personnel involved in the day to day management of Process Safety related Assets.

Rick Stanley has over 40 years’ experience in Process Control Systems and Process Safety Systems with 32 years spent at ARCO and BP in execution of major projects, corporate standards and plant operation and maintenance. Since retiring from BP in 2011, Rick formed his company, Tehama Control Systems Consulting Services, and has consulted with Mangan Software Solutions (MSS) on the development and use of MSS’s Safety Lifecycle Management software.

Rick has a BS in Chemical Engineering from the University of California, Santa Barbara and is a registered Professional Control Systems Engineer in California and Colorado. Rick has served as a member and chairman of both the API Subcommittee for Pressure Relieving Systems and the API Subcommittee on Instrumentation and Control Systems.

How to Use Your Process Historian to Generate Automated Safety Lifecycle Manager (SLM) Events

Capturing Events in SLM generally requires manual entry of data by a user. However, this doesn’t need to be the case. It is possible to automatically extract Event data from a Process Historian. Setting it up takes some initial work, but once the setup has been done, the process of Event generation can be automated First, the user must have tags that exist in the basic process control system (BPCS) and Historian from which the Historian can capture changes in status. These are status tags that signify that an Event has occurred. A few examples of this are:
  • Alarm Activation
  • Safety Instrumented Function (SIF) Demands
  • Manual Trip Commands
  • SIF Bypasses
  • Fault/Failure Diagnostics
  • BPCS demands
In order to leverage this functionality, the underlying BPCS, Safety Instrumented System (SIS) alarm, and status tags need to be developed. See example in the figure below:

Then, once the necessary status data is available in the Historian, an external scanning program needs to be developed that will scan the Historian data for a set of tags on some routine basis, typically daily, but other intervals may be chosen.

The scanning program exports a file with a list of all status changes that occurred over the scan interval. Typically, this file contains the tag number of the tags associated with the status change, the status change (e.g. from Normal to Tripped, Normal to Bypass, Bypass to Normal, etc.) and the time stamp of the status change. On the SLM side, another program, the SLM Import Adapter, examines the Historian export file and generates the associated Event in SLM. In order to do this, SLM needs to have a table of the tags which may have a status change and enough information to allow SLM to generate the Event. Some of the information required is:

  • The Historian tag name and the SLM object name – These should be the same, but there is no guarantee they will be.
  • The type of Event with which the Historian tag is to be associated (e.g. Demand, Bypass, etc.)
  •  A list of Devices associated with the SLM object name for which SLM should create Device Events
  • Whether the Event is to be directly logged in SLM or submitted for Approval.

The SLM Import Adaptor is then used to generate the SLM Events. The Adaptor handles the messy behind the scenes details of creating the Events and any linkages to SLM Parents or Children. 

However, it should be noted that Historian tag status data cannot always provide all the data that a user may want to support SLM’s performance analysis and reporting functions.

For example, a SIF Demand Event generated from Historian data will record a SIF Demand in SLM, but probably won’t have enough data available to verify whether the Demand was executed successfully or identify what Devices were involved in the Demand.

It will usually be necessary for a user to review the automatically generated Event data and supplement it with additional information such as Pass/Fail status or creating or editing Device Events that should be associated with a Demand. This can be addressed by requiring that all automatically generated Events be entered into the SLM database as requiring Approval. This clearly identifies that new Events have been created and allow for review and completion prior to finalizing the Event.

While we have been discussing how SLM Events can be generated from Historian data, the same concepts can be applied to other Events such as Testing and Maintenance Events where data can be extracted from a Site’s Maintenance Management System and imported to SLM Events. 

Rick Stanley has over 40 years’ experience in Process Control Systems and Process Safety Systems with 32 years spent at ARCO and BP in execution of major projects, corporate standards and plant operation and maintenance. Since retiring from BP in 2011, Rick formed his company, Tehama Control Systems Consulting Services, and has consulted with Mangan Software Solutions (MSS) on the development and use of MSS’s Safety Lifecycle Management software.
Rick has a BS in Chemical Engineering from the University of California, Santa Barbara and is a registered Professional Control Systems Engineer in California and Colorado. Rick has served as a member and chairman of both the API Subcommittee for Pressure Relieving Systems and the API Subcommittee on Instrumentation and Control Systems.

Functional Safety Assessment (FSA) – “A” is for Assessment

Functional Safety Assessment VS Functional Safety Audit

When I chair a Functional Safety Assessment (FSA) for a Safety Instrumented System (SIS), there is usually a brief kickoff meeting with the personnel that will be involved in the assessment such as Engineering, Operations, Maintenance and Process Safety. They are often under the impression that they are being audited. However, that isn’t really the case.

IEC 161511 ed. 2 contains the following definitions:

3.2.24 Functional Safety Assessment (FSA):
Investigation, based on evidence, to judge the functional safety achieved by one or more SIS and/or other protection layers.

3.2.25 Functional Safety Audit:
Systematic and independent examination to determine whether the procedures specific to the functional safety requirements comply with the planned arrangements, are implemented effectively and are suitable to achieve the specified objectives.

An FSA is not intended to be a systematic deep dive into all aspects of the execution of Safety Life Cycle requirements. It is intended to be a review of the evidence that an organization can present to demonstrate that their activities, procedures and plans comply with the Safety Life Cycle requirements of the IEC/ISA Standards.

The Standards say that the FSA team shall include one senior competent person not involved in the project design team or involved in operation and maintenance of the SIS. That is an incredibly important requirement. The “senior competent person” needs to have the experience and judgment to know what to look for and to be able to assess what is found.

As that “senior competent person” for most FSA’s of which I’ve been the chair, I tend to take an initial high-level review of the documentation I’ve been provided. I’m not checking all the details. However, over my career I’ve been bitten enough times (sometimes by myself) to be able to sniff out where something is missing or where an organization that has produced a portion of the documentation hasn’t really thought about a particular part of the Life Cycle. That is when it may be time for a selective deep dive.

The important issue is I’m not cross checking every single detail in each document. I’m assessing the overall quality of the documentation given to me, noting what documentation may be missing, and the answers I get when I’m discussing the SIS with various personnel that are involved. Only when I am able to ascertain that something is off is when I begin to devote the time to start looking with a little more attention to detail. When I find things that are incorrect, or incomplete, I will identify them, but I’m not going so deep as to say things like “Step 45 in the proof test procedure isn’t correct”. I’m going to look at the proof test procedure and review it to verify first that it exists and then if executed will it meet its stated objective. The FSA team doesn’t have the time to do a detailed design and documentation quality audit – that is the job of the organization that designs and owns the SIS and it’s an activity that should be done prior to the FSA.

Another aspect is that in most instances, the FSA team has little or no enforcement authority. The team can only identify the issues of concern in the FSA report and recommend actions that should be taken to address gaps. Sometimes the recommendations are very specific things to fix, or there may be long term organizational or procedural issues to address. The management of the organization that will own and operate the SIS has the responsibility to determine how and when to address the recommendations and how seriously they will take a finding of “Functional Safety has not been achieved”.

FSA
Rick Stanley has over 40 years’ experience in Process Control Systems and Process Safety Systems with 32 years spent at ARCO and BP in execution of major projects, corporate standards and plant operation and maintenance. Since retiring from BP in 2011, Rick formed his company, Tehama Control Systems Consulting Services, and has consulted with Mangan Software Solutions (MSS) on the development and use of MSS’s Safety Lifecycle Management software. Rick has a BS in Chemical Engineering from the University of California, Santa Barbara and is a registered Professional Control Systems Engineer in California and Colorado. Rick has served as a member and chairman of both the API Subcommittee for Pressure Relieving Systems and the API Subcommittee on Instrumentation and Control Systems.

What really should be in a Safety Requirements Specification (SRS)?

IEC 61511 ed.2 and ISA 84.00.01 require that a Safety Requirements Specification (SRS) be prepared for each Safety Instrumented System (SIS). Clause 10 describes the requirements for the SRS. Clause 10.3.2 lists the minimum items that shall be addressed in the SRS. A total of 29 items are listed.

In my experience (over 40 years) reviewing SRS’s produced by multiple organizations, the authors typically don’t read or understand the requirements, nor do they understand the overall intent of the SRS. Many I have reviewed have missed the mark badly. In many cases, the SRS has been treated as an “after the fact” document and as such has been bloated with tons of detailed design information while missing a majority of the standard requirements. When you actually dig into an SRS you typically find all sorts of things have been left out while the main focus tends to be on documenting what the engineers did. The sad thing is that some of these SRS’s have been produced by reputable companies that market themselves as SIS experts.

The first thing to know is that the SRS is required to be a “before design starts” document. The intent of the committees that wrote IEC 61511 ed. 2 and ISA 84 was to ensure that the SIS requirements be laid out before design starts, and to define the things required to be addressed during detailed design. The SRS is NOT a detailed design document. The SRS is used to guide detailed design of the SIS and should then be used to verify that the design actually meets the requirements. Any Organization that does not require that a complete SRS be prepared and approved prior to the start of detailed design isn’t in compliance with the RAGAGEP and thus is likely to be spending way too much money during the detailed design phase. If you have an SRS that conforms to the standards, the subsequent detailed design becomes a lot less expensive and is more effective.

That said, an SRS isn’t necessarily a short document, but it also doesn’t need to be a huge pile of papers that most become. AN SRS is not easy to write the first time around-I look back at some of my first efforts and cringe a bit. In order to be effective, there is a lot of learning that needs to happen. It’s really a good idea to be able to have a quality SRS example to work from if you are developing your first one.

Click here to read more about how Safety Requirements Specifications don’t have to be hard or expensive!

An SRS should be focused on the following broad areas. Coincidentally, many of these areas are what are missing from the SRS’s I’ve seen. Within each of these, the items defined in IEC 16511/ISA 84 Clause 10.3 need to be included.

  • Hazard Prevention – The SRS needs to clearly identify the Hazard for which each of its Safety Instrumented Functions (SIF) is intended to prevent and the functions that the SIF’s must perform.
  • Operating Modes – The SRS needs to define when the SIF’s are required to be available and when they are not. The SRS needs to describe how and when a SIF is put into service and also how and when it is bypassed or removed from service. These descriptions need to be explicit and define what the detailed design needs to enable.
  • SIF Performance – The SRS must define the performance requirements for each SIF. IEC 161511 ed. 2 and ISA 84 are big on making sure that the SIF activates upon demand (this is usually categorized as a Probability of Failure upon demand (PFD). They aren’t as focused on the reverse, which is making sure that false trips don’t occur. The owner of the SIS needs to make sure the SRS addresses design requirements for both Availability (PDF) and Reliability (prevention of false trips). This means defining requirements for redundancy, voting groups, and similar design features that promote reliability without compromising availability.
  • Device Functional Requirements – The SRS needs to define the performance expectation for field devices such as:
    • Range
    • Accuracy
    • Response time
    • Shutdown valve stroke time
    • Leakage
    • Certifications for use

These are performance requirements and are not procurement specifications.

  • SIS Design Requirements – The SRS needs to identify the specific SIS and SIF design requirements and they need to address organization and site practices such as:
    • Acceptable component selection
    • Installation requirements
    • Wiring requirements

Note: It’s best that an organization produces a SIS Design Standard as a reference and not try to cram this data into the SRS. Some organizations have two Standards. Once for SIS physical design and installation and a second for SIS application software and programming.

  • Operation and Maintenance Requirements – The SRS needs to define testing and verification requirements for the SIS and its SIF’s over the SIS life. The SRS should define what procedures must exist such as:
    • Operating Procedures
    • Initial Validation
    • Periodic Test Procedures
    • Bypass Procedures
    • Performance Data Records
    • Periodic evaluations

The SRS doesn’t need to include these procedures but needs to identify the requirement that the procedures be developed and used.

What you don’t see in the items listed above is anything about doing data sheets or design drawings, etc. Those all come later and should never show up in an SRS. If an organization wants to produce a design book, that’s fine, but its separate from the SRS. 

srs-checklist
Rick Stanley has over 40 years’ experience in Process Control Systems and Process Safety Systems with 32 years spent at ARCO and BP in execution of major projects, corporate standards and plant operation and maintenance. Since retiring from BP in 2011, Rick formed his company, Tehama Control Systems Consulting Services, and has consulted with Mangan Software Solutions (MSS) on the development and use of MSS’s SLM Safety Lifecycle Management software.

Rick has a BS in Chemical Engineering from the University of California, Santa Barbara and is a registered Professional Control Systems Engineer in California and Colorado. Rick has served as a member and chairman of both the API Subcommittee for Pressure Relieving Systems and the API Subcommittee on Instrumentation and Control Systems.

5 Reasons You Should Invest in a Safety System (other than Safety)

When it comes to Safety Systems, there are two perspectives:

Those who have experienced a major incident  VS  those who have not.

Most of those who have experienced a major incident have no desire to experience another one. Those who have not are often unable or unwilling to recognize that they could be next. Unfortunately, these people far outnumber the enlightened, and more unfortunately they tend to overpopulate corporate management levels meaning they also control the budgets.

If you think “it won’t happen to me”, the perspective you’re taking is most likely a financial one, and short term financial at that. Therefore, aside from the “its illegal” and safety is a requirement argument, a financial argument also needs to be made. Safety can be looked at as an investment for many reasons (the list below is listed in no particular order of importance):

  • Reputation: Major incidents, while infrequent, costs lots and lots of money and can mess up your reputation for years. They can put you out of business and may have severe personal impact.
  • Compliance: HAZOP and LOPA procedures will tell you the exact consequences, how likely these incidents are to occur, and how much impact the consequences could have. You can save more money proactively preventing losses rather than the expense of reacting which can be way more expensive and time consuming.
  •  Responsibility: The argument may be made that the likelihood is “remote”. Often this is code for “Won’t happen while I am here”. The real value is realized when you add up the returns for all of the Safety Systems at a Site or in an Enterprise. The probabilities now become a certainty that one or more of the Safety Systems will have to function. So, it WILL happen while you are here.
  • Return on Investment (ROI) : If you multiply the cost of a consequence by the unmitigated probability of it occurring you get a value of that incident. Do the same thing with the probability adjusted for the presence of a Safety System you get a mitigated value. The difference is value of the Safety System. You can calculate a Return on Investment from that value for major consequences, it’s usually the best ROI you will find anywhere.
  • You save Money! : When a significant number of systems are considered, the value of Safety Systems is certain and can be calculated. It’s likely that a few Safety Systems will pay for the installation and maintenance of all the others, and possibly several times over.

If the management of an Enterprise or Site remains intransigent, it’s probably time to document who made the decision and then go find somewhere else to work. It’s probably not safe where you work now. It’s a sad thing, but very real, that some people just “have to have an incident”.

FREE DOWNLOAD: Learn more about the business reasons supporting investment in an integrated Safety Lifecycle Management program.  

investment in safety

Rick Stanley has over 40 years’ experience in Process Control Systems and Process Safety Systems with 32 years spent at ARCO and BP in execution of major projects, corporate standards and plant operation and maintenance. Since retiring from BP in 2011, Rick formed his company, Tehama Control Systems Consulting Services, and has consulted with Mangan Software Solutions (MSS) on the development and use of MSS’s SLM Safety Lifecycle Management software.

Rick has a BS in Chemical Engineering from the University of California, Santa Barbara and is a registered Professional Control Systems Engineer in California and Colorado. Rick has served as a member and chairman of both the API Subcommittee for Pressure Relieving Systems and the API Subcommittee on Instrumentation and Control Systems.

Thoughts on Prescriptive Design – It Doesn’t Solve Everything, and Sometimes Doesn’t Solve Anything

Some Organization’s feel they can address the requirements of the Safety Lifecycle by developing prescriptive requirements. This can be effective in enforcing some level of conformance with Safety Lifecycle requirements, but it can also have the opposite affect if not done properly.

 1.) Are the prescriptive requirements complete?

Compliance with the Safety Lifecycle is far more than a company standard simply stating “all fired heaters shall be equipped with a system that shuts down the heater upon unsafe conditions”. That is not very useful. The requirements need to be very specific and based upon real hazard assessments. For a prescriptive design program to be effective, the required designs need to address:

  • Anything that constitutes a robust design including identifying specific requirements such as specific required Safety Instrumented Functions (SIF) (e.g. heater fuel gas is shut off when the fuel gas pressure is less than the value required for minimum stable firing)
  • Details such as voting inputs and outputs, physical configuration, component selection, testing, etc.
  • A complete detailed Safety Requirements Specification (SRS)

Additionally, when a Safety Instrumented System (SIS) is designed based upon the prescriptive requirements, it still needs is own application specific SRS. An SRS in a standard can be a good starting point, however it still needs to be adapted to a Site’s practices

2.) Do the prescriptive design standards fall short?

Ownership requirements typically do not address Site organizations and procedures. However, they need to be addressed in order to assure that post design Safety Lifecycle functions (such as testing, performance reporting, performance reviews, training, etc.) are performed. If an organization has good prescriptive design standards they also have to make sure they follow up on the post design requirements.

 3.) Is your overall Safety Lifecycle really complete?

Prescriptive design standards that don’t focus on the overall Safety Lifecycle requirements are often perceived by a Site as the end of the requirements. It’s very easy to get into a “we did what they told us to” culture instead of one that understands the entire Safety Lifecycle and makes it a part of their day to day best practices.

If an Organization chooses to use prescriptive requirements it cannot be thought of as being a complete solution. It’s only a small part of the overall requirements. It may be a starting point, but there is a lot more consider.

The Next Step – Operations:

Make sure all prescriptive design standards are accompanied with very specific Safety Lifecycle requirements for the Operation phase of the Lifecycle. This includes requirements for meeting all of the other specific requirements as well as identification of who is responsible for what tasks and how they should report data. This can be difficult because every Site will want to do things their way unless they are provided lots of incentive. Without some level of enforcement, it’s far too easy for the Operations phase to fall apart with missed or incomplete testing, bypassed systems, poor or no data retention or reporting and no continuous process of reviewing performance and making the necessary improvements.

Learn more about the different roles and responsibilities in the safety Lifecycle.

Rick Stanley has over 40 years’ experience in Process Control Systems and Process Safety Systems with 32 years spent at ARCO and BP in execution of major projects, corporate standards and plant operation and maintenance. Since retiring from BP in 2011, Rick formed his company, Tehama Control Systems Consulting Services, and has consulted with Mangan Software Solutions (MSS) on the development and use of MSS’s SLM Safety Lifecycle Management software.

Rick has a BS in Chemical Engineering from the University of California, Santa Barbara and is a registered Professional Control Systems Engineer in California and Colorado. Rick has served as a member and chairman of both the API Subcommittee for Pressure Relieving Systems and the API Subcommittee on Instrumentation and Control Systems.

Two Major Issues with Managing Safety Lifecycle Data

There are many issues with managing Safety Lifecycle data in a real plant environment. The fundamental problem is that the tools available in the plant are traditional and unsuitable for an effective Safety Lifecycle Management program.  Especially for Instrumented Systems, much less other functions such as non-instrumented systems or relief systems.

Plants that manage the Safety Lifecycle generally have to resort to manual data gathering and retention procedures that are outside of what management perceives as their core tools. This results in systems that are very inefficient making it easy to develop gaps or become out of date.  This is a very common issue in the process industry.

infographic
  • Maintenance Management System (MMS)

There are very traditional Maintenance Management Systems that have been implemented by cost driven projects that are transitioning old home-grown systems to large commercially available systems.  Implementation teams usually have directions– “If the old system didn’t do it, the new one shouldn’t either”. However, the old systems do not include instrumentation or new ideas such as Safety Systems that are needed.

The system is typically driven by Work Order Management, Warehouse stock management, and Maintenance Management of major equipment. Attempts to add Safety Instrumented Systems (SIS), Safety Instrumented Functions (SIF), field instruments and the like are exercises in futility. The system can’t handle the sheer volume and really can’t handle things that have a lot of inter-relationships.  Furthermore, management typically does not support the effort required to input additional data into the MMS even if it is known the system can handle it.

  • Instrument Database

Commonly used commercial Instrument Database applications define such things as instrument data sheets, loop diagrams, wiring, etc. They are typically used for large engineering projects, yet still have issues such as rudimentary maintenance functions and not able to support ongoing events. Attempts to force it to fit needs usually won’t work very well.

In effect, maintenance data for instrumentation is only as good as individual records. The MMS could be used for Work Orders and warehouse stock management, but not much else. All Work Order feedback, when existing, is usually manually entered text and seldom contains useful instrument work information.

  • Process Hazard Analysis (PHA) Records

The Process Safety Group is usually responsible for facilitating PHA’s for the facility.  This includes initial PHA/HAZOP, 5-year revalidations, projects, and in-house Management of Changes (MOC). They use a combination of commercial PHA/HAZOP applications, Excel spreadsheets, as well as both paper and electronic MOC check lists. All of this is typically kept in the group’s records, yet they are exceptionally hard to use for other purposes. The PHA/HAZOP applications also usually have draconian license restrictions which only allow the Process Safety Group to have access.

Every PHA/HAZOP and MOC checklist is usually kept in a separate file which causes major efforts to then find it. Requests for information can be met with a “Who wants to know?” response causing substantial delays in actually getting the information, if it is ever received.

Sometimes, master lists of Independent Protection Layers (IPL) that are identified in the LOPA’s do not correlated to actual plant assets, or even exist at all. The operations and maintenance personnel then have no real knowledge of what the IPL’s are.  Also, they tend to lack knowledge of what hazards led to the requirements for the IPL’s to be there in the first place

  • Document Management

Some facilities have a centralized document system that seems to work. Being able to access scanned or source files for just about any drawing or document in the facility can be useful. However, there is sometimes an unspoken rule that the document system would contain “engineering data only”.  Documents are then to be stored only by Unit and document type. That would work if that’s all that is needed, but if not, don’t even think about asking for a list of documents associated with a piece of equipment, or a Safety Function.

  • Independent Protection Layer (IPL), Safety Instrumented System (SIS) and Safety Instrumented Function (SIF) Management

It is becoming clearer that traditional plant management tools are not able to manage the Safety Lifecycle for Instrumented and non-Instrumented protective functions. Previously, there were no commercially available data management tools, so the effort got reduced to setting up a series of folders on a facility network drive. In attempt to  capture a “dossier” of protective systems, scanned copies of widely dispersed data such as PHA/HAZOP/LOPA documents, SRS’s, test procedures, design documents, and data sheets were stored. Other folders are typically created to provide a place to store operationally related things like scanned copies of completed test procedures and Excel spreadsheets of various events.

This process was used parallel with existing documentation systems because it was the only way that all the relevant information could be collected and made accessible. In theory, these documents were available in other systems yet finding them would be a scavenger hunt if the documents weren’t collected separately. The system is very labor intensive as manual labor is required to collect all the relevant documents and then electronically file them. It was something of an underground effort, as site management didn’t really appreciate the value of the data.  Furthermore, this systems longevity depended heavily upon not having a poor quarter of financial performance.

infographic
  • It can be very difficult to manage the Safety Lifecycle within a plant that only has the traditional commercial Process Safety, Maintenance Management, and Documentation applications that you typically find in any general operation. Some facilities have structured their own system by just filing the relevant documents in a parallel network drive folder, but that isn’t a permanent solution. Safety Lifecycle management requires a separate purpose –built application for proper Safety Lifecycle Management.

    Read more about Justifying Investment in a Safety Lifecycle Management Platform

    Rick Stanley has over 40 years’ experience in Process Control Systems and Process Safety Systems with 32 years spent at ARCO and BP in execution of major projects, corporate standards and plant operation and maintenance. Since retiring from BP in 2011, Rick formed his company, Tehama Control Systems Consulting Services, and has consulted with Mangan Software Solutions (MSS) on the development and use of MSS’s SLM Safety Lifecycle Management software.

    Rick has a BS in Chemical Engineering from the University of California, Santa Barbara and is a registered Professional Control Systems Engineer in California and Colorado. Rick has served as a member and chairman of both the API Subcommittee for Pressure Relieving Systems and the API Subcommittee on Instrumentation and Control Systems.

Benefits of Effective Bypass Management

SA-TR84.01.00 and IEC 61511 ed 2, Part 1, contain extensive discussions of the design and operating procedures for Safety Instrumented Function (SIF) bypasses. Clause 16.2 describes operational requirements such as: 

  •  Performing a hazard analysis prior to initiating a bypass
  • Having operational procedures in place for when a protective function has been bypassed
  • Logging of all bypasses

In addition to the need of managing process hazards while a protective function has been bypassed, the time a protective function is in bypass affects the in-service performance of the SIF. While bypassed, the protective function is unavailable, so every hour of bypass time increases the Probability of Failure upon Demand (PFD) of the function.

The fault tree excerpt below illustrates how bypassing of a SIF’s shutdown valves for 20 hours in a year can significantly affect the PFD. Without any bypasses, the Risk Reduction Factor (1/PFD) of the SIF is 306. The 20 hours of bypass reduces the in-service risk reduction factor (RRF) to 180, or about a 40% reduction in performance.

The Why:

  1. Compliance – The Standards governing the Safety Lifecycle require that bypasses be tracked and define specific information that should associated with each bypass. This is crucial to ensuring overall safety.
  2. Process Safety Management – Excessive bypassing of protective functions has a substantial impact upon overall process safety. Performance of protective functions can be significantly reduced with even moderate levels of bypass. An effective bypass log will help identify bad actors – most bypasses occur for a reason, and if a function is bypassed frequently, it’s typically for the same repetitive reason. 

The Benefits:

  1. Improves Safety and the overall availability through transparent and effective safeguard stewardship – Key Performance Indicators for effective process safety management for safety functions, ensuring the designed integrity is not compromised.
  2. Reduces Operational Risk through effective evaluation and mitigation of occurrences where safety critical functions or equipment is bypassed– visibility of risk, tracking active bypasses, performing override risk assessments prior to bypassing.

 Rick Stanley has over 40 years’ experience in Process Control Systems and Process Safety Systems with 32 years spent at ARCO and BP in execution of major projects, corporate standards and plant operation and maintenance. Since retiring from BP in 2011, Rick formed his company, Tehama Control Systems Consulting Services, and has consulted with Mangan Software Solutions (MSS) on the development and use of MSS’s SLM Safety Lifecycle Management software.

Rick has a BS in Chemical Engineering from the University of California, Santa Barbara and is a registered Professional Control Systems Engineer in California and Colorado. Rick has served as a member and chairman of both the API Subcommittee for Pressure Relieving Systems and the API Subcommittee on Instrumentation and Control Systems.

What Went Wrong With The Process Safety System? Lessons From The 737 Max Crash

When there is a major accident somewhere, you have to investigate what errors might have been made that contribute to it.  Boeing’s problem with the 737 Max crash highlights a few fundamental issues with the process safety system that need to be examined.

4 Lessons Learned: 

1.)  An extremely important part of the specification of a Process Safety System is to seriously consider the effects of a spurious trip on the overall safety of a process. They really should be designed to keep you out of trouble, and not to put you into it. If a spurious trip could at any point drive a process to an unsafe condition, there needs to be some careful thinking about how that unsafe condition can be avoided. In the 737 Max case, there are indications that operation of the Maneuvering Characteristics Augmentation System (MCAS) at low altitudes may have not been examined as carefully as it should have been.

2.)  The second issue is the lack of a robust system. From reports so far, it appears that the MCAS operated based on only one sensor, which made the system much more exposed to a spurious trip. The failure of the one and only sensor resulted in behavior that drove two planes into the ground. When designing a Process Safety System that could have unsafe behavior if a spurious trip occurs, having a robust system is really important in order to cope with any potential errors. Designing a system that prevents the airplane from going down extremely fast deserves more than one sensor. From the reports this appears to have finally dawned on Boeing’s engineers after two crashes in 5 months.

3.)  This might not entirely be the engineer’s fault who designed the system- there was a second sensor known to be available as an option. This suggests managers could have possibly tried to force the unsafe systems to save some money.

4.)  The last issue is relying on people that operate the plane rather than the safety system itself. That only leaves room for human error.   This suggests a robust system wasn’t not required because the pilots were expected to be able to turn off the system if they needed to. This appears to have been successful in some of the reported incidents from US airlines. However, in the actual crashes that occurred, it is being discussed that the flight crews either could not or did not turn off the system. There is some speculation that their training wasn’t sufficient, but in any case, people under a lot of stress tend to forget things and make mistakes.

People can’t be expected to respond to unexpected events reliably. It’s worse if they haven’t been well trained, or it’s been a long time since they were trained. Expecting operator response comes with a burden to train well and train often.

Summary:

The 737 Max crashes are a stark reminder that when designing a safety system responsible for the lives of people, it deserves healthy portions of realism, pessimism, and all around potential risk consideration. You really need to spend time thinking before deciding that a design is acceptable even if there are other pressures from management. 

Infographic of lessons from a plane crash
Rick Stanley has over 40 years’ experience in Process Control Systems and Process Safety Systems with 32 years spent at ARCO and BP in execution of major projects, corporate standards and plant operation and maintenance. Since retiring from BP in 2011, Rick formed his company, Tehama Control Systems Consulting Services, and has consulted with Mangan Software Solutions (MSS) on the development and use of MSS’s SLM Safety Lifecycle Management software.

Rick has a BS in Chemical Engineering from the University of California, Santa Barbara and is a registered Professional Control Systems Engineer in California and Colorado. Rick has served as a member and chairman of both the API Subcommittee for Pressure Relieving Systems and the API Subcommittee on Instrumentation and Control Systems.

Issues With Managing Process Hazard Analysis (PHA) Data

National and local regulations require that all process operations have a formal Hazards Analysis performed on the original installation as well as for all modifications to the facility. Most regulations also require that the Process Hazard Analysis (PHA) of record be re-validated at regular intervals, such as the 5-year re validation cycle required in the US.

PHA is a complex tool used during the lifecycle of a facility and two of the biggest issues with them are coordination and consistency (see figure 1 below). A PHA of Record represents a point in time, but in reality plant cycles are not static. They are actually very dynamic with multiple independent modifications in progress. Some records are implemented even though the plant is in operation while a backlog of modifications are scheduled for the next turnaround. They start collecting the day the plant is started up after its last turn around. Every time a plant is modified, some form of PHA is performed. The scope of these modifications can range from a small in-house modification to large projects that expand, de-bottleneck, or fix the process.

Figure 1:

issues to consider info graphic

So, in facilities, the Process Safety Management (PSM) Teams are faced with the almost impossible task of monitoring and collecting all of the completed Hazard Assessments and incorporating them into the PHA of Record as the modifications are implemented. If this hasn’t been done as time goes along, the PSM team then has an even bigger job of collecting all the incremental changes and identifying how they relate to the PHA of Record before they start the Re-validation process. All is a lot of work and consumes several full-time equivalents of work just to keep up. Most places don’t have these resources, so they make due as best they can.

 Rick Stanley has over 40 years’ experience in Process Control Systems and Process Safety Systems with 32 years spent at ARCO and BP in execution of major projects, corporate standards and plant operation and maintenance. Since retiring from BP in 2011, Rick formed his company, Tehama Control Systems Consulting Services, and has consulted with Mangan Software Solutions (MSS) on the development and use of MSS’s SLM Safety Lifecycle Management software.

Rick has a BS in Chemical Engineering from the University of California, Santa Barbara and is a registered Professional Control Systems Engineer in California and Colorado. Rick has served as a member and chairman of both the API Subcommittee for Pressure Relieving Systems and the API Subcommittee on Instrumentation and Control Systems.

The Biggest Problem with Safety Lifecycle Management Roles and How You Can Fix It

The problem: Many organizations are trying to figure out how to manage the Safety Lifecycle.  Leadership will often end up just appointing someone as “the SIS guy”.  Sometimes the role is given a fancy title, but the intent is to assign the issue of the entire Safety Lifecycle to someone. That someone is usually in engineering, who may or may not actually have the skills to take on the task. Leadership then thinks they’ve done all that is necessary. The poor person who gets handed the responsibility usually doesn’t have any authority to go along with it, but they are somehow expected to “make it happen”. The organizations that try this approach typically fail.

There are enough responsibilities to go around. An effective Safety Lifecycle Management program recognizes this and clearly identifies who has what responsibility focusing on each area of the Safety Lifecycle. If there is a central authority, they are given a very big management stick to hold every department fully accountable.

 Three main phases to consider:

 1.) Requirements Identification

This is where the Process Safety function in an organization has responsibilities that include: 

  • Define Risk Management Standards
  • Facilitate HAZOP and LOPA Studies
  • Clearly communicate results to the Engineering, Operations and Maintenance personnel

 The Requirements Identification process continues as modifications are made, new processes are added, and periodic re-validations are needed. Other groups with Safety Lifecycle responsibilities also fully participate in the identification of protective system requirements but the Process Safety function is in charge of it.

2.) Specification, Design, Installation

This is typically an Engineering Group that translate the basic requirements from the Process Safety function into real designs, as well as implement them. The responsibilities include:

  • Prepare the Safety Requirements Specification (SRS)
  • Assure that the design meets the SRS
  • Follow the detailed design
  • Inspect and validate testing

Along the way they are also responsible for assuring that all testing and maintenance procedures are prepared and approved. Also, engineering personnel will often be responsible for monitoring the performance to identify any changes that are needed to continue to meet the requirements of the design. 

This requires that events such as faults, failures, bypasses, demands, and testing, etc. be reported to the personnel responsible for evaluating performance. The personnel that assess protective system performance are also responsible for reporting the results to all groups that have related responsibilities, including Site Management.  

3.) Ownership

This starts during design while Operations and Maintenance procedures are being prepared.  Operations and Maintenance are trained and qualified personnel.

The  Operations responsibilities include:

  • Ensure that protective systems are operated in accordance with SRS requirements and operating procedures
  • Record all operational related events
  • Report all events to the personnel responsible for assessing the protective systems performance
  • Monitor testing requirements to make sure that all required testing is performed on schedule and according to testing procedures

The Maintenance responsibilities include: 

  • Perform testing and repairs as required by the schedule and procedures
  • Any repairs needed between testing intervals
  • Maintain all testing and repair records and report them to the personnel responsible for assessing protective systems performance
  • Schedule and plan period testing

The Fix

Because the Safety Lifecycle is a multi-organization endeavor, a collaborative approach is ideal.  The need for communication among various responsible groups requires clearly identified roles of responsibility for each area of the Safety Lifecycle. The Safety Lifecycle is not something that can just be handed off to anyone.  It’s a deep organizational commitment that involves qualified personnel doing specific parts of the job.  Management also needs to provide proper oversight to assure that the process is followed as required.

When Should You Conduct a Functional Safety Assessment (FSA)?

The ISA 84.01.00 and IEC 61511 ed. 2, Part 1, clause 5.2.6 safety standards require that every safety instrumented system (SIS) shall have a Functional Safety Assessment (FSA) performed prior to being placed into service. A FSA is required in order to provide assurance that a SIS has been specified, designed, and tested in accordance with all phases of the Safety Lifecycle. These Standards identify 5 stages at which an FSA may be performed.

 The 5 STAGES

FSA-stages

However, as is the case with safety standards, they don’t exactly explain when to perform the assessment rather leave the actual scheduling to the User. It is important to consider factors such as size, complexity experience, etc.

Important Factors to Consider:

  • If an organization is new to managing the Safety Lifecycle, it is a really good idea to conduct a FSA at each of the Stages identified in the Standards:

1.) First, after the HAZOP and LOPA’s have been performed and the Safety Requirements Specification (SRS) has been developed, perform the Stage 1 FSA on those items.

2.) Next, after the SIS design has been completed, perform the Stage 2 FSA on the design.

3.) Then, prior to startup, perform a Stage 3 FSA to assess the installation, testing and validation of the SIS and its Safety Instrument Functions (SIF).

This incremental process allows the newbie organization to learn about FSA’s and allows the organization to close any gaps identified with a minimum of impact on the overall project.

  • The same multiple stage procedure should also be followed on large projects where the SIS is only part of a larger design. Larger projects develop a momentum therefore, timely checks on the compliance of SIS specification and design are necessary to avoid substantial impacts on the project schedule and avoid expensive re-work.
  • An organization that is very experienced with SIS design and ownership may choose to defer the FSA until prior to startup. This scheduling assumes that the organization has well defined Safety Lifecycle procedures and standards and therefore have confidence that an FSA performed late in the SIS specification and design process will not identify serious gaps that might delay the startup. 
  • If a new SIS is being installed on an existing process or an existing SIS is being modified, it is usually during a unit turnaround. SIS and SIF testing and validation is usually the last step, so it is important to keep in mind that the operations team that takes part in the startup process might not appreciate waiting on an FSA.

This is why it’s a really good idea to have the FSA completed except for final items such as testing, validation and training assessments. This allows the FSA team to do a quick final assessment of these items and provide that necessary “Functional Safety has been achieved” guidance.

ADDITIONAL CONSIDERATIONS with STAGE 4:

It’s important to recognize that throughout the service life of a SIS, multiple Stage 4 FSA’s will need to be performed to assess in-service performance. Once again, the standards don’t define how often this occurs. We can figure on a nominal 4-5 years between Stage 4 FSA’s, but it all depends upon the process requirements and actual experience.  A poorly performing SIS or SIF may need more frequent assessments. The Stage 4 FSA’s should generally be scheduled ahead of turnarounds to allow time for any needed corrective measures to be implemented within the turnaround.  All this requires the performance data of the SIS (demands, failures, faults, testing history, etc.) be kept, managed, and organized in a quickly accessible manner. 

pro-tip
Rick Stanley has over 40 years’ experience in Process Control Systems and Process Safety Systems with 32 years spent at ARCO and BP in execution of major projects, corporate standards and plant operation and maintenance. Since retiring from BP in 2011, Rick formed his company, Tehama Control Systems Consulting Services, and has consulted with Mangan Software Solutions (MSS) on the development and use of MSS’s Safety Lifecycle Management software. Rick has a BS in Chemical Engineering from the University of California, Santa Barbara and is a registered Professional Control Systems Engineer in California and Colorado. Rick has served as a member and chairman of both the API Subcommittee for Pressure Relieving Systems and the API Subcommittee on Instrumentation and Control Systems.