Top Banner
NERC | Southwest Cold Weather Event of February 1-5, 2011 NERC Event Analysis Cause Codes | December 31, 2012 1 of 14 Southwest Cold Weather Event of February 1-5, 2011 NERC Event Analysis Cause Codes December 31, 2012
14

Southwest Cold Weather Event of February 1-5, 2011 NERC ... 2011 Southwest Cold Weather Eve… · Appendix: About This Report ... anticipated generator availability. ... (unless committed

Mar 29, 2018

Download

Documents

hoangthuy
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Southwest Cold Weather Event of February 1-5, 2011 NERC ... 2011 Southwest Cold Weather Eve… · Appendix: About This Report ... anticipated generator availability. ... (unless committed

NERC | Southwest Cold Weather Event of February 1-5, 2011 – NERC Event Analysis Cause Codes | December 31, 2012 1 of 14

Southwest Cold Weather Event of February 1-5, 2011 NERC Event Analysis Cause Codes

December 31, 2012

Page 2: Southwest Cold Weather Event of February 1-5, 2011 NERC ... 2011 Southwest Cold Weather Eve… · Appendix: About This Report ... anticipated generator availability. ... (unless committed

NERC | Southwest Cold Weather Event of February 1-5, 2011 – NERC Event Analysis Cause Codes | December 2012 2 of 14

Table of Contents Table of Contents ......................................................................................................................................................................... 2

Executive Summary ..................................................................................................................................................................... 3

Purpose ........................................................................................................................................................................................ 4

The Southwest Cold Weather Event ............................................................................................................................................ 5

Cause Code Assignment ............................................................................................................................................................... 6

Root Cause ........................................................................................................................................................................... 6

Contributing Causes ............................................................................................................................................................. 6

Appendix: About This Report ..................................................................................................................................................... 14

Page 3: Southwest Cold Weather Event of February 1-5, 2011 NERC ... 2011 Southwest Cold Weather Eve… · Appendix: About This Report ... anticipated generator availability. ... (unless committed

NERC | Southwest Cold Weather Event of February 1-5, 2011 – NERC Event Analysis Cause Codes | December 31, 2012 3 of 14

Executive Summary The North American Electric Reliability Corporation (NERC) Reliability Risk Management (RRM) Group reviewed the February 2, 2012 Southwest Cold Weather Event Report (SW_Cold_Weather_Event_Final_Report – identified in the NERC tracking system as TR20110202_Winter_Weather_Event) to determine the appropriate cause codes for trending purposes. While no single root cause was determined, there were a number of contributing causes which were found to exist in multiple situations. Overall, NERC affirms that management and organization challenges were some of the largest contributing factors involved in this event. In addition, corrective actions that were previously recommended, but were not institutionalized, played a major role. There were also maintenance issues, communication and training problems and overall human performance issues that contributed to this major generation and load shedding event on the North American Bulk-Power System (BPS).

Page 4: Southwest Cold Weather Event of February 1-5, 2011 NERC ... 2011 Southwest Cold Weather Eve… · Appendix: About This Report ... anticipated generator availability. ... (unless committed

Purpose

NERC | Southwest Cold Weather Event of February 1-5, 2011 – NERC Event Analysis Cause Codes | December 2012 4 of 14

Purpose Each Event Analysis Report is reviewed to determine as appropriate a cause code or codes. These codes provide a means to effectively label and catalogue various causes. These codes may then be analyzed and trended across events with similar causes or factors. Analysis should lead to the proper application of risk management procedures to develop and implement appropriate corrective and proactive actions.

Page 5: Southwest Cold Weather Event of February 1-5, 2011 NERC ... 2011 Southwest Cold Weather Eve… · Appendix: About This Report ... anticipated generator availability. ... (unless committed

The Southwest Cold Weather Event

NERC | Southwest Cold Weather Event of February 1-5, 2011 – NERC Event Analysis Cause Codes | December 2012 5 of 14

The Southwest Cold Weather Event The arctic cold front that descended on the Southwest during the first week of February 2011 was unusually severe in terms of temperature, wind, and duration. In many cities in the Southwest, temperatures remained below freezing for four days, and winds gusted in places to 30 mph or more. The geographic area affected by this weather extreme was also extensive, complicating efforts to obtain power and natural gas from neighboring regions. This winter storm, however, was not without precedent. There were prior severe cold weather events in the Southwest in 1983, 1989, 2003, 2006, 2008, and 2010. The worst of these prior storms was in 1989, and it was most comparable to many of the circumstances realized in the 2011 cold weather event. The cold weather event in 1989 marked the first time ERCOT resorted to system-wide rolling blackouts to prevent more widespread customer outages. In all of those prior years, the natural gas delivery system experienced production declines; however, curtailments to natural gas customers in the region were essentially limited to the years 1989 and 2003. Going into the February 2011 storm, neither ERCOT, nor the other electric entities that initiated rolling blackouts during the event, expected to have a problem meeting customer demand. They all had adequate reserve margins, based on anticipated generator availability. But those reserves proved insufficient for the extraordinary amount of capacity that was lost during the event from trips, derates, and failures to start. In the case of ERCOT, where rolling blackouts affected the largest number of customers (3.2 million), there were 3100 MW of responsive reserves available on the first day of the event, compared to a minimum requirement of 2300 MW. But over the course of that day and the next, a total of 193 ERCOT generating units failed or were derated, representing a cumulative loss of 29,729 MW. Combining forced outages with scheduled outages, approximately one-third of the total ERCOT fleet was unavailable at the lowest point of the event. These extensive generator failures quickly depleted ERCOT’s reserves. If ERCOT had not acted promptly to shed load, it would very likely have suffered widespread, uncontrolled blackouts throughout the entire ERCOT Interconnection. ERCOT also experienced generator outages in the Rio Grande Valley on February 3, again due to the cold weather. This area is transmission constrained, and the loss of local generation led to voltage concerns that necessitated localized load shedding. El Paso Electric Company (EPE) and Salt River Project (SRP) likewise suffered numerous generator outages, necessitating load shed of 1023 MW in EPE’s case, and 300 MW in SRP’s case. As with ERCOT, many of these generators failed because of weather-related reasons. A number of entities within Southwest Power Pool (SPP) also experienced outages during the event. In their case, however, load shedding was not required, principally because the utilities were able to purchase emergency energy. The majority of the problems experienced by the many generators that tripped, suffered derates, or failed to start during the event were attributable, either directly or indirectly, to the cold weather itself, including frozen sensing lines, frozen equipment, frozen water lines, frozen valves, blade icing, low temperature cutoff limits, and the like.

Page 6: Southwest Cold Weather Event of February 1-5, 2011 NERC ... 2011 Southwest Cold Weather Eve… · Appendix: About This Report ... anticipated generator availability. ... (unless committed

Cause Code Assignment

NERC | Southwest Cold Weather Event of February 1-5, 2011 – NERC Event Analysis Cause Codes | December 2012 6 of 14

Cause Code Assignment In this document, where “evidence” citations are pulled directly from the report, the page number where the citation can be found is cited. The initiating event is captured first in this report, followed by contributing causes grouped together under their “A-node” titles, in alphanumeric sequence; therefore sequence does not necessarily imply importance. While cause sequence is instrumental in a cause analysis, to determine “what caused what”, the complexity and magnitude of this event makes that pursuit, at least for this cause code report, very difficult to document. The NERC cause coding process was still very new at the time the event report was completed and no cause codes were submitted for this event by any of the associated registered entities As with the registered entities, the NERC cause coding process was still new at the time the event report was completed and no cause codes were submitted for this event by any of the associated Regional Entities After a month of review and analysis and with consultation with the Regional Entities involved, the following codes were determined to be associated with this event by the RRM group.

Root Cause No root cause identified. There are far too many entities, generators and problems associated with this event to identify a single root cause. In an effort to maintain subjectivity in the prioritization of causes, the initiating event is the first cause code, with the remaining cause codes listed in alphanumeric sequence.

Contributing Causes

A7B1C01 – Weather or ambient conditions LTA; Definition - Unusual weather or ambient conditions, including hurricanes, tornadoes, flooding, earthquake, and lightning. Note: This is actually a “nature of occurrence” rather than a true apparent cause. In other words, this is “what” happened rather than “why” it happened. If the event did not take into account the effects of weather or ambient conditions on the facility try Design Input LTA [A1B1], Operability of Design/Environment LTA [A1B5], or Change Management LTA [A4B5].

This is identified as the “Initiating Action” – what caused the event to start, though not necessarily the cause of why the event grew beyond its start.

Temperatures were considerably lower (15 degrees plus) than average winter temperatures, and represented the longest sustained cold spell in 25 years. Steady winds also accelerated equipment heat loss. However, such a cold spell was not unprecedented (Page 195).

A1 Design/Engineering Problem

A1B2C01 - Design output scope LTA; Definition - The design did not consider all the possible scenarios. All the operating conditions, [normal and emergency] were not included in the design.

(Report recommendation #12, page 204). Consideration should be given to designing all new generating plants and designing modifications to existing plants (unless committed solely for summer peaking purposes) to be able to perform at the lowest recorded ambient temperature for the nearest city for which historical weather data is available, factoring in accelerated heat loss due to wind speed.

o The ideal time to prepare a generating unit to withstand cold temperatures is in the design stage. For that reason, the low temperatures and wind chills that can occur during the occasional severe storm should be incorporated in the design process.

Report recommendation #13, page 204). The temperature design parameters of existing generating units should be assessed.

o The task force found that for existing generating units, it is often not known with any specificity at what temperature the unit will be able to operate, or to what temperature heat tracing and insulation can prevent the water or moisture in its critical components from freezing. For that reason, Generator Owner/Operators should conduct engineering analyses to ascertain each unit’s operating parameters, and

Page 7: Southwest Cold Weather Event of February 1-5, 2011 NERC ... 2011 Southwest Cold Weather Eve… · Appendix: About This Report ... anticipated generator availability. ... (unless committed

Cause Code Assignment

NERC | Southwest Cold Weather Event of February 1-5, 2011 – NERC Event Analysis Cause Codes | December 2012 7 of 14

then take appropriate steps to ensure that each unit will be able to achieve the optimum level of performance of which it is capable.

A1B3C01 - Design / documentation not complete; Definition - The designs and other documentation for equipment were incomplete. Items were missing from the documentation. A complete baseline did not exist.

Balancing authorities, reliability coordinators and generators often lacked adequate knowledge of plant temperature design limits, and thus did not realize the extent to which generation would be lost when temperatures dropped. (page 196)

A2 Equipment/Material Problem

A2B2C03 - Corrective maintenance LTA; Definition - Corrective maintenance was performed but failed to correct the originating problem. The equipment or component was reassembled improperly during corrective maintenance. Other problems were noted during maintenance activities that were not corrected. The actual job of performing a maintenance activity was complete, but was not performed correctly.

(Report recommendation #14, page 205). Generator Owner/Operators should ensure that adequate maintenance and inspection of its freeze protection elements be conducted on a timely and repetitive basis.

o The task force found a number of inadequacies in generating units’ preparations for winter performance. These included a lack of accountability and senior management review, lack of an adequate inspection and maintenance program, and failure to perform engineering analyses to determine the correct capability needed for their protection equipment.

(Report recommendation #20, page 209) Transmission Operators should ensure that transmission facilities are capable of performing during cold weather conditions.

o Transmission Operators reported several incidents of unplanned outages during the February 2011 event as a result of circuit breaker trips, transformer trips, and other transmission line issues. Although these outages did not generally contribute materially to any transmission limitations, some transmission breaker outages did lead to the loss of generating units. Many breaker trips were the result of low air in the breaker, low sulfur hexa-fluoride (SF6) gas pressure, failed or inadequate heaters, bad contacts, and gas leaks.

Generators were generally reactive as opposed to being proactive in their approach to winterization and preparedness. The single largest problem during the cold weather event was the freezing of instrumentation and equipment. Many generators failed to adequately prepare for winter, including the following: failed or inadequate heat traces, missing or inadequate wind breaks, inadequate insulation and lagging (metal covering for insulation), failure to have or to maintain heating elements and heat lamps in instrument cabinets, failure to train. (page 196)

A2B3C02 – Inspection / testing LTA. Required testing / inspection was not established or performed for the equipment involved in the incident. The required testing / inspection was performed at an incorrect frequency. The acceptance criteria for the required testing / inspection were inadequately defined. All essential components were not included in the required testing / inspection.

(Report recommendation #6, page 200) Transmission Operators, Balancing Authorities, and Generation Owner/Operators should consider developing mechanisms to verify that units that have fuel switching capabilities can periodically demonstrate those capabilities

o Sixteen percent of ERCOT’s generation capacity is listed as having fuel switching capabilities. During the February cold weather event, a quarter of the 20 units that attempted to switch fuel were unsuccessful. If a unit represents itself as having fuel switching capability, verification of the adequacy of its capability would provide useful information to the Balancing Authority or Transmission Operator as to the availability of that unit in the event of natural gas curtailments.

o Fuel switching verification might consist of the following:

– Documented time required to switch equipment,

– Documented unit capacity while on alternate fuel,

Page 8: Southwest Cold Weather Event of February 1-5, 2011 NERC ... 2011 Southwest Cold Weather Eve… · Appendix: About This Report ... anticipated generator availability. ... (unless committed

Cause Code Assignment

NERC | Southwest Cold Weather Event of February 1-5, 2011 – NERC Event Analysis Cause Codes | December 2012 8 of 14

– Operator training and experience,

– Fuel switching equipment problems, and

– Boiler and combustion control adjustments needed to operate on alternate fuel.

(Report recommendation #6, page 201). Balancing Authorities, Transmission Operators and Generator Owners/Operators should take the steps necessary to ensure that black start units can be utilized during adverse weather and emergency conditions.

o The task force determined that a combination of scheduled and forced outages of ERCOT’s black start units would have put ERCOT’s ability to restore the system in jeopardy, had an uncontrolled blackout not been averted by the implementation of load shedding. Balancing Authorities and Transmission

o Operators should take steps to ensure the availability and reliability of their black start units during adverse weather and emergency conditions, particularly to prevent a gap in this function before 2013, when the provisions of Reliability Standard EOP-005-2 on System Restoration from Blackstart Resources becomes mandatory. These steps should ideally include auditing Generator Owner/Operators, random testing of black start units during temperature extremes (both hot and cold), determining the ambient operating temperature limitations of the black start units, evaluating the effects of extreme temperatures on implementation of the entity’s black start plan; and ensuring that operators are trained to start the black start units during extreme weather conditions. ERCOT is presently considering Protocol revisions that would provide for unannounced testing of black start units and “claw back” payments for black start units that fail testing or fail to perform.

A2B3C03 – Post maintenance/post - modification testing LTA. The post-maintenance or post-modification testing specified was not performed or was performed incorrectly. The post-maintenance or post-modification testing was completed, but the testing requirements were less than adequate. The post-maintenance or post-modification testing was not performed in accordance with the schedule for testing.

Many generators failed to adequately prepare for winter, including the following: failed or inadequate heat traces, missing or inadequate wind breaks, inadequate insulation and lagging (metal covering for Insulation), failure to have or to maintain heating elements and heat lamps in instrument cabinets. (page 196)

A2B6C01 – Defective or failed part A part/instrument that lacked something essential to perform its intended function. The degraded performance of a part or a component contributed to the failure of the component, equipment, or system. Note: this does not to explain why the object failed or was defective. Therefore, this node should be multiple coded.

The reason blackouts had to be initiated was that over 29,000 MW of generation that was committed in the day-ahead market or held in reserve either tripped, was de rated, or failed to start.

A3 Individual Human Performance LTA

A3 -Human Performance LTA. An event or condition resulting from the failure, malfunction, or deterioration of the human performance associated with the process. Note: Strictly speaking, A3B1, A3B2, & A3B3 nodes are only applicable when “problem-solving,” although this does not have to be conscious. These are not the intended coding when not engaged in solving a problem, e.g., falling asleep because of prescription medication [which might be A3B4C01 or A5B4C06]. Further, these codes are for individual actions or lack thereof. If an event has multiple occurrences of the same A3 C node[s], it is time to look for other rationale behind the behavior. Yes, there are single examples of group performance that is LTA. However, when it is multiple examples, there is usually another explanation. For example, the control room operators at Three Mile Island mutually incorrectly diagnosed several of the accident indications and also mutually avoided application of several potential recovery paths. These errors were eventually traced to how their training had treated these potentialities.

(Report recommendation #23, page 211). WECC should review its Reliability Coordinator procedures for providing notice to Transmission Operators and Balancing Authorities when another Transmission Operator or Balancing Authority within WECC is experiencing a system emergency (or likely will experience a system emergency), and consider whether modification of those procedures is needed to expedite the notice process.

Page 9: Southwest Cold Weather Event of February 1-5, 2011 NERC ... 2011 Southwest Cold Weather Eve… · Appendix: About This Report ... anticipated generator availability. ... (unless committed

Cause Code Assignment

NERC | Southwest Cold Weather Event of February 1-5, 2011 – NERC Event Analysis Cause Codes | December 2012 9 of 14

o The Task Force observed a lag in communicating a declared system emergency in WECC. In one instance, a Reliability Coordinator did not issue an EEA 3 declaration until seven minutes after the decision had been made to do so; the delayed declaration appeared to have been the first official notice by the Reliability Coordinator to other WECC entities of the seriousness of the generation failures on the system of the Balancing Authority in question.

(Report recommendation #26, page 212). Transmission Operators should train operators in proper load shedding procedures and conduct periodic drills to maintain their load shedding skills.

o The task force found that at least one Transmission Operator in WECC experienced a minor delay in initiating its load shedding sequence, due to problems notifying the concerned Distribution Provider. Another Transmission Operator experienced delay in executing its load shedding because the individual operators had never shed load before and had not had recent drills. These incidents underscore the necessity of adequate training in load shedding procedures.

A4 Management/Organizational Challenges

A4B1C03 - Management direction created insufficient awareness of the impact of actions on safety / reliability; Definition - Management failed to provide direction regarding safeguards against non-conservative actions by personnel concerning quality, safety or reliability.

(Report recommendation #1, page 197). Balancing Authorities, Reliability Coordinators, Transmission Operators and Generation Owner/Operators in ERCOT and in the southwest regions of WECC should consider preparation for the winter season as critical as preparation for the summer peak season.

o The large number of generating units that failed to start, tripped offline or had to be derated during the February event demonstrates that the generators did not adequately anticipate the full impact of the extended cold weather and high winds. While plant personnel and system operators, in the main, performed admirably during the event, more thorough preparation for cold weather could have prevented many of the weather-related outages.

(Report recommendation #4, page 198). ERCOT should reconsider its protocol that requires it to approve outages if requested more than eight days before the outage, consider giving itself the authority to cancel outages previously scheduled, and expand its outage evaluation criteria.

o ERCOT’s Protocols provide that it may not forbid an outage request submitted more than eight days prior to the scheduled outage, unless the outage would keep ERCOT from meeting applicable Reliability Standards or Protocol requirements. The Protocols further limit review of outage requests made earlier than eight days before the outage to the following three things: load forecast, other known outages of both generation and transmission, and the results of a contingency analysis to indicate whether the outages would cause overloads or voltage problems.

Some entities failed to take action to safeguard their own equipment by ensuring that proper winterization procedures were used to protect critical equipment and to ensure that the equipment was in working order (lack of winterization procedures, not running or pre-warming units during the night) [analysis comment]

A4B1C04 - Management follow-up or monitoring of activities did not identify problems; Definition - Management's methods for monitoring the success of initiatives were ineffective in identifying shortcomings in the implementation.

Management follow-up or monitoring of activities did not identify problems, as evidenced by: lack of accountability and senior management review, lack of an adequate inspection and maintenance program, and failure to perform engineering analyses to determine the correct capability needed for their protection equipment. [analysis comment]

These winterization techniques are well known and were recommended for use after the 1989 event. There was a failure to ensure that the initiatives that began after that event continued. It is possible that these initiatives were lost as companies sold off plants or as experienced personnel left. That is why it is critical that management keep track of these issues and ensure they are passed on and continued by succeeding management. [analysis comment]

Page 10: Southwest Cold Weather Event of February 1-5, 2011 NERC ... 2011 Southwest Cold Weather Eve… · Appendix: About This Report ... anticipated generator availability. ... (unless committed

Cause Code Assignment

NERC | Southwest Cold Weather Event of February 1-5, 2011 – NERC Event Analysis Cause Codes | December 2012 10 of 14

A4B1C06 – Previous industry or in-house experience was not effectively used to prevent recurrence; Definition - Industry or in-house experience relating to a current problem that existed prior to the event, but was not assimilated by the organization. Note: This code is not necessarily limited to the site’s formal lessons learned program. It can apply to any event of which the facility had been made aware.

During the February event, temperatures were considerably lower (15 degrees plus) than average winter temperatures, and represented the longest sustained cold spell in 25 years. Steady winds also accelerated equipment heat loss. However, such a cold spell was not unprecedented. The Southwest also experienced temperatures considerably below average, accompanied by generation outages, in December 1989. Less extreme cold weather events occurred in 2003 and 2010. Many generators failed to adequately apply and institutionalize knowledge and recommendations from previous severe winter weather events, especially as to winterization of generation and plant auxiliary equipment. (page 195)

Prior to the event, several other cold weather events occurring both in Texas and in other locations have happened. It is also cited in the report that the 1989 event had some recommendations that if had been implemented and remained implemented, may have prevented or at least moderated many of the challenges faced during the event. [analysis comment]

It is also noted that newer plants had a slightly higher rate of failure than older plants, pointing to a lack of learning from industry experience. [analysis comment]

A4B1C09 - Corrective action for previously identified problem or event was not adequate to prevent recurrence; Definition - Management failed to take meaningful corrective action for consequential or non-consequential events.

(Report recommendation #5, page 199) ERCOT should consider modifying its procedures to (i) allow it to significantly raise the 2300 MW responsive reserve requirement in extreme low temperatures, (ii) allow it to direct generating units to utilize preoperational warming prior to anticipated severe cold weather, and (iii) allow it to verify with each generating unit its preparedness for severe cold weather, including operating limits, potential fuel needs and fuel switching abilities.

o ERCOT data on forced outages during the 50 coldest days between 2005-2011 show a correlation between low temperatures and forced outages. This was demonstrated not only by the February 2011 event but also by the 1989 event; in both cases, extremely low temperatures led to the loss of large amounts of generation and the implementation of rolling blackouts.

o Increasing the amount of responsive reserves going into a cold weather event would compensate for the probability that a number of generating units might fail, and would provide better response to system instability in the event of such losses.

o Additionally, pre-operational warming would help prevent freezing and identify other operational problems. Running a unit prior to the start of extreme cold weather would utilize the unit’s own radiant heat to help prevent freezing. And starting it up would permit correction of any problems that otherwise would not be noticed until the unit was called upon for performance.

o While pre-operational warming has considerable value, issues of whether or how generators are to be compensated for taking such actions at ERCOT’s direction would need to be addressed.

During the February event, temperatures were considerably lower (15 degrees plus) than average winter temperatures, and represented the longest sustained cold spell in 25 years. Steady winds also accelerated equipment heat loss. However, such a cold spell was not unprecedented. The Southwest also experienced temperatures considerably below average, accompanied by generation outages, in December 1989. Less extreme cold weather events occurred in 2003 and 2010. Many generators failed to adequately apply and institutionalize knowledge and recommendations from previous severe winter weather events, especially as to winterization of generation and plant auxiliary equipment.(page 195)

A4B2C07 - Means not provided for assuring adequate availability of appropriate materials / tools. A process for supplying personnel with appropriate materials or tools did not exist.

(Report recommendation #19, page 208) Each Generator Owner/Operator should take steps to ensure that winterization supplies and equipment are in place before the winter season, that adequate staffing is in

Page 11: Southwest Cold Weather Event of February 1-5, 2011 NERC ... 2011 Southwest Cold Weather Eve… · Appendix: About This Report ... anticipated generator availability. ... (unless committed

Cause Code Assignment

NERC | Southwest Cold Weather Event of February 1-5, 2011 – NERC Event Analysis Cause Codes | December 2012 11 of 14

place for cold weather events, and that preventative action in anticipation of such events is taken in a timely manner.

o Specifically, the task force recommends:

– Each Generator Owner/Operator should maintain a sufficient inventory of supplies at each generating unit necessary for extreme weather preparations and operations.

– Each Generator Owner/Operator should place thermometers in rooms containing equipment sensitive to cold and in freeze protection enclosures to ensure that temperature is being maintained above freezing and to determine the need for additional heaters or other freeze protection devices.

– During extreme cold weather events, each Generator Owner/Operator should schedule additional personnel for around-the-clock coverage.

– Each Generator Owner/Operator should evaluate whether it has sufficient electrical circuits and capacity to operate portable heaters, and perform preventive maintenance on all portable heaters prior to cold weather.

– Each Generator Owner/Operator should drain any non-critical service water lines in anticipation of severe cold weather.

A4B3C07 - Job scoping did not identify potential task interruptions and/or environmental stress; Definition - The work scoping process was not effective in detecting reasonable obstructions to work flow (e.g., shift changes) or the impact of environmental conditions. Note: This code applies to disruptions of circadian rhythms [biological functions based on 24-hour schedule] caused by scheduling of work.

Balancing authorities, reliability coordinators and generators often lacked adequate knowledge of plant temperature design limits, and thus did not realize the extent to which generation would be lost when temperatures dropped. (page 196)

Transmission operators and distribution providers generally did not identify natural gas facilities such as gathering facilities, processing plants or compressor stations as critical and essential loads. (page 196)

A4B3C08 - Job scoping did not identify special circumstances and/or conditions; Definition - The work scoping process was not effective in detecting work process elements having a dependency upon other circumstances or conditions.

Transmission operators and distribution providers generally did not identify natural gas facilities such as gathering facilities, processing plants or compressor stations as critical and essential loads. (page 196)

The scoping processes for starting up the units did not take into account the cold which might have lead companies to pre-warm units, or allow them to run overnight at minimum. [analysis comment]

A4B3C09 - Work planning not coordinated with all departments involved in task; Definition - Interdepartmental communication and teamwork did not support the work flow being planned. Note: The key word is “coordinated.” By not getting input from affected departments, the work plan is likely not to succeed.

ERCOT and the generators within ERCOT could better coordinate generator scheduled outages, both in terms of the total amount of scheduled outages at a given time and their location. A substantial amount of generation (11,566 MW) was on scheduled outage going into the cold weather event. ERCOT’s current Protocols provide that requests for scheduled outages submitted earlier than eight days before the outage is to begin are automatically approved, unless they would violate a Reliability Standard. (page 196)

The scoping processes for starting up the units did not take into account the cold which might have lead companies to pre-warm units, or allow them to run overnight at minimum. [analysis comment]

A4B5C05 - System interactions not considered; Definition - Changes to processes or physical systems caused interactions with other processes or physical systems that had were not identified prior to implementation.

NERC-FERC report commendations 3, 22, 25

(Report recommendation #3, page 198) Balancing Authorities and Reserve Sharing Groups should review the distribution of reserves to ensure that they are useable and deliverable during contingencies.

Page 12: Southwest Cold Weather Event of February 1-5, 2011 NERC ... 2011 Southwest Cold Weather Eve… · Appendix: About This Report ... anticipated generator availability. ... (unless committed

Cause Code Assignment

NERC | Southwest Cold Weather Event of February 1-5, 2011 – NERC Event Analysis Cause Codes | December 2012 12 of 14

o This recommendation is designed to ensure that Balancing Authorities take into account transmission constraints, other demands on reserve sharing resources, the possibility that more than one reserve sharing group member might experience simultaneous emergencies, and other factors that might affect the availability or deliverability of reserves. ERCOT is currently considering a similar recommendation, which was presented to its Board of Directors in March, 2011.

(Report recommendation #22, page 210) ERCOT should review and modify its Protocols as needed to give Transmission Service Providers and Distribution Service Providers in Texas access to information about loads on their systems that could be curtailed by ERCOT as Load Resources or as Emergency Interruptible Load Service.

o Some ERCOT Transmission Service Providers expressed concern that they have virtually no information regarding loads on their own systems that may be deployed by ERCOT as Load Resources or Emergency Interruptible Load Service resources. These loads contract directly with ERCOT, and the Transmission Service Provider does not receive information about their status. When these loads are shed by ERCOT without prior notification to the Transmission Service Providers and Distribution Service Providers, they have the potential to cause localized imbalances in line flows, voltages, and other system parameters that may be problematic.

(Report recommendation #25, page 211) Transmission Operators and Distribution Providers should conduct critical load review for gas production and transmission facilities, and determine the level of protection such facilities should be accorded in the event of system stress or load shedding.

o Keeping gas production facilities in service is critical to maintaining an adequate supply of natural gas, particularly in the Southwest where there is a relatively small amount of underground gas storage. And keeping electric-powered compressors running can be important in maintaining adequate pressure in gas transmission lines.

Transmission operators and distribution providers generally did not identify natural gas facilities such as gathering facilities, processing plants or compressor stations as critical and essential loads. (page 196)

A6 Training Deficiency

A6B2C01 - Practice or “hands-on” experience LTA; Definition - The on-the-job training did not provide opportunities to learn skills necessary to perform the job. There was insufficient on-the-job training. There was an inadequate amount of preparation before performing the activity. The employee had not previously performed the task under direct supervision.

(Report recommendation #26, page 212) Transmission Operators should train operators in proper load shedding procedures and conduct periodic drills to maintain their load shedding skills.

o The task force found that at least one Transmission Operator in WECC experienced a minor delay in initiating its load shedding sequence, due to problems notifying the concerned Distribution Provider. Another Transmission Operator experienced delay in executing its load shedding because the individual operators had never shed load before and had not had recent drills. These incidents underscore the necessity of adequate training in load shedding procedures.

For the Southwest as a whole, 67 percent of the generator failures (by MWh) were due directly to weather-related causes, including frozen sensing lines, frozen equipment, frozen water lines, frozen valves, blade icing, low temperature cutoff limits, and the like. At least another 12 percent were indirectly attributable to the weather (occasioned by natural gas curtailments to gas-fired generators and difficulties in fuel switching) Generators were generally reactive as opposed to being proactive in their approach to winterization and preparedness. The single largest problem during the cold weather event was the freezing of instrumentation and equipment. Many generators failed to adequately prepare for winter, including the following: failed or inadequate heat traces, missing or inadequate wind breaks, inadequate insulation and lagging (metal covering for insulation), failure to have or to maintain heating elements and heat lamps in instrument cabinets, failure to train operators and maintenance personnel on winter preparations, lack of fuel switching training and drills, and failure to ensure adequate fuel. (page 196).

Page 13: Southwest Cold Weather Event of February 1-5, 2011 NERC ... 2011 Southwest Cold Weather Eve… · Appendix: About This Report ... anticipated generator availability. ... (unless committed

Cause Code Assignment

NERC | Southwest Cold Weather Event of February 1-5, 2011 – NERC Event Analysis Cause Codes | December 2012 13 of 14

A6B2C03 - Refresher training LTA; Definition - Training updates were not performed. Continuing training was not performed to keep employees equipped to perform non-routine tasks. The frequency of continuing training was inadequate. The frequency of refresher training was not sufficient to maintain the required knowledge and skills.

(Report recommendation #18, page 208) Each Generator Owner/Operator should develop and annually conduct winter-specific and plant-specific operator awareness and maintenance training.

o Operator training should include awareness of the capabilities and limitations of the freeze protection monitoring system, proper methods to check insulation integrity and the reliability and output of heat tracing, and prioritization of repair orders when problems are discovered.

Page 14: Southwest Cold Weather Event of February 1-5, 2011 NERC ... 2011 Southwest Cold Weather Eve… · Appendix: About This Report ... anticipated generator availability. ... (unless committed

Appendix: About This Report

NERC | Southwest Cold Weather Event of February 1-5, 2011 – NERC Event Analysis Cause Codes | December 2012 14 of 14

Appendix: About This Report

Authority This report was prepared by NERC in its capacity as the electric reliability organization (ERO), and provides an assessment of the reliability issues surrounding a major event that occurred on the BPS. Section 215(g) of the Energy Policy Act of 2005 provides that “ The ERO is to conduct periodic assessments of the reliability and adequacy of the BPS in North America.”

Cause Coding Methodology for Larger Events Larger events, or events that involve multiple entities, demand that the analyst approach cause coding differently. For a more complete review of the NERC cause coding process see NERC_Cause_Code_Assignment_Process. This report has been cause coded as a single large event, so as not to concentrate the analysis on any singular entity or portion of the event. While there may be several instances of a contributing cause being identified more than one time, the cause code is only captured once, with potentially several instances of supporting evidence. It is often useful to see if the causes of a larger event are present in the emerging trends of smaller events. Using this approach, it is likely that the overall event’s root cause cannot be determined, while smaller portions of the event or individual entities might be able to ascertain the root cause of their particular contribution to the overall event.

For More Information If you have specific questions about this report, or the processes used to create it, please contact Jule Tate, NERC Manager of Event Analysis, at [email protected] or (919) 550-3993.