![]() |
The Operations and Support Discipline: Scaling Agile Software Development |
![]() |
![]() |
The Enterprise Unified ProcessTM (EUP) extends iterative/agile processes such as the Rational Unified Process (RUP), Extreme Programming (XP), or Scrum with an Operations and Support discipline, which reflects many of the best practices described within the Information Technology Infrastructure Library (ITIL) for IT Service Management (ITSM). Like any phase or discipline within the EUP, your organization will apply the activities contained within this discipline in a manner that reflects your environment. Organizations that develop and deploy systems in-house do more in terms of operational support than companies who produce shrink-wrapped software. The latter may spend more time and effort on support terms to keep their diversified base of paying customers satisfied and therefore have no operational staff, while the former may have an entire team dedicated just to operations. |
|
The high-level workflow for the Operations and Support discipline is depicted in Figure 1 and the detailed amalgamated workflow in Figure 2. The primary goal of the Operations and Support discipline is to operate and support your software in a production environment. The focus of operations is to ensure that software is running properly, that the network is available and monitored, and that the appropriate data is backed up and restored as needed. Disaster plans are created, and in the event a disaster occurs, they are executed to restore primary systems. The focus of support is to assist end users by answering their questions, analyzing the problems that they are encountering with production systems, recording requests for new functionality, and making and applying fixes. Furthermore, an important message of this discipline is that in order for it to succeed your organization must be as agile as possible: it is possible for enterprise-level professionals (including operations and support staff) to work in an agile manner, but they must choose to do so and be allowed to do so.
Figure 1. The Operations and Support discipline workflow.

Figure 2. The amalgamated workflow of the Operations and Support discipline.

A critical success factor within this discipline is planning the deployment of a system into your production environment. This effort augments the Deployment discipline to include planning how a system will be operated and supported after it is deployed. The support manager must define the support plan prior to deploying a system into production. The system support plan should address:
Similarly, the operations manager, working closely with the project teams, creates the operations plan. The system operations plan defines how the system will be operated while it is in production.
| There are two basic strategies for delivering support: an escalation strategy and a touch-and-hold strategy. The escalation strategy is based on the idea that most support requests are fairly basic and therefore can be handled quickly, whereas small minorities of requests are complicated and must be assigned or escalated to more knowledgeable staff members. This approach scales well, although hand-offs between support staff can prove to be frustrating for the person being supported. With the touch-and-hold strategy, the initial person who took the support request follows it through to the end, although this person may have to work with other people to fulfill the request. The touch-and-hold strategy typically results in greater customer satisfaction because the support requester only needs to deal with one support engineer. However, this approach requires highly skilled support people and is difficult to scale because it can be hard to hire and retain such staff. More information about incident management, problem reporting, and service desks can be found in their associated ITIL Fact Sheets. |
![]() |
The goal of this activity, depicted in Figure 3, is to operate systems in a production environment. Two main roles are associated with this activity:
Operator. This person is responsible for keeping systems running, backing up and restoring data based on the operations plan and requirements of the system, managing any problems, performing periodic cleanup, performing fine tuning and any system reconfigurations, monitoring systems, and redeploying systems as necessary. Because operations support is usually a 24/7 activity, a hand-off protocol should be defined and followed to address the process of handing off a problem if it occurs between shifts, along with a definition of what each team is responsible for completing both prior to and at the completion of a shift.
Support Developer. This person is responsible for applying maintenance fixes to the system via hot fixes (also know as service packs or patches). This is often a member of the development team. As with your development environment, always test any fine tuning or reconfiguration of programs or systems in a test area prior to deploying it to the production environment and have your back-out plan ready to put into place if serious system problems occur.
Figure 3. Operate systems workflow details.

Disaster recovery defines the steps you will follow to get your critical systems back up and running in case something catastrophic happens. Disasters in this context could be a natural disaster such as a hurricane or a tornado that destroys your entire Network Operations Center (NOC). It also includes man-made disasters like the electrical blackout that struck the American northeast in the autumn of 2003. More information about IT Service Continuity Management can be found in the associated ITIL Fact Sheet.
Your organization may ultimately have to execute the disaster recovery plan. Be sure to have the disaster recovery plan in hard-copy form kept with multiple people in multiple locations: when a disaster occurs you may not have access to the electronic versions. The Operations Manager is responsible for executing the disaster recovery plan, and that person will work with the various project teams when necessary in order to recover the systems. The final output of a successful disaster recovery is to have your systems running, according to the plan, on a contingency platform. The operations manager is also responsible for reviewing the recovery effort to identify and then act on the lessons learned.
|
|
|
Copyright © 2004-2012 Scott W. Ambler |
This site owned by Ambysoft Inc. |