Sunday, October 4, 2009

Who Owns Monitoring?

Large companies are often able to support a 24x7 operations center, which make a natural place for monitoring to occur though perhaps not for the administration of those tools. Many of our smaller customers have struggled with the question of where their monitoring and its administration belongs organizationally. By default monitoring tasks generally end up the responsibility of either a few developers or with the administrators responsible for the systems in the production environment. Naturally just because something is the default doesn't make it right.

Monitoring tools generally come into the organization via three sources: operations, support, or development. How tools come into an organization can say a lot about organizational culture. For example, operations and support will generally bring in a tool because they feel constrained by the visibility they have into the production applications and environments -- they have angry customers, but no way to appease them. Developers tend to focus on monitoring tools that enable them to build better applications, but often overlook the importance of tools in the production environment. The team that brings in the tools often ends up as the administrators of the tools, regardless of whether they have the appropriate skills or resources to provide adequate support.

For our customers who are trying to make the most of restricted budgets and headcount, we've found some common trends that lead to success.

- Appoint a champion: Find someone within the organization who has adequate time, skills, and organizational knowledge to become the dedicated subject matter expert for service management.
- Define a process for monitoring and triage: Make it clear who is responsible for operational monitoring and the process for escalation.
- Administration of monitoring tools and actual the monitoring may be separate teams or personnel.
- Don't overlook support: Adding a centralized support engineering team within a support organization can streamline problem resolution.
- Defining a virtual triage team made from experts from database, networking, systems, and application teams that can be immediately activated in crisis situations is a better approach than waiting for the crisis and then taking action.

Ultimately who owns monitoring in an organization is driven by organizational needs and culture, but should be driven by a concerted decision rather than indecision.

No comments: