Sunday, October 4, 2009

Can we virtualize the virtual servers?

I'm actually surprised that I haven't heard this question yet. The last few months it seems everyone has been budget conscious to the point of the ridiculous. It's been difficult to convince customers that somewhere out there, real hardware still exists. BAC is supported on VMWare but due to our field experiences we cannot recommend anything less than separate dedicated DPS and Gateway hardware. Even getting virtual servers allocated has been a struggle. It seems like many organizations, in their zeal to reap the benefits of virtualization, have forgotten that under the hood there has to be sufficient hardware to actually support the number of virtual machines, especially if all the vm's are going to operate at fully capacity.

What's perhaps really surprising to me is that companies continue to put full PCs on the desks of all of their employees rather than equipping them with ($400) netbooks or even just Blackberries or iPhones. So a $5,000 server for a core business process is out, but a $1,000 PC belongs on everyone's desk, even those employees who work remotely or who only need a machine for basic word processing and email tasks.

I have to admit, I don't understand the penny-wise pound-foolish thinking about physical hardware. Four weeks ago my laptop, a two-year old Mac Book Pro, suffered a hardware failure at 5:00pm on a Friday night. I knew that it would likely be at least a week before I could get it back from Apple - and a week's worth of lost productivity would easily be more than the cost of a new machine. I barely hesitated to drop the three grand for a new laptop (out of my own pocket) yet companies with a national footprint, who need that hardware to run mission critical systems, aren't able or willing to do the same. Certainly there is a difference between the hard drive in my laptop and RAID 5 network attached storage, but nonetheless disk is cheap by all accounts. Utility computing via "cloud" providers are driving costs down as well, and really creates a situation where the argument that "we can't get hardware for this project because we don't have budget" into a non-starter. There's really no good excuse for the view that "hardware" has to be treated as a precious resource.

Who Owns Monitoring?

Large companies are often able to support a 24x7 operations center, which make a natural place for monitoring to occur though perhaps not for the administration of those tools. Many of our smaller customers have struggled with the question of where their monitoring and its administration belongs organizationally. By default monitoring tasks generally end up the responsibility of either a few developers or with the administrators responsible for the systems in the production environment. Naturally just because something is the default doesn't make it right.

Monitoring tools generally come into the organization via three sources: operations, support, or development. How tools come into an organization can say a lot about organizational culture. For example, operations and support will generally bring in a tool because they feel constrained by the visibility they have into the production applications and environments -- they have angry customers, but no way to appease them. Developers tend to focus on monitoring tools that enable them to build better applications, but often overlook the importance of tools in the production environment. The team that brings in the tools often ends up as the administrators of the tools, regardless of whether they have the appropriate skills or resources to provide adequate support.

For our customers who are trying to make the most of restricted budgets and headcount, we've found some common trends that lead to success.

- Appoint a champion: Find someone within the organization who has adequate time, skills, and organizational knowledge to become the dedicated subject matter expert for service management.
- Define a process for monitoring and triage: Make it clear who is responsible for operational monitoring and the process for escalation.
- Administration of monitoring tools and actual the monitoring may be separate teams or personnel.
- Don't overlook support: Adding a centralized support engineering team within a support organization can streamline problem resolution.
- Defining a virtual triage team made from experts from database, networking, systems, and application teams that can be immediately activated in crisis situations is a better approach than waiting for the crisis and then taking action.

Ultimately who owns monitoring in an organization is driven by organizational needs and culture, but should be driven by a concerted decision rather than indecision.

Friday, October 2, 2009

Duct Tape Programming

Joel Spolsky has hit the mark again with his recent blog posting "The Duct Tape Programmer" - http://joelonsoftware.com/items/2009/09/23.html. There are two key quotes "you’re not here to write code; you’re here to ship products" and “Overengineering seems to be a pet peeve of yours.” That's the premise in a nutshell -- quit over-engineering, start shipping. The only thing I fear, and some of the commentary across the net expresses this, that too many will miss his message and instead read Spolsky as justifying sloppy decisions. That's not it at all. Instead, he's taking the experienced, pragmatic approach in which could be best summarized as "just simple enough." A good read, as are his other postings.

Management by Flying Around

Being up at 4:30 am and a long drive to Portland gave me a chance to reflect on a problem I've been mulling for a while now.

For decades, thinking about business and management has been driven by sports and military analogies and experiences. The post-war generation that built the United States into the world's largest economy brought practices and organizational structures from their military experiences. Even within technology we are not immune to this. When I first saw Scrum, an "agile" method for developing software, my immediate reaction was "This is exactly like the Romans structured their military command, 2000 years ago!" We intuitively understand command-control management, work in "teams," "quarterback" meetings, and of course what executive doesn't play golf?

J9's consultants are located all across the United States - I've never met in person some of the people I work closely with, and others I see in person only rarely. The tactics commonly deployed and many of the management techniques of the past quickly fall apart when you don't have the proverbial water-cooler. The inter-personal issues -- health, relationships, personal interests -- become difficult to track and yet plenty of research has shown management empathy to personal needs as a significant factor in employee retention and job satisfaction. Career planning and reviews, especially when criticism needs to be levied, are lost when timezones and thousands of miles separate your staff.

It isn't simply a problem in personnel management either. I recall vividly the first time I saw a Gantt chart, at age 13. Those colorful bars and perfectly placed diamond milestones sparkled with their organizational efficiency. Perfection, yet completely useless if your project consists of loosely related tasks without strict dependencies, especially one where the personnel ebb and flow in and out of the project. Installing a piece of software -- there's something you can put on a gantt chart. Whether the customer has successfully developed the skills to support the software? A less well-defined task.

So here comes the summary: Companies are ever more virtualized, global, and 24x7, and it isn't just the largest companies and in the executive office that these demands appear. The management practices of the past, with their roots in industrialism, simply aren't working. I don't yet know what the answer is, but change is imminent.

Wednesday, September 30, 2009

Happy Days are Here Again

Recession? What recession? J9 is actively seeking Solution Architects. Are you an experienced consultant who understands the benefits of life at a smaller firm, where you can direct your career? Check out our posting here: http://www.j9tech.com/careers.html and apply today.

Monday, July 6, 2009

But did you do the phosphorus test?

I heard the phone clang down and my colleague Steve distraughtly mumble "She's going to kill the fish." His wife called to tell him about a phosphorus problem in their fish tank at home. She's a medical researcher, a biologist by training. Steve's first reaction when she told him there was a phosphorus problem was to ask if she had in fact done a phosphorus test. No, she said, but she'd run through all of the other chemical and algae tests, so of course it had to be the phosphorus and thus she'd started adding more phosphorus to the tank -- they'd know in a few days if that was the problem. Steve, imagining coming home to a tank of dead fish, was not impressed that his scientist wife had failed to use the scientific method at home.

It's so often like that in technology as well. Despite years of rigorous training to use the scientific method to guide our actions (it is called "computer science" for a reason), it's easy to throw all that away when faced with a challenge. A customer came to me the other day asking about monitoring tools to help with a production triage situation for a failing web service. A developer assigned to the task interrupted us saying that a fix had been deployed ten minutes prior and it looked like it was working. Let's reflect upon that:

a) No load or performance testing scripts existed for this web service.
b) No monitoring or profiling tools had been deployed with this service in either a pre-production or production setting.
c) A hopeful fix had been hot-deployed to production and left to run for a mere ten minutes before victory was declared.
d) No permanent monitoring was put in place to prevent the next occurrence of the problem.
e) Apart from a few manual executions of the service and a face-value assessment by one individual, no further validation to correlate the fix with the perceived problem occurred.

Chances are good that Steve's fish will be fine, but can the same be said for those cases where we play roulette with mission critical IT systems? Just as in the case of Steve's fish, there is no legitimate reason for a lack of objective, quantitative analysis except basic human apathy. Anyone who has ever taken a statistics course or been face-to-face with a serious production issue knows that just because many other tests have ruled out many options does not mean its safe to jump ahead and make assumptions just because of gut feeling -- why abandon a working method for one that brings doubt, risk, and exposure to criticism? Run the phosphorus test and let the results be your guide.

Friday, July 3, 2009

A video speaks a thousand words




It is nothing new for us to be constantly developing new educational tools. Demos and lab materials for trainings on site, or content for our evolving KnowledgeBase that augments the HP software support we provide to our customers. But the videos are the biggest hits so far. They pack a three minute punch of information without leaning on those lazy powerpoint icons. Check 'em out.


Business Transaction Management in palatable terms (no yawning required):
http://www.youtube.com/watch?v=49tQ9BpnrT0

In case you missed the first one, here it is:
Why J9? Well, since you asked...
http://www.youtube.com/watch?v=FjPlvO01SmA

Please rate them! We'd love to get some feedback on how well these videos connect with you and for god sakes, if they are still boring, please let us know.