To achieve high uptimes we sysadmins
have to have collaborate. High uptimes are difficult if not impossible unless there is
more than one person working on it.
To explain this lets first think about
building a simplified highly reliable web site. We employ RAID of
some sort in the disks. We have multiple physical servers. They may be in a cluster behind a cluster of load balancers. There will multiple
links to the Internet via multiple providers. We will have primary
and redundant DNS servers. There will multiple power supplies, UPS,
and generators. I could go on but the pattern is obvious. The point
is we aggressively engineer around all the single points of failure
(SPOF) to maintain uptime. We assume failure will happen and build
our systems to see failure as “normal” rather than surprising.
High Availability (HA) is just this.
What about our people? They are just
as crucial part of the system, caring for it, patching,
monitoring, etc. Just like servers people are fallible. We get sick,
need to sleep, and, as a favorite professor put it, get hit by the
beer truck on occasion. To achieve reliability we need to eliminate
SPOF in people as well. So long the hero, hello the team.
Sure, we want
to be heros. As our vision
and ambitions grow we become limited by the reach of our own arms. With
others we can tackle larger problems, do it better, faster, and more
consistently. If I am asleep I know my fellow team member is watching
my systems. If my team member goes on vacation or is sick I don't
fear looking after their systems.
How do we become team players? Study
people! Learn psychology, speech communication, improve your writing,
and so on. These classes should be taught in college but seldom are
but that does not make them any less important. By understanding what
motivates people (your self included) how to communicate clearly, and
how to negotiate in collaborative both the quality and quanity of work improves.
With this knowledge you will also need
to set aside and not insignificant amount of your day working on the
people be it in your team, your manager, or others with-in and and
outside your organization who make your work possible. Just as
important as coming up with a good system design is spending time
memorizing your peers children's names, being supportive when they
screw up, buying coffee for the team on occasion, interrupting your
work to lend a helping hand, teaching what you do, documenting what
you know, and so on. Of course this needs to be kept in balance with
your needs. However without giving yourself over to spending
significant time (25% +) supporting others how can you expect the
same in return. If you don't set up your Jr. team member with good
documentation to read at 3AM when they get woken up they will be left
with nothing but calling you when an alert comes up.
Almost all large technical
achievements are built by foundation of people working together. What
are doing to make you foundation strong?
No comments:
Post a Comment