(Central)Disconnecting the reporter from the problem: machine A can report
a problem involving machine B, and the notification etc. will be for
machine B.
This will enable e.g. a central web monitor, or intelligent
reporting of NFS server outage.
(Central)Configurable reporting
for machine foo.u, database probs go to DBA, system probs go
to host oncall
temporarily, all alerts from newly deployed service FOOMON
go to KenM.
call KenM between 0400 and 0800, because he is the earlybird.
(Central)Configurable contacts, like pilot, but with plugins.
(Central)Monitoring tools that work from home/everywhere (even for systems on p172)
(Central)Event correlation
(Central)Tie in between 'dashboard' display of events and current ownership/status
of an event (who's working on it, page sent waiting for call back, reviewed
and downgraded, etc)
(Central)Authentication/authorization for who can see/change what.
(Global)Built in diagnostics, i.e. an app that produces a web page can embed
explicit diagnostic comments instead of a probe parsig the output.
(Local)Retain and report on the collected performance metrics, for
capacity planning, trend analysis, and recent problem forensics.
(Local)Proactive diagnostics, e.g. page space low triggering a hunt thru
the process list.