Sunday 27 January 2008

Infrastructure Manager's checklist for monitoring


  1. Automated management
    • Notification
    • Escalation
    • Auto correction
  2. Configuration management
    • Correct initial sever deployment
    • ITIL practices
    • Network provisioning
    • Auto discovery
  3. Capacity management
    • Applications resource utilisation
    • Systems resource utilisation
    • Systems and application resource optimization
    • Storage usage
    • Memory usage
    • CPU usage
    • Login capacities
  4. Workforce management
    • Hands-off automated systems management
    • Identify problem areas
    • Skills shortage
    • Resource shortage
    • Cross skill for emergences/stand-in support where necessary
  5. Service continuity management
    • Creation of standards for systems maintenance
    • Cross skill resources
    • Extensive usage of automation capabilities
    • Automate routine maintenance & administration tasks
  6. Applications management
    • Use integrated application abilities where possible
    • Monitor processes in application for systems availability
    • Automate batch & other re-occurring processors in applications
    • Test systems holistically (from login to command execution)
    • Monitor inter-systems dependencies
  7. Policy management
    • Base deployment standards for reactive &proactive monitoring
    • Base policy inclusion by technology type
    • Custom made policies
  8. Report management
    • Create useful management information
    • Identify & analyze trends
    • Problem area identification in current & historic data
  9. Systems administration
    • Simplified administration based on standards
    • Release management of agents automatically
    • Alert customization based on business requirements
    • Data retention for at least two consecutive years
  10. Continuous improvement
    • Hierarchical management of systems
    • Service management level improvement
    • Policies evolution
    • Reporting evolution & additions of graphs
    • Agent updating & refinement
    • Increase knowledgebase & reuse of knowledge gained
    • Integration with Service Desk
    • Compliance measurement for server management & monitoring
  11. Problem management
    • History review
    • Resolution recommendation
    • Tracking
  12. Inventory management
    • Comprehensive server inventory
    • Commissioning & decommission status checking
    • Maximize hardware availability for redeployment post decommissioning
  13. Documentation
    • Monitoring manual updating
    • Educate users on correct systems usage to extend maximum benefit
    • Create monitoring standards
    • Disseminate monitoring documentation to Business Unit users
  14. Proactive management
    • Prevent fire fighting
    • Automate problem resolution & identify trends before issue arises
    • Maintain service uptime
  15. Fault management
    • Alarm handling
    • Trouble detection
    • Trouble correction
    • Test & acceptance
    • Network recovery

0 comments:

Post a Comment