中文网站
  Advanced Search
Read the latest Blogs from IT professionals in the field. Read and write community created documents. Need IT help? Ask our staff. Connect with your peers. Check our Tech Shop for posters, books and software tools. Home

Network Fault Management

Network fault management, a key part of the today Network Management architecture, covers functions such as detect, isolate, determine the cause and correct malfunctions in a network. The objectives of doing fault management are to increase network availability, reduce network downtime and restore network failure quickly.

The basic requirements for a fault management system are:

  • Monitoring and collect of statistics on network devices, traffic conditions and usage in real-time to avoid and forecast potential faults
  • Setting thresholds and alarms that may cause network failure to warn the network admin
  • Setting alarms that warns of performance degradation on network devices and links
  • Setting alarms of network resource (such as hard disk space) usage and limitation problems
  • Remotely control network devices for rebooting, shutting down etc.
  • Have a centralized consol to perform all of the above functions

A typical fault management system follows these steps:

Detection -> Analysis -> Action Taking
  • Error Detection
  • Data Gathering
  • Error Handling
  • Diagnosis
  • Event Recording
  • Action
  • Service Restart
  • Black-Listing

When an error occurs, a report is generated and is sent to the fault analyzer. The fault analyzer diagnoses and records the problem. Finally, a system or a person uses the information from the fault analyzer to take appropriate actions such as isolating the error, black-listing failing or failed components, automatically restarting/restoring services, and alerting the system administrator.

Related Terms: Network management, Performance management, Configuration management, Security management