Server monitoring using Agent

Server monitoring using Agent

Agents

Server monitoring using Agent

Monitor the availability, performance, and health of all physical servers managed by Cloudmon agents. Track CPU, memory, disk, and network metrics in real time across your entire server fleet.

Overview

Servers are the backbone of most IT environments, running critical workloads, databases, and services that teams depend on around the clock. Unlike laptops, servers are expected to be always on, which makes any performance degradation or unexpected downtime immediately impactful. Cloudmon addresses this by deploying a lightweight agent directly on each server, giving IT teams continuous visibility into every monitored endpoint.

The agent collects performance data locally and stores it in an offline database for a configurable period of time. When connectivity is restored, all locally stored data is automatically uploaded and processed, ensuring no gaps in your monitoring history even during brief network interruptions.

Navigate to Agents → Servers to access this view. The summary bar shows Total devices, Down, Critical, High CPU Usage, High Memory Usage, and High Disk Usage counts, giving you an immediate picture of which servers are under pressure. Distribution charts break down servers by State, OS, and Vendor. The Top 5 charts highlight the servers under the heaviest CPU, Memory, and Disk load.

Server List

The server list shows all discovered servers with the following details:

ColumnDescription
NameThe server hostname and the time it was last seen.
Agent VendorThe hardware manufacturer, for example HPE or Dell Inc.
Boot TimeHow long ago the server was last booted. Useful for identifying servers that have been running for extended periods without a restart, which can sometimes mask memory leaks or deferred updates.
IP Info - LocationThe public IP location of the server, useful for confirming the data centre or region it is operating from.
OSThe operating system running on the server, useful for identifying servers running outdated or unsupported OS versions.

Server Overview

Selecting a server opens its detail page. The overview shows key identity and status information including the hostname, IP address, group, customer, state, and status. It also shows availability percentage, total downtime, and current system time.

Hardware details include the processor, memory, partitions, disk, operating system, BIOS, and chassis, among other configuration information. Public IP details show the ISP, country, timezone, and organisation, helpful for confirming the hosting environment of the server.

Interfaces

Displays the operational state of all network interfaces on the server, including physical NICs, virtual adapters, and bonded interfaces. A searchable table lists each interface with its name, IP address, MAC address, speed, TX and RX rates, and operational state, among other details. This is particularly useful for servers with multiple interfaces, helping identify misconfigured or underperforming network paths, unexpected interface failures, or unusual traffic volumes on a specific NIC.

Installed Software

Lists all software installed on the server, showing the application name, version, publisher, and installation date. This is useful for auditing server software, identifying outdated application versions that may carry security vulnerabilities, confirming that required server software is present, or spotting unauthorised installations on production systems.

Processes and Services

Displays real-time CPU and memory usage across all running processes and services, alongside a list of monitored system services and their current state with the limit being 10 processes or 10 services at one point. For servers specifically, this view is most useful for:

  • Identifying a runaway process consuming excessive CPU or memory, affecting other workloads running on the same server.
  • Confirming that critical server services such as web servers, database engines, or application runtimes are running and not in a failed or degraded state.
  • Detecting unexpected processes that should not be running on a production server, which could indicate unauthorised activity or a misconfigured deployment.
  • Correlating a service outage or performance incident with a specific process that spiked resource usage at the time of the issue.
  • Monitoring the number of instances of a process to detect abnormal duplication or spawning behaviour.

Enabling Process Monitoring

All running processes are listed under this tab but are not individually monitored by default. To monitor a specific process and track its resource usage over time:

  1. Navigate to Agents and select the Server you want to configure.
  2. Open the Processes and Services tab.
  3. Click All Processes to view every process currently running on the device.
  4. Click the Monitor icon next to any process you want to track. The process is added to the Monitored Processes table and Cloudmon begins collecting detailed CPU usage, memory usage, instance count, PID, and installation path for it.

Enabling Service Monitoring

Services are not monitored by default. To monitor specific services and receive alerts when they stop, you need to enable monitoring for each service individually:

  1. Navigate to Agents and select the Server you want to configure.
  2. Open the Processes and Services tab.
  3. Click All Services to view every service currently running on the device.
  4. Click the Monitor icon next to any service you want to track. The service is added to the Monitored Services table and Cloudmon begins tracking its state continuously.

The service list refreshes automatically at every reporting interval. For each monitored service you can see its running state, start mode (manual or automatic), CPU usage, memory usage, instance count, and virtual memory size.

Enabling Process and Service Alarms

Cloudmon can raise a Critical alarm automatically when a monitored process or service becomes inactive. There are two ways to enable this:

Option 1 — Enable the alarm directly from the service list. When viewing the Monitored Services table, click the alarm icon next to a process or service to toggle its alarm on. When the icon is active, a Critical alarm is raised if that service or process stops. Clicking the icon again disables the alarm without removing the service or process from monitoring.

Option 2 — Configure an alarm rule at the device or group level. For applying consistent service and process alarm policies across multiple devices, refer to Alarm Rule Configuration for full details.

System Metrics

Displays time-series charts for key performance metrics collected from the server. For servers, the most operationally critical metrics include CPU Utilisation, Memory Utilisation, Disk Usage, and network throughput, among other hardware and performance indicators. Sustained high CPU or memory on a server often signals a workload capacity issue or a misbehaving service, making these metrics essential for proactive incident prevention.


Troubleshooting

SymptomLikely CauseFix
Server is showing as DownThe agent service may have stopped, or the server may have lost network connectivityVerify the agent service is running on the server and that it can reach the Cloudmon probe
High CPU or Memory alarms firing unexpectedlyA runaway process or service may be consuming excessive resourcesReview Processes and Services to identify the responsible process and take action
Gaps in metric historyThe server lost connectivity longer than the configured offline data retention windowAdjust the offline data retention window in the agent configuration to match expected maintenance windows
Server not appearing in the listThe agent may not be installed or has not checked in yetInstall the Cloudmon agent on the server and ensure it can reach the probe
    • Related Articles

    • Agent Vs Agentless Monitoring in cloudmon

      Attribute Agent-based Monitoring Agentless Monitoring – SNMP, WMI, TCP, ICMP Methodology Deploy the Cloudmon Agent on each server that requires monitoring. Cloudmon uses Probes to monitor IP network endpoints and devices in the network such as ...
    • Server Monitoring using Plugins

      Servers Server Monitoring using Plugins Extend Cloudmon monitoring beyond built-in integrations using custom scripts. Collect metrics from any device, service, or application and surface them alongside your existing monitoring data. Overview Plugins ...
    • Server Monitoring

      Servers Server Monitoring Monitor servers, virtual machines, and Windows infrastructure across your environment. Cloudmon supports agent-based and agentless monitoring, giving you the flexibility to cover every server regardless of how it is managed. ...
    • Windows Application Performance Monitoring using WMI

      Server Monitoring Windows Application Performance Monitoring Monitor the performance and health of Windows server roles and applications using WMI-based templates. Cloudmon supports twelve application templates covering Active Directory, DHCP, IIS, ...
    • Benefits of Agent-Based vs Agentless Monitoring

      Server Monitoring Benefits of Monitoring Using an Agent Over Agentless Understand the key differences between agent-based and agentless monitoring in Cloudmon, so you can choose the right method for each device in your environment. Overview Cloudmon ...