Monitor the availability, performance, and health of all physical servers managed by Cloudmon agents. Track CPU, memory, disk, and network metrics in real time across your entire server fleet.
Servers are the backbone of most IT environments, running critical workloads, databases, and services that teams depend on around the clock. Unlike laptops, servers are expected to be always on, which makes any performance degradation or unexpected downtime immediately impactful. Cloudmon addresses this by deploying a lightweight agent directly on each server, giving IT teams continuous visibility into every monitored endpoint.
The agent collects performance data locally and stores it in an offline database for a configurable period of time. When connectivity is restored, all locally stored data is automatically uploaded and processed, ensuring no gaps in your monitoring history even during brief network interruptions.
Navigate to Agents → Servers to access this view. The summary bar shows Total devices, Down, Critical, High CPU Usage, High Memory Usage, and High Disk Usage counts, giving you an immediate picture of which servers are under pressure. Distribution charts break down servers by State, OS, and Vendor. The Top 5 charts highlight the servers under the heaviest CPU, Memory, and Disk load.
The server list shows all discovered servers with the following details:
| Column | Description |
| Name | The server hostname and the time it was last seen. |
| Agent Vendor | The hardware manufacturer, for example HPE or Dell Inc. |
| Boot Time | How long ago the server was last booted. Useful for identifying servers that have been running for extended periods without a restart, which can sometimes mask memory leaks or deferred updates. |
| IP Info - Location | The public IP location of the server, useful for confirming the data centre or region it is operating from. |
| OS | The operating system running on the server, useful for identifying servers running outdated or unsupported OS versions. |
Selecting a server opens its detail page. The overview shows key identity and status information including the hostname, IP address, group, customer, state, and status. It also shows availability percentage, total downtime, and current system time.
Hardware details include the processor, memory, partitions, disk, operating system, BIOS, and chassis, among other configuration information. Public IP details show the ISP, country, timezone, and organisation, helpful for confirming the hosting environment of the server.
Displays the operational state of all network interfaces on the server, including physical NICs, virtual adapters, and bonded interfaces. A searchable table lists each interface with its name, IP address, MAC address, speed, TX and RX rates, and operational state, among other details. This is particularly useful for servers with multiple interfaces, helping identify misconfigured or underperforming network paths, unexpected interface failures, or unusual traffic volumes on a specific NIC.
Lists all software installed on the server, showing the application name, version, publisher, and installation date. This is useful for auditing server software, identifying outdated application versions that may carry security vulnerabilities, confirming that required server software is present, or spotting unauthorised installations on production systems.
Displays real-time CPU and memory usage across all running processes and services, alongside a list of monitored system services and their current state with the limit being 10 processes or 10 services at one point. For servers specifically, this view is most useful for:
All running processes are listed under this tab but are not individually monitored by default. To monitor a specific process and track its resource usage over time:
Services are not monitored by default. To monitor specific services and receive alerts when they stop, you need to enable monitoring for each service individually:
The service list refreshes automatically at every reporting interval. For each monitored service you can see its running state, start mode (manual or automatic), CPU usage, memory usage, instance count, and virtual memory size.
Cloudmon can raise a Critical alarm automatically when a monitored process or service becomes inactive. There are two ways to enable this:
Option 1 — Enable the alarm directly from the service list. When viewing the Monitored Services table, click the alarm icon next to a process or service to toggle its alarm on. When the icon is active, a Critical alarm is raised if that service or process stops. Clicking the icon again disables the alarm without removing the service or process from monitoring.
Option 2 — Configure an alarm rule at the device or group level. For applying consistent service and process alarm policies across multiple devices, refer to Alarm Rule Configuration for full details.
Displays time-series charts for key performance metrics collected from the server. For servers, the most operationally critical metrics include CPU Utilisation, Memory Utilisation, Disk Usage, and network throughput, among other hardware and performance indicators. Sustained high CPU or memory on a server often signals a workload capacity issue or a misbehaving service, making these metrics essential for proactive incident prevention.
| Symptom | Likely Cause | Fix |
| Server is showing as Down | The agent service may have stopped, or the server may have lost network connectivity | Verify the agent service is running on the server and that it can reach the Cloudmon probe |
| High CPU or Memory alarms firing unexpectedly | A runaway process or service may be consuming excessive resources | Review Processes and Services to identify the responsible process and take action |
| Gaps in metric history | The server lost connectivity longer than the configured offline data retention window | Adjust the offline data retention window in the agent configuration to match expected maintenance windows |
| Server not appearing in the list | The agent may not be installed or has not checked in yet | Install the Cloudmon agent on the server and ensure it can reach the probe |