Monitoring computer's temperature with lm_sensors
by Alpha01
One of the primary reasons I use SSD drives on both of my Mac Minis that I use as hypervisors (besides speed), is that compared to regular hard drives, SSD drives consume far less power and more importantly generate less heat. Before using SSD drives on my machines, the fan noise both of them made during the middle of summer was pretty evident compared to any other time during the year.
Although at the time I did little research about proactively monitoring the temperature of my machines, now thanks to the Nagios book that I’m currently reading, I learned about the tool lm-sensors, which is available to monitor the hardware temperature in Linux.
Installing lm-sersors
in Ubuntu Server 12.04 is really simple.
sudo apt-get install libsensors4 libsensors4-dev lm-sensors
Since lm-sensor
s requires low-level hooks to monitor hardware temperate, it comes with the utility sensors-detect
, which can be used to automatically detect and load the appropriate kernel modules for the lm-sensors
tool to function on the respective piece of hardware.
tony@mini02:~$ sudo sensors-detect
# sensors-detect revision 5984 (2011-07-10 21:22:53 +0200)
# System: Apple Inc. Macmini5,1
# Board: Apple Inc. Mac-8ED6AF5B48C039E1
This program will help you determine which kernel modules you need
to load to use lm_sensors most effectively. It is generally safe
and recommended to accept the default answers to all questions,
unless you know what you're doing.
Some south bridges, CPUs or memory controllers contain embedded sensors.
Do you want to scan for them? This is totally safe. (YES/no): YES
[...]
In the case of my mid 2011 Apple Mac Minis, it was only able to use the coretemp
module. File /etc/modules
:
# Generated by sensors-detect on Sat Feb 2 21:22:20 2013
# Chip drivers
coretemp
After the module has been added, then its just a matter of loading the recently applied modules.
[....]
Do you want to add these lines automatically to /etc/modules? (yes/NO)yes
Successful!
Monitoring programs won't work until the needed modules are
loaded. You may want to run 'service module-init-tools start'
to load them.
Unloading i2c-dev... OK
Unloading i2c-i801... OK
Unloading cpuid... OK
tony@mini02:~$ sudo service module-init-tools start
module-init-tools stop/waiting
tony@mini02:~$
Now that the appropriate kernel modules have been loaded. I have everything needed to check the temperature.
tony@mini02:~$ sensors
coretemp-isa-0000
Adapter: ISA adapter
Physical id 0: +49.0°C (high = +86.0°C, crit = +100.0°C)
Core 0: +48.0°C (high = +86.0°C, crit = +100.0°C)
Core 1: +50.0°C (high = +86.0°C, crit = +100.0°C)
applesmc-isa-0300
Adapter: ISA adapter
Exhaust : 1801 RPM (min = 1800 RPM)
TA0P: +36.0°C
TA0p: +36.0°C
TA1P: +34.8°C
TA1p: +34.8°C
TC0C: +47.0°C
TC0D: +44.8°C
TC0E: +57.5°C
TC0F: +58.5°C
TC0G: +94.0°C
TC0J: +0.8°C
TC0P: +42.5°C
TC0c: +47.0°C
TC0d: +44.8°C
TC0p: +42.5°C
TC1C: +50.0°C
TC1c: +50.0°C
TCFC: +0.2°C
TCGC: +49.0°C
TCGc: +49.0°C
TCPG: +98.0°C
TCSC: +50.0°C
TCSc: +50.0°C
TCTD: +255.5°C
TCXC: +49.5°C
Of course, I just had to write a Nagios plugin to monitor them:
#!/usr/bin/env perl
use strict;
# Tony Baltazar. root[@]rubyninja.org
use constant OK => 0;
use constant WARNING => 1;
use constant CRITICAL => 2;
use constant UNKNOWN => 3;
my %THRESHOLDS = (OK => 70, WARNING => 75, CRITICAL => 86);
# Sample output
#Physical id 0: +55.0°C (high = +86.0°C, crit = +100.0°C)
#Core 0: +54.0°C (high = +86.0°C, crit = +100.0°C)
#Core 1: +55.0°C (high = +86.0°C, crit = +100.0°C)
my @get_current_heat = split "\n", `sensors 2>/dev/null|grep -E -e '(Physical id 0|Core [0-1])'`;
my $counter = 0;
my $output_string;
for my $heat_usage_per_core (@get_current_heat) {
$heat_usage_per_core =~ /(.*):\s+\+([0-9]{1,3})/;
my $core = $1;
my $temp = $2;
if ($temp < $THRESHOLDS{OK}) {
$output_string .= "$core - temperature : $temp" . 'C | ';
$counter++;
} elsif ( ($temp > $THRESHOLDS{OK}) && ($temp >= $THRESHOLDS{WARNING}) && ($temp < $THRESHOLDS{CRITICAL}) ) {
print "WARNING! $core temperature: $temp\n";
exit(WARNING);
} elsif ( ($temp > $THRESHOLDS{OK}) && ($temp > $THRESHOLDS{WARNING}) && ($temp >= $THRESHOLDS{CRITICAL}) ) {
print "CRITICAL! $core temperature: $temp\n";
exit(CRITICAL);
}
}
if ($counter == 3 ) {
print $output_string;
exit(OK);
} else {
print "Unable to get all CPU's temperature.\n";
exit(UNKNOWN);
}
perl
ubuntu
monitoring
nagios
]