You can run Smartmontools in the background and have it check drives and email when there are issues: Open the file /etc/default/smartmontools with your favourite text editor. For example (using vim): sudo vim /etc/default/smartmontools. Uncomment the line start_smartd=yes. How smartd is going to scan the disks and what it will do in case of errors is controlled by the daemon configuration file, /etc/smartd.conf. Again, use your favourite text editor to open this file. There should be one uncommented line, similar to: DEVICESCAN -m root -M exec /usr/share/smartmontools/smartd-runner In this example (which is the default for Karmic), smartd will:
/usr/share/smartmontools/smartd-runner is a script that basically saves the report to a temporary file, and then runs anything it finds in /etc/smartmontools/run.d/; take a look there to understand what you already have (there should be a script that mails the report). There are several -M directives that change when and how often reports are sent. You need to specify (-m something) in order to use them, even if you're not sending any mail. You may include some useful options: DEVICESCAN -H -l error -l selftest -f -s (O/../../5/11|L/../../5/13|C/../../5/15) -m root -M exec /usr/share/smartmontools/smartd-runner In this example, smartd will:
You may also replace DEVICESCAN with the path of the device which you'd like to be monitored (e.g. /dev/sda), and the daemon will only monitor this drive. You'll need one such line for each device. Actions in case of troubleYou'll want to configure the actions smartd will take in case of trouble. If all you want is a notification shown on your desktop, skip to "Personal computer" below. Most of the time, you only need to place a script in /etc/smartmontools/run.d/. Whenever smartd wants to send a report, it will execute smart-runner and the latter will run your script. You have several variables available to your script (again, see the smartd manpage). These come from a test run: SMARTD_MAILER=/usr/share/smartmontools/smartd-runnerSMARTD_SUBJECT=SMART error (EmailTest) detected on host: XXXXXSMARTD_ADDRESS=rootSMARTD_TFIRSTEPOCH=1267409738SMARTD_FAILTYPE=EmailTestSMARTD_TFIRST=Sun Feb 28 21:45:38 2010 VETSMARTD_DEVICE=/dev/sdaSMARTD_DEVICETYPE=satSMARTD_DEVICESTRING=/dev/sdaSMARTD_FULLMESSAGE=This email was generated by the smartd daemon running on:SMARTD_MESSAGE=TEST EMAIL from smartd for device: /dev/sda Your script also has a temporary copy of the report available as "$1". It will be deleted after you finish but the same content is written to /var/log/syslog. Personal computerFor a visual notification, you may just install smart-notifier. You will see a large popup with the report: Alternatively, you may create a custom notification (bubble) as seen in other GNOME programs. You will need to install the libnotify-bin package: sudo aptitude install libnotify-bin Now create a text file called 60notify in /etc/smartmontools/run.d: sudo vi /etc/smartmontools/run.d/60notify and add the following to the file: DISPLAY=:0.0 notify-send --icon=important "Possible disk failure" "$SMARTD_DEVICE may have a problem" (The DISPLAY=:0.0 part is a variable assignment that helps programs to locate your X server. It's already set for your terminal, but the script lacks it since it is being run inside a different session). Now give it execute permissions: sudo chmod +x /etc/smartmontools/run.d/60notify This will produce a nice libnotify bubble with a warning icon: You may also experiment with Zenity: DISPLAY=:0.0 zenity --text-info --filename="$1" --title="smartd: $SMARTD_DEVICE may have a problem" Notice: Be very careful with these scripts as they are run under the root account. ServerHere, you may wish to handle things differently. In this example we want to mail an admin and shut down the server. Comment out the line that contains DEVICESCAN, by adding # to the beginning of the line. Then, add this to the end of the file: /dev/hda -H -l error -l selftest -f -s (O/../../5/11|L/../../5/13|C/../../5/15) \-m admin@somewhere.com -M exec /usr/share/smartmontools/smartd-runner (Be sure not to add any whitespace after the "\") Now, we are going to make the script which is going to shut down the computer *after* we mail the admin. Create a text file called 99shutdownin /etc/smartmontools/run.d and add the following to the file: sleep 40shutdown -h now The number 99 at the start of the filename is to ensure that it is called last when smartd-runner runs. It will wait 40 seconds and then shut down the computer. Of course, you may customize this at will; you may not wish to turn off the server. Now, it is time to start the daemon: sudo service smartmontools start TestingIf you want to test all these actions, add -M test after exec /usr/share/smartmontools/smartd-runner and restart the daemon (sudo service smartmontools restart). When the daemon comes up, it will execute the script immediately with a test message. Notice:If you included the shutdown -h line, the script will shut down the computer as soon as the service starts. To fix this, you will have to start the computer in recovery mode and remove the -M test option from /etc/smartd.conf. Based on Gentoo Wiki: HOWTO Monitor your hard disk(s)withsmartmontools. NoteBefore running this, be sure to check that you have a "mail" command, and do a test first to your address. On my default Fiesty: jim@beorn:~$ mail The program 'mail' can be found in the following packages:
Try: sudo apt-get install <selected package> Make sure you have the 'universe' component enabled bash: mail: command not found Utility: Checking all disks at onceNote: Following the Gentoo Wiki I made a modified script which checks all the disk in /dev/disk/by-id/ Just invoke the script below as follows: ./smart.sh short|long|offline The script creates a directory named smart-logs and stores all the files there. # Script by Meliorator. irc://irc.freenode.net/Meliorator# modified by Ranpha[ ! "$@" ] && echo "Usage: $0 type [type] [type]"[ ! -e smart-logs ] && mkdir smart-logs[ ! -d smart-logs ] && Can not create smart-logs dir && exit 1a=0for t in "$@"; do case "$t" in offline) l=error;; short|long) l=selftest;; *) echo $t is an unrecognised test type. Skipping... && continue esac for hd in /dev/disk/by-id/ata*; do r=$(( $(smartctl -t $t -d ata $hd | grep 'Please wait' | awk '{print $3}') )) echo Check $hd - $t test in $r minutes [ $r -gt $a ] && a=$r done echo "Waiting $a minutes for all tests to complete" sleep $(($a))m for hd in /dev/disk/by-id/ata*; do smartctl -l $l -d ata $hd 2>&1 >> smart-logs/smart-${t}-${hd##*/}.log done donefor i in {1..10}; do sleep .01 echo -n -e \\adoneecho "All tests have completed" (Remember to give execute permissions to the script with chmod +x smart.sh). |
Home‎ > ‎Server config‎ > ‎