Using a Cron job and ‘du’ to generate a daily report on disk usage for all users
Here’s a bash script that lists the users who consume the most disk space on a Linux server, saves the results in a file, and implements log rotation. The script can be run as root and scheduled to run daily as a cron job.
I’ve modified the logger
command to include the custom tag user_disk_usage
to ensure the journal entries are categorized as requested so all journal entries are logged under the specific category user_disk_usage
.
Script: user_disk_usage.sh
#!/bin/bash
# Directory for storing logs
LOG_DIR="/var/log/user_disk_usage"
LOG_FILE="${LOG_DIR}/user_disk_usage.log"
CSV_LOG_FILE="${LOG_DIR}/user_disk_usage.csv"
# Function to log to the system journal with a specific category (tag)
log_to_journal() {
local log_message=$1
logger -t user_disk_usage "$log_message"
}
# Check if the script is run as root
if [ "$EUID" -ne 0 ]; then
echo "Error: This script must be run as root."
log_to_journal "Error: Script attempted to run without root privileges."
exit 1
fi
# Check if '--csv' argument is passed
OUTPUT_CSV=false
if [[ "$1" == "--csv" ]]; then
OUTPUT_CSV=true
LOG_FILE=$CSV_LOG_FILE
fi
# Ensure the log directory exists
mkdir -p $LOG_DIR
# Rotate logs: Keep only the last 7 days of logs
find $LOG_DIR -name "user_disk_usage.log*" -type f -mtime +6 -exec rm {} \;
find $LOG_DIR -name "user_disk_usage.csv*" -type f -mtime +6 -exec rm {} \;
# Create a new log file with a date stamp
TODAY=$(date +"%Y-%m-%d")
if [ "$OUTPUT_CSV" = true ]; then
LOG_FILE="${LOG_DIR}/user_disk_usage_${TODAY}.csv"
else
LOG_FILE="${LOG_DIR}/user_disk_usage_${TODAY}.log"
fi
# Log start of the script to the system journal under the 'user_disk_usage' category
log_to_journal "Script started: Generating user disk usage report."
# Get the disk usage for each user and save it to the log file
if [ "$OUTPUT_CSV" = true ]; then
echo "Directory, Size" > $LOG_FILE
else
echo "Disk usage report for $(date):" > $LOG_FILE
echo "===================================" >> $LOG_FILE
fi
# Capture the disk usage results
du_output=$(du -sh /home/* 2>/dev/null | sort -rh)
# Write the results to the log file
if [ "$OUTPUT_CSV" = true ]; then
# Convert to CSV format
echo "$du_output" | awk '{print $2 "," $1}' >> $LOG_FILE
else
echo "$du_output" >> $LOG_FILE
echo "===================================" >> $LOG_FILE
echo "End of report" >> $LOG_FILE
fi
# Log results to system journal under the 'user_disk_usage' category
log_to_journal "User disk usage results: $du_output"
# Log end of the script to system journal under the 'user_disk_usage' category
log_to_journal "Script finished: User disk usage report generated."
# Optional: Set proper permissions on the log file
chmod 600 $LOG_FILE
exit 0
Notes
du -sh /home/*
: Calculates the disk usage for each user in the/home
directory. You may need to adjust this path based on where user directories are located on your system.- Log rotation: The script saves the latest log to
user_disk_usage.log.1
before creating a new one. - Log directory: Logs are stored in
/var/log/user_disk_usage
. You can adjust this to any preferred location. - Permissions: The log file is secured with
chmod 600
to ensure that only root can read it.
Features
- System Journal Logging: The
logger
command now includes the taguser_disk_usage
, which categorizes all log entries under this tag. The functionlog_to_journal
handles all logging with this specific tag. - Log Rotation: The script still keeps the last 7 log files, with files older than 7 days being deleted.
- Output: As before, the output can be either plain text or CSV depending on whether the
--csv
argument is passed. - Root Privileges Check:
- Before running the main logic, the script checks if it’s being run by the root user with
if [ "$EUID" -ne 0 ]
. - If not, it logs an error to the system journal using
logger
and exits with status code 1.
- Before running the main logic, the script checks if it’s being run by the root user with
- CSV Output Option:
- When the
--csv
argument is passed, the script outputs both the daily disk usage and the difference in CSV format (Directory, Size
andDirectory, Previous Size, Current Size, Difference
). - If no argument is passed, it defaults to the regular text output.
- When the
Setting up the cron job:
- Make the script executable:
chmod +x /path/to/user_disk_usage.sh
- Open the root crontab file:
sudo crontab -e
- Add the following line to schedule the script to run daily at midnight:
0 0 * * * /path/to/user_disk_usage.sh
Command to View Journal Entries:
To view the journal entries for this script, you can use the following journalctl
command that filters based on the user_disk_usage
tag:
journalctl -t user_disk_usage
This will display all the entries that the script has logged in the system journal under the user_disk_usage
category.
Example Usage:
- Without CSV (default output):
/path/to/user_disk_usage.sh
- With CSV Output:
/path/to/user_disk_usage.sh --csv
- View Journal Entries:
journalctl -t user_disk_usage
This way, you can easily track the disk usage reports and any errors or important messages related to this script by querying the system journal using the specified tag.