Automatic Reports on Disk Usage

Posted by : on

Category : bash   scripts   du


Using a Cron job and ‘du’ to generate a daily report on disk usage for all users

Here’s a bash script that lists the users who consume the most disk space on a Linux server, saves the results in a file, and implements log rotation. The script can be run as root and scheduled to run daily as a cron job.

I’ve modified the logger command to include the custom tag user_disk_usage to ensure the journal entries are categorized as requested so all journal entries are logged under the specific category user_disk_usage.

Script: user_disk_usage.sh

#!/bin/bash

# Directory for storing logs
LOG_DIR="/var/log/user_disk_usage"
LOG_FILE="${LOG_DIR}/user_disk_usage.log"
CSV_LOG_FILE="${LOG_DIR}/user_disk_usage.csv"

# Function to log to the system journal with a specific category (tag)
log_to_journal() {
    local log_message=$1
    logger -t user_disk_usage "$log_message"
}

# Check if the script is run as root
if [ "$EUID" -ne 0 ]; then
    echo "Error: This script must be run as root."
    log_to_journal "Error: Script attempted to run without root privileges."
    exit 1
fi

# Check if '--csv' argument is passed
OUTPUT_CSV=false
if [[ "$1" == "--csv" ]]; then
    OUTPUT_CSV=true
    LOG_FILE=$CSV_LOG_FILE
fi

# Ensure the log directory exists
mkdir -p $LOG_DIR

# Rotate logs: Keep only the last 7 days of logs
find $LOG_DIR -name "user_disk_usage.log*" -type f -mtime +6 -exec rm {} \;
find $LOG_DIR -name "user_disk_usage.csv*" -type f -mtime +6 -exec rm {} \;

# Create a new log file with a date stamp
TODAY=$(date +"%Y-%m-%d")
if [ "$OUTPUT_CSV" = true ]; then
    LOG_FILE="${LOG_DIR}/user_disk_usage_${TODAY}.csv"
else
    LOG_FILE="${LOG_DIR}/user_disk_usage_${TODAY}.log"
fi

# Log start of the script to the system journal under the 'user_disk_usage' category
log_to_journal "Script started: Generating user disk usage report."

# Get the disk usage for each user and save it to the log file
if [ "$OUTPUT_CSV" = true ]; then
    echo "Directory, Size" > $LOG_FILE
else
    echo "Disk usage report for $(date):" > $LOG_FILE
    echo "===================================" >> $LOG_FILE
fi

# Capture the disk usage results
du_output=$(du -sh /home/* 2>/dev/null | sort -rh)

# Write the results to the log file
if [ "$OUTPUT_CSV" = true ]; then
    # Convert to CSV format
    echo "$du_output" | awk '{print $2 "," $1}' >> $LOG_FILE
else
    echo "$du_output" >> $LOG_FILE
    echo "===================================" >> $LOG_FILE
    echo "End of report" >> $LOG_FILE
fi

# Log results to system journal under the 'user_disk_usage' category
log_to_journal "User disk usage results: $du_output"

# Log end of the script to system journal under the 'user_disk_usage' category
log_to_journal "Script finished: User disk usage report generated."

# Optional: Set proper permissions on the log file
chmod 600 $LOG_FILE

exit 0

Notes

  1. du -sh /home/*: Calculates the disk usage for each user in the /home directory. You may need to adjust this path based on where user directories are located on your system.
  2. Log rotation: The script saves the latest log to user_disk_usage.log.1 before creating a new one.
  3. Log directory: Logs are stored in /var/log/user_disk_usage. You can adjust this to any preferred location.
  4. Permissions: The log file is secured with chmod 600 to ensure that only root can read it.

Features

  1. System Journal Logging: The logger command now includes the tag user_disk_usage, which categorizes all log entries under this tag. The function log_to_journal handles all logging with this specific tag.
  2. Log Rotation: The script still keeps the last 7 log files, with files older than 7 days being deleted.
  3. Output: As before, the output can be either plain text or CSV depending on whether the --csv argument is passed.
  4. Root Privileges Check:
    • Before running the main logic, the script checks if it’s being run by the root user with if [ "$EUID" -ne 0 ].
    • If not, it logs an error to the system journal using logger and exits with status code 1.
  5. CSV Output Option:
    • When the --csv argument is passed, the script outputs both the daily disk usage and the difference in CSV format (Directory, Size and Directory, Previous Size, Current Size, Difference).
    • If no argument is passed, it defaults to the regular text output.

Setting up the cron job:

  1. Make the script executable:
    chmod +x /path/to/user_disk_usage.sh
    
  2. Open the root crontab file:
    sudo crontab -e
    
  3. Add the following line to schedule the script to run daily at midnight:
    0 0 * * * /path/to/user_disk_usage.sh
    

Command to View Journal Entries:

To view the journal entries for this script, you can use the following journalctl command that filters based on the user_disk_usage tag:

journalctl -t user_disk_usage

This will display all the entries that the script has logged in the system journal under the user_disk_usage category.

Example Usage:

  • Without CSV (default output):
     /path/to/user_disk_usage.sh
    
  • With CSV Output:
     /path/to/user_disk_usage.sh --csv
    
  • View Journal Entries:
     journalctl -t user_disk_usage
    

This way, you can easily track the disk usage reports and any errors or important messages related to this script by querying the system journal using the specified tag.


About Guillaume Plante
Guillaume Plante

A developper with a passion for technology, music, astronomy and art. Coding range: hardware/drivers, security, ai,. c/c++, powershell

Email : guillaumeplante.qc@gmail.com

Website : https://arsscriptum.ddns.net

Useful Links