Automagically Graphing on Linux

So my last post was about administration on Linux. After running Oracle on Windows for way too many years we migrated our Oracle systems to 64-bit Linux over the past year.

Now that we're on Linux I needed a lightweight way to capture and graph some performance data. The performance data was a mix of OS utilities and 3rd party utilities, but all the utilities output text, so I knew I could screen scrape the output, massage it a little, and save the data to a file. After I had the data in a file I knew I could run it through GnuPlot to graph it.

There is a fantastic gnuplot information page here: here. I started my project with a basic ugly graph and using the tips on the page I was able to create a very pretty and readable graph with only a few updates to a format file.

Here are the basic steps:

1) Capture the raw data

2) Massage the data and write it to a file

3) Graph the data

4) Put it all together and Crontab it!

In this example I will graph output of a simple query. The data is collected 1 file per day. Every time the getData.sh is run, data is added to the current day's datafile. The query determines how many users are logged into an instance. Beware of formatting, the "less than" and "greater than" symbol is being eaten by WordPress in the code examples below.

Implementation:

1) Prepare the SQL Query:

file: query.sql
SET FEEDBACK OFF
SET HEADING OFF
SET PAGESIZE 0
select to_char(sysdate,'YYYY-MM-DD-HH24:MI') || ' ' || username ||
' ' || UPPER(machine) || ' ' || COUNT(*) as text
FROM   v$session
WHERE  LENGTH(USERNAME) > 0
group by username, machine
union all
select to_char(sysdate,'YYYY-MM-DD-HH24:MI') ||
' -TOTAL- ALL ' || COUNT(*) as text
FROM   v$session;
EXIT
If you run this script in SQLPlus this produces output with 4 columns:

Col1: Date and time stamp of data collection. Formatting is modified to make it easier for GnuPlot to figure out.
Col2: The username of the logged in user.
Col3: The machine the user is logged in from.
Col4: The total count of users with username in Col2 that are logged in from Col3.

2) Set up format for Graph

file: format.gnu
set terminal png
set output "DATE-DBConnections.png"set xdata time
set xlabel "Time Of Day"
set xtics rotateset ylabel "Number of Connections"
set ytics 10
set mytics 2
set yrange [0:200]set grid ytics xtics

set timefmt "%Y-%m-%d-%H:%M"

plot "> cat conndataDATE.txt | grep TOTAL" using 1:4 with lines title "Total Oracle Sessions"

Note that "DATE" is going to be substituted when we run the script.


3) Get data and graph it

This shell script runs the show. It creates a variable called TODAY which is a date in format "YYYY-MM-DD". It runs SQLPLUS with the script we created above and appends it to the conndata$TODAY.txt file. Finally it runs GnuPlot with the format.gnu file and conndata$TODAY.txt file as input. The tricky part of this is that SED is used to substitute the date contained in the TODAY shell variable for any occurrences of "DATE" in the format.gnu. This allows us to read today's data file and write out a .png with today's date in the filename. file: getData.sh
#!/bin/sh
# Variables
TODAY=$(date +%Y-%m-%d)
# Gather the data
sqlplus -S perfstat/perfstat @query.sql >> conndata$TODAY.txt
# Generate the graph
cat format.gnu | sed s/DATE/$TODAY/ | gnuplot

4) Crontab it!

Now you just need to make this sucker run automagically with Cron. It will run, append the data to the file, and then generate the graph. The first time it is run after 12:01am a new datafile and graph will be created. This is just how I set up archiving because it worked well for me. If you get fancy with the shell script you could have it gather data for weeks, months, or years before starting a new datafile or graph. You can add a line to crontab like this to have it run every 15 minutes:
0,15,30,45 * * * * /monitor/getData.sh
Here is an example of the output :
Example Output

Add new comment