Phasor Burn

Warning: Do not look into phasor with remaining eye.

About

Yet another collection of random links and rantings of a greying unix geek with a photography bent. Pass the Guinness and Grecian Formula.

Archive for the 'Open Source' Category

Nostalgia

Friday, April 26th, 2013

762748652.jpg

garden-caffe

Friday, April 30th, 2010


garden-caffe, originally uploaded by palkieu.


This guy is rocking the “coffee and bathrobe in the back yard” look. Something I can aspire to when the weather turns a bit warmer, LOL.

Apache mod_rewrite cache

Wednesday, July 22nd, 2009

Lets pretend you have a stock ticker application that is getting hammered. A lot of people are interested in one particular stock for some reason.

The poor little app server is running some lumbering hulking piece of code written in a legacy language (java). It can’t keep up with all the requests for this same stock symbol over and over.

Furthermore, the developers haven’t been able to put their own caching code into the application just yet.

How do we fix this, from a system administrators perspective?

Perhaps the better way would be to use mod_cache.

Unfortunately, this option did not exist on our web servers and we needed something PDQ to take the load off the app servers. We went with mod_rewrite instead and a small script to do the caching.

Cron runs a script every 5 minutes, which calls the app server, caches the results in a file. Then mod_rewrite rules tell Apache to use that instead of going to the app server (via existing rewrite rules)

In the apache virtual host entry :

RewriteCond %{REQUEST_URI} ^/ticker/s=AAPL$
RewriteCond /app/ticker/cache/AAPL -f
RewriteRule ^(.*) /app/ticker/cache/AAPL [L]

A script to maintain the cache :

#!/bin/bash

export PATH=/usr/bin

CACHEFILE=/app/ticker/cache/AAPL
TMPFILE=${CACHEFILE}.$$
trap "/bin/rm ${TMPFILE}" 0 1 15

curl -D - -H "host: www.example.com" \
"http://app01/ticker/s=AAPL" > ${TMPFILE} \
&& mv ${TMPFILE} ${CACHEFILE}

The cron entry to make it go :

0,5,10,15,20,25,30,35,40,45,50,55 * * * * /app/ticker/bin/update-ticker-cache >/dev/null 2>&1

Now when the outside url http://www.example.com/ticker/s=AAPL is hit, apache looks for the file /app/ticker/cache/AAPL and delivers its contents instead of passing the request through to the app server (via other rewrite rules).

Cron and the script keep this cache updated every 5 minutes.

The graph speaks for itself.

aapl.png

This is based on a real world example. Details have been changed to protect the guilty parties.

Excel 0, GnuPlot 1

Sunday, July 12th, 2009

excel-blows.png

Excel computed it’s brains out for 30+ seconds before giving me a partial graph and the complaint that I had too much data for it to continue. I guess you can’t really do anything *serious* with Excel for charting data, such as a month of 1-minute interval data points.

(60 minutes x 24 hours x 30 days = 42,200 points)

I had to resort to gnuplot, which is a bit funky but easily handles large datasets such as this.

What was I trying to graph anyways?

Web server hits per minute and average response time per minute, over a month.

I had custom apache log lines that looked like this :
63.214.229.120 - - [01/May/2009:00:00:00 +0000] "GET /s HTTP/1.1" 302 381 - - - "http://muy/url/here" "MOT-SPARK/00.62 UP.Browser/6.2.3.4.c.1.123 (GUI) MMP/2.0" "-" "-" 922us 0s -

I piped the logs through distillation code :

#!/bin/sh

bzcat logs/access_log_2009-05-??.bz2 |\
sed -e 's/\[//g' -e 's/\]//g' |\
awk ' {
day=substr($4,0,2)
month=substr($4,4,3)
year=substr($4,8,4)
hhmm=substr($4,13,5)

if ( month =="Jan" ) month="01"
if ( month =="Feb" ) month="02"
if ( month =="Mar" ) month="03"
if ( month =="Apr" ) month="04"
if ( month =="May" ) month="05"
if ( month =="Jun" ) month="06"
if ( month =="Jul" ) month="07"
if ( month =="Aug" ) month="08"
if ( month =="Sep" ) month="09"
if ( month =="Oct" ) month="10"
if ( month =="Nov" ) month="11"
if ( month =="Dec" ) month="12"

timestamp=year"-"month"-"day"-"hhmm

hits[timestamp]++
time[timestamp] += int($(NF-2)/1000)
timeavg[timestamp] = int(time[timestamp] / hits [timestamp])
}

END {
for (timestamp in hits) {
print timestamp " " hits[timestamp] " " timeavg[timestamp]
}
}
' | sort

Which after about a minute resulted in data looking like :

Time, Hits, Average Response (ms)
2009-05-01-00:00 357 508
2009-05-01-00:01 363 607
2009-05-01-00:02 357 589
2009-05-01-00:03 381 693
2009-05-01-00:04 405 576
2009-05-01-00:05 391 369

( That was the data set that Excel choked on… )

Now, create a gnuplot script for hits per minute

#!/opt/local/bin/gnuplot
set terminal png enhanced size 1024,768
set xdata time
set timefmt "%Y-%m-%d-%H:%M"
set format x "%d-%m-%Y"
set xlabel "time"
set grid
set style data points

set xrange [ "2009-05-01-00:00" : "2009-05-31-23:59" ]

set output "web01-05-hits.png"
set ylabel "Hits per minute"
set title "Web01 - May - Hits per minute"
plot "web01-05.dat" using 1:2 title ""

which created this graph in under a second :
web01-05-hits.png

Finally, the response time averages per minute

#!/opt/local/bin/gnuplot
reset
set terminal png enhanced size 1024,768
set xdata time
set timefmt "%Y-%m-%d-%H:%M"
set format x "%d-%m-%Y"
set xlabel "time"
set grid
set style data points

set xrange [ "2009-05-01-00:00" : "2009-05-31-23:59" ]

set output "web01-05-response.png"
set ylabel "Average Response Time (ms)"
set title "Web01 - May - Average Response"
plot "web01-05.dat" using 1:3 title ""

web01-05-response.png

[ Click on the graphs to expand them to a readable size. ]

So yeah. I tried to use Excel as these were to be one-off charts and it seemed like the easiest way to do it at first.

Seduced by the simplistic, (and simpleton) microsoft way at the start…

In the end, I learned a bit about gnuplot and how dead simple it really is to ask it to chew through a decent sample size and produce a nice enough looking graph in under a second.

Reblog this post [with Zemanta]

Too Much Innovation

Monday, June 22nd, 2009

Does The Linux Desktop Innovate Too Much?

<kosh>Yes.</kosh>

I’ve been waiting for this kind of article to come from inside the FOSS camp. I hope it is just one of many pebbles voting for the usability avalanche that leads to stability and greater traction.

In the mean time, I’ll stick with my proprietary locked down GUI which is based on a certified UNIX platform underneath, both of which do not get in my way and merely help me get my daily work done without getting in my face.