Visualize your logs and business data by using Google Chart
Well, if you already have Splunk or LogStash, and you know how to use them to visualize and analyze your logs and Business data, then you can stop reading this. (I will write separate threads to talk about LogStash later, but probably not Splunk, as I am not really big fan of it .. )
This post is written to fit following circumstances
- • You don't have fancy tools such as Splunk or LogStash (due to insufficient fund, politics or whatever reason)
- • You don't have enough resource to run fancy tools
- • You want integrate this with your web applications
The example I am going to show you today is to visualize access log of my Apache server, what we do is simply extract useful data out of access log and then feed those data into Google Chart (you could pick your own DV framework, like D3)
Before we dive into details, let me show you what we gonna achieve first.
To view the live example (the page is re-generated every 1 hour), click here
If you would love to setup something similar/better for your own server,
WARNING: then Carry on reading or your lunch box will be stolen tomorrow...
Format Access log
To visualize data, firstly without any doult, you have to collect whatever meaningful to you or your business, so before you carry on, you really should double think what information you needed.
For example, I want to have more visibility and in depth understanding of my web server, so I decided that following information are quite important to me:
- HTTP Response Code in percentage, let's say if more than 10 percent of response codes returned from my server are 404, then I will be concerned.
- HTTP Request Methods in percentage, let's say if you seem lots Options, Trace, etc coming thru, then it might be an idication of being attacked by someone.
- Client IP in percentage, let's say you see over thousands connection are from a single IP, then you better learn what that dude/bot trying to do sooner rather than later.
- URLs in percentage, allows you to be able to find out what's your most popular page, so that you can either refine the page or write something similar to gain more traffic, and also you be able to learn pages not that popular.(you can actually use Google Analytics to learn more about visitor and visitor's behavours.)
- Traffic Per minite, which allows you to be able to save your traffic pattern and use it later as a baseline, so that you can campare when it really hits the fan.
- Time consumption by URLs in percentage, so that you will be able to tell which URL overall consume most of server time, which allows you be able to prioritize your performance tuning list.
- Bandwidth Consumption by URLs in percentage, so that you will be able to tell which URL overall consume most of server bandwidth, which allows you be able to prioritize your performance tuning list
- Top 25 slowest URLs, so that you know better of customer experience, work on this list along with most popular URLs to improve the overall performance and response time.
Base on the requirment above, I decided to format my access log as following
Extract data and feed Google Chart
Use your favorite Programming language, Perl, Python or even Shell Scripts to parse the access log to the format required by Google Chart.
All charts we had for this example, only requires you to parse data into following format
Once you have the data, you can now feed it into charts, if you are not sure how to do it, just go to the live example page and view the source code of the page.(or you can visit the online docs of how to use google chart)
Based on your requirement, you can choose to parse your data once a day, or you can make it live by sheduling a cron job.
After practicing with your Apache log, you should now be able to visualize something else, like logs of your Application Servers, your integration servers, etc and perhaps your business data (well you might have to cooperate with your in-house developers/Vendors to generate some parse-friendly logs), and you could even go one more step further, you could define KPIs and SLAs based on the data your collected.
Posted by: leaonow on: July 28, 2016
- In: devops