Sunday, April 27, 2008

Lightning metric load to increasing performance!

It's easy to monitor everything in Introscope, the key is finding the right balance! With most tools as you increase the metrics collected you incur some expense usually in performance. Unfortunately it becomes complicated to determine where metrics are coming from to determine how useful they are! Introscope now has a metric count type view extension to assist!

The new metric count type view breaks down your metrics into different catagories dynamically based on what its collecting. If you have a lot of metrics enabled it may tame a minute or two to render, but it's great to see where your metrics are coming from.

As you grow with Introscope you'll realize that PMI metrics become pretty useless and look to start reducing these pretty quickly! If you want this type view ask your CA/Wily Rep to provide it to you! Its a must have for agent tuning!

Read more!

Tuesday, April 15, 2008

ChelseaPiers Hockey Schedule into your outlook or gmail cals.

I get annoyed at sites that provide schedules but no way to easily import them into my outlook or google calendar. I play hockey at chelsea piers in NYC and decided to finally write up something to do this for me! Feel free to use it as well.


Read more!

Friday, February 15, 2008

Monitoring and Tuning Verbose GC with Introscope EPAgent

Tuning garbage collection (GC) can be a difficult task without the right data, analysis tools and understanding of the runtime. With the help of the Introscope EPAgent and verbose gc logging I'll show you how to collect the data you need to be able to monitor and tune your runtimes GC!

A number of tools exist to analyze verbose gc log files, my personal favorite is IBM PMAT. PMAT gives some pretty graphs showing gc pattern analysis, allocation failures and more. Feel free to read more about PMAT here. The problem with PMAT is the time it takes to collect all the log files, and process them. Even on my loaded laptop with multiple cores and GigE connections it takes a few hours to get a report.

Our environment is pretty big. Running multiple cells, with hundreds of app servers, across multiple OS's and environments with high volumes. Our GC logs are a few hundred MB on average with the level of GC we output. We needed a way to monitor the GC logs in real time as well as having the ability to view historical data for baselines.

Introscope EPAgent came to the rescue! The EPAgent gives us the ability to monitoring log files and report the data back to the Enterprise Manager in real time. As GC data was written to the log files we would be able to collect it using Introscope. This gives us the ability to pull up historical information to use as baseline data to improve upon. As applications evolve we can easily review the data as part of our performance tuning process and report on it! As with any Introscope metrics we can also alert on them.

Here is a screenshot of the metrics this app server was collecting:

The setup is simple and I've put together a zip file of the perl script and EPAgent config file needed get done!

Download Here

Read more!

Friday, January 18, 2008

Portal 5.1 NlsCannotInterpretStateException

Recently I started to poke around our log files when we started having performance issues shortly after our upgrade of portal. I started to see a number of unfamiliar errors:

2008.01.25 15:02:33.111 W doGet()
class Unspecified message ()

I traced it back to before the upgrade which was good that it wasn't introduced by but what is it?

Searching through the IBM site, it didn't give me very much on it at all. Over the years I've learned not to ignore any errors from WAS or Portal until you can safely identify what they are! On some of our lower volume portal instances we were getting a few hundred of these a day.

Using Introscope I was able to see that one user navigating a synthetic transaction (sitescope) was actually executing portlets multiple times.. some of which were not even on the same page, but on that users homepage. Using the transaction tracer for Introscope and error detector we were able to see this pretty clearly.

With some tracing we were able to see we were missing some images (arrow.gif) and a few others. Each time portal attempted to pull if they didn't exist it would throw back a home page, rather than something light such as a 404. It seems the default behavior for portal if it cannot decode a URL is to throw the home page back. ICK!

That means a few bad images could really degrade performance for your portal. This seems like a pretty tough lesson to learn about missing images especially for high volume sites! Hopefully in future releases portal will understand light weight error pages (maybe even custom ones) and throw back something like a 404!

I won't hold my breath though!

Read more!

Friday, December 14, 2007

Introscope EJB's not being displayed

When we added Introscope on a number of Websphere Application servers we noticed some of the EJB's were not coming up. As we poked around we saw the EJB's in the PMI data from the app server, but not flagged inside introscope itself.

Upon some research we see that introscope idenfities EJB's using the directives below:

IdentifyInheritedAs: javax.ejb.SessionBean SessionBeanTracing
IdentifyInheritedAs: javax.ejb.EntityBean EntityBeanTracing

The directives tell introscope that any object that directly inherits from javax.ejb.(SESSION|ENTITY)Bean is a (SESSION|ENTITY) EJB. The limitation is that it must *directly* inherit.

Example, if class B inherits from javax.ejb.SessionBean, then Introscope will know that class B is a Session bean. However, if class C inherits from class B, Introscope will not trace class C.

This was a big hit for us as we have a number of abstract superclasses for our EJB's. A majority of our EJB's were not being tagged. Fortunately this was an easy fix, once they were identified. We created a pbd which contains a number of directives:

IdentifyInheritedAs {abstract_superclass} SessionBeanTracing

One cool thing when we move to WebSphere 6.1 that we will be able to take advantage of is "if you are using 1.5 JVM, Introscope now supports ProbeBuilding for Multi-Level Class Hierarchies. In pre-5.0 JVMs (we are using 1.4.2 JVM), Introscope does NOT instrument classes in the deeper levels of an class hierarchy—only the classes that explicitly extend a probed class. On JVM 5.0, you can configure Introscope to instrument multiple levels of subclasses of a probed class."

Read more!

Sunday, November 04, 2007


I'm still on my Introscope kick, trying to get us up to speed as quickly as I can. Being the lone soldier in this process its a lot of work, between doing the grunt work as well as looking forward to get the visibility we need. As we upgraded to 7.2.1 the workstation web start seems to use a newer version of java which resulted in the following error: unknown protocol: socket

This hit both IE and FireFox users. Upgrading to the latest version of web start resolved the issue!

Read more!

Friday, October 26, 2007

Wily EPAgent Stopping Unexpectedly

So I started playing with the Wily EPA agent. It’s basically a way to run scripts on a remote server and report those metrics back into the Introscope EM product. It’s incredibly useful when you want to correlate date. Unfortunately installing it wasn’t as smooth as I had hoped!

For starters – I’m on windows. Easy guys – I’m trying to get off it, really I am. I have to work with what I got for now! One issue under windows is by default it doesn’t have a way to register as a service (it’s a java program).

Luckily someone on the community site already provided a service wrapper, thanks! I installed the service wrapper, started playing with a few of my WMI scripts. Never in my life have I written vbs other than when I took my VB 6 programming class in college, I just found out why I never used it beyond that.

WMI is so complex. I needed to go to MS site and do a lot of research on how to get accurate numbers from the WMI data. I provided some of my scripts but please don’t blame me if they are ideal.

I’m a geek - I wanted things like context switches per second, Processor Queue Length, Network Information, and more... Stuff I could have easily gotten with an awk command on took days of figuring out via the WMI interfaces… anyway checkout the scripts here:

System Information -
Physical Disk Information -

So I attempted to start the service and it came up fine, but shortly after I would get a message user logged off then it shutdown. This was annoying! Come to find out windows sends a SIGNAL when a user logs off and java reads it. You can read more on this at the sun site.

The simple solution to this problem was to tell java to ignore the SIGNAL with the –Xrs option. I just added it to the wrapper conf file and was good to go.

# Java Additional Parameters

Hope this helps.

Read more!

Last posts