Thursday, September 16, 2010

Oracle redo log slows down application

Symptom:
J2EE application on tomcat server suddenly has high latency, and has very slow server side processing response, and all 150 threads are used up. Regular request even takes more than 30 seconds while in normal case it takes around 100ms.

Production Info:
  1. SW: Oracle 10g RAC, Tomcat6.0, JDK1.6, CentOS4.4
  2. No production outage or HA failover/failback
  3. No stress test or peak load
Root Cause:
NFS mount point hung which in turn slowed the archiving the logs to the NFS mount point, so the redos were not getting archived fast enough, and caused the latency.

No comments:

Post a Comment