Database Troubleshooting- Automation and Troubleshooting

So far there have been scripts provided to help avoid issues with the database. Even in the brief discussion about Autonomous Database, self-repairing was discussed as there are tasks that can be done automatically to fix and have the database up and available.

But there are things that might need to be investigated such as connectivity issues, performance, or figuring out ways to load data faster.

The overall health of the database environment needs to be monitored and then quickly assessed if something is not quite right or errors are being thrown.

We have already described many of the tools that are used for troubleshooting with the data dictionary views and verifying that the monitoring and maintenance scripts are working.

Error messages can be in the database logs but also captured in the regular scripts run against the database that help with the investigation.

Quickly Triaging

When getting the call that there is an issue with the database, it is critical to be able to ask questions and know the right questions to ask.

Understanding the issue is the first step and must be done quickly to get to the other troubleshooting steps.

DBAs are going to be called upon to troubleshoot database and nondatabase issues, such as server, connection, and network issues as these are all part of the database system.

Or maybe the data is not being returned quickly enough.

Here are a handful of questions that are useful to understand the issue:

•     Is this in the application or with a direct connection to the database?

•     Is this a new process, query, or application code?

•     How long has this been slow? Has it happened before, or is this the first time?

•     Is anything (such as an error message or hanging) being returned?

•     Do you have any error messages that you are receiving?

As the answers are coming in, you can be checking the alert logs, checking the script output from the regular jobs, pinging the database and database server, and seeing if you are able to log in.

This should give you a good start for troubleshooting the issue.

As you have been automating jobs that perform tasks such as verifying the database availability, you should already know of some issues.

Automated jobs help you proactively handle issues so that they do not turn into database downtime.


Comments

Leave a Reply

Your email address will not be published. Required fields are marked *