Functions in the WHERE clause...
Business logic. Sometimes it lives in the application, sometimes it lives in the database. I'll save the application/db argument for each individual case that I encounter. But if you do have significant business logic in your database, there will probably come a time when someone wants to incorporate some of that logic into a query they are writing.
I am often asked to improve the speed of queries that have User Defined Functions (UDF) in the where clause. Sometimes I am successful at tuning them, sometimes a major structural or logic change is required to get good performance.
Because this comes up so often I wanted to get my analysis points on paper (or pixels as the case may be).
Keep this in mind when users complain about performance of these constructs or how different they perform with more data: The Database Is Doing Exactly What You Asked It To Do.
In some cases this means executing and evaluating the result of that function every time a row is visited in the WHERE clause. Does your query visit a million more rows than it needs to in order to generate the result? Guess what, you are running that UDF a million extra times and then throwing away the result.
On to the analysis points...
1) Is the performance of the UDF tuned?
If you are dealing with a call to a PL/SQL function you can use the PL/SQL Profiler to determine the number of times the function is executed and how long each line of the function takes to execute. If you see that a part of the function is inefficient, you may be able to address this and get the original query performing fine.
This is often the case when the original query goes to Production for the first time. The data volume in Dev/Test didn't cause the user to wait, but the data volume combined with the complexity of the PL/SQL code can be an issue in production.
Locate the PL/SQL that runs the longest and determine if it can be tuned to perform better. If the PL/SQL calls further queries this may simply be another SQL tuning exercise that can provide better performance with indexes, refining the logic, or refactoring the SQL.
More information about the PL/SQL Profiler is available here: Profiling PL/SQL Code - How to find out where your PL/SQL is spending its time
.
If you are calling a Java function I don't have much advice except isolate what is taking longest and then work to reduce the time to a reasonable level.
2) Is the UDF in a non-correlated subquery?
If the function is buried in a non correlated subquery it may be executed more times than necessary. In some cases the subquery is fired each time a row is evaluated in the WHERE clause.
Consider pulling the subquery out using the SQL-99 WITH clause and use the MATERIALIZE hint. This allows to execute only once and have its results saved, reducing the number of times the function is executed. This saved result is referenced while the query completes instead of executing the subquery fresh each time.
3) Reduce the amount of times the UDF is run
Make sure WHERE clause is as restrictive as possible on table that the function references. This will ensure othe function runs the minimum amount of times necessary to satisfy the query.
This may involve adding seemingly redundant criteria to the WHERE clause, but if the number of rows is reduced significantly before the UDF has to fire, much time may be saved.
4) Run the UDF before the query and store results.
Use a materialized view or trigger to record the output of the function in a table so it can be referenced by queries. If the data is pre computed and lives in a table, a table join can be used to access the data instead of a function in the WHERE clause.
Materialized Views have their own set of implementation decisions and issues, but if you can leverage them to speed up your query you may avoid waiting for the UDF to run during the query.
-Dave
Add new comment