15 Controlling the Use of Optimizer Statistics
Using DBMS_STATS
, you can specify when and how the optimizer uses statistics.
15.1 Locking and Unlocking Optimizer Statistics
You can lock statistics to prevent them from changing.
After statistics are locked, you cannot make modifications to the statistics until the statistics have been unlocked. Locking procedures are useful in a static environment when you want to guarantee that the statistics and resulting plan never change. For example, you may want to prevent new statistics from being gathered on a table or schema by the DBMS_STATS_JOB
process, such as highly volatile tables.
When you lock statistics on a table, all dependent statistics are locked. The locked statistics include table statistics, column statistics, histograms, and dependent index statistics. To overwrite statistics even when they are locked, you can set the value of the FORCE
argument in various DBMS_STATS
procedures, for example, DELETE_*_STATS
and RESTORE_*_STATS
, to true
.
15.1.1 Locking Statistics
The DBMS_STATS
package provides two procedures for locking statistics: LOCK_SCHEMA_STATS
and LOCK_TABLE_STATS
.
Assumptions
This tutorial assumes the following:
-
You gathered statistics on the
oe.orders
table and on thehr
schema. -
You want to prevent the
oe.orders
table statistics andhr
schema statistics from changing.
To lock statistics:
-
Start SQL*Plus and connect to the database as the
oe
user. -
Lock the statistics on
oe.orders
.For example, execute the following PL/SQL program:
BEGIN DBMS_STATS.LOCK_TABLE_STATS('OE','ORDERS'); END; /
-
Connect to the database as the
hr
user. -
Lock the statistics in the
hr
schema.For example, execute the following PL/SQL program:
BEGIN DBMS_STATS.LOCK_SCHEMA_STATS('HR'); END; /
See Also:
Oracle Database PL/SQL Packages and Types Reference to learn about the DBMS_STATS.LOCK_TABLE_STATS
procedure
15.1.2 Unlocking Statistics
The DBMS_STATS
package provides two procedures for unlocking statistics: UNLOCK_SCHEMA_STATS
and UNLOCK_TABLE_STATS
.
Assumptions
This tutorial assumes the following:
-
You locked statistics on the
oe.orders
table and on thehr
schema. -
You want to unlock these statistics.
To unlock statistics:
-
Start SQL*Plus and connect to the database as the
oe
user. -
Unlock the statistics on
oe.orders
.For example, execute the following PL/SQL program:
BEGIN DBMS_STATS.UNLOCK_TABLE_STATS('OE','ORDERS'); END; /
-
Connect to the database as the
hr
user. -
Unlock the statistics in the
hr
schema.For example, execute the following PL/SQL program:
BEGIN DBMS_STATS.UNLOCK_SCHEMA_STATS('HR'); END; /
See Also:
Oracle Database PL/SQL Packages and Types Reference to learn about the DBMS_STATS.UNLOCK_TABLE_STATS
procedure
15.2 Publishing Pending Optimizer Statistics
By default, the database automatically publishes statistics when the statistics collection ends.
Alternatively, you can use pending statistics to save the statistics and not publish them immediately after the collection. This technique is useful for testing queries in a session with pending statistics. When the test results are satisfactory, you can publish the statistics to make them available for the entire database.
15.2.1 About Pending Optimizer Statistics
The database stores pending statistics in the data dictionary just as for published statistics.
By default, the optimizer uses published statistics. You can change the default behavior by setting the OPTIMIZER_USE_PENDING_STATISTICS
initialization parameter to true
(the default is false
).
The top part of the following graphic shows the optimizer gathering statistics for the sh.customers
table and storing them in the data dictionary with pending status. The bottom part of the diagram shows the optimizer using only published statistics to process a query of sh.customers
.
Figure 15-1 Published and Pending Statistics
Description of "Figure 15-1 Published and Pending Statistics"
In some cases, the optimizer can use a combination of published and pending statistics. For example, the database stores both published and pending statistics for the customers
table. For the orders
table, the database stores only published statistics. If OPTIMIZER_USE_PENDING_STATS = true
, then the optimizer uses pending statistics for customers
and published statistics for orders
. If OPTIMIZER_USE_PENDING_STATS = false
, then the optimizer uses published statistics for customers
and orders
.
See Also:
Oracle Database Reference to learn about the OPTIMIZER_USE_PENDING_STATISTICS
initialization parameter
15.2.2 User Interfaces for Publishing Optimizer Statistics
You can use the DBMS_STATS
package to perform operations relating to publishing statistics.
The following table lists the relevant program units.
Table 15-1 DBMS_STATS Program Units Relevant for Publishing Optimizer Statistics
Program Unit | Description |
---|---|
|
Check whether the statistics are automatically published as soon as |
|
Set the |
|
Set the |
|
Publish valid pending statistics for all objects or only specified objects. |
|
Delete pending statistics. |
|
Export pending statistics. |
The initialization parameter OPTIMIZER_USE_PENDING_STATISTICS
determines whether the database uses pending statistics when they are available. The default value is false
, which means that the optimizer uses only published statistics. Set to true
to specify that the optimizer uses any existing pending statistics instead. The best practice is to set this parameter at the session level rather than at the database level.
You can use access information about published statistics from data dictionary views. Table 15-2 lists relevant views.
Table 15-2 Views Relevant for Publishing Optimizer Statistics
View | Description |
---|---|
|
Displays optimizer statistics for the tables accessible to the current user. |
|
Displays column statistics and histogram information extracted from |
|
Displays column statistics and histogram information for the table partitions owned by the current user. |
|
Describes column statistics and histogram information for subpartitions of partitioned objects owned by the current user. |
|
Displays optimizer statistics for the indexes accessible to the current user. |
|
Describes pending statistics for tables, partitions, and subpartitions accessible to the current user. |
|
Describes the pending statistics of the columns accessible to the current user. |
|
Describes the pending statistics for tables, partitions, and subpartitions accessible to the current user collected using the |
See Also:
-
Oracle Database PL/SQL Packages and Types Reference to learn about the
DBMS_STATS
package -
Oracle Database Reference to learn about
USER_TAB_PENDING_STATS
and related views
15.2.3 Managing Published and Pending Statistics
This section explains how to use DBMS_STATS
program units to change the publishing behavior of optimizer statistics, and also to export and delete these statistics.
Assumptions
This tutorial assumes the following:
-
You want to change the preferences for the
sh.customers
andsh.sales
tables so that newly collected statistics have pending status. -
You want the current session to use pending statistics.
-
You want to gather and publish pending statistics on the
sh.customers
table. -
You gather the pending statistics on the
sh.sales
table, but decide to delete them without publishing them. -
You want to change the preferences for the
sh.customers
andsh.sales
tables so that newly collected statistics are published.
To manage published and pending statistics:
-
Start SQL*Plus and connect to the database as user
sh
. -
Query the global optimizer statistics publishing setting.
Run the following query (sample output included):
sh@PROD> SELECT DBMS_STATS.GET_PREFS('PUBLISH') PUBLISH FROM DUAL; PUBLISH ------- TRUE
The value
true
indicates that the database publishes statistics as it gathers them. Every table uses this value unless a specific table preference has been set.When using
GET_PREFS
, you can also specify a schema and table name. The function returns a table preference if it is set. Otherwise, the function returns the global preference. -
Query the pending statistics.
For example, run the following query (sample output included):
sh@PROD> SELECT * FROM USER_TAB_PENDING_STATS; no rows selected
This example shows that the database currently stores no pending statistics for the
sh
schema. -
Change the publishing preferences for the
sh.customers
table.For example, execute the following procedure so that statistics are marked as pending:
BEGIN DBMS_STATS.SET_TABLE_PREFS('sh', 'customers', 'publish', 'false'); END; /
Subsequently, when you gather statistics on the
customers
table, the database does not automatically publish statistics when the gather job completes. Instead, the database stores the newly gathered statistics in theUSER_TAB_PENDING_STATS
table. -
Gather statistics for
sh.customers
.For example, run the following program:
BEGIN DBMS_STATS.GATHER_TABLE_STATS('sh','customers'); END; /
-
Query the pending statistics.
For example, run the following query (sample output included):
sh@PROD> SELECT TABLE_NAME, NUM_ROWS FROM USER_TAB_PENDING_STATS; TABLE_NAME NUM_ROWS ------------------------------ ---------- CUSTOMERS 55500
This example shows that the database now stores pending statistics for the
sh.customers
table. -
Instruct the optimizer to use the pending statistics in this session.
Set the initialization parameter
OPTIMIZER_USE_PENDING_STATISTICS
totrue
as shown:ALTER SESSION SET OPTIMIZER_USE_PENDING_STATISTICS = true;
-
Run a workload.
The following example changes the email addresses of all customers named Bruce Chalmers:
UPDATE sh.customers SET cust_email='ChalmersB@company.example.icom' WHERE cust_first_name = 'Bruce' AND cust_last_name = 'Chalmers'; COMMIT;
The optimizer uses the pending statistics instead of the published statistics when compiling all SQL statements in this session.
-
Publish the pending statistics for
sh.customers
.For example, execute the following program:
BEGIN DBMS_STATS.PUBLISH_PENDING_STATS('SH','CUSTOMERS'); END; /
-
Change the publishing preferences for the
sh.sales
table.For example, execute the following program:
BEGIN DBMS_STATS.SET_TABLE_PREFS('sh', 'sales', 'publish', 'false'); END; /
Subsequently, when you gather statistics on the
sh.sales
table, the database does not automatically publish statistics when the gather job completes. Instead, the database stores the statistics in theUSER_TAB_PENDING_STATS
table. -
Gather statistics for
sh.sales
.For example, run the following program:
BEGIN DBMS_STATS.GATHER_TABLE_STATS('sh','sales'); END; /
-
Delete the pending statistics for
sh.sales
.Assume you change your mind and now want to delete pending statistics for
sh.sales
. Run the following program:BEGIN DBMS_STATS.DELETE_PENDING_STATS('sh','sales'); END; /
-
Change the publishing preferences for the
sh.customers
andsh.sales
tables back to their default setting.For example, execute the following program:
BEGIN DBMS_STATS.SET_TABLE_PREFS('sh', 'customers', 'publish', null); DBMS_STATS.SET_TABLE_PREFS('sh', 'sales', 'publish', null); END; /
15.3 Creating Artificial Optimizer Statistics for Testing
To provide the optimizer with user-created statistics for testing purposes, you can use the DBMS_STATS.SET_*_STATS
procedures. These procedures provide the optimizer with artificial values for the specified statistics.
15.3.1 About Artificial Optimizer Statistics
For testing purposes, you can manually create artificial statistics for a table, index, or the system using the DBMS_STATS.SET_*_STATS
procedures.
When stattab
is null, the DBMS_STATS.SET_*_STATS
procedures insert the artificial statistics into the data dictionary directly. Alternatively, you can specify a user-created table.
Caution:
TheDBMS_STATS.SET_*_STATS
procedures are intended for development testing only. Do not use them in a production database. If you set statistics in the data dictionary, then Oracle Database considers the set statistics as the “real” statistics, which means that statistics gathering jobs may not re-gather artificial statistics when they do not meet the criteria for staleness.
Typical use cases for the DBMS_STATS.SET_*_STATS
procedures are:
-
Showing how execution plans change as the numbers of rows or blocks in a table change
For example,
SET_TABLE_STATS
can set number of rows and blocks in a small or empty table to a large number. When you execute a query using the altered statistics, the optimizer may change the execution plan. For example, the increased row count may lead the optimizer to choose an index scan rather than a full table scan. By experimenting with different values, you can see how the optimizer will change its execution plan over time. -
Creating realistic statistics for temporary tables
You may want to see what the optimizer does when a large temporary table is referenced in multiple SQL statements. You can create a regular table, load representative data, and then use
GET_TABLE_STATS
to retrieve the statistics. After you create the temporary table, you can “deceive” the optimizer into using these statistics by invokingSET_TABLE_STATS
.
Optionally, you can specify a unique ID for statistics in a user-created table. The SET_*_STATS
procedures have corresponding GET_*_STATS
procedures.
Table 15-3 DBMS_STATS Procedures for Setting Optimizer Statistics
DBMS_STATS Procedure | Description |
---|---|
SET_TABLE_STATS |
Sets table or partition statistics using parameters such as numrows , numblks , and avgrlen .
If the database uses the In-Memory Column store, you can set The optimizer uses the cached data to estimate the number of cached blocks for index or statistics table access. The total cost is the I/O cost of reading data blocks from disk, the CPU cost of reading cached blocks from the buffer cache, and the CPU cost of processing the data. |
SET_COLUMN_STATS |
Sets column statistics using parameters such as distcnt , density , nullcnt , and so on.
In the version of this procedure that deals with user-defined statistics, use |
SET_SYSTEM_STATS |
Sets system statistics using parameters such as iotfrspeed , sreadtim , and cpuspeed .
|
SET_INDEX_STATS |
Sets index statistics using parameters such as numrows , numlblks , avglblk , clstfct , and indlevel .
In the version of this procedure that deals with user-defined statistics, use |
See Also:
Oracle Database PL/SQL Packages and Types Reference to learn more about DBMS_STATS.SET_TABLE_STATS
and the other procedures for setting optimizer statistics
15.3.2 Setting Artificial Optimizer Statistics for a Table
This topic explains how to set artificial statistics for a table using DBMS_STATS.SET_TABLE_STATS
. The basic steps are the same for SET_INDEX_STATS
and SET_SYSTEM_STATS
.
-
For an object not owned by
SYS
, you must be the owner of the object, or have theANALYZE ANY
privilege. -
For an object owned by
SYS
, you must have theANALYZE ANY DICTIONARY
privilege or theSYSDBA
privilege. -
When invoking
GET_*_STATS
for a table, column, or index, the referenced object must exist.
-
You have the required privileges to use
DBMS_STATS.SET_TABLE_STATS
for the specified table. -
You intend to store the statistics in the data dictionary.
15.3.3 Setting Optimizer Statistics: Example
This example shows how to gather optimizer statistics for a table, set artificial statistics, and then compare the plans that the optimizer chooses based on the differing statistics.
-
You are logged in to the database as a user with DBA privileges.
-
You want to test when the optimizer chooses an index scan.
-
Create a table called
contractors
, and index thesalary
column.CREATE TABLE contractors ( con_id NUMBER, last_name VARCHAR2(50), salary NUMBER, CONSTRAINT cond_id_pk PRIMARY KEY(con_id) ); CREATE INDEX salary_ix ON contractors(salary);
-
Insert a single row into this table.
INSERT INTO contractors VALUES (8, 'JONES',1000); COMMIT;
-
Gather statistics for the table.
EXECUTE DBMS_STATS.GATHER_TABLE_STATS( user, tabname => 'CONTRACTORS' );
-
Query the number of rows for the table and index (sample output included):
SQL> SELECT NUM_ROWS FROM USER_TABLES WHERE TABLE_NAME = 'CONTRACTORS'; NUM_ROWS ---------- 1 SQL> SELECT NUM_ROWS FROM USER_INDEXES WHERE INDEX_NAME = 'SALARY_IX'; NUM_ROWS ---------- 1
-
Query contractors whose salary is 1000, using the
dynamic_sampling
hint to disable dynamic sampling:SELECT /*+ dynamic_sampling(contractors 0) */ * FROM contractors WHERE salary = 1000;
-
Query the execution plan chosen by the optimizer (sample output included):
SQL> SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR); SQL_ID cy0wzytc16g9n, child number 0 ------------------------------------- SELECT /*+ dynamic_sampling(contractors 0) */ * FROM contractors WHERE salary = 1000 Plan hash value: 5038823 ---------------------------------------------------------------------- | Id | Operation | Name |Rows|Bytes|Cost (%CPU)| Time| ---------------------------------------------------------------------- | 0 | SELECT STATEMENT | | | | 2 (100)| | |* 1 | TABLE ACCESS FULL| CONTRACTORS | 1 | 12 | 2 (0)| 00:00:01 | ---------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter("SALARY"=1000) 19 rows selected.
Because only 1 row exists in the table, the optimizer chooses a full table scan over an index range scan.
-
Use
SET_TABLE_STATS
andSET_INDEX_STATS
to simulate statistics for a table with 2000 rows stored in 10 data blocks:BEGIN DBMS_STATS.SET_TABLE_STATS( ownname => user , tabname => 'CONTRACTORS' , numrows => 2000 , numblks => 10 ); END; / BEGIN DBMS_STATS.SET_INDEX_STATS( ownname => user , indname => 'SALARY_IX' , numrows => 2000 ); END; /
-
Query the number of rows for the table and index (sample output included):
SQL> SELECT NUM_ROWS FROM USER_TABLES WHERE TABLE_NAME = 'CONTRACTORS'; NUM_ROWS ---------- 2000 SQL> SELECT NUM_ROWS FROM USER_INDEXES WHERE INDEX_NAME = 'SALARY_IX'; NUM_ROWS ---------- 2000
Now the optimizer believes that the table contains 2000 rows in 10 blocks, even though only 1 row actually exists in one block.
-
Flush the shared pool to eliminate possibility of plan reuse, and then execute the same query of
contractors
:ALTER SYSTEM FLUSH SHARED_POOL; SELECT /*+ dynamic_sampling(contractors 0) */ * FROM contractors WHERE salary = 1000;
-
Query the execution plan chosen by the optimizer based on the artificial statistics (sample output included):
SQL> SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR); SQL_ID cy0wzytc16g9n, child number 0 ------------------------------------- SELECT /*+ dynamic_sampling(contractors 0) */ * FROM contractors WHERE salary = 1000 Plan hash value: 996794789 --------------------------------------------------------------------------------- |Id| Operation | Name |Rows|Bytes|Cost(%CPU)|Time| --------------------------------------------------------------------------------- | 0| SELECT STATEMENT | | | |3(100)| | | 1| TABLE ACCESS BY INDEX ROWID BATCHED|CONTRACTORS|2000|24000|3 (34)|00:00:01| |*2| INDEX RANGE SCAN |SALARY_IX |2000| |1 (0)|00:00:01| --------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("SALARY"=1000) 20 rows selected.
Based on the artificially generated statistics for the number of rows and block distribution, the optimizer considers an index range scan more cost-effective.