Saturday, November 20, 2004

Analyze table revisite/Gather Statistics

DBMS_STATS Package
Description
The PL/SQL package DBMS_STATS lets you generate and manage statistics for cost-based optimisation. You can use this package to gather, modify, view, export, import, and delete statistics.
The DBMS_STATS package can gather statistics on indexes, tables, columns, and partitions, as well as statistics on all schema objects in a schema or database. The statistics-gathering operations can run either serially or in parallel (DATABASE/SCHEMA/TABLE only)
Procedure Name
Description
GATHER_TABLE_STATS
Collects table, column, and index statistics.
GATHER_INDEX_STATS
Collects index statistics.
GATHER_SCHEMA_STATS
Collects statistics for all objects in a schema.
GATHER_DATABASE_STATS
Collects statistics for all objects in a database.
GATHER_SYSTEM_STATS
Collects CPU and I/O statistics for the system.
Previous to 8i, you would be using the ANALYZE ... methods. However 8i onwards, using ANALYZE for this purpose is not recommended because of various restrictions; for example:
1. ANALYZE always runs serially.
2. ANALYZE calculates global statistics for partitioned tables and indexes instead of gathering them directly. This can lead to inaccuracies for some statistics, such as the number of distinct values.
3. ANALYZE cannot overwrite or delete some of the values of statistics that were gathered by DBMS_STATS.
4. Most importantly, in the future, ANALYZE will not collect statistics needed by the cost-based optimiser.


ANALYZE can gather additional information that is not used by the optimiser, such as information about chained rows and the structural integrity of indexes, tables, and clusters. DBMS_STATS does not gather this information.

SQL Sourceset echo on
set feed on
set timing on
execute dbms_stats.gather_table_stats (ownname => 'SCOTT'
, tabname => 'DEPT'
, partname=> null
, estimate_percent => 20
, degree => 5
, cascade => true);
execute dbms_stats.gather_schema_stats (ownname => 'SCOTT'
, estimate_percent => 20
, degree => 5
, cascade => true);
execute dbms_stats.gather_database_stats (estimate_percent => 20
, degree => 5
, cascade => true);
SQL Source - Dynamic MethodDECLARE
sql_stmt VARCHAR2(1024);
BEGIN
FOR tab_rec IN (SELECT owner,table_name
FROM all_tables WHERE owner like UPPER('&1')
) LOOP
sql_stmt := 'BEGIN dbms_stats.gather_table_stats (ownname => :1, tabname
=> :2,partname=> null, estimate_percent => 20, degree => 5 ,cascade => true); END;'
;
EXECUTE IMMEDIATE sql_stmt USING tab_rec.owner, tab_rec.table_name ;
END LOOP;
END;
/

DBMS_STATS in 9i

http://www.oracledba.co.uk/tips/9i_dbms_stats.htm

Oracle Optimizer: Moving to and working with CBO and other articles from Amar Kumar Padhi

http://www.databasejournal.com/article.php/1558281

http://www.databasejournal.com/features/oracle/article.php/10893_3098241_1

Monday, November 15, 2004

SQL Tuning Troubleshooting Suggestions

This document contains a number of potentially useful pointers for use when attempting to tune an individual SQL statement. This is a vast topic and this is just a drop in the ocean.


Contents: Possible Causes of Poor SQL Performance
=================================================

1. Poorly tuned SQL
2. Poor disk performance/disk contention
3. Unnecessary sorting
4. Late row elimination
5. Over parsing
6. Missing indexes/use of 'wrong' indexes
7. Wrong plan or join order selected
8. Import estimating statistics on tables
9. Insufficiently high sample rate for CBO
10. Skewed data
11. New features forcing use of CBO
12. ITL contention


Diagnostics/Remedies
====================

1. Poorly tuned SQL

Often, part of the problem is finding the SQL that is causing the problems.
If you are seeing problems on a system, it is usually a good idea to start
by eliminating database setup issues by using the statspack (or older
UTLBSTAT & UTLESTAT) reports. See:

[NOTE:61998.1] Introduction to Tuning
[NOTE:94224.1] FAQ- STATSPACK COMPLETE REFERENCE
Tuning using BSTAT/ESTAT

for much more on this.

Once the database has been tuned to a reasonable level then the most
resource hungry selects can be determined as follows
(a very similar report can be found in the Enterprise Manager Tuning Pack):

SELECT address, SUBSTR(sql_text,1,20) Text, buffer_gets, executions,
buffer_gets/executions AVG
FROM v$sqlarea
WHERE executions > 0
AND buffer_gets > 100000
ORDER BY 5;

Remember that the 'buffer_gets' value of > 100000 needs to be varied for the
individual system being tuned. On some systems no queries will read more than
100000 buffers, while on others most of them will. This value allows you to
control how many rows you see returned from the select.

The ADDRESS value retrieved above can then be used to lookup the whole
statement in the v$sqltext view:

SELECT sql_text FROM v$sqltext WHERE address = '...' ORDER BY piece;

Once the whole statement has been identified it can be tuned to reduce
resource usage.

If the problem relates to CPU bound applications then CPU information
for each session can be examined to determine the culprits. The v$sesstat
view can be queried to find high cpu using sessions and then SQL can be
listed as before. Steps:

1. Verify the reference number for the 'CPU used by this session'
statistic:

SELECT name ,statistic#
FROM v$statname
WHERE name LIKE '%CPU%session';

NAME STATISTIC#
----------------------------------- ----------
CPU used by this session 12

2. Then determine which session is using most of the cpu:

SELECT * FROM v$sesstat WHERE statistic# = 12;

SID STATISTIC# VALUE
---------- ---------- ----------
1 12 0
2 12 0
3 12 0
4 12 0
5 12 0
6 12 0
7 12 0
8 12 0
9 12 0
10 12 0
11 12 0
12 12 0
16 12 1930

3. Lookup details for this session:

SELECT address ,SUBSTR(sql_text,1,20) Text, buffer_gets, executions,
buffer_gets/executions AVG
FROM v$sqlarea a, v$session s
WHERE sid = 16
AND s.sql_address = a.address
AND executions > 0
ORDER BY 5;

4. Use v$sqltext to extract the whole SQL text.

5. Explain the queries and examine their access paths. Autotrace is
a useful tool for examining access paths. See [NOTE:43214.1]


2. Poor disk performance/disk contention

Use of statspack (or BSTAT/ESTAT) and/or operating system i/o reports can
help in this area. Remember that you may be able to capture the activity of
a single statement by running the report around the run of your statement with
no other activity.

Another good way of monitoring IO is to run a 10046 Level 8 trace to
capture all the waits for a particular session. 10046 can be turned on at
the session level using:

alter session set events '10046 trace name context forever, level 8';

Excessing i/o can be found by examining the resultant trace file and
looking for i/o related waits such as:

'db file sequential read' (Single-Block i/o - Index, Rollback Segment or Sort)
'db file scattered read' (Multi-Block i/o - Full table Scan).

Remember to set TIMED_STATISTICS = TRUE to capture timing information
otherwise comparisons will be meaningless. See:

[NOTE:39817.1] SQL_TRACE interpretation

If you are also interested in viewing bind variable values then a level 12
trace an be used.


3. Unnecessary sorting

The first question to ask is 'Does the data REALLY need to be sorted?'
If sorting does need to be done then try to allocate enough memory to
prevent the sorts from spilling to disk an causing i/o problems.

Sorting is a very expensive operation:

- High CPU usage
- Potentially large disk usage

Try to make the query sort the data as late in the access path as possible.
The idea behind this is to make sure that the smallest number of rows
possible are sorted.

Remember that:

- Indexes may be used to provided presorted data.

- Sort merge joins inherently need to do a sort.

- Some sorts don't actually need a sort to be performed. In this case the
explain plan should show NOSORT for this operation.

In summary:

- Increase sort area size to promote in memory sorts.

- Modify the query to process less rows -> Less to sort

- Use an index to retrieve the rows in order and avoid the sort.

- use sort_direct_writes to avoid flooding the buffer cache with sort
blocks.

- If Pro*C use release_cursor=yes as this will free up any temporary
segments held open.



4. Late row elimination

Queries are more likely to be performant if the bulk of the rows can be
eliminated early in the plan. If this does happen then unnecessary
comparisons may be made on rows that are simply eliminated later.
This tends to increase CPU usage with no performance benefits.

If these rows can be eliminated early in the access path using a selective
predicate then this may significantly enhance the query performance.


5. Over parsing

Over parsing implies that cursors are not being shared.

If statements are referenced multiple times then it makes sense to share
then rather than fill up the shared pool with multiple copies of
essentially the same statement. See:

[NOTE:62143.1] Main issues affecting the Shared Pool on Oracle 7 and 8
[NOTE:70075.1] Use of bind variables with CBO


6. Missing indexes/use of 'wrong' indexes

If indexes are missing on key columns then queries will have to use Full
Table Scans to retrieve data. Usually indexes for performance should be
added to support selective predicates included in queries.

If an unselective index is chosen in preference to a selective one then
potential solutions are:

RBO
- indexes have an equal ranking so row cache order is used. See [NOTE:73167.1]

CBO
- reanalyze with a higher sample size
- add histograms if column data has an uneven distribution of values
- add hints to force use of the index you require

Remember that index usage on join can be compromised by the join type and
join order chosen. For more information on the use of indexes see
[NOTE:67522.1].



7. Wrong plan or join order selected

If the wrong plan has been selected then you may want to force the correct
one.

If the problem relates to an incorrect join order, then it ofter helps to
draw out the tables linking them together to show how they join e.g.:

A-B-C-D

E-F

This can help with visualisation of the join order and identifications of
missing joins. When tuning a plan, try different join orders
examining number of rows returned to get an idea of how good they may be.


8. Import estimating statistics on tables

Pre 8i, import performs an analyze estimate statistics on all tables
that were analyzed when the tables were exported. This can result in
different performance after an export/import.

Introduced in 8i, more sampling functionality has been introduced including
the facility to extract statistics on export.


9. Insufficiently high sample rate for CBO

If the CBO does not have the correct statistical information then it
cannot be expected to produce accurate results. Usually a sample size of
5% will be sufficient, however in some cases it may be necessary to have
more accurate statistics at its' disposal. Please see
[NOTE:44961.1] for Analysis recommendations.


10. Skewed data

If column data distribution is non uniform, then the use of column statistics
in the form of histograms should be considered. Histogram statistics do not
help with uniformly distributed data or where no information about the
column predicate is available such as with bind variables.
11. New features forcing use of CBO

A number of new features are not implemented in the RBO and their presence
in queries will force the use of the CBO. These include:

- Degree of parallelism set on any table in the query

- Index-only tables

- Partition Tables

- Materialised views

See [NOTE:66484.1] for a more extensive list.


12. ITL contention

ITL contention can occur when there is not enough Interested Transaction
Lists in each block to support the update volume required. This can often
occur after an export and import especially when no update space has been
left in the blocks and the ITLs have not been increased.

Wednesday, November 10, 2004

Sizing Extents for Performance

www.kevinloney.com/free/extperf.doc

This is an edited excerpt from ORACLE8 Advanced Tuning and Administration, by Eyal Aronoff, Kevin Loney, and Noorali Sonawalla, published under Osborne/McGraw-Hill's Oracle Press imprint. This edited version first appeared on http://www.kevinloney.com.

During a full table scan, Oracle uses its multiblock read capability to scan multiple blocks at a time. The number of blocks read at a time is determined by the database's DB_FILE_MULTIBLOCK_READ_COUNT setting in the init.ora file and by the limitations of the operating system's read buffers. For example, if you are limited to a 64KB buffer for the operating system, and the database block size is 4KB, you can read no more than 16 database blocks during a single read.
Consider the SALES table again, with an 8MB initial extent and a 4MB second extent. For this example, assume that the highwatermark of SALES is located at the end of the second extent. In the first extent, there are 2048 database blocks, 4KB each in size (8 MB/4 KB = 2048). During a full table scan, those blocks will be read 16 at a time, for a total of 128 reads (2048/16 = 128). In the second extent, there are 1024 database blocks. During a full table scan, those blocks will be read 16 at a time, for a total of 64 reads (1024/16=64). Thus, scanning the entire table will require 192 reads (128 for the first extent, plus 64 for the second extent).
What if the two extents were combined, with the table having the same highwatermark? The combined extent would be 12MB in size, consisting of 3072 database blocks. During a full table scan, those blocks will be read 16 at a time, for a total of 192 reads. Despite the fact that the extents have been compressed into a single extent, the exact same number of reads is required because of the way the extents were sized. As you will see in the next section, the location of the extents may influence the efficiency of the reads, but the number of reads is the same in both the single- and two-extent examples.
What if SALES had 192 extents, each one 16 blocks in length? A full table scan would read 16 blocks at a time, for a total of 192 reads. Thus, whether the SALES table had 1, 2, or 192 extents, the exact same number of reads was required to perform the full table scan. The size of the extents is critical—the size of each extent must be a multiple of the number of blocks read during each multiblock read (set via the DB_FILE_MULTIBLOCK_READ_COUNT init.ora parameter value).
In the 192-extent example, the extent size matched the setting of the DB_FILE_MULTIBLOCK_READ_COUNT value (16). If each extent had been 20 blocks (instead of 16 blocks), how many reads would be required?
The SALES table contains 3072 blocks of data (12MB total). If each extent is 20 blocks (80 KB) each, you'll need 154 extents to store the 3072 blocks of data. When reading the first extent during a full table scan, Oracle will read the first 16 blocks of the extent (as dictated by the DB_FILE_MULTIBLOCK_READ_COUNT). Because there are four blocks left in the extent, Oracle will issue a second read for that extent. Reads cannot span extents, so only four blocks are read by the second read. Therefore, the 20-block extent requires two reads. Since each extent is 20 blocks in length, each extent will require two reads. Because there are 154 extents, a full table scan of SALES will now require 308 reads—a 60 percent increase over the 192 reads previously required!

dbazine: Tuning and Performance

http://www.dbazine.com/foot3.shtml

Changing DB_FILE_MULTIBLOCK_READ_COUNT
This is a neat trick I learned from one of Rich Niemiec's articles. Rich has written numerous articles and given dozens of presentations on Oracle. If you don't have his latest book on tuning titled, Oracle Performance Tuning Tips & Techniques you need to buy it.
If you are thinking about changing the DB_BLOCK_SIZE parameter to increase database performance, consider increasing the DB_FILE_MULTIBLOCK_READ_COUNT parameter. More blocks will be read during sequential read operations. In some cases, it will provide benefits similar to a larger DB_BLOCK_SIZE parameter and you won't have to rebuild the entire database.
Increasing the DB_FILE_MULTIBLOCK_READ_COUNT may have an impact on access path selection. Full table scans use multiblock reads, so the cost of a full table scan depends on the number of multiblock reads required to read the entire table. The number of multiblock reads required to read the entire table depends on the number of blocks read by a single multiblock read which is specified by the DB_FILE_MULTIBLOCK_READ_COUNT parameter.
For this reason, the optimizer may be more likely to choose a full table scan when the value of this parameter is high.