ORION is a standalone tool for calibrating the I/O performance for storage systems that are intended to be used for Oracle databases.
To find out the detail and play with it, you can find the detail here.
Monday, January 05, 2009
Friday, January 02, 2009
DBA 2.0
While the whole world is talking about the web 2.0, DBA 2.0 is also not a NEW thing. To learn more about it, here is a good article: Performing an Oracle DBA 1.0 to DBA 2.0 Upgrade. Actually, it was a dialogue between 2 Oracle gurus.
Tuesday, December 23, 2008
pga_aggregate_target and db_cache_size
In his blog article Analysing Statspack(8), Jonathan Lewis has talked about 2 rules:
pga_aggregate_target - A comfortable figure for an OLTP system would be in the 2M - 4M range per process.
db_cache_size - is always at least as large as 1M per process.
Take a note first and check reason later.
pga_aggregate_target - A comfortable figure for an OLTP system would be in the 2M - 4M range per process.
db_cache_size - is always at least as large as 1M per process.
Take a note first and check reason later.
Wednesday, November 26, 2008
Oracle 9.2.0.8 and 10.2.0.4 Patch download list
Oracle 9.2.0.8 patch:
ftp://updates.oracle.com/4547809/p4547809_92080_AIX64-5L.zip
ftp://updates.oracle.com/4547809/p4547809_92080_AXP.zip
ftp://updates.oracle.com/4547809/p4547809_92080_HP64.zip
ftp://updates.oracle.com/4547809/p4547809_92080_HPUX-IA64.zip
ftp://updates.oracle.com/4547809/p4547809_92080_LINUX-S390.zip
ftp://updates.oracle.com/4547809/p4547809_92080_LINUX.zip
ftp://updates.oracle.com/4547809/p4547809_92080_Linux-IA64.zip
ftp://updates.oracle.com/4547809/p4547809_92080_Linux-x86-64.zip
ftp://updates.oracle.com/4547809/p4547809_92080_MVS.zip
ftp://updates.oracle.com/4547809/p4547809_92080_SOLARIS.zip
ftp://updates.oracle.com/4547809/p4547809_92080_SOLARIS64.zip
ftp://updates.oracle.com/4547809/p4547809_92080_TRU64.zip
ftp://updates.oracle.com/4547809/p4547809_92080_WINNT.zip
ftp://updates.oracle.com/4547809/p4547809_92080_WINNT64.zip
Oracle 10.2.0.4 patch:
ftp://updates.oracle.com/6810189/p6810189_10204_AIX5L.zip
ftp://updates.oracle.com/6810189/p6810189_10204_HPUX-IA64.zip
ftp://updates.oracle.com/6810189/p6810189_10204_HPUX-64.zip
ftp://updates.oracle.com/6810189/p6810189_10204_Solaris-64.zip
ftp://updates.oracle.com/6810189/p6810189_10204_Linux-x86-64.zip
ftp://updates.oracle.com/6810189/p6810189_10204_Linux-x86.zip
ftp://updates.oracle.com/6810189/p6810189_10204_Win32.zip
ftp://updates.oracle.com/6810189/p6810189_10204_MSWIN-x86-64.zip
ftp://updates.oracle.com/4547809/p4547809_92080_AIX64-5L.zip
ftp://updates.oracle.com/4547809/p4547809_92080_AXP.zip
ftp://updates.oracle.com/4547809/p4547809_92080_HP64.zip
ftp://updates.oracle.com/4547809/p4547809_92080_HPUX-IA64.zip
ftp://updates.oracle.com/4547809/p4547809_92080_LINUX-S390.zip
ftp://updates.oracle.com/4547809/p4547809_92080_LINUX.zip
ftp://updates.oracle.com/4547809/p4547809_92080_Linux-IA64.zip
ftp://updates.oracle.com/4547809/p4547809_92080_Linux-x86-64.zip
ftp://updates.oracle.com/4547809/p4547809_92080_MVS.zip
ftp://updates.oracle.com/4547809/p4547809_92080_SOLARIS.zip
ftp://updates.oracle.com/4547809/p4547809_92080_SOLARIS64.zip
ftp://updates.oracle.com/4547809/p4547809_92080_TRU64.zip
ftp://updates.oracle.com/4547809/p4547809_92080_WINNT.zip
ftp://updates.oracle.com/4547809/p4547809_92080_WINNT64.zip
Oracle 10.2.0.4 patch:
ftp://updates.oracle.com/6810189/p6810189_10204_AIX5L.zip
ftp://updates.oracle.com/6810189/p6810189_10204_HPUX-IA64.zip
ftp://updates.oracle.com/6810189/p6810189_10204_HPUX-64.zip
ftp://updates.oracle.com/6810189/p6810189_10204_Solaris-64.zip
ftp://updates.oracle.com/6810189/p6810189_10204_Linux-x86-64.zip
ftp://updates.oracle.com/6810189/p6810189_10204_Linux-x86.zip
ftp://updates.oracle.com/6810189/p6810189_10204_Win32.zip
ftp://updates.oracle.com/6810189/p6810189_10204_MSWIN-x86-64.zip
Thursday, November 20, 2008
Search Oracle Book on Google Book and Read online
Nowadays there is no much time to read Oracle books (I mean the real books). We rather to search in Metalink, Google or other Oracle related sites. As we all know, those books are not cheap. If we do want to buy them, we would like the company to pay for the bills:-)
Actually, we can find lots of Oracle books for online reading. Here is an example. You can search Oracle book by using Google Book:
http://books.google.com/books?q=Oracle&lr=&sa=N&start=0
You can even fine-tune your search. Here are some books for your reference:
Oracle Essentials: Oracle Database 10g
Oracle Performance Tuning
Oracle High Performance Tuning for 9i and 10g
Oracle Performance Troubleshooting: With Dictionary Internals SQL & Tuning
Be aware: you can NOT read full pages of the above books. If you feel those books really good, you will have to buy by yourself or company. Anyway, enjoy!
Actually, we can find lots of Oracle books for online reading. Here is an example. You can search Oracle book by using Google Book:
http://books.google.com/books?q=Oracle&lr=&sa=N&start=0
You can even fine-tune your search. Here are some books for your reference:
Oracle Essentials: Oracle Database 10g
Oracle Performance Tuning
Oracle High Performance Tuning for 9i and 10g
Oracle Performance Troubleshooting: With Dictionary Internals SQL & Tuning
Be aware: you can NOT read full pages of the above books. If you feel those books really good, you will have to buy by yourself or company. Anyway, enjoy!
Thursday, April 27, 2006
How to Misuse SQL's FROM Clause
http://www.onlamp.com/lpt/a/5220
Stéphane Faroult first discovered the relational model and the SQL language in 1983.
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnsqlpro05/html/sp05a1.asp
http://searchoracle.techtarget.com/ateQuestionNResponse/0,289625,sid41_cid611283_tax301455,00.html
Stéphane Faroult first discovered the relational model and the SQL language in 1983.
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnsqlpro05/html/sp05a1.asp
http://searchoracle.techtarget.com/ateQuestionNResponse/0,289625,sid41_cid611283_tax301455,00.html
Wednesday, February 01, 2006
Default Kernel Parameters for Sun OS 9.X Database
Metalink Note:187273.1 Type: PROBLEM
Last Revision Date: 21-OCT-2005 Status: PUBLISHED
Solution Description:
---------------------
This article is to define the default kernel parameter settings for the
Sun OS/Solaris Operating system running Oracle 8.X Enterprise Edition.
Solution Explanation:
---------------------
In Sun OS the /etc/system file is the kernel parameter file.
The current kernel parameters values can be displayed by using the
'sysdef -i' command.
- Refer to Note 70688.1"Location and Dispay of kernel parameters" for more
information on uderstanding the 'sysdef -i'output.
======================================================================
UNIX Kernel Parameters:
-----------------------
SEMMNI 100
- Defines the maximum number of semaphore sets in the entire system.
SEMMNS 256
- Defines the maximum semaphores on the system. This setting is a
minimum recommended value, for initialinstallation only. The SEMMNS
parameter should be set to the sum of the PROCESSES parameter for each Oracle
database, adding the largest one twice, and then adding an additional 10 for
each database.
SEMMSL 256
- Defines the minimum recommended value, for initial installation only.
SHMMAX 4294967295
- Defines the maximum allowable size of one shared memory segment.
(4 GB = 4294967295)
SHMMIN 1
- Defines the minimum allowable size of a single shared memory segment.
SHMMNI 100
- Defines the maximum number of shared memory segments in the entire system.
SHMSEG 10
- Defines the maximum number of shared memory segments one process can attach.
Note: These are minimum kernel requirements for Oracle9i. If you have previously
tuned your kernel parameters to levels equal to or higher than these values,
continue to use the higher values. A system restart is necessary for kernel
changes to take effect.
------------------------------------------------------------------------------
* Example of Oracle required parameters:
set shmsys: shminfo_shmseg = 10
set shmsys: shminfo_shmmin = 1
set shmsys: shminfo_shmmni = 100
set semsys: seminfo_semmns = 256
set semsys: seminfo_semmni = 4096
set semsys: seminfo_semmsl = 256
set shmsys: shminfo_shmmax = 4294967295
Last Revision Date: 21-OCT-2005 Status: PUBLISHED
Solution Description:
---------------------
This article is to define the default kernel parameter settings for the
Sun OS/Solaris Operating system running Oracle 8.X Enterprise Edition.
Solution Explanation:
---------------------
In Sun OS the /etc/system file is the kernel parameter file.
The current kernel parameters values can be displayed by using the
'sysdef -i' command.
- Refer to Note 70688.1"Location and Dispay of kernel parameters" for more
information on uderstanding the 'sysdef -i'output.
======================================================================
UNIX Kernel Parameters:
-----------------------
SEMMNI 100
- Defines the maximum number of semaphore sets in the entire system.
SEMMNS 256
- Defines the maximum semaphores on the system. This setting is a
minimum recommended value, for initialinstallation only. The SEMMNS
parameter should be set to the sum of the PROCESSES parameter for each Oracle
database, adding the largest one twice, and then adding an additional 10 for
each database.
SEMMSL 256
- Defines the minimum recommended value, for initial installation only.
SHMMAX 4294967295
- Defines the maximum allowable size of one shared memory segment.
(4 GB = 4294967295)
SHMMIN 1
- Defines the minimum allowable size of a single shared memory segment.
SHMMNI 100
- Defines the maximum number of shared memory segments in the entire system.
SHMSEG 10
- Defines the maximum number of shared memory segments one process can attach.
Note: These are minimum kernel requirements for Oracle9i. If you have previously
tuned your kernel parameters to levels equal to or higher than these values,
continue to use the higher values. A system restart is necessary for kernel
changes to take effect.
------------------------------------------------------------------------------
* Example of Oracle required parameters:
set shmsys: shminfo_shmseg = 10
set shmsys: shminfo_shmmin = 1
set shmsys: shminfo_shmmni = 100
set semsys: seminfo_semmns = 256
set semsys: seminfo_semmni = 4096
set semsys: seminfo_semmsl = 256
set shmsys: shminfo_shmmax = 4294967295
9.2.0.4 databases to 9.2.0.6, 9.2.0.7 or wait for testing?
http://searchoracle.techtarget.com/expert/KnowledgebaseAnswer/0,289625,sid41_gci1157663,00.html?track=sy40
My name is Steve and I'm a Lead Oracle DBA. My question revolves around the quarterly security patches and the best 9i release of Oracle to be on. I have over 300 9.2.0.4 databases. We just started creating 9.2.0.6 databases with an eye toward going to 10.2 in the first quarter next year. My question/issue is this: Oracle released 9.2.0.7 recently, and my concern is that they will stop patching 9.2.0.6 in the near future (not sure of that date). I'd like to be on the terminal release of 9i but that's been a moving target. Anyway, should I upgrade my existing 9.2.0.4 databases to 9.2.0.6, 9.2.0.7 or wait for testing, etc. of 10.2 and just move them all to that release? Any advice would be helpful. Thanks.
This question posed on 10 January 2006
>This is a subject I have quite an interest in so I could probably spend hours discussing it. This patching issue is still relatively new to most DBAs and it can be especially painful if there are a large number of databases to support, as is the case here. I work for an IT consulting company, but I just spent the last two years at a large client and we had to tackle this exact problem. This client had more than 200 Oracle databases that had to be patched on a regular basis. With Sarbanes-Oxley (and other) regulatory compliance legislation, patching databases is no longer an option, but a necessity.
Your concern that patches will no longer be available for Oracle 9.2.0.6 is quite valid -- this, in fact, will happen one day soon. So, in my opinion, you need to develop a strategy that balances the practicality of patching but can tolerate some risk. For example, you can apply the appropriate CPU patches once per year (say, in late spring) and then plan to upgrade databases to the next release in the fall. In theory, the latest release will contain the latest CPUs. It's nearly impossible to upgrade 300 databases more than once per year; as well, it would be impossible to apply all four CPUs to these databases considering you would require outages which can be very difficult to obtain on production systems.
Whatever strategy you choose, make sure that it works for your organization and that you can justify it. Also, document the strategy and your rationale for choosing it.
My name is Steve and I'm a Lead Oracle DBA. My question revolves around the quarterly security patches and the best 9i release of Oracle to be on. I have over 300 9.2.0.4 databases. We just started creating 9.2.0.6 databases with an eye toward going to 10.2 in the first quarter next year. My question/issue is this: Oracle released 9.2.0.7 recently, and my concern is that they will stop patching 9.2.0.6 in the near future (not sure of that date). I'd like to be on the terminal release of 9i but that's been a moving target. Anyway, should I upgrade my existing 9.2.0.4 databases to 9.2.0.6, 9.2.0.7 or wait for testing, etc. of 10.2 and just move them all to that release? Any advice would be helpful. Thanks.
This question posed on 10 January 2006
>This is a subject I have quite an interest in so I could probably spend hours discussing it. This patching issue is still relatively new to most DBAs and it can be especially painful if there are a large number of databases to support, as is the case here. I work for an IT consulting company, but I just spent the last two years at a large client and we had to tackle this exact problem. This client had more than 200 Oracle databases that had to be patched on a regular basis. With Sarbanes-Oxley (and other) regulatory compliance legislation, patching databases is no longer an option, but a necessity.
Your concern that patches will no longer be available for Oracle 9.2.0.6 is quite valid -- this, in fact, will happen one day soon. So, in my opinion, you need to develop a strategy that balances the practicality of patching but can tolerate some risk. For example, you can apply the appropriate CPU patches once per year (say, in late spring) and then plan to upgrade databases to the next release in the fall. In theory, the latest release will contain the latest CPUs. It's nearly impossible to upgrade 300 databases more than once per year; as well, it would be impossible to apply all four CPUs to these databases considering you would require outages which can be very difficult to obtain on production systems.
Whatever strategy you choose, make sure that it works for your organization and that you can justify it. Also, document the strategy and your rationale for choosing it.
Monday, January 16, 2006
How to find obsolete parameters in Oracle
By querying V$OBSOLETE_PARAMETER.
If the value in column ISSPECIFIED is 'TRUE' then it is specified in the init.ora file.
select * from V$OBSOLETE_PARAMETER
where ISSPECIFIED = 'TRUE'
/
If the value in column ISSPECIFIED is 'TRUE' then it is specified in the init.ora file.
select * from V$OBSOLETE_PARAMETER
where ISSPECIFIED = 'TRUE'
/
Monday, January 09, 2006
Thursday, December 22, 2005
Salesforce "failover"?
CNET reported on December 21, 2005: A Salesforce.com outage lasting nearly a day cut off access to critical business data for many of the company's customers on Tuesday in what appears to be Salesforce's most severe service disruption to date.
Salesforce, which has been growing rapidly, has undertaken efforts to bolster its computing infrastructure. For instance, it has configured its database to run on four different computers so if a machine fails, others will pick up the slack, Francis said. But the "failover" feature didn't prevent Tuesday's problems.
Salesforce's database supplier helped to restore service, Francis said. While he declined to identify who that supplier was, he did identify Oracle as Salesforce's biggest database supplier.
To see the detail, go to http://news.com.com/Salesforce+outage+angers+customers/2100-1012_3-6004625.html?tag=nefd.top
Salesforce, which has been growing rapidly, has undertaken efforts to bolster its computing infrastructure. For instance, it has configured its database to run on four different computers so if a machine fails, others will pick up the slack, Francis said. But the "failover" feature didn't prevent Tuesday's problems.
Salesforce's database supplier helped to restore service, Francis said. While he declined to identify who that supplier was, he did identify Oracle as Salesforce's biggest database supplier.
To see the detail, go to http://news.com.com/Salesforce+outage+angers+customers/2100-1012_3-6004625.html?tag=nefd.top
Tuesday, December 06, 2005
Manually Resolving In-Doubt Transactions: Different Scenarios
NOTE1: If using Oracle 9i and DBMS_TRANSACTION.PURGE_LOST_DB_ENTRY fails with
ORA-30019: Illegal rollback Segment operation in Automatic Undo mode, use the following workaround
SQL> alter session set "_smu_debug_mode" = 4;
SQL>execute DBMS_TRANSACTION.PURGE_LOST_DB_ENTRY('local_tran_id');
select * from dba_2pc_pending
/
SQL> select LOCAL_TRAN_ID, STATE, MIXED, ADVICE from dba_2pc_pending;
LOCAL_TRAN_ID STATE MIX A
---------------------- ---------------- --- -
3.7.99084 prepared no
http://www-rohan.sdsu.edu/doc/oracle/server803/A54647_01/ch4e.htm
COMMIT FORCE '3.7.99084';
SQL> select LOCAL_TRAN_ID, STATE, MIXED, ADVICE from dba_2pc_pending;
LOCAL_TRAN_ID STATE MIX A
---------------------- ---------------- --- -
3.7.99084 forced commit no
SQL> select * from dba_pending_transactions;
FORMATID
----------
GLOBALID
--------------------------------------------------------------------------------
BRANCHID
--------------------------------------------------------------------------------
48801
34A257C2BC134A007FFD
73616D705841436F6E6E506F6F6C
alter session set "_smu_debug_mode" = 4;
execute DBMS_TRANSACTION.PURGE_LOST_DB_ENTRY('3.7.99084');
SQL> select * from dba_pending_transactions;
no rows selected
SQL> select * from dba_pending_transactions;
no rows selected
ORA-30019: Illegal rollback Segment operation in Automatic Undo mode, use the following workaround
SQL> alter session set "_smu_debug_mode" = 4;
SQL>execute DBMS_TRANSACTION.PURGE_LOST_DB_ENTRY('local_tran_id');
select * from dba_2pc_pending
/
SQL> select LOCAL_TRAN_ID, STATE, MIXED, ADVICE from dba_2pc_pending;
LOCAL_TRAN_ID STATE MIX A
---------------------- ---------------- --- -
3.7.99084 prepared no
http://www-rohan.sdsu.edu/doc/oracle/server803/A54647_01/ch4e.htm
COMMIT FORCE '3.7.99084';
SQL> select LOCAL_TRAN_ID, STATE, MIXED, ADVICE from dba_2pc_pending;
LOCAL_TRAN_ID STATE MIX A
---------------------- ---------------- --- -
3.7.99084 forced commit no
SQL> select * from dba_pending_transactions;
FORMATID
----------
GLOBALID
--------------------------------------------------------------------------------
BRANCHID
--------------------------------------------------------------------------------
48801
34A257C2BC134A007FFD
73616D705841436F6E6E506F6F6C
alter session set "_smu_debug_mode" = 4;
execute DBMS_TRANSACTION.PURGE_LOST_DB_ENTRY('3.7.99084');
SQL> select * from dba_pending_transactions;
no rows selected
SQL> select * from dba_pending_transactions;
no rows selected
Wednesday, November 09, 2005
Developing Signatures for Data Buffers
Remember, change occurs in the data buffers rapidly, and sometimes a long-term analysis will provide clues that point to processing problems within the database. Almost every Oracle database exhibits patterns that are linked to regular processing schedules, called signatures.
To solve this problem, the DBA might schedule a dynamic adjustment to add more RAM to db_cache_size every day.
http://www.oracle.com/technology/oramag/webcolumns/2003/techarticles/burleson_auto_pt2.html
To solve this problem, the DBA might schedule a dynamic adjustment to add more RAM to db_cache_size every day.
http://www.oracle.com/technology/oramag/webcolumns/2003/techarticles/burleson_auto_pt2.html
Wednesday, October 26, 2005
Fast guide to finding and keeping an Oracle job
This guide provides tips and advice to help you stand out from the hundreds of other job seekers.
Frequently asked questions and myths about indexes
[SearchOracle.com]
Popular author Tom Kyte tackles some of the most common questions about Oracle indexes and debunks some myths in the process.
Popular author Tom Kyte tackles some of the most common questions about Oracle indexes and debunks some myths in the process.
Monday, October 24, 2005
44% of database devs use MySQL
44% of database devs use MySQL by ZDNet's ZDNet Research -- Open source database deployments are up more than 20% in the last six months, according to Evans Data. MySQL use, has increased by more than 25% in six months and is approaching a majority in the database space, with 44% of developers using the open source database. More than 60% of database developers say their [...]
Wednesday, October 19, 2005
Fast guide to finding and keeping an Oracle job
Elisa Gabbert, Assistant Site Editor
03.16.2005
Looking for your first Oracle DBA or developer job? Concerned about the future of your current position? You're not alone! With the IT market becoming increasingly competitive, it can be very difficult to find a job or keep the one you've got. We know how time-consuming it can be to search the Web for relevant career information, so we've gathered some valuable Oracle-related resources for you. This guide provides tips and advice to help you stand out from the hundreds of other job seekers. If you're interested in brushing up on the latest interviewing techniques, checking out Oracle salaries or simply searching for open positions, you've come to the right place.
03.16.2005
Looking for your first Oracle DBA or developer job? Concerned about the future of your current position? You're not alone! With the IT market becoming increasingly competitive, it can be very difficult to find a job or keep the one you've got. We know how time-consuming it can be to search the Web for relevant career information, so we've gathered some valuable Oracle-related resources for you. This guide provides tips and advice to help you stand out from the hundreds of other job seekers. If you're interested in brushing up on the latest interviewing techniques, checking out Oracle salaries or simply searching for open positions, you've come to the right place.
Essential performance forecasting, part 2: I/O
Craig Shallahamer
SearchOracle.com
As I wrote in the first part of this series, forecasting Oracle performance is absolutely essential for every DBA to understand and perform. When performance begins to degrade, it's the DBA who hears about it, and it's the DBA who's supposed to fix it. Fortunately, low precision forecasting can be done very quickly and it is a great way to get started forecasting Oracle performance. This time, I'll focus on I/O performance forecasting.
The key metrics we want to forecast are utilization, queue time, and response time. With only these three metrics, as a DBA you can perform all sorts of low precision what-if scenarios. To derive the values, you essentially need 3 things:
a few simple formulas
some basic operating system statistics
some basic Oracle statistics
Modern I/O subsystems can be extremely difficult to forecast. Just as with Oracle, there is batching, caching, and a host of complex algorithms centered around optimizing performance. While these features are great for performance, they make intricate forecast models very complex. This may seem like a problem, but actually it's not. I have found that by keeping the level of detail and complexity at a consistently lower level (i.e., less detail), overall system I/O forecasts are typically more than adequate.
At a very basic level, an I/O subsystem is modeled differently than a CPU subsystem. A CPU subsystem routes all transactions into a single queue. All CPUs feed off of this single queue. This is why, with a mature operating system, any one CPU should be just as busy as the next. If you have had I/O performance problems you know the situation is very different.
In contrast to a CPU subsystem, each I/O device has its own queue. A transaction cannot simply be routed to any device. It must go specifically where the data it needs resides or where it has been told to write a specific piece of data. This is why each device needs its own queue and why some I/O queues are longer than others. This is also why balancing IO between devices is still the number one I/O subsystem bottleneck solution.
Today an I/O device can mean just about anything. It could be a single physical disk, a disk partition, a raid array, or some combination of these. The key when forecasting I/O is whatever you call a "device" is a device throughout the entire forecast. If a device is a 5 disk raid array, then make sure whenever a device is referenced, everyone involved understands the two devices are actually two raid arrays, each with five physical disks. If your device definition is consistant, you'll avoid many problems.
The forecasting formulas we'll use below assume the I/O load is perfectly distributed across all devices. While today's I/O subsystems do a fantastic job at distributing I/O activity, many DBAs do not. I have found that while an array's disk activity is nearly perfectly balanced, the activity from one array to the next may not be very well balanced. Hint: If an I/O device is not very active (utilization less than 5%), do not count it as a device. It is better to be conservative then aggressive when forecasting.
Before you are inundated with formulas, it's important to understand some definitions and recognize their symbols.
S : Time to service one workload unit. This is known as the service time or service demand. It is how long it takes a device to service a single transaction. For example, 1.5 seconds per transaction or 1.5 sec/trx. For simplicity sake, this value will be derived.
U : Utilization or device busyness. Commonly shown as a percentage and that's how it works in our formulas. For example, in the formula it should be something like 75% or 0.75, but not 75. This value can be gathered from both sar or iostat.
λ : Workload arrival rate. This is how many transactions enter the system per unit of time. For example, 150 transactions each second or 150 trx/sec. When working with Oracle, there are many possible statistics that can be used for the "transaction" arrival rate. For simplicity sake, this value will be derived and will refer to the general workload.
M : Number of devices. You can get this from the sar or iostat report. Be careful not to count both a disk and a disk's partition, resulting in a double count.
W : Wait time or more commonly called queue time. This is how long a transaction must wait before it begins to be serviced. For simplicity sake, this value will be derived.
R : Response time. This is how long it takes for a single transaction to complete. This includes both the service time and any queue/wait time. This will be gathered from the sar and iostat command (details below).
The IO formulas for calculating averages are as follows:
U = ( S λ ) / M (1)
R = S / (1 - U) (2)
R = S + W (3)
Before we dive into real-life examples, let's check these formulas out by doing some thought experiments.
Thought experiment 1. Using formula (1), if the arrival rate doubles, so will the utilization.
Thought experiment 2. Using formula (1), if we used slower disks, the service time (S) would increase, and therefore the utilization would also increase.
Thought experiment 3. Using formula (2), if we used faster devices, the service time would decrease, then the response time would also decrease.
Thought experiment 4. Using formula (2), if the device utilization decreased, the denominator would increase, which would cause the response time to decrease.
Thought experiment 5. Using formula (3), if we used a faster devices, service time would decrease, then the response time would also decrease.
While gathering I/O subsystem data is simple, the actual meaning of the data and how to apply it to our formulas is not so trivial. One approach, which is fine for low precision forecasting like this, is to gather only the response time, the utilization, and the number of devices. From these values, we can derive the arrival rate and service time.
Gathering device utilization is very simple as both sar –d and iostat clearly label these columns. However, gathering response time is not that simple. What iostat labels as service time is more appropriately the response time. Response time from sar –d is what you would expect, the service time plus the wait time. (For details, see "System Performance Tuning" by Musumeci and Loukides.)
There are many different ways we can forecast I/O subsystem activity. We could forecast at the device level or perhaps at the summary level. While detail level forecasting provides a plethora of numerical data, forecasting at the summary level allows us to easily communicate different configuration scenarios both numerically and graphically. For this article, I will present one way to consolidate all devices into a single representative device.
Capacity Planners like to call this process of consolidating or summarizing aggregation. While there are many ways to aggregate, the better the aggregation, the more precise and reliable your forecasts will be. For this example, our aggregation objective is to derive a single service time representing all devices and also the total system arrival rate. The total system arrival rate is trivial; it's just the sum of all the arrivals. Based upon the table below, the total arrival rate is 0.34 trx/ms.
To aggregate the service time, we should weight the average device service time based upon each respective device's arrival rate. But for simplicity and space, we will simply use the average service time across all devices. Based upon the table below, the average service time is 4.84 ms/trx.
Armed with the number of devices 5, the average service time 4.84 ms, and the system arrival rate of 0.34 trx/ms, we are ready to forecast!
Example 1. Let's say the I/O workload is expected to increase 20% each quarter and you need to now when the I/O subsystem will need to be upgraded. To answer the classic question, "When will we run out of gas?", we will forecast the average queue/wait time, response time, and utilization. The table below shows the raw forecast values.
Here's an example of the calculations with the arrival rate increased by 80% (arrival rate 0.71 trx/ms).
U = ( S λ ) / M = ( 4.84*0.71 ) / 5 = 0.69
R = S / (1 - U) = 4.84 / ( 1 – 0.69 ) = 15.46
W = R – S = 15.46 – 4.84 = 10.62
So what's the answer to our question? Technically speaking the system will operate with a 120% workload increase. But stating that in front of management is what I would call a "career decision." Looking closely at the forecasted utilization, the wait time, and the response time, you can see that once the utilization goes over 57%, the wait time skyrockets! Take a look at the resulting classic response time graph below.
SearchOracle.com
As I wrote in the first part of this series, forecasting Oracle performance is absolutely essential for every DBA to understand and perform. When performance begins to degrade, it's the DBA who hears about it, and it's the DBA who's supposed to fix it. Fortunately, low precision forecasting can be done very quickly and it is a great way to get started forecasting Oracle performance. This time, I'll focus on I/O performance forecasting.
The key metrics we want to forecast are utilization, queue time, and response time. With only these three metrics, as a DBA you can perform all sorts of low precision what-if scenarios. To derive the values, you essentially need 3 things:
a few simple formulas
some basic operating system statistics
some basic Oracle statistics
Modern I/O subsystems can be extremely difficult to forecast. Just as with Oracle, there is batching, caching, and a host of complex algorithms centered around optimizing performance. While these features are great for performance, they make intricate forecast models very complex. This may seem like a problem, but actually it's not. I have found that by keeping the level of detail and complexity at a consistently lower level (i.e., less detail), overall system I/O forecasts are typically more than adequate.
At a very basic level, an I/O subsystem is modeled differently than a CPU subsystem. A CPU subsystem routes all transactions into a single queue. All CPUs feed off of this single queue. This is why, with a mature operating system, any one CPU should be just as busy as the next. If you have had I/O performance problems you know the situation is very different.
In contrast to a CPU subsystem, each I/O device has its own queue. A transaction cannot simply be routed to any device. It must go specifically where the data it needs resides or where it has been told to write a specific piece of data. This is why each device needs its own queue and why some I/O queues are longer than others. This is also why balancing IO between devices is still the number one I/O subsystem bottleneck solution.
Today an I/O device can mean just about anything. It could be a single physical disk, a disk partition, a raid array, or some combination of these. The key when forecasting I/O is whatever you call a "device" is a device throughout the entire forecast. If a device is a 5 disk raid array, then make sure whenever a device is referenced, everyone involved understands the two devices are actually two raid arrays, each with five physical disks. If your device definition is consistant, you'll avoid many problems.
The forecasting formulas we'll use below assume the I/O load is perfectly distributed across all devices. While today's I/O subsystems do a fantastic job at distributing I/O activity, many DBAs do not. I have found that while an array's disk activity is nearly perfectly balanced, the activity from one array to the next may not be very well balanced. Hint: If an I/O device is not very active (utilization less than 5%), do not count it as a device. It is better to be conservative then aggressive when forecasting.
Before you are inundated with formulas, it's important to understand some definitions and recognize their symbols.
S : Time to service one workload unit. This is known as the service time or service demand. It is how long it takes a device to service a single transaction. For example, 1.5 seconds per transaction or 1.5 sec/trx. For simplicity sake, this value will be derived.
U : Utilization or device busyness. Commonly shown as a percentage and that's how it works in our formulas. For example, in the formula it should be something like 75% or 0.75, but not 75. This value can be gathered from both sar or iostat.
λ : Workload arrival rate. This is how many transactions enter the system per unit of time. For example, 150 transactions each second or 150 trx/sec. When working with Oracle, there are many possible statistics that can be used for the "transaction" arrival rate. For simplicity sake, this value will be derived and will refer to the general workload.
M : Number of devices. You can get this from the sar or iostat report. Be careful not to count both a disk and a disk's partition, resulting in a double count.
W : Wait time or more commonly called queue time. This is how long a transaction must wait before it begins to be serviced. For simplicity sake, this value will be derived.
R : Response time. This is how long it takes for a single transaction to complete. This includes both the service time and any queue/wait time. This will be gathered from the sar and iostat command (details below).
The IO formulas for calculating averages are as follows:
U = ( S λ ) / M (1)
R = S / (1 - U) (2)
R = S + W (3)
Before we dive into real-life examples, let's check these formulas out by doing some thought experiments.
Thought experiment 1. Using formula (1), if the arrival rate doubles, so will the utilization.
Thought experiment 2. Using formula (1), if we used slower disks, the service time (S) would increase, and therefore the utilization would also increase.
Thought experiment 3. Using formula (2), if we used faster devices, the service time would decrease, then the response time would also decrease.
Thought experiment 4. Using formula (2), if the device utilization decreased, the denominator would increase, which would cause the response time to decrease.
Thought experiment 5. Using formula (3), if we used a faster devices, service time would decrease, then the response time would also decrease.
While gathering I/O subsystem data is simple, the actual meaning of the data and how to apply it to our formulas is not so trivial. One approach, which is fine for low precision forecasting like this, is to gather only the response time, the utilization, and the number of devices. From these values, we can derive the arrival rate and service time.
Gathering device utilization is very simple as both sar –d and iostat clearly label these columns. However, gathering response time is not that simple. What iostat labels as service time is more appropriately the response time. Response time from sar –d is what you would expect, the service time plus the wait time. (For details, see "System Performance Tuning" by Musumeci and Loukides.)
There are many different ways we can forecast I/O subsystem activity. We could forecast at the device level or perhaps at the summary level. While detail level forecasting provides a plethora of numerical data, forecasting at the summary level allows us to easily communicate different configuration scenarios both numerically and graphically. For this article, I will present one way to consolidate all devices into a single representative device.
Capacity Planners like to call this process of consolidating or summarizing aggregation. While there are many ways to aggregate, the better the aggregation, the more precise and reliable your forecasts will be. For this example, our aggregation objective is to derive a single service time representing all devices and also the total system arrival rate. The total system arrival rate is trivial; it's just the sum of all the arrivals. Based upon the table below, the total arrival rate is 0.34 trx/ms.
To aggregate the service time, we should weight the average device service time based upon each respective device's arrival rate. But for simplicity and space, we will simply use the average service time across all devices. Based upon the table below, the average service time is 4.84 ms/trx.
Armed with the number of devices 5, the average service time 4.84 ms, and the system arrival rate of 0.34 trx/ms, we are ready to forecast!
Example 1. Let's say the I/O workload is expected to increase 20% each quarter and you need to now when the I/O subsystem will need to be upgraded. To answer the classic question, "When will we run out of gas?", we will forecast the average queue/wait time, response time, and utilization. The table below shows the raw forecast values.
Here's an example of the calculations with the arrival rate increased by 80% (arrival rate 0.71 trx/ms).
U = ( S λ ) / M = ( 4.84*0.71 ) / 5 = 0.69
R = S / (1 - U) = 4.84 / ( 1 – 0.69 ) = 15.46
W = R – S = 15.46 – 4.84 = 10.62
So what's the answer to our question? Technically speaking the system will operate with a 120% workload increase. But stating that in front of management is what I would call a "career decision." Looking closely at the forecasted utilization, the wait time, and the response time, you can see that once the utilization goes over 57%, the wait time skyrockets! Take a look at the resulting classic response time graph below.
Subscribe to:
Posts (Atom)