7.09.2013

About SQL Command Insert " large number rows "

Many friends Software Architect, DBA, QA Analyst questions about SQL Insert " large" ?

( Oracle Database )

Question:  What are the steps for tuning an insert SQL?  
It's running far too long and I need to understand how to optimize the insert for performance. 

Answer:  Many factors that effect Oracle insert SQL performance and many things that you can do to tune an insert statement.  When loading large-volumes of data, you have several choices of tools, each with their own costs and performance benefits. 

Some insert tuning techniques that can work in a multitude of workloads.  Remember, you can have choices for doing table inserts including the Data Pump and SQL*Loader utilities, as well as PL/SQL bulk load tools such as theforall operator There are many types or Oracle inserts, each with distinct performance characteristics:



=======================================================================

The fastest Oracle table insert rate I've ever seen was 400,000 rows per second, about 24 million rows per minute, using super-fast RAM disk (SSD), but Greg Rahn of Oracle notes SQL insert rates of upwards of 6 million rows per second using the Exadata firmware:

"One of the faster bulk (parallel nologging direct path from external table using direct path compression) load rates I've seen is just over 7.7 billion rows in under 20 minutes which equates to around 385,000,000 per minute or about 6,416,666 per second.
All the CPUs are running at around 99% user CPU during that load. That was loading to spinning rust (Exadata Storage). It would be even faster had compression not been used. That was on a HP Oracle DB Machine (64 Intel Harpertown CPU cores). "

While my complete notes are found in my book "Oracle Tuning: The Definitive Reference", I have my main notes on tuning inserts here, but here are some general guidelines for tuning inserts statements. 
a - Disable/drop indexes and constraints - It's far faster to rebuild indexes after the data load, all at-once. Also indexes will rebuild cleaner, and with less I/O if they reside in a tablespace with a large block size.
b - Manage segment header contention for parallel inserts - Make sure to define multiple freelist (or freelist groups) to remove contention for the table header. Multiple freelists add additional segment header blocks, removing the bottleneck.  You can also use Automatic Segment Space Management (bitmap freelists) to support parallel DML, but ASSM has some limitations.

c - Parallelize the load - You can invoke parallel DML (i.e. using the PARALLEL and APPEND hint) to have multiple inserts into the same table. For this INSERT optimization, make sure to define multiple freelists and use the SQL "APPEND" option.  Mark Bobak notes that if you submit parallel jobs to insert against the table at the same time, using the APPEND hint may cause serialization, removing the benefit of parallel jobstreams.
 
d - APPEND into tables - By using the APPEND hint, you ensure that Oracle always grabs "fresh" data blocks by raising the high-water-mark for the table. If you are doing parallel insert DML, the Append mode is the default and you don't need to specify an APPEND hint.  Mark Bobak notes "Also, if you're going w/ APPEND, consider putting the table into NOLOGGING mode, which will allow Oracle to avoid almost all redo logging."
insert /*+ append */ into customer values ('hello',';there');
 
e - Use a large blocksize - By defining large (i.e. 32k) blocksizes for the target table, you reduce I/O because more rows fit onto a block before a "block full" condition (as set by PCTFREE) unlinks the block from the freelist.
  • See benchmark test of blocksize and inserts here
     
  • See general benefits of multiple blocksizes here
f - Use NOLOGGING
f - RAM disk - You can use high-speed solid-state disk (RAM-SAN) to make Oracle inserts run up to 300x faster than platter disk.





Insert performance (blocksize)


When begin test, action,  my small single-CPU, single-user benchmark showing the performance of loads into a larger blocksize:

alter system set db_2k_cache_size=64m scope=spfile;
alter system set db_16k_cache_size=64m scope=spfile;
startup force
create tablespace twok blocksize 2k; <-- 100m="" asm="" br="" defaults="" to="" using="">create tablespace sixteenk blocksize 16k;
create table load2k tablespace twok as select * from dba_objects; < creates 8k rows
drop table load2k; <- br="" buffers="" create="" first="" preload="" to="" was="">

set timing on;
create table load2k tablespace twok as select * from dba_objects;
create table load16k tablespace sixteenk as select * from dba_objects;


For a larger sample, I re-issued the create processes with:

select * from dba_source; -- (80k rows)

Even with this super-tiny sample on Linux using Oracle10g (with ASM) the results were impressive, with a significant performance improvement using large blocksizes.

             2k        16k
      blksze         blksze
8k table size   4.33 secs     4.16 secs  
80k table size   8.74 secs     8.31 secs 

Optimizing Oracle INSERT performance

The fastest Oracle table insert rate I've ever seen was 400,000 rows per second, about 24 millions rows per minute, using super-fast RAM disk (SSD).

Speed of inserts is primarily a function of device speed, but NOLOGGING, maximum parallel DML (which, in turn, a function of the number of CPU's and the layout of the disks) also factor into the equation. When using standard SQL statements to load Oracle data tables, there are several tuning approaches: 
Using SSD for insert tablespaces
For databases that require high-speed loads, some shops define the insert table partition on solid-state disk (later moving it to platter disk).  Mike Ault notes in his book "Oracle Solid-State Disk Tuning", a respectable 30% improvement in load speed:
“In the SSD verses ATA benchmark the gains for insert and update processing as shown in the database loading and index build scenarios was a respectable 30%.I
This 30% was due to the CPU overhead involved in the insert and update activities.
If the Oracle level processing for insert and update activities could be optimized for SSD, significant performance gains might be realized during these activities."


No comments:

Node.js?

  Node.js? It is JS or Javascript yes, it is the basis.  Yes the old Javascript that today has a strong front and back end stack, the same s...