| |||
|
free Oracle DBA tutorial
Oracle Jobs Ask A Question SQL Statement Tuning Backup and Recovery Concepts Oracle 11g New Features Oracle E Suite & Others Oracle Data Guard Oracle DBA FAQ |
Until now, the export/import toolset has been the utility of choice for transferring data across multiple platforms with minimal effort, despite common complaints about its lack of speed. Import merely reads each record from the export dump file and inserts it into the target table using the usual INSERT INTO command, so it's no surprise that import can be a slow process.
Enter Oracle Data Pump, the newer and faster sibling of the export/import toolkit in Oracle Database 10g, designed to speed up the process many times over.
Data Pump reflects a complete overhaul of the export/import process. Instead of using the usual SQL commands, it provides proprietary APIs to load and unload data significantly faster. In my tests, I have seen performance increases of 10-15 times over export in direct mode and 5-times-over performance increases in the import process. In addition, unlike with the export utility, it is possible to extract only specific types of objects such as procedures.
create directory dpdata1 as '/u02/dpdata1';
Note the parameter job_name above, a special one not found in the original export. All Data Pump work is done though jobs. Data Pump jobs, unlike DBMS jobs, are merely server processes that process the data on behalf of the main process. The main process, known as a master control process, coordinates this effort via Advanced Queuing; it does so through a special table created at runtime known as a master table. In our example, if you check the schema of the user ANANDA while expdp is running you will notice the existence of a table CASES_EXPORT, corresponding to the parameter job_name. This table is dropped when expdp finishes.
Export>
This approach allows several commands to be entered on that DPE job. To find a summary, use the STATUS command at the prompt:
Worker 1 Status:
State: EXECUTING
Object Schema: DWOWNER
Object Name: CASES
Object Type: TABLE_EXPORT/TBL_TABLE_DATA/TABLE/TABLE_DATA
Completed Objects: 1
Total Objects: 1
Completed Rows: 4687818
expdp ananda/abc123 tables=CASES directory=DPDATA1
In parallel mode, the status screen will show four worker processes. (In default mode, only one process will be visible.) All worker processes extract data simultaneously and show their progress on the status screen.
It's important to separate the I/O channels for access to the database files and the dumpfile directory filesystems. Otherwise, the overhead associated with maintaining the Data Pump jobs may outweigh the benefits of parallel threads and hence degrade performance. Parallelism will be in effect only if the number of tables is higher than the parallel value and the tables are big.
select sid, serial#
This instruction shows the session of the foreground process. More useful information is obtained from the alert log. When the process starts up, the MCP and the worker processes are shown in the alert log as follows:
kupprdp: master process DM00 started with pid=23, OS id=20530 to execute -
kupprdp: worker process DW01 started with worker id=1, pid=24, OS id=20532 to execute -
kupprdp: worker process DW03 started with worker id=2, pid=25, OS id=20534 to execute -
It shows the PID of the sessions started for the data pump operation. You can find the actual SIDs using this query:
select sid, program from v$session where paddr in
The PROGRAM column will show the process DM (for master process) or DW (the worker proceses), corresponding to the names in the alert log file. If a parallel query is used by a worker process, say for SID 23, you can see it in the view V$PX_SESSION to find it out. It will show you all the parallel query sessions running from the worker process represented by SID 23:
select sid from v$px_session where qcsid = 23;
Additional useful information can be obtained from the view V$SESSION_LONGOPS to predict the time it will take to complete the job.
select sid, serial#, sofar, totalwork
The column totalwork shows the total amount of work, of which the sofar amount has been done up until now--which you can then use to estimate how much longer it will take.
impdp ananda/abc123 directory=dpdata1 dumpfile=expCASES.dmp job_name=cases_import
The default behavior of the import process is to create the table and all associated objects, and to produce an error when the table exists. Should you want to append the data to the existing table, you could use TABLE_EXISTS_ACTION=APPEND in the above command line.
As with Data Pump Export, pressing Control-C on the process brings up the interactive mode of Date Pump Import (DPI); again, the prompt is Import.
expdp ananda/iclaim directory=DPDATA1 dumpfile=expprocs.dmp include=PROCEDURE
To export only a few specific objects--say, function FUNC1 and procedure PROC1--you could use
expdp ananda/iclaim directory=DPDATA1 dumpfile=expprocs.dmp
This dumpfile serves as a backup of the sources. You can even use it to create DDL scripts to be used later. A special parameter called SQLFILE allows the creation of the DDL script file.
impdp ananda/iclaim directory=DPDATA1 dumpfile=expprocs.dmp sqlfile=procs.sql
Using the parameter INCLUDE allows you to define objects to be included or excluded from the dumpfile. You can use the clause INCLUDE=TABLE:"LIKE 'TAB%'" to export only those tables whose name start with TAB. Similarly, you could use the construct INCLUDE=TABLE:"NOT LIKE 'TAB%'" to exclude all tables starting with TAB. Alternatively you can use the EXCLUDE parameter to exclude specific objects.
Data Pump can also be used to transport tablespaces using external tables; it's sufficiently powerful to redefine parallelism on the fly, attach more tables to an existing process, and so on (which are beyond the scope of this article; see Oracle Database Utilities 10g Release 1 10.1 for more details). The following command generates the list of all available parameters in the Data Pump export utility:
expdp help=y
Similarly, impdp help=y will show all the parameters in DPI.
While Data Pump jobs are running, you can pause them by issuing STOP_JOB on the DPE or DPI prompts and then restart them with START_JOB. This functionality comes in handy when you run out of space and want to make corrections before continuing.
More Tutorials on Oracle dba ... Want to share or request Oracle Tutorial articles to become a Oracle DBA. Direct your requests to webmaster@oracleonline.info |
| |