Saturday, November 15, 2008

Backup and Recovery in Oracle

Backup and Recovery

In this part, I discuss some of the options that the architecture gives you when it comes to backing up and recovering your database.

Backup and Recovery Options

This section outlines at a high level some of the options available for backing up and restoring your database. I discuss the types of failure that can occur and the actions to take. The major part of this section describes preventive action to guard against loss of your database files.

This section discusses in theory the available options. The backup and recovery options mentioned here, along with other available options, are discussed in greater detail in Next Posts.

Different Types of Failure

The major types of failure that can occur are statement failure, user process failure, machine failure, distributed transaction failure, instance failure, and disk failure/file loss.

Statement Failure

In Oracle, a DML statement such as UPDATE operates on either all the rows satisfying its where clause or none at all.

Failure with a statement occurs for a myriad of reasons. For example, when you insert rows into a table, the table might require more storage; if the database software discovers that no more free storage is available, it returns an error message to the user. Oracle does not leave only half the rows updated. Even if the failure occurs halfway through the statement, the rows already modified are "unmodified." This is known as statement-level rollback.

Note that other DML statements in the transaction remain in a pending state ready for a commit or rollback.

User Process Failure

A user process failure occurs when the user process making the connection to the database terminates abnormally during execution. For example, the system administrator could have killed the user process. If this does occur, the Oracle background process PMON automatically rolls back any changes for the current transaction. All changes already committed by the user process are saved, but inserts, updates, and deletes since the last commit or rollback are reversed.

Also, the PMON background process releases any locks, rollback segments, and other system resources acquired by the user process when it was alive. No database administrator involvement is necessary. The database continues to function as usual, and the tables are accessible to other users. (A slight delay could occur before the locks are released.)

Machine Failure

When the machine on which the database server is running fails and shuts down (the power is turned off, for example), the Oracle instance stops running. As long as no database files are lost, the only action required of the database administrator is restarting the Oracle instance. When you do this, the SMON background process reads the online redo log files and reapplies any changes for committed transactions. Any changes that had not been committed are rolled back.

Remember that a COMMIT statement writes only the changes to the redo log files; it does not write the database blocks back to disk at the point at which the commit was issued. If the database blocks with committed changes were written to the database files before the machine failure, the SMON background process obviously does not need to reapply the changes for those blocks.

Instance Failure

Instance failure occurs when the machine is still up and running but the Oracle instance itself fails (perhaps one of the background processes was killed). This situation is very similar to machine failure in that the database administrator needs only to restart the instance; the SMON process reapplies any changes. When restarting the instance after this kind of failure, you will notice no difference from when the instance is started after a normal shutdown.

Distributed Transaction Failure

A distributed transaction is one that involves changes to more than one database. If a failure occurs during a distributed transaction, the RECO background process (if it's running) automatically synchronizes the rollbacks of the transaction's changes across all the databases involved. Again, no manual intervention is required in all but the most serious cases. The database administrators of the instances involved in the distributed transaction can manually force the commit or rollback of the changes on their databases, but I recommend that you leave the recovery to the RECO background process if possible. This might not be possible if the links between the databases are not available for the RECO processes on all the instances to communicate.

Disk Failure/File Loss

The only time you really need to concern yourself with recovery is when you lose one or more of the files making up the database—the database files themselves, the control file, and the redo logs. Some type of manual recovery is necessary.

If you lose one or more of the files (database, redo, control), you have available the options highlighted in the following sections. In every situation, you must work with a previous backup of the database.

Cold Backup

A cold backup is when you copy the three sets of files (database files, redo logs, and control file) when the instance is shut down. This is a straight file copy, usually from the disk directly to tape. You must shut down the instance to guarantee a consistent copy. (It is possible to back up the files without bringing the instance down; see the section titled "Hot Backup.")

If you only perform a cold backup, the only option available in the event of data file loss is restoring all the files from the latest backup. All work performed on the database since the last backup is lost.

Archiving

If you've set up the database to run in ARCHIVELOG mode (easily done by the DBA), the database changes recorded in the redo logs are archived to an archive destination whenever the redo logs fill. Using this option, you have a complete record of changes made to the database files in the offline and online redo log files.

If you lose one or more of the database files, you could restore them from the last backup and reapply the changes since the last backup from the online and offline redo log files. You must have some kind of backup of the files and the complete set of online and offline redo logs from which to reapply all the changes made to the database.

With the archiving option, you lose no changes in the database if the complete set of redo logs is available. All the changes committed before the file was lost are reapplied. It is also possible to perform recovery if the control or redo log files are lost.

Hot Backups

Some sites (such as worldwide airline reservations systems) cannot shut down the database while making a backup copy of the files. The cold backup is not an available option.

You can use a different means of backing up your database—the hot backup. Issue a SQL command to indicate to Oracle, on a tablespace-by-tablespace basis, that you want to back up the files of the tablespace. The users can continue to make full use of the files, including making changes to the data. Once you have indicated that you want to back up the tablespace files, you can use your operating system to copy those files to your backup destination.

The database must be running in ARCHIVELOG mode for the hot backup option.

If a data loss failure does occur, you can restore the lost database files using the hot backup and the online and offline redo logs created since the backup was done. The database is restored to the most consistent state without any loss of committed transactions.

Export and Import

Along with the RDBMS software, Oracle provides two utilities that you can use to back up and restore the database. These utilities are useful to database administrators for system-wide backups and recoveries and also application developers for temporary backups of their own data and object recovery into their own user accounts.

The Export utility dumps the definitions and data for the specified part of the database to an operating system binary file. The Import utility reads the file produced by an export, recreates the definitions of objects, and inserts the data.

For a full database import, you must have an existing template database already created.

If you use Export and Import as a means of backing up and recovering the database, you cannot recover all the changes made to the database since the export was performed. This is similar to the situation with the cold backup. The best you can do is recover the database to the time when the export was last performed.

On large, data-filled databases, the Export and Import utilities can take a long time to run; many hours is not unusual. However, the utilities do provide an option to selectively export and import different user accounts and even objects within an Oracle user account.

Multiplexing

In this part, I discuss the options available for Oracle to duplicate data to provide an extra level of protection in the event of data loss.

Control Files

To protect against control file loss, the Oracle RDBMS software can maintain more than one copy of the control file. You do this by making a consistent copy of the existing control file and modifying an INIT.ORA parameter. This does not significantly impact the performance of the database. If all copies of the control file are lost, you can manually recreate them using the CREATE CONTROLFILE command.

Redo Logs

Redo logs record all the changes made to data blocks on the database. If the database is running in ARCHIVELOG mode and only the offline redo logs are lost, you should shut down the database and make another backup of the three sets of files for the database.

If the online redo logs are lost, however, you could lose some work because some of the information required to reapply changes to the database files is in the online redo log files. To guard against this, you can multiplex (mirror) the online redo log files in the same way as the control files. When the RDBMS software writes changes to one redo log, the exact same information is written to an exact copy of the redo log.

Distributed Databases

A distributed database is one logical database that is implemented as two or more physical databases on either the same machine or separate machines thousands of miles away. The system's designers decide where the tables should physically reside.

Each physical database has its own instance and sets of files, and the machines on which the databases reside are connected over a network. The location of tables can be made transparent to the application using database links and synonyms.

Oracle enables a transaction and even a single statement to access tables on two or more distributed databases. This does not necessitate any more coding by the application developers.

A distributed transaction is a transaction that modifies tables on more than one database and then expects all the changes to be committed. If there is any kind of failure, all the changes on all the databases are rolled back. A distributed transaction can involve many Oracle databases and only one non-Oracle database. The Oracle two-phase commit mechanism controls the synchronization of commits across all databases and can automatically roll back changes on all the databases if any kind of failure should occur. The RECO background process synchronizes this operation.

In addition to the this functionality, Oracle also provides the capability to replicate tables from one database to others. This is called creating a snapshot of the table.

You create a snapshot with the CREATE SNAPSHOT command on the database where you want to have the copy of the data. The Oracle RDBMS software automatically sends down any changes made to the master copy of the table to each of the snapshot copies at user-defined intervals without any manual intervention.

The snapshot mechanism enables you to make updates to the snapshot copy of the table, in which case the changes are sent from the copy table back to the master table.
What is RDBMS?

In recent years, database management systems (DBMS) have established themselves as the primary means of data storage for information systems ranging from large commercial transaction processing applications to PC-based desktop applications. At the heart of most of today's information systems is a relational database management system (RDBMS). RDBMSs have been the workhorse for data management operations for over a decade and continue to evolve and mature, providing sophisticated storage, retrieval, and distribution functions to enterprise-wide data processing and information management systems. Compared to the file systems, relational database management systems provide organizations with the capability to easily integrate and leverage the massive amounts of operational data into meaningful information systems. The evolution of high-powered database engines such as Oracle7 has fostered the development of advanced "enabling" technologies including client/server, data warehousing, and online analytical processing, all of which comprise the core of today's state-of-the-art information management systems.

Examine the components of the term relational database management system. First, a database is an integrated collection of related data. Given a specific data item, the structure of a database facilitates the access to data related to it, such as a student and all of his registered courses or an employee and his dependents. Next, a relational database is a type of database based in the relational model; non-relational databases commonly use a hierarchical, network, or object-oriented model as their basis. Finally, a relational database management system is the software that manages a relational database. These systems come in several varieties, ranging from single-user desktop systems to full-featured, global, enterprise-wide systems, such as Oracle7.

This blog discusses the basic elements of a relational database management system, the relational database, and the software systems that manage it. Also included is a discussion of nonprocedural data access. If you are a new user to relational database technology, you'll have to change your thinking somewhat when it comes to referencing data nonprocedurally.

The Relational Database Model

Most of the database management systems used by commercial applications today are based on one of three basic models: the hierarchical model, the network model, or the relational model. The following sections describe the various differences and similarities of the models.

Hierarchical and Network Models


The first commercially available database management systems were of the CODASYL type, and many of them are still in use with mainframe-based, COBOL applications. Both network and hierarchical databases are quite complex in that they rely on the use of permanent internal pointers to relate records to each other. For example, in an accounts payable application, a vendor record might contain a physical pointer in its record structure that points to purchase order records. Each purchase order record in turn contains pointers to purchase order line item records.

The process of inserting, updating, and deleting records using these types of databases requires synchronization of the pointers, a task that must be performed by the application. As you might imagine, this pointer maintenance requires a significant amount of application code (usually written in COBOL) that at times can be quite cumbersome.

Elements of the Relational Model


Relational databases rely on the actual attribute values as opposed to internal pointers to link records. Instead of using an internal pointer from the vendor record to purchase order records, you would link the purchase order record to the vendor record using a common attribute from each record, such as the vendor identification number.

Although the concepts of academic theory underlying the relational model are somewhat complex, you should be familiar with are some basic concepts and terminology. Essentially, there are three basic components of the relational model: relational data structures, constraints that govern the organization of the data structures, and operations that are performed on the data structures.

Relational Data Structures

The relational model supports a single, "logical" structure called a relation, a two-dimensional data structure commonly called a table in the "physical" database. Attributes represent the atomic data elements that are related by the relation. For example, the Customer relation might contain such attributes about a customer as the customer number, customer name, region, credit status, and so on.


Key Values and Referential Integrity


Attributes are grouped with other attributes based on their dependency on a primary key value. A primary key is an attribute or group of attributes that uniquely identifies a row in a table. A table has only one primary key, and as a rule, every table has one. Because primary key values are used as identifiers, they cannot be null. Using the conventional notation for relations, an attribute is underlined to indicate that it is the primary key of the relation. If a primary key consists of several attributes, each attribute is underlined.

You can have additional attributes in a relation with values that you define as unique to the relation. Unlike primary keys, unique keys can contain null values. In practice, unique keys are used to prevent duplication in the table rather than identify rows. Consider a relation that contains the attribute, United States Social Security Number (SSN). In some rows, this attribute may be null in since not every person has a SSN; however for a row that contains a non-null value for the SSN attribute, the value must be unique to the relation.

Linking one relation to another typically involves an attribute that is common to both relations. The common attributes are usually a primary key from one table and a foreign key from the other. Referential integrity rules dictate that foreign key values in one relation reference the primary key values in another relation. Foreign keys might also reference the primary key of the same relation. Figure illustrates two foreign key relationships.



Oracle and Client/Server


Oracle Corporation's reputation as a database company is firmly established in its full-featured, high-performance RDBMS server. With the database as the cornerstone of its product line, Oracle has evolved into more than just a database company, complementing its RDBMS server with a rich offering of well-integrated products that are designed specifically for distributed processing and client/server applications. As Oracle's database server has evolved to support large-scale enterprise systems for transaction processing and decision support, so too have its other products, to the extent that Oracle can provide a complete solution for client/server application development and deployment. This chapter presents an overview of client/server database systems and the Oracle product architectures that support their implementation.

An Overview of Client/Server Computing

The premise of client/server computing is to distribute the execution of a task among multiple processors in a network. Each processor is dedicated to a specific, focused set of subtasks that it performs best, and the end result is increased overall efficiency and effectiveness of the system as a whole. Splitting the execution of tasks between processors is done through a protocol of service requests; one processor, the client, requests a service from another processor, the server. The most prevalent implementation of client/server processing involves separating the user interface portion of an application from the data access portion.

On the client, or front end, of the typical client/server configuration is a user workstation operating with a Graphical User Interface (GUI) platform, usually Microsoft Windows, Macintosh, or Motif. At the back end of the configuration is a database server, often managed by a UNIX, Netware, Windows NT, or VMS operating system.

Client/server architecture also takes the form of a server-to-server configuration. In this arrangement, one server plays the role of a client, requesting database services from another server. Multiple database servers can look like a single logical database, providing transparent access to data that is spread around the network.

Designing an efficient client/server application is somewhat of a balancing act, the goal of which is to evenly distribute execution of tasks among processors while making optimal use of available resources. Given the increased complexity and processing power required to manage a graphical user interface (GUI) and the increased demands for throughput on database servers and networks, achieving the proper distribution of tasks is challenging. Client/server systems are inherently more difficult to develop and manage than traditional host-based application systems because of the following challenges:

The components of a client/server system are distributed across more varied types of processors. There are many more software components that manage client, network, and server functions, as well as an array of infrastructure layers, all of which must be in place and configured to be compatible with each other.

The complexity of GUI applications far outweighs that of their character-based predecessors. GUIs are capable of presenting much more information to the user and providing many additional navigation paths to elements of the interface.

Troubleshooting performance problems and errors is more difficult because of the increased number of components and layers in the system.

Databases in a Client/Server Architecture

Client/server technologies have changed the look and architecture of application systems in two ways. Not only has the supporting hardware architecture undergone substantial changes, but there have also been significant changes in the approach to designing the application logic of the system.

Prior to the advent of client/server technology, most Oracle applications ran on a single node. Typically, a character-based SQL*Forms application would access a database instance on the same machine with the application and the RDBMS competing for the same CPU and memory resources. Not only was the system responsible for supporting all the database processing, but it was also responsible for executing the application logic. In addition, the system was burdened with all the I/O processing for each terminal on the system; each keystroke and display attribute was controlled by the same processor that processed database requests and application logic.

Client/server systems change this architecture considerably by splitting all of the interface management and much of the application processing from the host system processor and distributing it to the client processor.

Combined with the advances in hardware infrastructure, the increased capabilities of RDBMS servers have also contributed to changes in the application architecture. Prior to the release of Oracle7, Oracle's RDBMS was less sophisticated in its capability to support the processing logic necessary to maintain the integrity of data in the database. For example, primary and foreign key checking and enforcement was performed by the application. As a result, the database was highly reliant on application code for enforcement of business rules and integrity, making application code bulkier and more complex. Figure 2.1 illustrates the differences between traditional host-based applications and client/server applications. Client/server database applications can take advantage of the Oracle7 server features for implementation of some of the application logic.