Saturday, November 15, 2008

Network Software and SQL*Net

Oracle Files

In this part, I discuss the different types of files that Oracle uses on the hard disk drive of any machine.

Database Files

The database files hold the actual data and are typically the largest in size (from a few megabytes to many gigabytes). The other files (control files and redo logs) support the rest of the architecture. Depending on their sizes, the tables (and other objects) for all the user accounts can obviously go in one database file—but that's not an ideal situation because it does not make the database structure very flexible for controlling access to storage for different Oracle users, putting the database on different disk drives, or backing up and restoring just part of the database.

You must have at least one database file (adequate for a small or testing database), but usually, you have many more than one. In terms of accessing and using the data in the tables and other objects, the number (or location) of the files is immaterial.

The database files are fixed in size and never grow bigger than the size at which they were created.

Control Files

Any database must have at least one control file, although you typically have more than one to guard against loss. The control file records the name of the database, the date and time it was created, the location of the database and redo logs, and the synchronization information to ensure that all three sets of files are always in step. Every time you add a new database or redo log file to the database, the information is recorded in the control files.

Redo Logs

Any database must have at least two redo logs. These are the journals for the database; the redo logs record all changes to the user objects or system objects. If any type of failure occurs, such as loss of one or more database files, you can use the changes recorded in the redo logs to bring the database to a consistent state without losing any committed transactions. In the case of non-data loss failure, such as a machine crash, Oracle can apply the information in the redo logs automatically without intervention from the database administrator (DBA). The SMON background process automatically reapplies the committed changes in the redo logs to the database files.

Like the other files used by Oracle, the redo log files are fixed in size and never grow dynamically from the size at which they were created.

Online Redo Logs

The online redo logs are the two or more redo log files that are always in use while the Oracle instance is up and running. Changes you make are recorded to each of the redo logs in turn. When one is full, the other is written to; when that becomes full, the first is overwritten, and the cycle continues.

Offline/Archived Redo Logs

The offline or archived redo logs are exact copies of the online redo logs that have been filled; it is optional whether you ask Oracle to create these. Oracle only creates them when the database is running in ARCHIVELOG mode. If the database is running in ARCHIVELOG mode, the ARCH background process wakes up and copies the online redo log to the offline destination (typically another disk drive) once it becomes full. While this copying is in progress, Oracle uses the other online redo log. If you have a complete set of offline redo logs since the database was last backed up, you have a complete record of changes that have been made. You could then use this record to reapply the changes to the backup copy of the database files if one or more online database files are lost.

Other Supporting Files

When you start an Oracle instance (in other words, when the Oracle background processes are initiated and the memory structures allocated), the instance parameter file determines the sizes and modes of the database. This parameter file is known as the INIT.ORA file (the actual name of the file has the Oracle instance identifier appended to the filename). This is an ordinary text file containing parameters for which you can override the default settings. The DBA is responsible for creating and modifying the contents of this parameter file.

On some Oracle platforms, a SGAPAD file is also created, which contains the starting memory address of the Oracle SGA.

System and User Processes

In this part, I discuss some of the Oracle system processes that must be running for the database to be useable, including the optional processes and the processes that are created for users connecting to the Oracle database.

Mandatory System Processes

The four Oracle system processes that must always be up and running for the database to be useable include DBWR (Database Writer), LGWR (Log Writer), SMON (System Monitor), and PMON (Process Monitor).

DBWR (Database Writer)

The database writer background process writes modified database blocks in the SGA to the database files. It reads only the blocks that have changed (for example, if the block contains a new record, a deleted record, or a changed record). These blocks are also called dirty blocks. The database writer writes out the least recently used blocks first. These blocks are not necessarily written to the database when the transaction commits; the only thing that always happens on a commit is that the changes are recorded and written to the online redo log files. The database blocks will be written out later when there are not enough buffers free in the SGA to read in a new block.

LGWR (Log Writer)

The log writer process writes the entries in the SGA's redo buffer for one or more transactions to the online redo log files. For example, when a transaction commits, the log writer must write out the entries in the redo log buffer to the redo log files on disk before the process receives a message indicating that the commit was successful. Once committed, the changes are safe on disk even though the modified database blocks are still in the SGA's database buffer area waiting to be written out by DBWR. The SMON can always reapply the changes from the redo logs if the memory's most up-to-date copy of the database blocks is lost.

SMON (System Monitor)

The system monitor process looks after the instance. If two transactions are both waiting for each other to release locks and neither of them can continue (known as a deadlock or deadly embrace), SMON detects the situation and one of the processes receives an error message indicating that a deadlock has occurred.

SMON also releases temporary segments that are no longer in use by the user processes which caused them to be created.

During idle periods, SMON compacts the free-space fragments in the database files, making it easier and simpler for Oracle to allocate storage for new database objects or for existing database objects to grow.

In addition, SMON automatically performs recovery when the Oracle instance is first started up (if none of the files have been lost). You won't see a message indicating that instance recovery is occurring, but the instance might take longer to come up.

PMON (Process Monitor)

The process monitor monitors the user processes. If any failure occurs with the user processes (for example, if the process is killed in the middle of a transaction), PMON automatically rolls back the work of the user process since the transaction started (anything since the last COMMIT or ROLLBACK). It releases any locks taken out and other system resources taken up by the failed process.

PMON also monitors the dispatcher and shared server processes, which are part of the multi-threaded server setup, and restarts them if they have died.

Optional System Processes

As well as the four mandatory system processes, there are a number of optional system processes that you can initiate.

ARCH (Archiver)

When the database is running in ARCHIVELOG mode and you've started the Archiver background process, it makes a copy of one of the online redo log files to the archive destination (the exact location is specified in an INIT.ORA parameter). In this way, you can have a complete history of changes made to the database files recorded in the offline and the online redo logs.

There is no point in keeping the Archiver background process running if the database is not running in ARCHIVELOG mode.

CKPT (Checkpoint Process)

A checkpoint occurs when one of the online redo log files fills; it will be overwritten when one of the other online redo logs fills. If the redo log file is overwritten, the changes recorded in that file are not available for reapplying in case of system failure. At a checkpoint, the modified database buffer blocks are written down to the relative safety of the database files on disk by the database writer background process. This means that you won't need the record of changes in the event of system failure with lost memory areas. After a checkpoint occurs, the redo log can be reused.

At a checkpoint, all the database file headers and redo log file headers are updated to record the fact that a checkpoint has occurred. The LGWR background process performs the updating task, which could be significant if there are a large number of database and redo log files. The entire database might have to wait for the checkpoint to complete before the redo logs can record further database changes. To reduce the time it takes for LGWR to update the database and redo log file headers, you can initiate the checkpoint process.

A checkpoint can occur at other times, such as when the entries in the redo log files reach a limit defined by the database administrator.
What is RDBMS?

In recent years, database management systems (DBMS) have established themselves as the primary means of data storage for information systems ranging from large commercial transaction processing applications to PC-based desktop applications. At the heart of most of today's information systems is a relational database management system (RDBMS). RDBMSs have been the workhorse for data management operations for over a decade and continue to evolve and mature, providing sophisticated storage, retrieval, and distribution functions to enterprise-wide data processing and information management systems. Compared to the file systems, relational database management systems provide organizations with the capability to easily integrate and leverage the massive amounts of operational data into meaningful information systems. The evolution of high-powered database engines such as Oracle7 has fostered the development of advanced "enabling" technologies including client/server, data warehousing, and online analytical processing, all of which comprise the core of today's state-of-the-art information management systems.

Examine the components of the term relational database management system. First, a database is an integrated collection of related data. Given a specific data item, the structure of a database facilitates the access to data related to it, such as a student and all of his registered courses or an employee and his dependents. Next, a relational database is a type of database based in the relational model; non-relational databases commonly use a hierarchical, network, or object-oriented model as their basis. Finally, a relational database management system is the software that manages a relational database. These systems come in several varieties, ranging from single-user desktop systems to full-featured, global, enterprise-wide systems, such as Oracle7.

This blog discusses the basic elements of a relational database management system, the relational database, and the software systems that manage it. Also included is a discussion of nonprocedural data access. If you are a new user to relational database technology, you'll have to change your thinking somewhat when it comes to referencing data nonprocedurally.

The Relational Database Model

Most of the database management systems used by commercial applications today are based on one of three basic models: the hierarchical model, the network model, or the relational model. The following sections describe the various differences and similarities of the models.

Hierarchical and Network Models


The first commercially available database management systems were of the CODASYL type, and many of them are still in use with mainframe-based, COBOL applications. Both network and hierarchical databases are quite complex in that they rely on the use of permanent internal pointers to relate records to each other. For example, in an accounts payable application, a vendor record might contain a physical pointer in its record structure that points to purchase order records. Each purchase order record in turn contains pointers to purchase order line item records.

The process of inserting, updating, and deleting records using these types of databases requires synchronization of the pointers, a task that must be performed by the application. As you might imagine, this pointer maintenance requires a significant amount of application code (usually written in COBOL) that at times can be quite cumbersome.

Elements of the Relational Model


Relational databases rely on the actual attribute values as opposed to internal pointers to link records. Instead of using an internal pointer from the vendor record to purchase order records, you would link the purchase order record to the vendor record using a common attribute from each record, such as the vendor identification number.

Although the concepts of academic theory underlying the relational model are somewhat complex, you should be familiar with are some basic concepts and terminology. Essentially, there are three basic components of the relational model: relational data structures, constraints that govern the organization of the data structures, and operations that are performed on the data structures.

Relational Data Structures

The relational model supports a single, "logical" structure called a relation, a two-dimensional data structure commonly called a table in the "physical" database. Attributes represent the atomic data elements that are related by the relation. For example, the Customer relation might contain such attributes about a customer as the customer number, customer name, region, credit status, and so on.


Key Values and Referential Integrity


Attributes are grouped with other attributes based on their dependency on a primary key value. A primary key is an attribute or group of attributes that uniquely identifies a row in a table. A table has only one primary key, and as a rule, every table has one. Because primary key values are used as identifiers, they cannot be null. Using the conventional notation for relations, an attribute is underlined to indicate that it is the primary key of the relation. If a primary key consists of several attributes, each attribute is underlined.

You can have additional attributes in a relation with values that you define as unique to the relation. Unlike primary keys, unique keys can contain null values. In practice, unique keys are used to prevent duplication in the table rather than identify rows. Consider a relation that contains the attribute, United States Social Security Number (SSN). In some rows, this attribute may be null in since not every person has a SSN; however for a row that contains a non-null value for the SSN attribute, the value must be unique to the relation.

Linking one relation to another typically involves an attribute that is common to both relations. The common attributes are usually a primary key from one table and a foreign key from the other. Referential integrity rules dictate that foreign key values in one relation reference the primary key values in another relation. Foreign keys might also reference the primary key of the same relation. Figure illustrates two foreign key relationships.



Oracle and Client/Server


Oracle Corporation's reputation as a database company is firmly established in its full-featured, high-performance RDBMS server. With the database as the cornerstone of its product line, Oracle has evolved into more than just a database company, complementing its RDBMS server with a rich offering of well-integrated products that are designed specifically for distributed processing and client/server applications. As Oracle's database server has evolved to support large-scale enterprise systems for transaction processing and decision support, so too have its other products, to the extent that Oracle can provide a complete solution for client/server application development and deployment. This chapter presents an overview of client/server database systems and the Oracle product architectures that support their implementation.

An Overview of Client/Server Computing

The premise of client/server computing is to distribute the execution of a task among multiple processors in a network. Each processor is dedicated to a specific, focused set of subtasks that it performs best, and the end result is increased overall efficiency and effectiveness of the system as a whole. Splitting the execution of tasks between processors is done through a protocol of service requests; one processor, the client, requests a service from another processor, the server. The most prevalent implementation of client/server processing involves separating the user interface portion of an application from the data access portion.

On the client, or front end, of the typical client/server configuration is a user workstation operating with a Graphical User Interface (GUI) platform, usually Microsoft Windows, Macintosh, or Motif. At the back end of the configuration is a database server, often managed by a UNIX, Netware, Windows NT, or VMS operating system.

Client/server architecture also takes the form of a server-to-server configuration. In this arrangement, one server plays the role of a client, requesting database services from another server. Multiple database servers can look like a single logical database, providing transparent access to data that is spread around the network.

Designing an efficient client/server application is somewhat of a balancing act, the goal of which is to evenly distribute execution of tasks among processors while making optimal use of available resources. Given the increased complexity and processing power required to manage a graphical user interface (GUI) and the increased demands for throughput on database servers and networks, achieving the proper distribution of tasks is challenging. Client/server systems are inherently more difficult to develop and manage than traditional host-based application systems because of the following challenges:

The components of a client/server system are distributed across more varied types of processors. There are many more software components that manage client, network, and server functions, as well as an array of infrastructure layers, all of which must be in place and configured to be compatible with each other.

The complexity of GUI applications far outweighs that of their character-based predecessors. GUIs are capable of presenting much more information to the user and providing many additional navigation paths to elements of the interface.

Troubleshooting performance problems and errors is more difficult because of the increased number of components and layers in the system.

Databases in a Client/Server Architecture

Client/server technologies have changed the look and architecture of application systems in two ways. Not only has the supporting hardware architecture undergone substantial changes, but there have also been significant changes in the approach to designing the application logic of the system.

Prior to the advent of client/server technology, most Oracle applications ran on a single node. Typically, a character-based SQL*Forms application would access a database instance on the same machine with the application and the RDBMS competing for the same CPU and memory resources. Not only was the system responsible for supporting all the database processing, but it was also responsible for executing the application logic. In addition, the system was burdened with all the I/O processing for each terminal on the system; each keystroke and display attribute was controlled by the same processor that processed database requests and application logic.

Client/server systems change this architecture considerably by splitting all of the interface management and much of the application processing from the host system processor and distributing it to the client processor.

Combined with the advances in hardware infrastructure, the increased capabilities of RDBMS servers have also contributed to changes in the application architecture. Prior to the release of Oracle7, Oracle's RDBMS was less sophisticated in its capability to support the processing logic necessary to maintain the integrity of data in the database. For example, primary and foreign key checking and enforcement was performed by the application. As a result, the database was highly reliant on application code for enforcement of business rules and integrity, making application code bulkier and more complex. Figure 2.1 illustrates the differences between traditional host-based applications and client/server applications. Client/server database applications can take advantage of the Oracle7 server features for implementation of some of the application logic.