Sunday, September 11, 2011

Smart Scan


How Standard Queries Work
An Oracle database instance is the process that handles SQL statements, requests and receives data from the storage system, and delivers results back to the requesting process. For Oracle instances running on non-Exadata servers, the process of executing a simple query that selects from a single table proceeds like this:


A query is submitted to the instance.

The query is parsed, and an execution path is determined by the Oracle optimizer.
The execution path is used to identify the extents and request data blocks from the storage system. A block is the smallest amount of data that can be transferred from the storage system to the instance.
The blocks are used to retrieve the rows and columns for that table from the storage system to the database instance.

The database instance processes the blocks, eliminating the rows that do not meet any selection criteria and assembling the requested columns to return to the user in a result set.

One particular point of potential inefficiency should leap out at you from this description—the requirement that all the potential blocks be read by the database instance. The database instance must implement selection criteria, which means many blocks are returned that are not needed to satisfy the
query.

In addition, complete blocks are sent back to the instance. A block, of course, could contain rows of data that are not relevant for satisfying a particular query. And most queries only request a subset of columns from an entire row, making for even more inefficiency.

Blocks are served up from storage systems over internal pathways, which typically operate rapidly, but as database workloads scale, both in numbers of users and in the amount of data requested, even these pathways can become saturated, creating performance bottlenecks.

The Oracle database has a number of features that can reduce the number of blocks returned from the storage system, from the use of indexes for random reads, to partition pruning to eliminate large groups of blocks from consideration, to materialized views to eliminate the need to return large numbers of rows to perform aggregations. One of the main points of query optimization is to find the optimal execution path, which means, among other considerations, reducing the amount of I/O necessary to produce the desired query results. But the Oracle instance, outside the world of Exadata technology, is still bound to retrieve all potential data blocks, ensuring inefficiency in its use of I/O resources, because this method was the only one available for interacting with the storage system

How Smart Scan Queries Work
Smart Scan, like most Oracle technology enhancements, works transparently. A query that can benefit from Smart Scan in the Exadata environment will use the feature without any changes or tuning. But the way that the data is requested and returned to the database server is significantly different from the way these tasks are accomplished with a standard query, resulting in potentially large performance gains for eligible queries. Smart Scan techniques can work on individual tables as well as tables that will be used in a join.

Individual Tables

The initial step in executing a query that will use Smart Scan is the same as with a standard query—the query is parsed and the Oracle optimizer determines an optimal execution plan. This similarity means that queries that use Smart Scan can still benefit from retrieving execution plans from the shared pool of the SGA.

The Exadata Storage Server Software retrieves all the data from the disks (and may also use the Exadata Smart Flash Cache) and uses the predicate information to reject all of the data that does meet the conditions imposed by the predicate.

A query will not use Smart Scan if the columns being requested by the query include a database large object (LOB), or if a table is a clustered table or an index-organized table. If a query would normally qualify for Smart Scan but doesn’t because the query contains a LOB, it’s easy to work around this restriction by breaking the query up into two queries—one without a LOB that can take advantage of Smart Scan, and another to retrieve the LOB based on the results of the first query.
If a query does not require a full scan, the Exadata software works as a normal data block server, sending blocks back to the requesting database instance, just as a standard Oracle database would return data. But if the query can use Smart Scan, the Exadata software goes to work.
The Smart Scan process reads data blocks and keeps the relevant rows to return to the database instance. The relevant rows are identified by filtering based on selection criteria specified in the query. Smart Scan filters on predicates that use most comparison operators, including >, <, =, !=, <=,
=>, IS [NOT] NULL, LIKE, [NOT} BETWEEN, [NOT]IN, EXISTS, IS OF type, NOT, and AND, as well as most SQL functions. The 11.2.0.2 release of the Exadata Storage software allows for the use of OR predicates and IN lists.
Continue.....
Like a “standard” query, the Oracle database instance requests data from storage. Since the Exadata Database Machine always uses Automatic Storage Management the request is for allocation units gathered from the ASM extent map. If the access method for the table uses a full scan for the table or the index, the database node also includes meta-data, which describes predicate information used for the query.

No comments:

Post a Comment