Oracle Exadata performance revealed – SmartScan – Part I
This is the start of yet another series of blog posts. This time it will be about Oracle Exadata and its SmartScan.
I just arrived back home from an Exadata Database Machine Administration Workshop with Joel Goodman as trainer. He did a great job. Have a look at his blog on oratrain.wordpress.com. Fortunately after this training I can still stick by what I wrote in a paper about SmartScan for the SOUG (swiss oracle user group) in November last year. Therefore I will put parts of this paper, which are originally written in German, in English on my blog. As it will not be a simple translation of the original text, it might also be interesting to people who read the original article.
First lets start with a very brief introduction to Oracle Exadata – readers already knowing Exadata can skip this section. Oracle Exadata Database Machine is a database appliance. It contains some servers to run an Oracle Database, usually in a Real Application Cluster and some others used as Storage Servers. The storage server software is developed by Oracle and optimized for the Oracle Database. All systems are very powerful and connected to each other over InfiniBand. The current minimum configuration is a cabinet containing two database servers with 2x 6-core Intel Xeon and 96GB each. Three storage servers, usually called Cells in the context of Exadata, containing 2x 6-core Intel Xeon and 24GB of RAM each. The storage server contain 12 hard disk drives and 384GB Sun Flash Accelerator storage each. Everything is plumbed together with 40Gbit/s InfiniBand. The smallest Database Machine configuration provides a raw read performance from disk of 5.6 GByte/s and 16 GByte/s from the flash storage. For detailed hardware specifications have a look at http://www.oracle.com/us/products/database/exadata/index.html.
Those are quite impressive numbers, but how much does this improve the speed of my application? Oracle claims a performance gain of 10 to 30 times faster. The biggest improvement achieved by a customer was 72 times faster. But how is this possible? Such massive improvements can not be achieved only by putting some very fast hardware together, otherwise customers would have already done this. But how does Oracle achieve its performance gains?
To be faster there are only two things you can do. Either work faster or do less work. We have already seen what Oracle did to work faster, but how can they reduce work and achieve the same result nevertheless?
In todays three tier architectures it is standard, to push a WHERE clause from the application layer into the database layer. Therefore only a subset of the data has to be sent back to the application layer. But what if the storage would already do this filtering? This would drastically reduce the payload on the database server and the network between the storage and database server. Therefore Oracle wrote an intelligent storage server, which is able to apply such filters before sending the data to the database.
Oracle calls this storage level optimizations SmartScan. With SmartScan Oracle can filter the following in the storage layer:
- an equals predicate filter in the WHERE clause
- a WHERE clause which filters using a function
- only returning the needed subset of columns
- pre-joining tables on storage
In subsequent posts I will show how this optimizations work. In the next post I will show what Oracle had to implement or reuse to make such optimizations possible.