📄 cache_control.html
字号:
<html><!-- == $Id: //open/mondrian-release/3.0/doc/cache_control.html#2 $ == This software is subject to the terms of the Common Public License == Agreement, available at the following URL: == http://www.opensource.org/licenses/cpl.html. == Copyright (C) 2006-2008 Julian Hyde == All Rights Reserved. == You must accept the terms of that agreement to use this software. --><head> <link rel="stylesheet" type="text/css" href="stylesheet.css"/> <title>Pentaho Analysis Services: Cache Control</title></head><body><!-- doc2web start --><!-- page title --><div class="contentheading">Cache Control</div><!-- end page title --><!-- ########################## Contents ############################# --><h3>Contents</h3> <ol> <li><a href="#Introduction">Introduction</a></li> <li><a href="#How_mondrians_cache_works">How mondrian's cache works</a></li> <li><a href="#CacheControl_API">CacheControl API</a><ol> <li><a href="#A_simple_example">A simple example</a></li> <li><a href="#More_about_cell_regions">More about cell regions</a></li> <li><a href="#Merging_and_truncating_segments">Merging and truncating segments</a></li> </ol> </li> <li><a href="#Other_cache_control_topics">Other cache control topics</a><ol> <li><a href="#Flushing_the_dimension_cache">Flushing the dimension cache</a></li> <li><a href="#Cache_consistency">Cache consistency</a></li> <li><a href="#Metadata_cache_control">Metadata cache control</a></li> </ol></li> </ol><!-- ########################### 1. Introduction ############################## --><h3>1. Introduction<a name="Introduction"> </a></h3><p>One of the strengths of mondrian's design is that you don't need to do any processing to populate special data structures before you start running OLAP queries. More than a few people have observed that this makes mondrian an excellent choice for 'real-time OLAP' -- running multi-dimensional queries on a database which is constantly changing.</p><p>The problem is that mondrian's cache gets in the way. Usually the cache is a great help, because it ensures that mondrian only goes to the DBMS once for a given piece of data, but the cache becomes out of date if the underlying database is changing.</p><p>This is solved with a set of APIs for cache control. Before I explain the API, let's understand how mondrian caches data.</p><h3>2. How mondrian's cache works<a name="How_mondrians_cache_works"> </a></h3><p>Mondrian's cache ensures that once a multidimensional cell -- say the Unit Sales of Beer in Texas in Q1, 1997 -- has been retrieved from the DBMS using an SQL query, it is retained in memory for subsequent MDX calculations. That cell may be used later during the execution of the same MDX query, and by future queries in the same session and in other sessions. The cache is a major factor ensuring that Mondrian is responsive for speed-of-thought analysis.</p><p>The cache operates at a lower level of abstraction than access control. If the current role is only permitted to see only sales of Dairy products, and the query asks for all sales in 1997, then the request sent to Mondrian's cache will be for Dairy sales in 1997. This ensures that the cache can safely be shared among users which have different permissions.</p><p>If the contents of the DBMS change while Mondrian is running, Mondrian's implementation must overcome some challenges. The end-user expects a speed-of-thought query response time yielding a more or less up-to-date view of the database. Response time necessitates a cache, but this cache will tend to become out of date as the database is modified.</p><p>Mondrian cannot deduce when the database is being modified, so we introduce an API so that the container can tell Mondrian which parts of the cache are out of date. Mondrian's implementation must ensure that the changing database state does not yield inconsistent query results.</p><p>Until now, control of the cache has been very crude: applications would typically call</p><blockquote><code>mondrian.rolap.RolapSchema.clearCache();</code></blockquote><p>to flush the cache which maps connect string URLs to in-memory datasets. The effect of this call is that a future connection will have to re-load metadata by parsing the schema XML file, and then load the data afresh.</p><p>There are a few problems with this approach. Flushing all data and metadata is all appropriate if the contents of a schema XML file has changed, but we have thrown out the proverbial baby with the bath-water. If only the data has changed, we would like to use a cheaper operation.</p><p>The final problem with the <code>clearCache()</code> method is that it affects only new connections. Existing connections will continue to use the same metadata and stale data, and will compete for scarce memory with new connections.</p><h3>3. CacheControl API<a name="CacheControl_API"> </a></h3><p>The <a href="api/mondrian/olap/CacheControl.html">CacheControl</a> API solves all of the problems described above. It provides fine-grained control over data in the cache, and the changes take place as soon as possible while retaining a consistent view of the data.</p> <p>When a connection uses the API to notify Mondrian that the database has changed, subsequent queries will see the new state of the database. Queries in other connections which are in progress when the notification is received will see the database state either before or after the notification, but in any case, will see a consistent view of the world.</p><p>The cache control API uses the new concept of a cache region, an area of multidimensional space defined by one or more members. To flush the cache, you first define a cache region, then tell Mondrian to flush all cell values which relate to that region. To ensure consistency, Mondrian automatically flushes all rollups of those cells.</p><h4>3.1. A simple example<a name="A_simple_example"> </a></h4><p>Suppose that a connection has executed a query:</p><blockquote><code>import mondrian.olap.*;<br/><br/>Connection connection;<br/>Query query = connection.parseQuery(<br/> "SELECT" +<br/> " {[Time].[1997]," +<br/> " [Time].[1997].Children} ON COLUMNS," +<br/> " {[Customer].[USA]," +<br/> " [Customer].[USA].[OR]," +<br/> " [Customer].[USA].[WA]} ON ROWS" +<br/> "FROM [Sales]");<br/>Result result = connection.execute(query);</code></blockquote><p>and that this has populated the cache with the following segments:</p><blockquote><table border="1" style="border-collapse: collapse"><tr><th>Segment YN#1</th><td><pre>Year Nation Unit Sales1997 USA xxxPredicates: Year=1997, Nation=USA</pre></td></tr><tr><th>Segment YNS#1</th><td><pre>Year Nation State Unit Sales1997 USA OR xxx1997 USA WA xxxPredicates: Year=1997, Nation=USA, State={OR, WA}</pre></td></tr><tr><th>Segment YQN#1</th><td><pre>Year Quarter Nation Unit Sales1997 Q1 USA xxx1997 Q2 USA xxxPredicates: Year=1997, Quarter=any, Nation=USA</pre></td></tr><tr><th>Segment YQNS#1</th><td><pre>Year Quarter Nation State Unit Sales1997 Q1 USA OR xxx1997 Q1 USA WA xxx1997 Q2 USA OR xxx1997 Q2 USA WA xxxPredicates: Year=1997, Quarter=any, Nation=USA, State={OR, WA}</pre></td></tr></table></blockquote><p>Now suppose that the application knows that batch of rows from Oregon, Q2 have been updated in the fact table. The application notifies Mondrian of the fact by defining a cache region:</p><blockquote><code>// Lookup members<br/>Cube salesCube =<br/> connection.getSchema().lookupCube(<br/> "Sales", true);<br/>SchemaReader schemaReader =<br/> salesCube.getSchemaReader(null);<br/>Member memberTimeQ2 =<br/> schemaReader.getMemberByUniqueName(<br/> Id.Segment.toList("Time", "1997", "Q2"),<br/> true);<br/>Member memberCustomerOR =<br/> schemaReader.getMemberByUniqueName(<br/> Id.Segment.toList("Customer", "USA", "OR"),<br/> true);<br> <br> // Create an object for managing the cache<br> CacheControl cacheControl =<br> connection.getCacheControl(null);<br/><br/>// Create a cache region defined by<br/>// [Time].[1997].[Q2] cross join<br/>// [Customer].[USA].[OR].<br/>CacheControl.CellRegion measuresRegion =<br/> cacheControl.createMeasuresRegion(<br/> salesCube);<br/>CacheControl.CellRegion regionTimeQ2 =<br/> cacheControl.createMemberRegion(<br/> memberTimeQ2, true);<br/>CacheControl.CellRegion regionCustomerOR =<br/> cacheControl.createMemberRegion(<br/> memberCustomerOR, true);<br/>CacheControl.CellRegion regionOregonQ2 =<br/> cacheControl.createCrossjoinRegion(<br/> measuresRegion,<br/> regionCustomerOR,<br/> regionTimeQ2);</code></blockquote><p>and flushing that region:</p><blockquote><code>cacheControl.flush(regionOregonQ2);</code></blockquote><p>Now let's look at what segments are left in memory after the flush.</p><blockquote><table border="1" style="border-collapse: collapse"><tr><th>Segment YNS#1</th><td><pre>Year Nation State Unit Sales1997 USA OR xxx1997 USA WA xxxPredicates: Year=1997, Nation=USA, State={WA}</pre></td></tr><tr><th>Segment YQN#1</th><td><pre>Year Quarter Nation Unit Sales1997 Q1 USA xxx1997 Q2 USA xxx
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -