Caching has always been a way to achieve better performance by bringing data closer to where it is consumed thus avoiding what can be a bottleneck to its original data source (usually the database). Due to the architecture of the Internet itself, caching often takes place outside of the Enterprise. This caching can happen in a user’s browser, on proxy servers, or on Content Distribution Networks (CDNs), etc. This type of caching is great because results are served up without even entering the infrastructure of the enterprise that hosts the application.
Once a request makes it into the infrastructure of the Enterprise, it is up to us as Enterprise developers and Architects to efficiently handle these requests and the associated data in ways that yield good performance that will scale as to not overload the resources of the Enterprise. To make this possible, there are various caching techniques that can be employed.
We are starting to see a trend towards applications that are becoming more data and state driven, especially as we are just beginning the journey into cloud-based computing. At this year’s PDC, Microsoft has shown many new and enhanced technologies that are more and more data and state driven. These include: Oslo, Azure, Workflow Foundation, “Dublin”, etc. To enable the massive scale that these technologies will help to provide, caching will become extremely important and will probably be thought of as a new tier in the application architectures of the future. Caching will be crucial to achieve the scale and performance that users will demand.
Common Caching Scenarios
When thinking about the data in the Enterprise we begin to uncover some very different scenarios that could benefit from caching. We also quickly realize that caching is not a “one size fits all” proposition because a solution that makes sense in one scenario may not make sense in another. To get a better understanding of this, lets talk more about the three basic scenarios that caching tends to fall into:
Reference Oriented Data
Reference data is typically written infrequently but frequently read. This type of data is an ideal candidate for caching. Most of the time, when we think about caching data, this is the type of data that first comes to mind. Examples of this could include: A product catalog, a schedule of flights, etc. Getting this type of data closer to where it is consumed can have huge performance benefits. It also doesn’t overload database resources with queries that are generating similar results.
Activity Oriented Data
Activity data is written and read by only one user as a result of a specific activity and is no longer needed when the activity is completed. While this type of data is not what we typically think of as a good candidate for caching, it can yield benefits of scalability if we do find an effective caching strategy. An example of this type of data would be for a shopping cart. An appropriate, distributed caching strategy will allow better overall scale for an application because requests can be served easily by load balanced servers that do not require sticky sessions. To handle this, the caching strategy must be able to handle many of these exclusive collections of data in a distributed way.
Resource Oriented Data
The trickiest of all is what’s know as Resource data. Resource data is typically read and written very frequently by many users simultaneously. Because of the volatility of this data it is not often thought of as a candidate for caching but can yield big benefits in both performance and scale if an efficient strategy can be found. An example of this type of data would include: The inventory for an online bookstore, the seats on a flight, the bid data for an online auction, etc. In all of these examples, although the data is very volatile in a high throughput situation it would be very slow if every request needed to result in a database access. The challenge for caching this type of data is having a strategy that can be distributed in order to achieve the ability to properly scale along with the necessary concurrency and replication so that the underlying data is consistent across machines.
Current Caching Technologies in .NET
There are existing .NET technologies that can be used to provide caching today. Some of these are tied to the web tier (e.g. ASP.NET Cache, ASP.NET Session and ASP.NET Application Cache) while some are more generic in their usage (e.g. Enterprise Library’s Caching Application Block). These are all great caching technologies for smaller applications but they have limitations that prevent them from being used for large Internet scale applications.
When it comes to larger Internet scale application caching technologies there are some 3rd party products in the space that do an excellent job, one of these being NCache by Alachisoft. Microsoft is now also jumping into this space with a new caching technology codenamed “Velocity.” Velocity can be configured to handle all of the caching scenarios described above in a performant and and highly scalable way.
What is “Velocity?”
“Velocity” is Microsoft’s new distributed caching technology. Although it is not scheduled to be released until middle of 2009, it already has many impressive caching features with lots of very useful and ambitious features slated for future releases. In Microsoft’s own words from the “Velocity” project website, they define “Velocity” in the following way:
Microsoft project code named “Velocity” provides a highly scalable in-memory application cache for all kinds of data. By using cache, application performance can improve significantly by avoiding unnecessary calls to the data source. Distributed cache enables your application to match increasing demand with increasing throughput using a cache cluster that automatically manages the complexities of load balancing. When you use “Velocity,” you can retrieve data by using keys or other identifiers, named “tags.” “Velocity” supports optimistic and pessimistic concurrency models, high availability, and a variety of cache configurations. “Velocity” includes an ASP.NET session provider object that enables you to store ASP.NET session objects in the distributed cache without having to write to databases, which increases the performance and scalability of ASP.NET applications.
In a nutshell, Velocity allows a cache to be distributed across servers which has huge scalability benefits. It also allows for pessimistic and optimistic concurrency options. This along with other features is what makes “Velocity” a great choice for the caching needs of large scale applications.
Features in Velocity
There are many features both existing and planned that will make Velocity an compelling caching technology that far exceeds the existing caching options offered by Microsoft.
Simple Caching Access Patterns Get/Add/Put/Remove
It is very easy using Velocity’s API to perform the standard cache access including Adding new items to the cache, getting items from the cache, putting updates to cache items and removing items from the cache.
When saving a cache item to a specific cache “region” (regions are discussed later in this post) you are able to add one or more string tags to that entry. It is then possible to query the cache for entries that contain one or more tags.
Distributed Across Machines
A cache can be configured to exist across machines. The configuration options for this are quite extensive. By allowing cache items to be distributed across machines the cache is allowed to scale out as new cache servers are added. The other existing caching technologies offered by Microsoft only allow for scaling up. Scaling up can be very expensive as the caching needs increase and often hit a limit as the the amount of caching that can be allowed. With “Velocity” you can scale out a cache across hundreds of machines if needed effectively fusing the memory of these machines together to form one giant cache, all with low cost commodity hardware.
Velocity can be configured to transparently store multiple copies of each cache item when it is stored in the cache. This provides high availability by helping to guarantee that a given cache item will still be accessible even in the event that a caching server fails. Of course the more backup copies of an item that you configure the greater the guarantee that your data will survive a failure. Velocity is also smart enough to make sure that each backup copy of a cache item exists on separate cache servers.
Velocity supports both Optimistic and Pessimistic concurrency models when updating an item in the cache. With Pessimistic concurrency, you would request a lock when retrieving an item from the cache that you were going to update. Then you would be required to unlock the object after it is updated. In the meantime, no one else would be able to obtain a lock for that item until it was either unlocked or the lock expires. With Optimistic concurrency, no lock is needed, instead the original version information is passed along when an item is updated. During the update, Velocity will check to see if the version that is being updated on the caching server is the same version that was edited. An error is passed back to the caller if the versions do not match. For performance reasons it is always better to use optimistic concurrency if the situation can tolerate it.
Management and Monitoring
Velocity is managed using PowerShell. There are over 130 functions that can be performed through PowerShell. For example: you can create caches, set configuration info, start a cache server, stop a cache server, etc.
Velocity comes with a Session State provider that can be “plugged into” ASP.NET so that the Session State information is stored inside Velocity as opposed to the standard ASP.NET session provider. Using the Velocity provider is completely transparent, the session object is used just as it was with the ASP.NET session provider. This automatically scales session state across servers in a way that does not require sticky routers.
Local Cache Option
Velocity can be configured such that when an item is retrieved from the distributed cache that it can also be cached locally on the server where it was retrieved. This makes it faster to retrieve the same object if it is asked for again. The real savings here is in network latency and the time it would take for de-serializing the cache item. Cache Items stored inside the Velocity cache are always serialized in memory but cache items stored in the local cache (when configured) are always stored natively as objects. So, if the memory space can be afforded, this can be quite a performance boost.
Can add Caching Servers at Runtime
Several times during the sessions on Velocity at the PDC, the presenter would add or remove a caching server at runtime. When this was done, the cluster of Velocity caching servers would react immediately and start to redistribute cache items across the cluster. This intelligence was very impressive and is the same smarts that is able to react if a caching server has a hardware failure, again redistributing cached items across the cluster. With this ability it is possible to dynamically add more caching power at runtime without losing any cached data because of a system restart.
Although no indication was given as to when the additional features described below would make it into the Velocity framework it was very encouraging to hear about the many features that were in the queue for a future release. Here are some of the ones mentioned at the PDC:
In future releases of Velocity, security will need to be a greater consideration. Currently, information stored inside the Velocity cache is not secured in anyway. Having access to the cache means that you have access to anything stored in the cache. In future versions of Velocity there will be security options that will allow you to secure items in the cache using several different techniques. The future planned security options are:
- Token-based Security - when storing items in the cache you will be able to specify a security token along with that item. That security token will need to be presented in order to retrieve the item from the cache.
- App ID-based Security – This option will allow you to register a domain account with a named cache. This way only specific users will be able to access a specified cache. Note: “Named caches” are discussed later in this post.
- Transport Level Security – This option will allow the standard transport level security offered by the various WCF bindings.
Cache Event Notifications
In the future, when anything happens that affects the cache, a notification will be sent across the cache cluster and to any other subscribed listeners. These events, when implemented, will allow a view into all of the actions taken on the cache. This will include notifications sent when items are added, updated and removed form the cache.
In many scenarios, it is the cache itself that we wish to front the data access to our system. This can provide very high performance and throughput. In order to make this as efficient as possible it would be great if it was possible to write to the cache and have the cache write the data to its backend data store. This is what is referred to as “Write Behind”. The actual writing to the backend happens asynchronously so that the caller does not have to wait for this write to happen and only has to wait for the item to be written to the cache memory. This is possible and safe because of the high availability features offered in Velocity. Because the cache data can be backed up inside the cache, there is little risk that a machine failure will prevent the data from being written to the backend data store.
A future release of Velocity will also provide a “Read Through” feature. Read Through allows the cache to fetch an item if it doesn’t currently hold the item within the cache. This is both a convenience and a performance enhancement because multiple calls are not needed to retrieve an item from the cache when it is not already there. This again would allow the cache itself to be the data access tier with the cache itself handling the communications with the backend data source.
Future versions of Velocity will provide access methods that will allow bulk operations to be performed on the cache. This again can enhance performance in some scenarios simply by removing chatty calls to the cache when larger chunkier calls could be made.
Being able to query the cache using LINQ will open up many very interesting scenarios. Having LINQ support will truly transform caching into its own robust tier in the overall architecture in the Enterprise.
Other future features talked about with regards to Velocity involve High Performance Computing (HPC). Up to this point, caching was all about placing data as close as possible to where it is consumed. When we begin to think about HPC scenarios, we are actually viewing the problem from the opposite point of view, that is, we are trying to put the processing as close to the data as possible. In the PDC sessions, they mentioned that the Velocity team is interested in exploring ways to move calculations and processing close to the items inside the cache.
With the announcement of Windows Azure and other cloud based initiatives, all major technologies at Microsoft are trying to figure out how they can fit into this new paradigm of computing. It is not clear exactly how Velocity will take part in the cloud but I would imagine that at some point Velocity caching will be available to applications hosted in the Window’s Azure cloud.
Velocity caching is architected to run on one or more machines, each using a cache host service. Although you can run more than one cache host service on a single machine, it is generally not advised as you do not get the full protection that Velocity’s high availability feature has to offer when a machine failure occurs. Multiple cache host services are meant to be run together as part of a cache cluster. This cache cluster is really taken to be one large caching service. The services that are part of the cluster actually interact together and are configured as a unit to offer all of the redundancy and scale that Velocity has to offer. When the cluster is started, all of the cache service hosts that make up the cluster read its configuration information either from SQL server or a common XML configuration file that is stored on a network accessible file share. A few of the cache hosts are assigned the additional responsibility of being “Lead Hosts”. These special hosts track the availability of the other cache hosts and also perform the necessary load balancing tasks for the cache as a whole.
While the cache clusters and associated cache hosts make up the physical view of the Velocity cache, “Named Caches” are used to make up the logical view of the cache. Velocity can support multiple name caches which act as separate isolated caches within a Velocity cache cluster. Each Named Cache can span all of the machines in the cluster, storing its items across various machines that are configured to redundantly store items to achieve high availability which gives the named cache a tolerance for machine failure (since any item in the cache can live in multiple places on different machines).
Inside a given named cache, Velocity offers another optional level in which to cache objects called “Regions.” There can be multiple “Regions” for any given Named Cache. When saving cache items into a Region it is possible to add “Tags” to these items so that they can be searched and retrieved by more than a simple cache ID (which is how caches normally work). The tradeoff in using regions is that all of the items in a given region are stored in the same cache host. This is done because the cache items need to be indexed in order to provide the searching functionality that “Tags” provide. Even though all of the items in a region are stored on the same host, they can still be backed up onto other hosts in order to provide the high availability that Velocity offers. So, while Regions support a great tag based searching functionality, they do not provide the same distributed scalability that cache items have that do not use Regions.
So what does the code look like when getting and saving objects into the Velocity cache.
To access a Velocity cache, you must first create/obtain a cache factory. From the cache factory you can get an instance of the named cache you wish to use (in the example, we are getting the “music” named cache).
CacheFactory factory = new CacheFactory();
Cache music = factory.GetCache("music");
Next, I will put an item into the cache. In the example below, I am caching the “Abbey Road” CD using its ASIN number as the cache key.
music.Put("B000002UB3", new CD("Abbey Road", .,.));
To retrieve an item from the cache, simply call the cache’s “Get” method passing in the key to the cache item you wish to retrieve.
CD cd = (CD)music.Get("B000002UB3");
To create a region, simply call the “CreateRegion” method of the cache passing in the desired name for the region you wish to create. In the example below, I create a “Beatles” region:
Below, I show an example where two items are being put into the same region. When using regions, you must always specify the region along with the key and object you wish to cache.
music.Put("Beatles", "B000002UAU", new CD( “Sgt. Pepper’s…”,.));
music.Put("Beatles", "B000002UB6", new CD( “Let It Be”,.));
Lastly, below, I show how to retrieve a cache item from a regions. Notice that the Region name must be specified along with the key.
CD cd = (CD)music.Get("Beatles", "B000002UAU");
How is High Availability Achieved?
As previously mentioned, Velocity has a feature that helps to promote high availability for the items in the cache. To gain this “High Availability” caches can be configured to to store multiple copies of an object when it is put into the cache. What Velocity does when doing this is ensure that a given item is stored in multiple cache hosts (which is why it is advisable to only run one cache host per physical server).
In order to achieve high availability without adversely impacting performance, when putting an item into the cache, Velocity will write the cache item to its primary location and to only one secondary location before returning to the caller. Then, after returning to the caller, Velocity will asynchronously write the cache item to other backup locations up to the number of backups configured for that cache. Doing this ensures that the object being cached exists in at least two places (which gives the minimum requirement for high availability) but doesn’t hold up the caller while fulfilling all of the configured backup requirements.
Since Regions are required to live entirely inside one cache host, to achieve “High Availability” Velocity backs up the entire Region to another host.
Just before the PDC08, Velocity had released its CTP2 (Community Technical Preview 2). During the PDC they stated the release schedule for Velocity to be the following:
CPT3 would be released during the MIX09 conference (scheduled for Mid-March 09).
RTM release scheduled for Mid-2009.
If you would like to know more about Velocity, I would suggest you view the following presentations given at the PDC08:
Project “Velocity”: A First Look – presenter: Murali Krishnaprasad
WMV-HQ | WMV | MP4 | PPTX
Project “Velocity”: Under the Hood – presenter: Anil Nori
WMV-HQ | WMV | MP4 | PPTX
Also, there was a very informative interview on Scott Hanselman’s podcast “Hanselminutes”: Distributed Caching with Microsoft’s Velocity
I would also recommend the Velocity article on MSDN: “Microsoft Project Code Named Velocity CTP2” as well as the Velocity Team Blog on MSDN.