We use two kinds of caches
Both cache types are provided by implementation EHCache, version 2.10.2, see JavaDoc and reference documentation. For EHCache, We maintain 2 configuration files: ehcache.xml and ehcache-stats.xml. Both should be kept in sync, the only difference is that ‘-stats’ one have statistics enabled, so that its possible to turn on statistics at deploy time via Candlepin configuration.
Current version of Hibernate that we use is 5.1.1 (see documentation). Hibernate docs contain specific chapters on 2LC Chapter 13, Caching. For more in-depth introduction, I recommend the following links:
To enable JMX statistics, use the following settings in candlepin.conf
cache.jmx.statistics=true
jpa.config.net.sf.ehcache.configurationResourceName=ehcache-stats.xml
You also need to enable JMX in your /etc/tomcat/tomcat.conf:
JMX_CONF="-Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.port=3322 -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false "
JAVA_OPTS=" $JMX_CONF ..."
Then you can connect to JMX server, using JConsole: jconsole localhost:3322
Classes that support caching are located in package org.candlepin.cache
If you are not going to use any existing cache, you need to configure a new one in CacheContextListener
To use already configured cache, inject CandlepinCache
. Example usage:
Cache<String, Status> statusCache = candlepinCache.getStatusCache();
Status cached = statusCache.get(CandlepinCache.STATUS_KEY);
...
statusCache.put(CandlepinCache.STATUS_KEY, status);
Note that Cache
interface is from JSR-107. We are trying to use interfaces from JSR-107 as much as possible, so that we can potentially swap caching implementations in the future.
To cache entities using 2LC, it is necessary to use appropriate annotations:
@Cacheable(true)
@Cache(usage = CacheConcurrencyStrategy.NONSTRICT_READ_WRITE)
public class Content extends AbstractHibernateObject implements SharedEntity, Cloneable {
You can also cache collections on entities:
@Cache(usage = CacheConcurrencyStrategy.NONSTRICT_READ_WRITE)
private Set<String> modifiedProductIds;
Another feature of 2LC is query caching. To use it, you need to indicate that a query should be cached using setCacheable
method:
@Transactional
public Content lookupByUuid(String uuid) {
return (Content) currentSession().createCriteria(Content.class).setCacheable(true)
.add(Restrictions.eq("uuid", uuid)).uniqueResult();
}
To find out if the cache is effective you should either use profiler to see number of SQL queries generated, or you can use JConsole and see cache statistics (cache hits).
There is one warning associated with using 2LC. We are currently not deploying EHCache in clustered mode. That means that in a cluster, cached entities might become stale (invalidation across nodes doesn’t work). We plan to get around this issue by making sure any entity that we cache is immutable across cluster nodes. For example Product
entity is being mutated but only on 1 Candlepin node during import manifest. That is fine, because cache will get invalidated locally.
To make use of Hibernate Second Level Cache, it is important to understand how it works and when Hibernate uses it to retrieve cached entities. This page will try to explain this by using examples in Candlepin sources.
In Candlepin, we mainly cache Product
/Content
entities and all the collections (e.g. attributes, productContent) that are on them. The reason this is a good choice is that these entities are almost completely immutable which means we don’t have to worry about invalidation.
Very simplified view of when Hibernate uses 2LC can be summarized as follows:
Id
columnCache
annotation on the collection), the entities in that collection will be loaded from 2LCOn the flip side, you won’t get cache hits when you have a complex query in which you JOIN cached entity (Product/Content). This happens quite often in Candlepin. The solution is to rewrite the specific piece of code to use more LAZY loading of Product
or Content
.
To realize M to N relationship between Owner
and Product
we have a special third entity OwnerProduct
. It has the following relationship defined
@ManyToOne(fetch=FetchType.LAZY, optional=false)
@JoinColumn(updatable = false, insertable = false)
private Product product;
Imagine one wants to find a Product
for an Owner
. He might use the following query A:
public Product getProductByIdJpqlNoCacheHit(String ownerId, String productId) {
TypedQuery<Product> query =
getEntityManager().createQuery("SELECT op.product FROM OwnerProduct op WHERE op.product.id=:productId and op.owner.id = :ownerId",Product.class)
.setParameter("ownerId", ownerId)
.setParameter("productId", productId);
return query.getSingleResult();
}
This query will populate Product
2LC. However, upon repeated calls, it will NOT hit Product
from the cache. Instead, Hibernate will issue a SQL query:
select
product1_.uuid as uuid1_13_,
product1_.created as created2_13_,
product1_.updated as updated3_13_,
product1_.entity_version as entity_v4_13_,
product1_.product_id as product_5_13_,
product1_.locked as locked6_13_,
product1_.multiplier as multipli7_13_,
product1_.name as name8_13_
from
cp2_owner_products ownerprodu0_
inner join
cp2_products product1_
on ownerprodu0_.product_uuid=product1_.uuid
where
product1_.product_id=?
and ownerprodu0_.owner_id=?
Even though Product
is not taken from cache here, we are still getting benefit from 2LC, because Product.attributes
and Product.productContent
are cached and retrieved from cache. One would still like to utilize cache even for retrieval of Product
entity. There are several ways to do that:
OwnerProduct
entity and set OwnerProduct.product
relationship as LAZY - shown belowOwnerProduct
entity and set OwnerProduct.product
as EAGER with Fetch style SELECTOwnerProduct
entity and then manually query for Product
using its UUIDLets discuss the first option. Because we set OwnerProduct.product
to LAZY, Hibernate is not going to join when we execute the query. Then, because OwnerProduct.product
is a relationship that retrieves just one Product, Hibernate will utilize 2LC. The code might look like this:
public Product getProductByIdJpql(String ownerId, String productId) {
TypedQuery<OwnerProduct> query =
getEntityManager().createQuery("SELECT op FROM OwnerProduct op WHERE op.product.id=:productId and op.owner.id = :ownerId", OwnerProduct.class)
.setParameter("ownerId", ownerId)
.setParameter("productId", productId);
return query.getSingleResult().getProduct();
}
If the Product
is not yet in the cache, the run of the getProductByIdJpql
will produce the following two SQL statements:
select
ownerprodu0_.owner_id as owner_id1_5_,
ownerprodu0_.product_uuid as product_2_5_
from
cp2_owner_products ownerprodu0_ cross
join
cp2_products product1_
where
ownerprodu0_.product_uuid=product1_.uuid
and product1_.product_id=?
and ownerprodu0_.owner_id=?
select
product0_.uuid as uuid1_13_0_,
product0_.created as created2_13_0_,
product0_.updated as updated3_13_0_,
product0_.entity_version as entity_v4_13_0_,
product0_.product_id as product_5_13_0_,
product0_.locked as locked6_13_0_,
product0_.multiplier as multipli7_13_0_,
product0_.name as name8_13_0_
from
cp2_products product0_
where
product0_.uuid=?
With repeated invocations, the second statement will not be issued.