When I was planning a PKI solution for one of my customers, the first thing that came to my mind is how to define the SLA for the PKI solution? That is, how much time can a corporate wait for a failed CA server for example?
This led me to another question, what is the consequences for having CA down? To answer this question, I may ask myself: What cannot be done if a CA is down.
The answer is simple, if CA is down, two things are immediately get affected:
- Your ability to issue certificate
- Your ability to recover keys
- Your ability to sign CRLs
Well, in small environments, you do not care that much for point 1 and 2, and your focus shall go to point 3. Let me start by stating the urgency of signing CRL.
If you do not sign CRLs and publish them before the current CRL expires, any service that performs CRL checking will eventually stop working or accepting your internally issued certificates. This is a huge thing!!
So, point 3 is the most important factor to look at when you have failed CA. You can do one of two things:
- If you possess the private key of the CA, you can manually sign and publish a CRL.
- You have to recover the CA before the current CRL expires.
As point 1 is not an easy thing or normal thing to do, my focus is point 2, which is fixing the CA before the current CRL expires.
So, if I have a 3 days CRL validity period, then I can have as much as 3 days until the current CRL expires (this is when the CA published a CRL and fail immediately), or I can have up to one second before the current CRL expires (this is when the CA published a CRL and failed after 3 days and right before publishing the next CRL).
Well, this opens the door for so much tolerance an inaccuracy in defining an SLA to recover a CA, right? The option that give you more flexibility is CRL overlap.
“Let us focus on how Base CRLs works with overlap. As mentioned there is some additional configuration that can be performed that can optimize your CRL publishing intervals so that you have adequate time to perform Emergency CRL Signing or to recover your CA. What will need to be configured is the CRL Overlap Period. In order to configure the CRL Overlap Period both the CRLOverlapUnits and CRLOverlapPeriod registry settings need to be configured.
So, in my previous example my CRL had a validity period of 3 days. What I can do now is add a CRL Overlap Period of 3 days. With this configuration, my CRL will be valid for a period of 6 Days. However, at 3 days a new CRL will be published as well. This is illustrated in the graphic below:
In the example illustrated in the graphic above, CRL 1 will be valid for a period of 6 days. CRL 2 will be published at Day 3. So, if my CA fails between Day 1 and Day 3, I would still have 3 Days (Day 3 through Day 6) to perform Emergency CRL signing or to recover my CA in event of failure. If my CA fails between Day 3 and Day 6, there is a new CRL (CRL 2) that is valid through Day 9. So, in short if my CA fails between Day 3 and Day 6, I still have at least 3 days to perform Emergency CRL Signing or to recovery my CA, before revocation checking starts to fail. And the reason that I have the 3 days is the CRL Overlap Period extended out my CRL for 3 days and staggered the Next Publish and Next Update times by 3 days”
So far we have identified the concept of CRL overlap. This is an important thing to consider when planning for CA SLA. CRL overlap also helps in the following cases:
- Active Directory replication delays;
- CRL distribution from CA server to revocation server delays;
- Temporary network connectivity issues;
- Unexpected server failure.
CRL Overlap Configuration:
Under the Certification Services configuration hive in the registry two values control the overlap period for the base CRL and two registry values define the overlap period for delta CRL creation:
- CRLOverlapPeriod=REG_SZ:Hours|Minutes (Units)
- CRLOverlapUnits=REG_DWORD:0x0 (Value)
- CRLDeltaOverlapPeriod=REG_SZ:Hours|Minutes (Units)
- CRLDeltaOverlapUnits=REG_DWORD:0x0 (Value)
You can verify the settings for the above registry keys on your CA computer with the following commands:
- certutil -getreg CA\CRLOv*
- certutil -getreg CA\CRLDeltaOv*
Applies to both Windows 2008 R2 and Windows 2012: Microsoft states that the default setting is 10 percent from the CRL lifecycle, and if not configured manually, it will have maximum of 12 hours. If configured manually, the overlap period cannot exceed the publishing period. [http://technet.microsoft.com/en-us/library/cc731104.aspx]
I could not find a blog post that describes how CRL overlap works better than this :http://social.technet.microsoft.com/wiki/contents/articles/20652.how-thisupdate-nextupdate-and-nextcrlpublish-are-calculated.aspx
CRL Certificates Extensions
There are three terms used to describe the base and delta CRLs:
Effective Date (aka ThisUpdate):
[The term Effective date is used in the Windows certificate dialog while certutil.exe and the RFC name this field thisupdate.]
Mandatory field. The date that a CRL became effective. The effective time, by default, is set to 10 minutes prior to the current date and time to allow for clock synchronization issues.
ThisUpdate = MaximumOf(CurrentTime – ClockSkewMinutes, CANotBefore)
In other words, usually ThisUpdate field value is CurrentTime minus ClockSkewMinutes (10 minutes by default). However, there is an exception when CA certificate is renewed. In this case, CurrentTime minus ClockSkewMinutes may occur prior to CA certificate validity. In this case, ThisUpdate field value equals NotBefore value of the CA certificate.
Next CLR Publish
This is non critical extension (optional), which means that it is not mandatory for the application to consume it. This indicates the date and time when a Windows CA will publish a new CRL. When a Windows computer uses a CRL for certificate verification it also examines the Next CRL Publish extension. If the Next CRL Publish date is already in the past, it connects to the CRL distribution points (referenced in the certificate) and attempts a download of a newer CRL.
The time after the Next CRL Publish and before the Next Update is a buffer time to allow Windows computers retrieval of a CRL before the CRL has actually expired, and a buffer for you to recover a failed CA.
NextCRLPublish (Base CRL) = MinimumOf(CurrentTime + CRLPeriod, CANotAfter)
NextCRLPublish (Delta CRL) = MinumumOf(CurrentTime + CRLDeltaPeriod, CANotAfter)
Note: There is a feature called (CRL prefetching) that allows certificate consumer to look at the Next CRL Publish extension and get newer CRLs in case they are available, that is the time between Next CRL Publish and Next Update. The way how CRL pre-fetching work is beyond the scope of this blog post, but it is worth knowing that if the CRL is locally cached, and under certain conditions, download of new CRL might be skipped, even if Next CRL Publish date is already in the past.
(please see http://technet.microsoft.com/en-us/library/ee619723(v=ws.10).aspx).
If CRLDeltaPeriod is equal to zero, Delta CRL is not published. CRL cannot be valid after CA certificate expiration.
Mandatory field. The date and time that a Windows client considers as the expiration date of the CRL. From an operational viewpoint, this is the most critical information. If this date passes, Windows computers will invalidate certificates that are checked against this CRL. You have to recover a failed CA before the date specified in this extension.
NextUpdate (Base CRL) = MinimumOf(NextCRLPublish + InterimBaseCRLOverlap, CANotAfter)
NextUpdate (Delta CRL) = MinimumOf(NextCRLPublish + InterimDeltaCRLOverlap, CANotAfter)