A voir une série de 3 articles pour découvrir comment mettre en place le DAG d'Exchange 2010.

Il faut Windows 2008 (R2) Enterprise mais une version standard d'exchange suffit pour jusqu'a 5 bases.

--> http://www.msexchange.org/articles_tutorials/exchange-server-2010/high-availability-recovery/uncovering-exchange-2010-database-availability-groups-dags-part1.html

What an Exchange 2010 Database Availability Group (DAG) is all about and how this high availability feature may fit into your Exchange 2010 organization. Providing also step by step instructions on how to deploy DAG and other best practice recommendations.

A Bit of History

Let us begin with a walk down memory lane. Prior to the release of Exchange 2007, the high availability and disaster recovery features included with the Exchange Server product were quite limited. Previous versions of Exchange (Exchange 2003 and earlier) could take advantage of Microsoft Cluster Services (MSCS), but this only provided redundancy at the hardware level since the nodes shared the same storage subsystem. If the active cluster node suddenly became unavailable, the Exchange Virtual Server (EVS) and any relevant cluster resources would fail over to the passive node and the end users could then continue their work.

But you wouldn’t want the storage subsystem to fail as it was a single point of failure. In order to achieve redundancy at the storage level, organizations were forced to invest in replication solutions provided by third-party software vendors and/or storage hardware vendors. Since solutions provided by third party are not supported by Microsoft and typically are quite expensive to implement, the Exchange Product group wanted to provide better high availability and disaster recovery features natively in the Exchange Server product.

Most of us probably agree that with the release of Exchange 2007 those visions became a reality! Exchange 2007 gave us a whole sleeve of brand new high availability and disaster recovery features such as Local Continuous Replication (LCR), which targeted small organizations and Cluster Continuous Replication (CCR) which targeted medium and large organizations. Later on (with Exchange 2007 SP1) came Standby Continuous Replication (SCR), targeted at organizations of pretty much all sizes. All three features used a new asynchronous replication technology, which worked by shipping log files to a passive copy of a storage group and after inspection replaying these into this passive copy.

Although LCR provided redundancy at the storage level, the feature never really got much attention. The reason behind this was that since the storage group copies had to be stored on a volume local to the Mailbox server, it presented a single point of failure at the hardware level. Since Exchange 2007 was released, CCR has been a huge success. The interesting thing about CCR was that it combined the new asynchronous replication technology introduced by Exchange 2007 with Windows Failover Clustering technology, thereby providing redundancy at the hardware level as well as at the storage level, providing a true high availability solution without any single point of failures.

CCR cluster nodes could be located in separate datacenters in order to provide site-level redundancy, but since CCR was not developed with site resiliency in mind, there were too many complexities involved with a multi-site CCR cluster solution (for details on multi-site CCR cluster deployment take a look at a previous article series of mine). This made the Exchange Product group think about how they could provide a built-in feature geared towards offering site resilience functionality with Exchange 2007.

When Exchange 2007 SP1 was released we got exactly that. A feature called Standby Continuous Replication (SCR) which made it possible to ship log files to another Exchange 2007 Mailbox Server. Because SCR did not require Windows Failover Clustering, the log files could be shipped from both clustered and non-clustered Mailbox servers (the SCR source) to clustered and non-clustered mailbox servers (SCR target). What was really interesting with SCR was that you could specify a log replay lag time of up to 7 days, which made it possible to fix most database/store related issues before they hit the SCR target located in another datacenter.