Choosing a SharePoint Farm Design
Although SharePoint 2013's social features and the FAST Search Server integration allow for greater scalability, they impose constraints when you're designing a farm. (See "Social feature support in SharePoint Server 2013.")
And there are other constraints (e.g., budgets, operational readiness, network bandwidth) that your SharePoint architecture team must consider too. To help identify those constraints, it's helpful to answer the following questions:
- What is the capital budget (i.e., the funding available for purchasing hardware and software)?
- What is the annual IT budget (i.e., the funding available for IT staffing and other expenses)?
- What are the operational capabilities of the IT staff members? For example, do they have virtualization or SharePoint experience? Do they practice disaster recovery on a regular basis?
- What technological and operational impacts does compliance add? For example, some organizations might not permit the use of Microsoft or require that certain farms be separate from each other due to the nature of the data.
- What security and Freedom of Information and Protection of Privacy (FOIP) controls and policies does the organization need to comply with? For example, in some organizations, the data must physically reside within geographic borders or reside in separate farms.
- What is the desired level of performance? For example, an organization might want pages to load in 10 seconds or less and search queries to return results within the first three pages.
- Where are users located in relationship to the data centers? What is the bandwidth and latency of the network connection used by users?
- What is the planned useful life of the platform? Technology aside, there can be commercial or contractual obligations or requirements that might need to be considered when determining the platform's lifespan.
- How many users are expected to use the SharePoint farm over its intended lifespan? How much data do you expect to accumulate, archive, and delete?
- What features (e.g., social features, team sites, workflows) do you want to deploy?
- What are the recovery time objectives (RTOs)? In other words, what is the tolerable maximum amount of time between a failure (e.g., site failure, server failure, content database failure) and its recovery?
- What are the recovery point objectives (RPOs)? In other words, what is the acceptable level of data loss due to a failure?
Note that the answers to these questions will likely be revised as the SharePoint architecture team learns more about the SharePoint technology, the organization's technical and business requirements, and those requirements' impact on IT operations.
As I described in "Architecting SharePoint: Building a Solid Foundation for SharePoint Farms," the process is like peeling through the layers of an onion. As you peel through the layers, you learn more, which means you can discuss and document the constraints in a more detailed manner.
Choosing an architectural model requires that you have a solid understanding of your organization's business and technical constraints. An example of a business constraint is the need to house content within countries' borders. An example of a technical constraint is the connectivity between data centers that reside within the countries in question.
The more distributed the farm, the greater the risk of the architecture not working or being an operational nightmare. At some point, common sense needs to take over and governance needs to be leveraged to make the stakeholders aware of the risks.
In any case, you'll likely be able to design a farm that meets most (if not all) of your organization's requirements as well as its constraints. However, meeting the requirements and constraints might mean major changes or additional funding. For example, if an organization has the business constraint of needing to house content within countries' borders, it might mean doing one of the following:
- Providing additional funding to build (or rent space for) other data centers and provide staff for them
- Outsourcing additional data centers
- Using Microsoft SharePoint Online (which is part of Office 365) or an IaaS solution (e.g., Windows Azure)
- Making policy or commercial arrangement changes to eliminate the business constraint
Needless to say, realizing that you require additional large capital expenses, additional staff, new outsourcing contracts, or policy changes late in a project is a major impediment to the perception of success. So, the sooner you find out the constraints, the better.
A single farm consists of a group of servers that are joined together using a tiered model to provide services and content. Specifically, single-farm environments consist of the traditional three-tier SharePoint model farm. According to Microsoft, the three tiers are:
- Web server role. Web servers respond to user requests for web pages. All web servers in a farm are mirrors of each other. You can load balance web servers using the Windows load-balancing service or a hardware device.
- Application server role. Application servers provide SharePoint's services (e.g., search, profile import). You can have one application server that provides all the services or multiple application servers that provide a subset of services. You can load balance multiple redundant application servers using the Windows load-balancing service or a hardware device.
- Database server role. Database servers store content and service data. You can assign all the databases to one database server or spread your databases across multiple database servers. You can cluster or mirror the databases for failover protection.
All the services (e.g., search, profile import) and content (e.g., site collections, services, databases) reside within the farm as a logical unit. With a single medium or large farm, you have foundational services and as many content databases, web applications, and site collections as needed. In addition, with SharePoint 2013 and SharePoint 2010, you can add services to meet requirements. For example, if you require Business Connectivity Services, you can activate it on a server with enough capacity. Similarly, if you're reaching the limits of your indexer, you can add another index server.
Figure 1 (courtesy of Microsoft) depicts a single mid-sized farm, including the servers (hosts) and services they provide. This is a virtualized topology showing virtual machine (VM) servers sitting on top of physical hosts.
You might want to deploy a single farm if one or some of the following conditions apply:
- Your organization can meet all the business requirements with a single farm.
- Your organization can meet all the security and compliance requirements with a single farm.
- Your organization has financial constraints that don't permit deploying multiple farms in the geographic regions (e.g., North America, Europe, Asia Pacific) where it resides.
- Your organization's capacity requirements can be met over the solution's useful lifespan.
- Most or all of your organization's users reside within one geographic region and the network provides enough bandwidth to meet service level agreements (SLAs).
- Your organization's core users reside within one geographic region and the network provides enough bandwidth to meet SLAs.
- The skilled and experienced IT staff members reside in only one geographic region.
- There aren't multiple global data centers (e.g., data centers in North America, Europe, and Asia Pacific).
Multi-farm architectures consist of services farms, My Site farms, and content farms. In these farms, only certain functions are implemented or they're used for a specific service (e.g., basic collaboration). With multiple farms, you can provide specific services to the business based on scalability, function, and policy requirements. For example, you could have a central search farm that's consumed by regional collaboration farms. Or you could have a central My Sites (social) farm consumed by all users so that all the social features work.
Figure 2 (courtesy of Microsoft) depicts a multi-farm architecture that includes an enterprise services farm, a My Site farm, and a content farm. Note that SharePoint 2013's social features work best with one My Site farm. If multiple content farms are necessary, you should keep all the My Sites on one farm. Social features don't work across multiple My Site farms.
You might want to deploy multiple farms if one or some of the following conditions apply:
- Your organization can't meet all the business requirements with a single farm. For example, you might need a second farm to host a public-facing website, extranet site, or custom application. (Custom applications that are deeply embedded into SharePoint are notoriously difficulty to upgrade. Isolating them to dedicated farms allows you to easily upgrade other workloads.) In addition, an organization might have a business unit that needs a separate farm because it's highly autonomous or has business use cases that are incompatible within a shared farm. You might even decide to deploy SharePoint 2013 on a new separate farm to take advantage of its new social features but keep the collaboration functionality and the intranet on the existing SharePoint 2010 farm for a while.
- Your organization's security and compliance requirements can't be met by a single farm. For example, regional security and FOIP requirements might require data to be located within the geographic region in question.
- Your organization's strategy requires multiple farms in several geographic regions, and it has the funds and resources to deploy and manage them (including setting up new data centers if needed).
- Your organization's capacity requirements can't be met with a single farm over the solution's useful lifespan.
- Users reside in multiple geographic regions and the network limitations (e.g., bandwidth, latency) require the farms to be deployed on the same continent.
- The network supports multi-farm connected environments, and you have the operational capability to patch them.
- Users in one region don't require access to the users and content in other regions.
- The skilled and experienced IT staff members reside in multiple geographic regions.
- Your organization is moving to the cloud. Having two farms lets you host one farm in a cloud-based service (e.g., SharePoint Online) while maintaining other content in an on-premises farm.
- Your organization needs separate farms for shared development, test, quality assurance (QA), and staging operations. This is the most common reason for having multiple farms. At a minimum, I recommend that organizations have a QA farm for testing the functionality and performance of service packs and customizations to minimize the risk to the production environment.
Note that this isn't an exhaustive list. For more information about multi-farm architecture (as well as single farm architecture), see TechNet's Architecture design for SharePoint 2013 IT pros web page.
When deciding whether to use a single-farm or multi-farm architecture, you should keep in mind the reasons for using each type of architecture. Here are some scenarios that walk you through this decision process, the reasons for the decision reached, and some design guidance.
Scenario 1. Suppose that your organization resides in various geographic regions in a country. The drivers for this scenario are the location of the users, network bandwidth and latency constraints, and SLAs.
In this case, you need to decide whether it's more financially and technically effective to deploy multiple farms rather than upgrade the network bandwidth, WAN compression, and cache technology devices, which would be needed if you deployed only one farm. When making this decision, you need to consider that you're going to have at least one more farm to manage if you decide to use a multi-farm architecture. Generally, a mid-sized farm is at least five servers, which means there will be additional costs associated with staffing (i.e., the time and labor needed to administer, monitor, and report on the farm), software licensing, and hardware licensing.
No matter whether you decide on a single centrally located farm or multiple farms, you need to make sure the SLAs will be met. If data center space and resources aren't available, you might consider using SharePoint Online or an IaaS solution. Note that externally hosted environments are subject to your organization's record-management, regional, and data-security policies. In this case, you'd need to work with your legal counsel and records management team. Finally, you should publish standards and practices for this solution for others to follow, which will help with governance.
Scenario 2. Suppose that your organization has offices located in different countries throughout the world. Several of those countries (e.g., Canada, Germany) have laws that companies must meet when the companies' data contains FOIP information regarding their citizens. As a result, the driver for this scenario is the compliance requirement dictated by your organization's legal counsel. This requirement mandates that each country's data must reside in a farm that's located within the borders of that country.
To meet this compliance requirement, you must deploy multiple farms. This means you'll have additional farms to manage and subsequently additional costs, as described in Scenario 1. Using SharePoint Online or an IaaS solution might be an option if the vendor can meet the terms dictated by your legal counsel. You should publish standards and practices for this solution for others to follow.
Note that if your organization doesn't have the funds and resources to deploy and manage multiple farms, you could eliminate the FOIP data from SharePoint so that the compliance requirement is no longer a driver. In addition to removing the existing FOIP data, you'd need to change your SharePoint data policy and implement training programs to avoid the inclusion of FOIP data in the future.
Scenario 3. Suppose that your organization resides in several geographic regions. The organization's financial resources are constrained, but you're still required to deliver a solution that meets the SLAs of each region (which is the driver).
To meet the SLAs, you need to deploy more farms, which means you'll have additional farms to manage and subsequently additional costs, as described in Scenario 1. If you don't have data centers in the areas where the additional farms are needed, you can use governance to escalate the need for data center space and staffing. Your options include allocating new space, leasing space in a third-party data center, using SharePoint Online, or using an IaaS solution. You'll likely need the help of a financial analyst to determine which approach is the best to use, given the capital and operational costs associated with adding more data centers and farms. You should publish standards and practices for this solution for others to follow.
Scenario 4. Suppose that your organization is considering deploying another farm for disaster recovery (DR) purposes. Specifically, your primary farm would fail over to the DR farm in case of failure or disaster. The drivers are meeting the SLAs and addressing audit requirements.
By deploying a DR farm, you'll be able to meet the SLAs and address audit requirements, but you'll have another farm to manage. Besides the time and labor needed to administer, monitor, and report on the additional farm, additional time and labor will be needed to run failover and failback tests, resulting in additional costs.
If your organization has been audited for DR, you should use the auditors' recommendations and guidelines to begin designing the DR farm. The biggest challenge will likely be data replication. There are a few options, such as SAN-level replication, SQL Server 2012 AlwaysOn Availability Groups, or a third-party replication product. So, when designing the DR farm, you'll need to work with your server, storage, and network vendors as well as with Microsoft. Make sure that you have the proper amount of network bandwidth. The DR farm shouldn't saturate the switches, lines, and routers, thereby hurting the performance of neighboring applications.
After you decide on the best farm design, you need to develop processes for building and rebuilding your farms or components in them (e.g., servers, storage) as well as processes for failing over and failing back. You also need to develop comprehensive test procedures that will ensure the DR farm will work properly during a failure. You should test the failover and failback processes annually and after any major changes to the environment. Finally, you should publish standards and practices for this solution for others to follow.
The Next Steps
Once the SharePoint architecture team has chosen the best farm design for your organization, the team needs to decide whether or not to virtualize SharePoint and whether or not to use a hosted solution. I discuss what you need to know to make these decisions as well as how to prove your design with a working model in "Architecting SharePoint: Virtualization and Cloud Decisions."