A: It Depends.
Not necessarily the most succinct answer. The reason for this is due to the many factors that affect the performance of a migration. What can be stated at this point is that Priasoft’s solution is typically seen migrating about 8-15 or more messages per second, per mailbox, on a local LAN (multiple mailboxes can migrate concurrently, and even on multiple workstations). We have seen the solution in some production environments perform as much as 35+ messages per second, and seen some that do only 1 or 2 messages per second. You may notice that unlike other vendors we discuss performance in terms of messages per second and not mailboxes per hour or gigabytes per hour. This is because the most critical factor determining the duration of a migration is how many total items there are and how many per second can be moved.
For illustration, consider an environment where there are 3 million messages total (messages in this case are anything found in a mailbox: email, calendar item, contact, etc.). If the aggregate total messages-per- second is 20, you can expect to complete the migration in about 40 hours (3mil / 20 msg/sec = 41 hours), which is well within a single-event timeframe.
So, in order to provide an answer to the question, metric tests have to be performed. The metric tests are done using the Priasoft product in Dry-Run mode to determine the throughput between the source and target environments and the overall duration. This provides a nearly guaranteed forecast as to how long a migration will take since the dry-run execution uses the EXACT same codebase as the production migration would. If a dry-run of 3000 mailboxes takes 32.545 hours, you can state with confidence that the production migration will take nearly the same amount of time, + or – a few minutes (or maybe an hour) depending on how much difference there is in the environment between when the dry-run was performed and the production migration is executed.
Factors that affect performance are many and the ones that most commonly affect such are provided below.
Network Latency – this is usually an issue when the source and target environments are separated by a WAN. These network connections inherently introduce latency (not to be confused with bandwidth) and this latency increases with the physical distance between the source and target environments. The effect of latency on migration performance can be severe. Microsoft Exchange and the protocols used to access it are quite “chatty” and were never initially designed with high latency in mind. Some recommendations are, if possible, to bring the source and target exchange servers as physically close in proximity as possible prior to the migration (one idea suggests virtualizing one environment and then bringing it online locally), introduce a WAN optimizer/accelerator, and/or reduce the amount of data that needs to be migrated.
Another option that can help in high-latency networks is to increase the number of concurrent tasks, that is to say, migrate more mailboxes at the same time and possibly by having more migration workstations running at the same time. Essentially – concurrency combats latency.
Disk Performance – in essence, Microsoft Exchange stores mail data in a specialized database. The data in the database can become fragmented over time. If the source database has not been defragmented in a long time (or ever), reading information from the database can be quite slow. Exchange 2000 and later have automated processes to perform database defragmentation, however, there are situations where these processes are unable to complete or are not able to defragment optimally. Regardless of how, any efforts that can be made to compact and defragment the Exchange data will have a positive effect on the migration. It may be valuable to review the Microsoft article about de-fragging.
If performance is of considerable concern, and fragmentation is listed as a way to improve the performance (perhaps by exhausting all other performance adjustments), the best is an offline defrag of the exchange database, which is also referenced in the above article.
Server Load – although obvious, it should not be casually ignored. Server load can affect a migration as much or more than anything else. Priasoft’s solutions for migration behave and appear as a mail client to Exchange. This is a positive benefit in that Priasoft can never really collapse an Exchange server (much like 3000 users wouldn’t), but due to the optimizations over the past 10 years, Priasoft can ask for data faster than the server (an ultimately the underlying storage system) can deliver it, which can cause delays to anyone attached to the server, including the Priasoft tools. Additionally, if at the same time some other process capitalizes the Exchange server’s resources, the migration will suffer along with all other clients as well.
Along with the server load is over capitalization of the underlying disk system. When the disk system is over-committed, queuing occurs – the disk system gets backlogged with read requests – which ultimately slow down the migration. Proper analysis of the disk system during dry-run tests will help determine maximum concurrency.
Of specific importance is analysis of SAN utilization. Often a SAN is a “multi-use playground”, meaning that more than one service can utilize the storage on the SAN. The importance of this centers around “disk spin events”. During a dry-run or a production migration, if at the same time some other consumer of the SAN also causes the “disks to spin”, such will reduce the available I/O for the migration effort. A common example is a SQL database that shares the same physical disks on the SAN as the Exchange database and for which an application or DBA executes an complicated and intensive database event. Such will spike the read operations (which are already elevated due to migration activities) and can likely cause queuing and delays for both events (SQL and migration).
There are many things that can affect an Exchange server’s processor load, and some are as obvious as others. In good practice, a full inspection of each Exchange server should be made looking for services or applications that work either directly with Exchange, or merely work on the server itself. The less obvious are services and applications that are external to the Exchange server but which do intense operations upon it or the underlying storage system. The least obvious and most difficult to track down are those processes that do work when the server is supposedly ‘idle’. By the time you log in to the server to inspect it, the intense process will have suspended itself because it has determined the server not to be idle anymore – because you logged in to the server.
Some examples that can increase load on an Exchange server are:
- Backup software
- Antivirus Software
- Antispam Software
- Fax and Unified Messaging software
- Mobile services – like blackberry
- Additional server roles – DNS, WINS, DHCP, SQL, etc.. Screen Savers – it shouldn’t have to be said…
- Archive software
- Terminal Services
Exchange specific Anti-Virus – an Exchange virus scanner can adversely affect the performance of a migration. It is suggested that if the time window for a migration is of great concern that the Exchange Anti Virus be suspended/disabled during the migration. There is obvious concern with changing the state of a virus scanner, however, such should be weighed by the business needs with regards to the performance and desired duration of the migration.
When an Exchange Virus scanner is installed, the virus scan engine is loaded directly into the Exchange store process (store.exe). The engine is designed to look at messages as they are being requested and before a client renders them. This design is efficient for daily use as the virus- scanning engine will only perform scans as data is requested versus crawling through the entire database. However, there is a little known side effect of this: older messages are not scanned except when accessed again.
A virus scanner works off of a database of virus ‘signatures’ from which it compares to items as they are requested. When a recent message is retrieved, it most likely was scanned within the past few days, and if no new virus signatures have been received from the vendor, the message is quickly handed off to the requester. If new virus signatures have been received, only the new signatures are compared against the item, which is still very quick.
However, the older a message is, with regards to the last time it was accessed, the more signatures that have to be enumerated and compared. In many cases, there are messages that existed BEFORE the Exchange virus scanner was installed. Accessing one of the very old messages means that the item has to be analyzed against ALL of the signatures in the virus scan database.
In usage-time, this happens in milliseconds or even 1 or 2 seconds, and is not really perceived by an end user. However, add up 1 or 2 seconds over several million messages in an Exchange Server and considerable delay has been introduced.
This delay is not the only issue. A migration can actually fail if an item is complex or simply has a large amount of virus signatures to compare to. The failure comes as a result of a timeout where our asking for data fails because it takes too long. This timeout is not controllable either by Priasoft or Exchange.
Lastly, simply disabling an Exchange virus scanner may not be sufficient. As described earlier, the scan engine is loaded with the store process. Often the scan engine is still scanning items as they are requested but is simply not reporting anything to the virus scan service. If it is suspected that an Exchange virus scanner is affecting performance, the only recourse may be to disable the virus scanner – be sure to do this using the vendor’s provided control panel as simply stopping the underlying windows service may not be enough – and then stop and restart the store process. This is the only way to truly ‘unload’ the virus scan engine from the store process.
Mailbox Characteristics – the characteristics of the data to be migrated can have as much to do with the performance of a migration as all of the above combined. To explain this better, an analogy may help.
The file system has much of the same issue with performance, as does an Exchange mailbox, with regards to the copying of data. In the file system it is well known that the amount of time consumed to copy a 1GB file is dramatically less than 1GB of 10k files (that would be 100,000 files!). The difference in time is because of the ‘context switching’ that occurs over the many files. For each new file to be copied, there is a bit of delay before the data actually starts copying. This delay constitutes various checks and analysis of the file in question: does it still exist, can it be copied, proper permissions, can the target file system accept it, etc. The delay may only be in milliseconds, but even 500ms over 100,000 files adds up to 50,000 seconds, which is over 13 hours of just waiting!
In Exchange, a mailbox with thousands of small messages will migrate slower than a mailbox of several hundred messages, even if they are the same size. A complexity that Exchange has over the file system analogy is the fact that items in a mailbox (messages, calendars, etc.) are really containers of information (analogous to a zip file). The complete structure of an item in a mailbox has many parts – properties, attachments, recipients, etc. – that are not necessarily stored contiguously in the exchange database. This becomes more apparent when you consider the Single Instance Storage feature of Exchange (for versions of Exchange that have it) or in Exchange 2010 the fact that Attachments are stored (internally) separately from the message. As a message is requested, Exchange must gather all the data for the message from various parts of the database (and now further consider the previous highlight of fragmentation).
So, the characteristics of a mailbox definitely affects performance of a migration, AND those same characteristics have an even more profound effect on the previously explained items that affect performance. If the mailbox scheduled for migration is very large, has a large amount of items with many of them having attachments, AND the database is fragmented, and they are most old and have to be rescanned by a virus scanner, and the attachments are large or complex, the performance for that mailbox will be slow.
In summary, if the duration of the migration is of critical importance, any work done to prepare the source exchange servers will be well worth it, and the determination of how important the duration is highlights again the importance of the ability to dry-run the project one or more times before committing to the production execution.
