Exchange Migration Knowledge BaseCategory: Mailbox Migration QuestionsWhat is a likely expectation of speed of a migration with Priasoft’s tools?
Anonymous asked 11 years ago

We have what we feel is a fairly large environment and want to know what to expect or for what we can plan with regards to speed.  We have about 2TB of data across approximately 1800 mailboxes.
We’ve seen quite a range of other suggestions of speed from other vendors and from Microsoft as well – 10GB a day, gigabytes per hour, etc.  Where does Priasoft typically fall with regards to such metrics?

1 Answers
Eriq VanBibber Staff answered 8 years ago

This is a fairly common question, but one that does not often have a simple answer.
It may help if some fundamentals about how Priasoft processes data first before offering any ideas on performance.

Priasoft Migration Process
Priasoft, for a single mailbox, processes folders and items sequentially, one-at-a-time until all items are processed.  However, what is important to consider – especially in comparison to other vendors – is that Priasoft makes a MAPI connection to a source mailbox AND a target mailbox at the same time.  Processing then involves copying items directly from a source folder to a target folder – there is no use of temporary files, PSTs, or other double or triple processing of data.  This design ensures that the fastest processing available is being used.
The data DOES travel through the network adapter of the host running the Priasoft tools so physical placement of the migration tools is an important consideration.  In all cases, network latency (the delay involved in waiting for electrons to flow) has an impact on performance.  It is best to eliminate as much network latency as possible.  As such, placing the migration tools physically near the source data is best; there are many operations such as content analysis, sorting, and grouping of data that happen prior to any actual data copy actions.  Placing the migration tools near the source ensures that the “non data” activities happen with little or no latency.  This is especially important for migration across a WAN connection, such as migrations to Office365.
Network latency occurs between each item copied and therefore higher latency will cause individual mailbox processing to be slower than if the source and target data were both on the same LAN.  In projects where the migration is to Office365, latency is often high (30ms or more).  Compounding this issue is the fact that Microsoft Geo-DNS setup is not an exact science – many customers have seen where ‘outlook.office365.com’ does NOT resolve to the nearest Microsoft datacenter.  In these cases, if left uncorrected, the latency between the on-premises data and Microsoft is sometimes twice as high (60ms instead of 30ms).  The impact of latency is seen in the items-per-second metric.  Higher latency lowers the metric.

Data-per-Hour vs. Items-per-Second
It’s not surprising that a range of claims about throughput has been received.  The reason for such differences in claims is because the metric of data-per-hour is not the correct metric to use to forecast duration or to gauge performance.  Data-per-hour is a misleading metric to use for forecasting because data stored in Exchange is not accessed as “rows of bytes”.  Migrating a 50MB mailbox is not 50MB of bytes all in sequence.  Data in a mailbox is a series of items and accessing each item requires a round trip from the migration host to the Exchange server.
Imagine 2 mailboxes of the same size – 500MB each.  Mailbox #1 has 5000 items.  Mailbox #2 has 15,000 items.  Even though both mailboxes are the same data size, the second mailbox will likely take 3 times longer to migrate as it has 3 times as many items.  Therefore, the most important metric to track and understand is “items-per-second”.  This metric is fairly stable over time unlike data-per-second.  In fact, it can even be that 2 folders in the same mailbox constitute the same data size, but have dramatically different item counts.
The Priasoft tools provide item-per-second metrics during migration so that it can be tracked.

Concurrency combats Latency
Priasoft first design to dealing with latency and its impact is through the use of concurrency.  Multiple mailboxes can be migrated simultaneously.  There have been customers that have been able to migration 50, 75, and more mailboxes at the same time on a single host.  That combined with the ability to run batches on multiple machines concurrently means that, as long as the source and target environments can support the load, the performance can be scaled out very far, thereby reducing the overall timeline.
Determining concurrency number is then a valuable and important first activity.  The use of the dry-run feature of the Priasoft tools allow for a measured and accurate discovery of the maximum concurrency a given environment can support.  Once the concurrency numbers are established, the overall items-per-second value can the be used for an early calculation of duration.  Consider the following imaginary numbers in an environment:

Number of mailboxes2000
 Average items per mailbox 5000
 Average items-per-second found during dry-run 200
 Total items in the source environment 10 million
 Items that can be migrated per hour 720,000
Items that can be migrated per 10 hours 7,200,000
Likely duration for 10 million items13.888 hours

So, given the above, it should be easy to see how calculations can be made to forecast duration.  The critical factors are total item count and the items-per-second metric.  If the item-per-second were only 100/s, then the duration would be ~28 hours.

Performance Influencers
So, given the above explanation, any thing that can influence either the items-per-second or the total item count has a considerable impact.  Here are some examples to consider in any environment:

  • Virus Scanners:  Exchange 2007 and earlier allows for “on access” scanning meaning that each request of an item to migrate activates the scanner.
  • Backups:  backups impact disk I/O and can create record locks that can delay (or fail) access to items.
  • Exchange Archiving:  similar to backups, migrating while archiving is occurring is not normally a good idea.
  • Latency:  can be network, disk, or even CPU.  Any delay accessing data influences performance.
  • Firewalls, VPNs, Network Encryption:  such things add extra processing at the network layer.
  • Physical Distance:  the greater the distance between the source and target data the greater the latency.
  • Any other application, service, appliance, etc. that adds processing in between.

Some of the influencers are controllable, and some are not.  One of the influencers that is controllable, but often ignored is item count.  Any opportunity to reduce the overall item count has a positive impact.  Priasoft has a first-class feature that specifically addresses this:  Cutover+BackFill.  This feature allows one to migrate a subset of user data in the “Cutover” pass, greatly reducing the total item count processed in order to activate an organization on a new environment.  In the above example, if the last 30 days of info equates to 2 million items, the initial cutover would only be ~3 hours.  The “Backfill” pass can be run while users are in their mailboxes and could be run non-stop until complete, even if that was 30 hours or more.

Hopefully this information provides a better perspective on what to expect and why it is hard for Priasoft to offer any “we can do this” type of metrics.  There are just too many variables that influence the result.  This is one of the primary purposes of the dry-run feature of the Priasoft solution.