2.8 Archiving Exchange and Office 365

2.8.1 Prerequisites for Archiving Exchange Data

There are several prerequisites that need to be completed for setting up an Exchange module.

Make Sure That Autodiscover Is Enabled and Working

IMPORTANT:Autodiscover is essential:

  • It lets you skip users or immediately abort a job.

  • If Autodiscover isn’t working, serious errors occur when Retain attempts to archive users’ messages.

  1. Test that Autodiscover is enabled and working for the domain by doing the following:

    1. Browse to the Microsoft Remote Connectivity Analyzer.

    2. On the Office 365 tab, under Microsoft Office Outlook Connectivity Tests, select Outlook Autodiscover.

    3. Enter your credentials and run the test.

  2. If the test succeeds, continue with the next section, Get the SMTP Server URL

  3. If the test fails, contact Microsoft and have them turn autodiscover on, then rerun the test until it succeeds.

Get the SMTP Server URL

Retain requires an SMTP server for sending notifications.

  1. In the Microsoft Remote Connectivity Analyzer (https://testconnectivity.microsoft.com/ > the Office 365 tab, under Microsoft Office Outlook Connectivity Tests, select Inbound SMTP Email.

  2. Enter your credentials and run the test.

  3. Record the SMTP server URL .

Preventing the Deletion of Unarchived Exchange Messages

To prevent data loss, you should set a rolling in-place hold so that users cannot remove items before Retain has a chance to archive them.

IMPORTANT:Not all Office 365 licenses allow the setting of a hold, in such cases there is no way to prevent data loss.

How Message Deletion Works in Exchange

When users delete messages in Outlook, the messages are moved, by default, to the trash.

When users empty their trash, deleted items are moved to the mostly hidden Recoverable Items folder, where they are kept for 14 day before being removed from the disk.

In the interim, users can right-click the Trash to recover items, but they can also purge the items, which immediately deletes them.

If a hold is in place, purged items are moved to a Purged folder that is not user-accessible and kept there until the hold is lifted.

Set Retain Profile/Miscellaneous to Include Recoverable Items

In Retain, set Profile/Miscellaneous to Include user's recoverable items.

Setting Up a Distribution List

  1. Access the Exchange Admin Console.

  2. Set up a distribution list.

    For example, create a list named All_Mailboxes that contains all mailboxes.

  3. Create a policy that adds new users to this distribution list by default.

Placing the Distribution List under a 90-day Hold

  1. Access the Exchange Management Shell.

  2. Enter the following command, replacing All_Mailboxes with the name of the distribution list mailbox that you created in Setting Up a Distribution List above.

    New-MailboxSearch "Retain90DayHold" -ItemHoldPeriod 90 -InPlaceHoldEnabled $true -SourceMailboxes All_Mailboxes

    It takes time for the hold to take effect.

Finding How Many Mailboxes Were Placed Under Hold

You can determine how many mailboxes were placed under hold with the following script:

((Get-Mailbox).InPlaceHolds).Count

Setting Up Users with a PowerShell Script

NOTE:Retain uses PowerShell to connect to Office 365 .

PowerShell does not allow the following special characters in names or passwords: # $ ( ) * + . [ ] ? \ / ^ { } |

Create a Retain Impersonation User

In the O365 Exchange Admin Center, create a Retain Impersonation user with a mailbox, making sure to give it a license.

Give the Impersonation user the proper rights. Under Permissions, create a new Admin Role (e.g. Retain Impersonation Management), add the ApplicationImpersonation right and the Retain Impersonation user as a member .

Create a Retain Administrator User

Retain needs a user with Administrator rights to download the address book from Office 365 every day with the Office 365 Address Book Synchronization Script. This can be an existing administrator account or you can create a separate one. It needs to have sufficient rights to see all the users in the address book.

Setting Up Access to Shared Mailboxes for the Impersonation User

Impersonation rights allow the Retain user to enter other mailboxes but those rights do not extend to shared mailboxes. To access a shared mailbox, the Retain user needs rights to each shared mailbox that is to be archived. These rights can be granted through the Exchange Management Shell.

For example, If the shared mailbox is owned by John Doe and your Retain impersonation account is Retain, you would issue the following command in an Exchange Management Shell (EMC):

Add-MailboxPermission -Identity "John Doe" -User Retain -AccessRights FullAccess -InheritanceType All -AutoMapping $false

Synchronizing the Exchange Address Book with Retain

For Retain to authenticate users and access mailboxes for archiving, it needs to know what mailboxes are in Office 365. There are two ways to do this:

  • Populating the address book directly from Office 365 by using the Microsoft Graph API (recommended).

    Or

  • Using PowerShell Scripts to download the domain address book as a .csv file.

Synchronizing the Address Book Using Office 365 (Microsoft GraphAPI)

To enable Retain for access to the Office 365 Address Book through the GraphAPI, it must first be registered on the Microsoft Azure Portal and then the information must be added to the Retain module:

  1. Register Office 365 on the Microsoft Azure Portal by entering the following URL in your administrative browser:

                      https://portal.azure.com/#blade/Microsoft_AAD_RegisteredApps/applicationsListBlade
                    
  2. Create a new app registration pointing it to your Retain Server URL.

    For example:

    https://retain.gwava.com/RetainServer
  3. After creating the app registration, click API permissions for the app > Add a permission > Microsoft Graph > Application permissions.

    NOTE:Adding the following permissions requires Admin credentials, for which you are prompted the next time you log in.

  4. Select the following permissions:

    • User.Read.All

    • Directory.Read.All

  5. Click API permissions for the app > Add a permission > Exchange > Application permissions.

  6. Select the following permission:

    • full_access_as_app (Only needed for archiving data, not address book synchronization)

  7. Go to Certificates & secrets and select New client secret to create a secret for Retain.

    IMPORTANT:Make a record of the Client secret value because it is only visible now.

  8. Add the following information to the record you just made:

    • Application (client) id (found on the Overview page)

    • Directory (tenant) id (found on the Overview page)

  9. Access the Service Connection Details tab and configure the module to populate the address book using Office 365.

Synchronizing the Address Book Using PowerShell Scripts

Retain includes two PowerShell scripts (under the Tools menu) for extracting address book information from Office 365.

Both scripts download the Office 365 address book and save it in two .csv files.

PowerShell Sync Script 1.0 saves the username and password in plaintext in the script.

PowerShell Sync Script 4.0 encrypts the password to a separate file.

Retain cannot archive members of the distribution lists if theHiddenFromAddressListsEnabled field: is set to True.

If using multiple modules, you must create separate folders for the script and the resulting .csv files. The folder location that Retain should pull the .csv file data from is set in the module as detailed below. You must also create a scheduled task for each script.

PowerShell Sync Script 1.0 (sync365.ps1)

IMPORTANT:This script requires that you enter an Office 365 administrative password in cleartext. if this is a concern, use the PowerShell Sync Script 4.0 (sourcesync365.ps1) instead.

The script requires PowerShell 2.0 . This script connects to the target system and downloads the address book and distribution group lists into two address book .csv files, exchangeuser.csv and exchangegroup.csv.

Settin Up the PowerShell Scripts
  1. Install PowerShell 2.0 or higher. (Windows 7 and Server 2008 R2 already come with PowerShell 2.0)

  2. Enable Microsoft .NET Framework 3.5.1.

  3. Install the Office 365 PowerShell cmdlets. Two packages are needed which can be downloaded from the Microsoft Azure Active Directory PowerShell Module Version Release History, currently found at: Where Can I Find the Latest Version of AAD PowerShell.

    • Microsoft Online Services Sign-In Assistant for IT Professional RTW (this is the prerequisite to the Azure AD Module)

    • Azure Active Directory Module for Windows PowerShell (64-bit version)

  4. You might need to change the execution policy to allow these scripts to function:

    • Allow PowerShell script execution

      The Default Execution Policy is set to restricted, it can be viewed by entering this command in PowerShell:

      Get-ExecutionPolicy

    • The script provided by Micro Focus must be run as an Administrator in PowerShell:

      Set-ExecutionPolicy RemoteSigned

  5. In the Retain Management Console, download the script by clicking Tools > O365 Archiving > PowerShell Sync Script 1.0.

    The downloaded script filename is sync365.ps1.

  6. If you plan to run the script on the Retain server itself, move the sync365.ps1 script to the ~\Program Files\Beginifinite\Retain\RetainServer\WEB-INF\cfg folder.

    Otherwise, if you run it on the management workstation, be sure to copy the resulting exchangeuser.csv and excahngegroup.csv files to that directory on the Retain server.

  7. Edit the sync365.ps1 script with the Microsoft Integrated Scripting Environment (ISE) editor.

    1. At the top are 3 settings:

      • $User Set this to the UPN of an administrator account in Office 365.

      • $PlainPassword Set this to the plain text password of the administrator account.

      • $ExportBasePath Set this to a directory where the two resulting .csv files are saved.

        If the path does not yet exist, you must create it manually, making sure to escape the backslashes (\\).

        For example:

        $ExportBasePath="C:\\Program Files\\Beginifinite\\Retain\\RetainServer\\WEB-INF\\cfg"

    2. Execute the script by clicking the play button. This process can take a while if there are many users. When the script finishes, a message displays in the bottom status bar.

    3. Make sure that the exchangeuser.csv and exchangegroup.csv files are in ~\Program Files\Beginifinite\Retain\RetainServer\WEB-INF\cfg for Retain to use for the address book.

  8. Set the Task Scheduler to run the script automatically once per day by doing the following.

    If you create it at the Task Scheduler (Local) level, you can find it after it is created in the Task Scheduler Library folder, center pane.

    1. Create a New Task.

    2. On the General tab, give it a name description.

    3. Under Security options, choose: Run whether user is logged in or not.

    4. In the Triggers tab, click New....

    5. Under Settings, choose Daily and set the Start Time to an hour before the Exchange archive job is set to begin to guarantee that the script finishes in time.

    6. Choose Do not expire.

    7. Enable the task.

    8. Under the Actions tab: Create a New action.

      1. Set the Action to “Start a program”

      2. Program/script: powershell

      3. Add arguments: -NoProfile -ExecutionPolicy Bypass -file "[drive]:\Program Files\Beginfinite\Retain\RetainServer\WEB_INF\cfg\sync365.ps1" -Verb RunAs

      4. Start In: (leave blank)

PowerShell Sync Script 4.0 (sourcesync365.ps1)

Use this script if exposing the administrator password in plain text is not acceptable under your organization’s security policy.

Requirements
  • The script must be run on the same machine as it was downloaded to because the password encryption is specific to the machine it runs on.

  • This script requires PowerShell 4.0 or higher or it aborts .

    You can determine the installed PowerShell version by running the cmdlet:

    $PSVersionTable.PSVersion
  • If using Windows Server 2008R2 or earlier, the script generates errors because the Task Scheduler cmdlets are not supported and the Scheduled Task must be created manually.

Running the Script
  1. In the Retain Management Console, download the script by clicking Tools > O365 Archiving > PowerShell Sync Script 4.0.

  2. After downloading and extracting the script, open PowerShell and change to the directory the script is in.

  3. enter .\Save-CredentialsEncrypted.ps1

    If you haven't run a PowerShell script before, you might have to change the Execution Policy to get a script to run using the following command:

    Set-ExecutionPolicy -ExecutionPolicy Bypass -Scope Process.

  4. A dialog box displays

  5. Enter the user name (example: DomainAdmin@company.onmicrosoft.com) and password of an administrator user with sufficient rights to download the address book from Office 365.

  6. Specify the destination folder of the address book files.

    Provide it with a name for the Scheduled task.

    You can specify any arbitrary destination folder you want. However, the resulting address book files, exchangeuser.csv and exchangegroup.csv must be in the ~\Program Files\Beginfinite\Retain\RetainServer\WEB_INF\cfg folder on the Retain Server when Retain refreshes the address book.

    If you are using a Linux-based Retain Server, you must set up a process to move the address book files to the Linux-based Retain Server's opt/beginfinite/retain/RetainServer/WEB-INF/cfg folder, as outlined in Using PowerShell when Retain Is Linux-based.

  7. Specify a separate task and destination folder for each Exchange module that you create.

  8. The script then sets up a recurring task (SyncO365) in Task Scheduler to download the address book every day at 12:30am.

    The script requests your logon credentials as it sets up the task. The script starts the task before exiting. If you are using Windows Server 2008, multiple errors appear because the Task Scheduler cmdlets do not exist, this is expected and the task can be created manually in Task Scheduler.

  9. It takes a moment to download the address book files: exchangeuser.csv and exchangegroup.csv. For a small system (<100 users) it might take a few minutes, for a large system (>10k users) it can take more than half an hour. After the script completes, make sure that the address book files are in the same folder as your PowerShell script.

    If there are no files, it may be an execution policy issue. See Troubleshooting the PowerShell Export Process.

Troubleshooting the PowerShell Export Process
  • Task Scheduler Reliability: Because the scheduler has been known to stop working at times, we recommend monitoring the .csv files to ensure that they are being updated every day.

  • Blank .csv files: Office 365 requires regular password changes. If the wrong credentials are entered or the password has expired, two blank .csv files are created. You must run the script again, entering the Administrator logon name, the new password, and the destination folder. The script starts the task to update the address book files.

  • Red Text Displays and Window Closes: If you see red text and the window closes immediately, there was an error of some kind.

    1. Open a PowerShell window, change to the script folder

    2. Run the following command in a PowerShell window to allow execution for the current process:

      Set-ExecutionPolicy -ExecutionPolicy Bypass -Scope Process
    3. Run the script from the same process window using the following command.

      .\Save-CredentialsEncrypted.ps1
Using PowerShell when Retain Is Linux-based

PowerShell scripts don’t run on Linux.

Instead, run the script on a Windows computer with the required version of PowerShell installed, then copy the resulting .csv files to the Retain server.

Create a Batch File to Transfer the Files

You can create a small Windows VM that only runs the Powershell script and then copy the exchangeuser.csv and exchangegroup.csv files to the Retain server.

You can use the free program WinSCP to copy from a Windows computer to a Linux server.

You can use windscp.com as the basis of a batch file to copy the resulting .csv files to the Retain server.

For example:

retain.bat
"C:\Program Files (x86)\WinSCP\winscp.com" /command ^
 "option batch abort" ^
 "option confirm off" ^
 "open scp://[user]:[password]@[retain server address]" ^
 "cd /opt/beginfinite/retain/RetainServer/WEB-INF/cfg" ^
 "option transfer binary" ^
 "put [file location on windows]*.csv" ^
 "close" ^
 "exit"

Explanation of what each line does:

#Specify that all commands are run on the command line, while using ^ to split long lines for readability.
 "C:\Program Files (x86)\WinSCP\winscp.com" /command ^
# Automatically abort script on errors
 "option batch abort" ^
# Disable overwrite confirmations that conflict with the previous
 "option confirm off" ^
# Connect replacing your own username, password and retain server address
 "open scp://[user]:[password]@[retain server address]" ^
# Change remote directory
 "cd /opt/beginfinite/retain/RetainServer/WEB-INF/cfg" ^
# Force binary mode transfer
 "option transfer binary" ^
# Upload the file to current working directory
 "put [file location on windows]*.csv" ^
# Disconnect
 "close" ^
# Exit WinSCP
 "exit"
Automate the Script

You can now automate the running of this batch file in Task Scheduler as a simple task to run before the Retain archive job. Set the run time of the task so that it completes before the Retain job begins.

Configuring Retain for Archiving Site-Collection Document Links

Online Exchange in Office 365 lets you attach documents that are stored in Site Collections in SharePoint Online/OneDrive. However, Exchange only contains links to the documents, not actual copies of them. The documents still reside only in the Site Collections themselves.

The ApplicationImpersonation right only provides Retain with access to what is stored in Exchange. Therefore, when Retain attempts to archive an attached document, it generates an error similar to the following:

11:04:16, 704[Thread-4920] [ERROR] ExchangeAttachment: error while creating attachment. java.io.IOException: SharePointError - Impersonation has no access to: https://gwava-my.sharepoint.com/personal/user08_gwava_onmicrosoft_com1/Documents/Email attachments/office 365 users(1) (1).txt

Granting Access to a Site Collection

For Retain to archive an attached document in Exchange, the Retain user must be added as a Site Collection Administrator in SharePoint/OneDrive.

  1. Browse to the Office 365 Admin page.

  2. Select SharePoint, then click Site Collections and select the collection that contains the attachment.

  3. Click the Owners tab and choose Manage Administrators.

  4. Add the Retain user as a Site Collection Administrator.

    The Retain user now has rights to access the documents for archiving.

Granting Access to Multiple Site Collections

If you need to provide site collection administration access to multiple collections, consider a soluction, such as the SharePoint Online Management Shell script.

2.8.2 Creating an Exchange Module

Office 365

NOTE:If your organization uses Office 365, use this section to configure your Exchange module.

If you use Exchange, go to the following section that applies to your situation:

This section contains the following:

Providing OpenID Access (Modern Authentication) to Users

Retain 4.9.1 lets users access Retain via a Log in with Office 365 button.

Retain redirects users to Office 365 and they then have 10 minutes to log in.

After successfully authenticating, users can access their Retain archive.

Enabling OpenID Access (Modern Authentication)

To enable this functionality you must do the following:

  1. Complete the instructions in Configure the Exchange Module for Office 365, making sure that you choose to populate the address book directly from Office 365.

    IMPORTANT:Using a .csv file to populate the address book is not compatible with this functionality.

  2. After you have created the module and verified (tested) the Office 365 connection, open the Retain Server Manager.

  3. Click Server Configuration > Accounts tab and go to the Office 365 End User Authentication panel.

  4. Copy and paste the same Tenant ID and Client ID as you used in the Credentials Sub-panel.

  5. In Microsoft Azure, on the Registered Apps page in the Authentication menu, do the following:

    1. Under Web > Redirect URIs, add the URL of the openIdConnect.jsp file on the Retain Server that you are configuring.

      For example, https://your-Retain-server/RetainServer/Server/openIdConnect.jsp

    2. Under Implicit grant, enable the ID tokens option.

    3. Under Supported account types, enable the Accounts in any organizational directory (Any Azure AD directory - Multitenant option.

Enabling Multiple Retain Servers for OpenID Support

If you have multiple Retain servers archiving from the same Office 365 system, you must do the following on each of those Retain servers.

  1. Edit the /opt/beginfinite/retain/RetainServer/WEB-INF/classes/config/misc.properties file.

  2. Add the following line:

    msopenid.redirecturi=https://your-Retain-server/RetainServer/Server/openIdConnect.jsp

  3. Restart Tomcat.

Configure the Exchange Module for Office 365

The Exchange module must be configured in the Retain Server before any communication between Retain and an existing Exchange message system can occur. Open the Retain management page on the Retain Server, and select Module Configuration.

Select the ‘Configure’ option in the Exchange module. A new window or tab opens with the module configuration.

NOTE:Ensure that your Retain Server and your Exchange server use the same DNS server.

The Exchange module uses DNS settings to auto discover critical information about Exchange that is stored in Active Directory. It cannot work correctly unless both systems use the same DNS server.

Core Settings Tab

The Core Setting Tab allows you to disable all jobs and disable users logging into Retain.

The module needs to be enabled on this page to make it active in the Retain system.

The module can be given a name.

The Send Method lets you send Exchange items to an external system using FTP or SMTP.

In most cases this should be disabled so that items are archived in Retain.

To select the SMTP Forwarding or FTP features, you must first add and configure them in the Module Forwarding Tab on the Server Configuration page, otherwise the drop-down list is empty.

Normally all the checkbox options on this tab are enabled. It is rare that you would ever deselect any of them. Two cases where you might, would be: troubleshooting (as instructed by Technical Support), and retiring an old email system.

The Enable Address Book Caching function allows Retain to regularly cache the online email systems address book and synchronize it with Retain. This is critical for administration, authentication, and archiving purposes. It is recommended to cache the Address Book once every 24 hours to keep the Retain storage system up to date. By default, maintenance is set to cache the Address Book once every 24 hours.

The Enable Authentication checkbox determines if end-user authentication is performed when the user logs into Retain. If it is deselected, the Retain system cannot authenticate the user against the email system and the user cannot log in unless another authentication method is enabled.

The Enable Jobs checkbox determines if configured data retrieval jobs are ever passed to the Worker. Even if the individual job is fully configured and enabled, if this checkbox is switched off, no jobs configured for this module can be processed.

The Message body allows the administrator to decide whether to store either the HTML or plain text message body, or both.

Send Method

The Send Method lets you send Exchange items to an external system using FTP or SMTP.

In most cases this should be disabled so that items are archived in Retain.

To select the SMTP Forwarding or FTP features, you must first add and configure them in the Module Forwarding Tab on the Server Configuration page, otherwise the drop-down list is empty.

Impersonation Tab

If the Impersonation and Core Settings tabs are not completely configured with the correct information, the hosted system cannot be archived correctly.

IMPORTANT:The Global Catalog User and Password specified here must be valid in both Exchange/O365 as well as in the Sharepoint system.

  1. Enter the Impersonation user credentials.

  2. Then click Test Connection to verify that both the user FQDN and password are entered correctly.

Hosted Services Tab

Path: Retain Server Manager > Configuration > Module Configuration > Exchange > Hosted Services Tab

Table 2-21 Using the Hosted Services Tab

Panels and Sub-panels

Information and/or Action

Hosted Services

See Hosted Services Panel.

Address Book Discovery

See Address Book Discovery Sub-panel.

Mailbox Archive Authentication

See Mailbox Archiving Authentication Sub-panel.

Credentials

See Credentials Sub-panel.

Hosted Services Panel

Path: Retain Server Manager > Configuration > Module Configuration > Exchange > Hosted Services Tab > Hosted Services Panel

Table 2-22 Using the Hosted Services Panel

Option, Field, or Sub-panel

Information and/or Action

I am using a Hosted Exchange system option

Enable this option if you use hosted Exchange services.

If you select this option, Retain ignores the Exchange Forest, User Forests, and Delegates tabs.

IMPORTANT:Once you select this option and save the module, you cannot switch and configure the Exchange Forest, User Forests, and Delegates tabs because they are ignored. You must create a new module instead.

Office 365 option

Select this if you use Office 365.

When you select this option, the Hosted Services tab panel expands to let you specify settings for address-book discovery, mailbox archiving authentication methods, and the credentials required for connecting with Office 365.

IMPORTANT:Once you select this option and save the module, you cannot switch to the Hosted Exchange without LDAP option (below). You must create a new module instead.

Hosted Exchange without LDAP option

Select this if you use a hosted Exchange service that doesn’t utilize LDAP directory services.

Selecting this requires that you import your address book from a PowerShell-generated CSV file. See Synchronizing the Address Book Using PowerShell Scripts.

Address Book Discovery Sub-panel

Path: Retain Server Manager > Configuration > Module Configuration > Exchange > Hosted Services Tab > Hosted Services Panel > Address Book Discovery Sub-panel

Table 2-23 Using the Address Book Discovery Sub-panel

Option, Field, or Sub-panel

Information and/or Action

Import from CSV file option

Path:

Specify the path to where the PowerShell Sync script saves the CSV user lists. For example:

C:\Program Files\Beginfinite\Retain\RetainServer\WEB-INF\cfg.

This requires the procedures described in Synchronizing the Address Book Using PowerShell Scripts.

Populate from Office 365 option

Select this option to populate your archived address book directly from Office 365. This requires the procedures described in Synchronizing the Address Book Using Office 365 (Microsoft GraphAPI).

Mailbox Archiving Authentication Sub-panel

Path: Retain Server Manager > Configuration > Module Configuration > Exchange > Hosted Services Tab > Hosted Services Panel > Mailbox Archiving Authentication Sub-panel

Table 2-24 Using the Mailbox Archiving Authentication Sub-panel

Option, Field, or Sub-panel

Information and/or Action

Use OAuth

This indicates that OAuth is the authentication method that this module uses for mailbox archiving.

Credentials Sub-panel

Path: Retain Server Manager > Configuration > Module Configuration > Exchange > Hosted Services Tab > Hosted Services Panel > Credentials Sub-panel

Table 2-25 Using the Credentials Sub-panel

Option, Field, or Sub-panel

Information and/or Action

Tenant ID field

This information is exposed when you complete the procedures described in Synchronizing the Address Book Using Office 365 (Microsoft GraphAPI).

Client ID field

This information is exposed when you complete the procedures described in Synchronizing the Address Book Using Office 365 (Microsoft GraphAPI).

Client Secret field

This information is exposed when you complete the procedures described in Synchronizing the Address Book Using Office 365 (Microsoft GraphAPI).

Test Connection button

Click this to verify that the credentials you have entered are valid with your Office 365 system. The connections to both Graph API and EWS (if using OAuth) are verified and reported separately.

Refresh the Address Book

After saving changes, return to the Retain Server's Module Configuration page, and trigger a refresh of the Address Book.

Depending on the size of the address book, it may take several minutes to return with information, but a successful configuration returns a correct address book cache date and no errors. The date should reflect the date when the address book refresh was triggered.

The Status may show “Address Book Cache Never Run” or may list commonly mis-configured or missed items if the Refresh job fails.

Once the status is configured and the Address Book has been cached, Retain can connect to and archive messages from the Exchange server. The system is ready to have workers, schedules, profiles, and jobs configured, and those options now appear on the main administrative interface.

The Address Book is refreshed whenever the button is pressed, during the nightly maintenance cycle, and before each job.

Exchange without Access to Active Directory

Use this section to configure a module for a hosted Exchange system where you don’t have administrative access to its associated Active Directory services.

This section contains the following:

Configure the Exchange Module for Exchange without Access to Active Directory

The Exchange module must be configured in the Retain Server before any communication between Retain and an existing Exchange message system can occur. Open the Retain management page on the Retain Server, and select Module Configuration.

Select the ‘Configure’ option in the Exchange module. A new window or tab opens with the module configuration.

NOTE:Ensure that your Retain Server and your Exchange server use the same DNS server.

The Exchange module uses DNS settings to auto discover critical information about Exchange that is stored in Active Directory. It cannot work correctly unless both systems use the same DNS server.

Core Settings Tab

The Core Setting Tab allows you to disable all jobs and disable users logging into Retain.

The module needs to be enabled on this page to make it active in the Retain system.

The module can be given a name.

The Send Method lets you send Exchange items to an external system using FTP or SMTP. In most cases this should be disabled so that items are archived in Retain. To select the SMTP Forwarding or FTP features, you must first add and configure them in the Module Forwarding Tab on the Server Configuration page, otherwise the drop-down list is empty.

Normally all the checkbox options on this tab are enabled. It is rare that you would ever deselect any of them. Two cases where you might, would be: troubleshooting (as instructed by Technical Support), and retiring an old email system.

The Enable Address Book Caching function allows Retain to regularly cache the online email systems address book and synchronize it with Retain. This is critical for administration, authentication, and archiving purposes. It is recommended to cache the Address Book once every 24 hours to keep the Retain storage system up to date. By default, maintenance is set to cache the Address Book once every 24 hours.

The Enable Authentication checkbox determines if end-user authentication is performed when the user logs into Retain. If it is deselected, the Retain system cannot authenticate the user against the email system and the user cannot log in unless another authentication method is enabled.

The Enable Jobs checkbox determines if configured data retrieval jobs are ever passed to the Worker. Even if the individual job is fully configured and enabled, if this checkbox is switched off, no jobs configured for this module can be processed.

The Message body allows the administrator to decide whether to store either the HTML or plain text message body, or both.

Send Method

The Send Method lets you send Exchange items to an external system using FTP or SMTP. In most cases this should be disabled so that items are archived in Retain. To select the SMTP Forwarding or FTP features, you must first add and configure them in the Module Forwarding Tab on the Server Configuration page, otherwise the drop-down list is empty.

Impersonation Tab

If the Impersonation and Core Settings tabs are not completely configured with the correct information, the hosted system cannot be archived correctly.

Enter the Impersonation user credentials.

Hosted Services Tab

Path: Retain Server Manager > Configuration > Module Configuration > Exchange > Hosted Services Tab

Table 2-26 Using the Hosted Services Tab

Panels and Sub-panels

Information and/or Action

Hosted Services

See Hosted Services Panel.

Address Book Discovery

See Address Book Discovery Sub-panel.

Mailbox Archive Authentication

See Mailbox Archiving Authentication Sub-panel.

Credentials

See Credentials Sub-panel.

Hosted Services Panel

Path: Retain Server Manager > Configuration > Module Configuration > Exchange > Hosted Services Tab > Hosted Services Panel

Table 2-27 Using the Hosted Services Panel

Option, Field, or Sub-panel

Information and/or Action

I am using a Hosted Exchange system option

Enable this option if you use hosted Exchange services.

If you select this option, Retain ignores the Exchange Forest, User Forests, and Delegates tabs.

Office 365 option

Select this if you use Office 365.

When you select this option, the Hosted Services tab panel expands to let you specify settings for address-book discovery, mailbox archiving authentication methods, and the credentials required for connecting with Office 365.

Hosted Exchange without LDAP option

Select this if you use a hosted Exchange service that doesn’t utilize LDAP directory services.

Selecting this requires that you import your address book from a PowerShell-generated CSV file. See Synchronizing the Address Book Using PowerShell Scripts.

Address Book Discovery Sub-panel

Path: Retain Server Manager > Configuration > Module Configuration > Exchange > Hosted Services Tab > Hosted Services Panel > Address Book Discovery Sub-panel

Table 2-28 Using the Address Book Discovery Sub-panel

Option, Field, or Sub-panel

Information and/or Action

Import from CSV file option

Path:

Specify the path to where the PowerShell Sync script saves the CSV user lists. For example:

C:\Program Files\Beginfinite\Retain\RetainServer\WEB-INF\cfg.

This requires the procedures described in Synchronizing the Address Book Using PowerShell Scripts.

Populate from Office 365 option

Select this option to populate your archived address book directly from Office 365. This requires the procedures described in Synchronizing the Address Book Using Office 365 (Microsoft GraphAPI).

Mailbox Archiving Authentication Sub-panel

Path: Retain Server Manager > Configuration > Module Configuration > Exchange > Hosted Services Tab > Hosted Services Panel > Mailbox Archiving Authentication Sub-panel

Table 2-29 Using the Mailbox Archiving Authentication Sub-panel

Option, Field, or Sub-panel

Information and/or Action

Use OAuth

This indicates that OAuth is the authentication method that this module uses for mailbox archiving.

Credentials Sub-panel

Path: Retain Server Manager > Configuration > Module Configuration > Exchange > Hosted Services Tab > Hosted Services Panel > Credentials Sub-panel

Table 2-30 Using the Credentials Sub-panel

Option, Field, or Sub-panel

Information and/or Action

Tenant ID field

This information is exposed when you complete the procedures described in Synchronizing the Address Book Using Office 365 (Microsoft GraphAPI).

Client ID field

This information is exposed when you complete the procedures described in Synchronizing the Address Book Using Office 365 (Microsoft GraphAPI).

Client Secret field

This information is exposed when you complete the procedures described in Synchronizing the Address Book Using Office 365 (Microsoft GraphAPI).

Test Connection button

Click this to verify that the credentials you have entered are valid with your Office 365 system.

Refresh Address Book

After saving changes, return to the Retain Server's Module Configuration page, and trigger a refresh of the Address Book.

Depending on the size of the address book, it may take several minutes to return with information, but a successful configuration returns a correct address book cache date and no errors. The date should reflect the date when the address book refresh was triggered.

The Status may show “Address Book Cache Never Run” or may list commonly mis-configured or missed items if the Refresh job fails.

Once the status is configured and the Address Book has been cached, Retain can connect to and archive messages from the Exchange server. The system is ready to have workers, schedules, profiles, and jobs configured, and those options now appear on the main administrative interface.

The Address Book is refreshed whenever the button is pressed, during the nightly maintenance cycle, and before each job.

Exchange with Access to Active Directory

The Exchange module can connect to On-Premise Exchange servers.

Supported Exchange Forest Configurations

Retain supports:

  • A single forest Active Directory system, (Exchange and standard users)

  • An Exchange Resource Forest, (One Exchange Forest linked to one or multiple User Forests)

Retain does NOT support multiple linked Exchange Forests. Ensure that the Exchange Settings have been configured correctly before continuing the Exchange module setup.

Exchange Prerequisites required for Retain

There are several prerequisites that need to be done in Exchange for Retain to successfully archive the mailbox databases:

  • A mailbox user with ApplicationImpersonation rights

  • Basic Authentication enabled for Autodiscover and EWS on all Client Access Servers

  • A DNS SRV record

  • Set the DNS used by the Retain server to be the same as used by Exchange.

  • Set a Rolling In-Place Hold to retain data until Retain can archive it.

  • If "Configure email forwarding for a mailbox" is in use, enable "Deliver messages to both forwarding address and mailbox", otherwise no messages can stored in Exchange and Retain cannot archive any messages.

Create the Retain Global Catalog User

To connect with Exchange, Retain needs a user with appropriate rights. This can be accomplished by using an existing user, or by creating a new one. It is recommended to create a new user for Retain archiving. If creating a new user, ensure that the user is an active user account and that the password does not change to ensure Retain can access mail without changing settings. This user is sometimes called a ‘service account’. Retain calls this user the ‘global catalog user’.

The user created or used for Retain must be a “mailbox-enabled user” with read access to see all other users, groups, resources, and Exchange Servers in the Exchange Forest. The user is utilized by both the Retain Server and the Worker for LDAP lookups in Active Directory. The Retain user also must have Exchange impersonation rights to every mailbox user on every server in the organization to be archived. The Retain user MUST NOT be a member of any Exchange Administrator group, as Exchange denies impersonation rights for all administrator accounts.

Additional permissions need to be added to the user created for Retain. The quickest way to add these rights is through the Exchange Management Shell.

After creating the new user in Active Directory, open the Exchange Management Shell.

Grant Impersonation Permissions to the Retain user.

In Exchange 2013 and 2016 Impersonation permissions can be granted in the Exchange Admin Center under Permissions.

Under Admin Roles create a new role (e.g. Retain Impersonation Management). Add the role "ApplicationImpersonation" and add the Retain User as a member.

You can also accomplish this via PowerShell commands using the Exchange Management Shell.

The commands required are different depending on the version of the Exchange Server. Exchange 2010, and 2013 require only one command per Exchange system to be issued, whereas Exchange 2007 requires the commands to be run on every Exchange server in the Exchange system to grant required permissions. If the Exchange system contains mixed 2007, 2010, and 2013 servers, the different commands must be completed on one server of each type.

Exchange 2010, 2013, and 2016 commands

For Exchange 2010, 2013, and 2016 the only command necessary for impersonation permissions is:

New-ManagementRoleAssignment –name ImpersonationAssignmentName –Role ApplicationImpersonation –User ServiceAccount

Where the ‘Name’ is a name chosen by the administrator and the ‘ServiceAccount’ is the name of the Retain user.

For Example:

New-ManagementRoleAssignment –Name impersonation-retain -Role ApplicationImpersonation -UserRetain

If additional Exchange servers are added to the system after running this command to grant rights to the ‘retain’ user, the command must be run again to grant rights to the new server.

Room and Equipment Resources

To archive Room and Equipment Resources, or to restore them, the Retain user, or Service Account, must also have delegation rights. These commands must be issued manually for each Room and Equipment or resource mailbox on every relevant server. This is required for 2013 and 2016.

These commands must be issued:

(‘Retain’ is used here as the name of the Service Account, or Retain user, and the ‘Mailbox Database’ should be changed to the appropriate name.)

(NOTE: every time a new Room and Equipment or resource mailbox is added, the first command must be re-run. )

Exchange 2013 and 2016 Powershell commands

Get-Mailbox –ResultSize Unlimited –Database “Mailbox Database” | Add-MailboxPermission –User “Retain” –AccessRights FullAccess

Add-ADPermission –Identity “Mailbox Database” –User “Retain” –ExtendedRights Receive-As

Add-ADPermission –Identity “Mailbox Database” –User “Retain” –ExtendedRights Send-As

Basic Authentication

Retain requires Basic Authentication to be enabled on each CAS Exchange server in the system for Autodiscover and EWS.

In Exchange Admin Center, go to Servers, then go to the Virtual Directories tab.

  1. Edit Autodiscover and under Authentication enable Basic authentication if it is not enabled.

  2. Edit EWS and under Authentication enable Basic authentication if it is not enabled.

  3. Do this for each server in the list.

To check if this worked, run the following PowerShell cmdlets:

For EWS:

Get-WebServicesVirtualDirectory | ft server,basicauthentication

For Autodiscover:

Get-AutoDiscoverVirtualDirectory | ft server,basicauthentication

On Exchange systems prior to 2013 you may need to set basic authentication manually.

Open “Server Manager” on the Exchange server.

  1. In the left pane, expand “Roles”, expand “Web Server (IIS)”, select “Internet Information Services (IIS) Manager”.

  2. A new “Connections” pane opens, expand your Exchange server object, expand “Sites”, expand “Default Web Site (Multiple Protocols)”, select “EWS”.

  3. Under heading “IIS”, open “Authentication” icon

  4. Select “Basic Authentication”, click “Enable” in right pane.

    You can now close “Server Manager”.

DNS SRV Record

Microsoft has an article describing how to set up a DNS SRV record titled "A new feature is available that enables Outlook 2007 to use DNS Service Location (SRV) records to locate the Exchange Autodiscover service".

In general, you must:

  1. Open the DNS Manager.

  2. Expand Forward Lookup Zones.

  3. Locate and right-click on the external DNS zone and choose Other New Records.

  4. Click Service Location (SRV) and enter:

    Service: _autodiscover
    Protocol: _tcp
    Port Number: 443
    Host: [your mail host, e.g. mail.gwava.net, usually the AD domain forest found in AD Domains and Trusts on the MS AD server]
  5. Click OK

The Microsoft autodiscover library in Retain expects a URL along the lines of https://autodiscover.[your domain]/Autodiscover/Autodiscover.xml (e.g., https://autodiscover.xyzcompany.com/Autodiscover/Autodiscover.xml), which can be found in the worker log as it attempts to log in by searching for "Discovered endpoint:" or "AutoDiscover".

Server DNS Setting

Retain performs best when the server's network setting is using the same DNS as the Exchange servers.

If Retain and Exchange must use different DNS, on the DNS that Retain uses, create a Conditional Forwarder that resolves to the Exchange server.

Set Rolling In-Place Hold

To prevent data loss, it is highly recommended that a rolling In-Place or Litigation Hold be set so users are unable to remove items from disk before Retain has a chance to archive them.

In Exchange by default, when a user deletes a message out of Outlook, it is moved to the trash. When they empty the trash, the item is moved to the mostly hidden Recoverable Items folder, where it is kept for 14 day before being removed from disk. The user can then right-click the Trash to recover items, and in that dialog box they can purge the item which deletes it immediately. With a hold in place that item is moved to a Purged folder that is not user accessible, where it is kept until the hold is lifted.

In Retain, set Profile/Miscellaneous to Include the user's recoverable items.

In the Exchange Admin Console, set up a distribution list, for example All_Mailboxes, that contains all mailboxes. It is best to create a policy to add new users to this distribution list by default .

Place the distribution list under a 90-day hold.

In the Exchange Management Shell:

An In-Place Hold can be set up for all mailboxes for 90 days:

New-MailboxSearch "Retain90DayHold" -ItemHoldPeriod 90 -InPlaceHoldEnabled $true -SourceMailboxes All_Mailboxes

It takes time for the hold to take effect. You can determine how many mailboxes were placed under hold with the script:

((Get-Mailbox).InPlaceHolds).Count

Configure an Exchange Module for On-Premise Exchange

The Exchange module must be configured in the Retain Server before any communication between Retain and an existing Exchange message system can occur. Open the Retain management page on the Retain Server, and select Module Configuration.

Select the ‘Configure’ option in the Exchange module. A new window or tab opens with the module configuration.

NOTE:Ensure that your Retain Server and your Exchange server use the same DNS service. The Exchange module uses these DNS settings to auto discover critical information about Exchange that is stored in Active Directory and cannot function correctly unless both systems use the same DNS server.

Core Settings

The module needs to be enabled on this page to make it active in the Retain system.

The module can be given a name.

The Send Method lets you send Exchange items to an external system using FTP or SMTP. In most cases this should be disabled so that items are archived in Retain. To select the SMTP Forwarding or FTP features, you must first add and configure them in the Module Forwarding Tab on the Server Configuration page, otherwise the drop-down list is empty.

Normally all the checkbox options on this tab are enabled. It is rare that you would ever deselect any of them. Two cases where you might, would be: troubleshooting (as instructed by Technical Support), or retrieving an old email system.

The Enable Address Book Caching function allows Retain to regularly cache the online email systems address book and synchronize it with Retain. This is critical for administration, authentication, and archiving purposes. It is recommended to cache the Address Book once every 24 hours to keep the Retain storage system up to date. By default, maintenance is set to cache the Address Book once every 24 hours.

The Enable Authentication checkbox determines whether end-user authentication is performed when the user logs into Retain. If it is deselected, the Retain system cannot authenticate the user against the email system and the user cannot log in unless another authentication method is enabled.

The Enable Jobs option determines whether job retrieve data and pass it to the Worker. Even if the individual job is fully configured and enabled, if this option is disabled, no jobs configured for this module can be processed.

The Message body allows the administrator to decide whether to store either the HTML or plain text message body, or both.

Send Method

The Send Method lets you send Exchange items to an external system using FTP or SMTP. In most cases this should be disabled so that items are archived in Retain. To select the SMTP Forwarding or FTP features, you must first add and configure them in the Module Forwarding Tab on the Server Configuration page, otherwise the drop-down list is empty.

Impersonation

If the Impersonation and Core Settings tabs are not completely configured with the correct information, the hosted system cannot be archived correctly.

Hosted Services

This tab is not used with an On-premise Exchange system.

Exchange Forest

Retain needs to know where to access the Global Catalog Host and existing domains before any archiving can be accomplished.

Open the “Exchange Forest” tab and enter the IP address or hostname of the Global Catalog Host.

Click on the Green Plus sign to add a search base. This should be set to the highest level of the LDAP domain so the entire address book can be found. For example: DC=exchange2013,DC=qa,DC=gwava,DC=com

Retain uses Active Directory extensively when integrating with Exchange. Its uses include: populating the address book, authentication, and access to the Exchange System.

There are settings required for Exchange, see Exchange Prerequisites required for Retain.

On the Exchange Forest tab, you configure all the Active Directory information you need for an Exchange forest. There is no need to fill out any information on the User Forest tab unless the users exist in a separate forest from the Exchange Forest.

On the Exchange Forest tab, specify whether to use SSL or not for the Global Catalog Security and the search base, (use of SSL with the Global Catalog Security and search base is highly recommended). The search base is the LDAP path to where Retain starts searching for valid Exchange users.

The Global Catalog Port defaults depend on whether SSL is used for security or not. (SSL is strongly recommended. Default ports are 3268 for plain text, and 3269 for SSL.) Adjust as appropriate for your system.

You also must provide the credentials of an Active Directory user. This user is special It must have full read rights to Active Directory, be a mailbox-enabled, user, and be granted various Impersonation and Delegation rights. More on this is discussed in Exchange Prerequisites required for Retain. The username must be in UPN format, (user principal name).

This search base, in LDAP form, must be high enough in the tree to include ALL users, groups, and servers. Multiple search bases can be specified, though it often results in a less efficient interface. These are LDAP search bases which allow Retain to resolve all users, groups, and servers of interest in the forest.

After the Search Base has been added, test the connection to ensure information and connection works. The test performs a simple login to confirm that the user exists, the Exchange Server is reachable, and that the credentials are accepted. The test does not confirm impersonation or delegation rights necessary for the Service Account.

If the test results in an error stating: “FAILURE: User doesn't exist or is not mail enabled,” It indicates that the user’s mailbox is unavailable. A mailbox is not required for Retain to utilize the specified user. If the user Retain utilizes does not have a mailbox, this error may be ignored. However, if the user specified does have a mailbox, this may indicate connection issues.

If the test results in an error with an LDAP error code 49 it is an authentication error. The important bit of information is what comes after the data field. That is the LDAP connection error code that applies to this case.

  • 525 user not found

  • 52e invalid credentials

  • 530 not permitted to logon at this time

  • 531 not permitted to logon at this workstation

  • 532 password expired

  • 533 account disabled

  • 701 account expired

  • 773 user must reset password

  • 775 user account locked

The Exchange Forest tab is the only tab required by the Server and the Worker to archive mail from the Exchange system. The User Forest tab, however, is required for Exchange systems utilizing a resource forest, to allow the end user to log in to Retain.

If the system contains a Resource Forest, enable the checkbox on the Exchange Forest tab and save changes. If the Resource Forest checkbox is not enabled, the User Forests tab is non-functional and all settings contained on that tab are ignored. The checkbox must be unchecked in a single forest Active Directory deployment, and it must be checked in a multiple forest Active Directory deployment.

Check all information to ensure that it is correct and save changes, and then configure the User Forest if required.

User Forest

The User Forest must have an entry for each user forest attached to the system.

Select the green ‘+’ button and input the LDAP information required by the Forests’ Global Catalog server: IP address or hostname, port, security, (SSL is strongly recommended), and all search bases to include all the users. No administrative credentials are required. Each end user’s provided credentials are used on login.

Delegates

You can set Retain to use delegate rights with On-Premise Exchange.

Finishing On-Premise Exchange

Save all changes before closing the Exchange Module page.

Refresh Address Book

After saving changes, return to the Retain Server's Module Configuration page, and trigger a refresh of the Address Book.

Depending on the size of the address book, it may take several minutes to return with information, but a successful configuration returns a correct address book cache date and no errors. The date should reflect the date when the address book refresh was triggered.

The Status may show “Address Book Cache Never Run” or may list commonly misconfigured or missed items if the Refresh job fails.

Once the status is configured and the Address Book has been cached, Retain can connect to and archive messages from the Exchange server. The system is ready to have workers, schedules, profiles, and jobs configured, and those options now appear on the main administrative interface.

The Address Book is refreshed whenever the button is pressed, during the nightly maintenance cycle, and before each job.

Exchange Distributions Lists

You can create distribution list in Exchange Admin Center to manage information dissemination. Retain queries Exchange for a list of users in each distribution list. While you can create a distribution list in Active Directory Users and Computers these changes are not reflected in Exchange and therefore Retain is unaware of them. If you want to rename a distribution group it must be done in Exchange or Retain will not see it.

Distribution lists can be hidden in Exchange. If a distribution list is hidden, Retain cannot see the users associated with the distribution list and cannot archive the distribution list. The distribution list will be marked as (hidden) in Job | Mailboxes | Distribution Lists.

Dynamic Distribution Lists cannot be seen by Retain because they only create a user list when a message is sent. Remember to refresh the address book if you want to see the latest list changes.

Shared Mailboxes, Rooms and Equipment

Impersonation rights allow the Retain user to enter other mailboxes but those rights do not extend to shared mailboxes. To access a shared mailbox the Retain user would need delegate rights to each shared mailbox that is to be archived. These rights can be granted through the Exchange Management Shell.

If the shared mailbox is owned by “John Doe” and your Retain impersonation account is "Retain", you would issue the following command in an Exchange Management Shell (EMC):

Add-MailboxPermission -Identity "John Doe" -User Retain -AccessRights FullAccess -InheritanceType All -AutoMapping $false

Exchange Message Dredging Process Overview

How does Retain get messages from Exchange?

  1. When a job starts, the Retain Worker queries the DNS for the SCP record to the URL of the Active Directory Global Catalog Host.

  2. Then the worker queries Active Directory for the Autodiscover SCP Records and Active Directory returns the Autodiscover URLs. The URLs tell Retain where to connect to autodiscover. There are also some default autodiscover URLs that Retain uses to connect to autodiscover.

  3. Retain then uses autodiscover to connect to the Client Access Server. It is helpful to have an autodiscover SRV record on the DNS to speed up this process.

  4. Once Retain has connected to the Client Access Server (CAS), the CAS uses EWS to connect Retain to the correct Mailbox Server.

  5. Retain uses the impersonation user credentials to enter the mailbox of the user we are attempting to dredge messages from. Retain queries Exchange for messages that meet the criteria set in the job.

  6. Exchange then serves the oldest message that meets the criteria back to the Retain Worker through EWS on the CAS.

  7. The Retain Worker receives the message and opens it to query the Retain Server if the message body or attachments already exists.

    1. If the Retain Server determines that the message is new, then the body and attachments are stored in the archive, the header information and hash is saved in the database with links to the archive and the contents of the message are indexed.

    2. If the message already exists, the database is updated with the header data and linked to the existing data, and the existing message body or attachment is dropped by the worker and the next message is retrieved from the email system.

Troubleshooting Exchange Performance

In general, we have found that acceptable throughput is in the 3-5 messages per second range. In well designed systems with sufficient hardware resources we have seen throughput above 10 m/s. There is definitely an issue if the throughput is less than 3, and we have seen instances of less than 0.1. The first place to look is the worker log.

Mailbox Delays

We are looking for how long it takes Retain to log in to each mailbox and when it finds the endpoint which tells us it entered the mailbox.

Search the log for lines containing:

enterMailbox
Discovered endpoint

Now you want to compare the difference in times between these two lines. It should be less than 2 seconds. If it is significantly longer than 2 seconds it is most likely an issue with the DNS not properly serving autodiscover.

2015-09-25 12:00:07,256 TRACE [RTWQuartzScheduler_Archive_Worker-1] com.gwava.caapi.MailboxArchivingStats: enterMailbox: JDoe@RETAIN.GWAVAUTAH
2015-09-25 12:02:14,177 DEBUG [RTWQuartzScheduler_Archive_Worker-1] com.gwava.ews.archiveimpl.process.ExchangeUser: Discovered endpoint: https://ad.test.sys/ews/exchange.asmxscreen

This indicates that there is an issue with how autodiscover is configured in the DNS. It may need an SCP or SRV record.

Message Delays

Another thing to search for are connection failures and retries, which increase each time it fails which can add up to 4 minutes:

search for items

Software caused connection abort: recv failed

EWS request failed: null. Will retry after

2015-07-22 00:25:25,056 TRACE [Thread-1341102] com.gwava.ews.RetainExchangeWebserviceFactory: retry, exception :
javax.xml.ws.WebServiceException: java.net.SocketException: Software caused connection abort: recv failed
at com.sun.xml.ws.transport.http.client.HttpClientTransport.readResponseCodeAndMessage(Unknown Source)
...
at com.gwava.ews.archiveimpl.process.CursorFetchThread.run(CursorFetchThread.java:1334)
Caused by: java.net.SocketException: Software caused connection abort: recv failed
at java.net.SocketInputStream.socketRead0(Native Method)
...
at sun.net.www.protocol.https.HttpsURLConnectionImpl.getResponseCode(HttpsURLConnectionImpl.java:318)
... 27 more
2015-07-22 00:25:25,056 DEBUG [Thread-1341102] com.gwava.ews.RetainExchangeWebserviceFactory: EWS request failed: null. Will retry after 2 seconds

This retries a few times with longer delays untl it aborts. Here we are losing connection to the Exchange server while already in a mailbox. This can indicate that there are issues with either a message attachment or the webserver on the Exchange or CAS servers is unable to serve the item at this time. Go to the message in Outlook or OWA and see if it can be accessed.

If the message can be accessed successfully export it as a .pst and use the PST Importer to bring it into Retain.

If the message cannot be accessed successfully then it must be deleted.

Exchange Health

You may also want to check the health of the Exchange server itself.

Performance Monitor

The first thing to check is the performance of the server by going into Performance Monitor to see it is above 80% utilization of CPU, Memory, Disk and/or Network. If they are consistently high you should use the various Server health, monitoring, and performance cmdlets to pinpoint the issue

Queues

Another thing to check are the Queues. The mail queues are how Exchange handles mail. You can see they by going into Exchange Tookbox/Queue Viewer. The number of messages in the queues should be low, if there is a queue with hundred or thousands of messages and they are not being cleared then that queue may have a stuck message, which would need to be cleared.

You can also use the Exchange Managment Shell (EMS) to check the status of the queues.

Get-Queues

Mailboxes

Another thing to check are the mailboxes. Performance can degrade if a mailbox has too many messages (~100k). The number of messages is more important then the size of the messages. For large systems you should pipe to a file since this command can exceed the EMS buffer.

Get-Mailbox | Get-MailboxStatistics > c:\mailboxstat.txt

If there is a specific mailbox with issues you may need to repair the mailbox.

Server Health

You can get a quick overview of an Exchange server's health by running this EMS cmdlet:

Get-ServerHealth -Identity server1 | Sort-Object AlertValue | ft Name, AlertValue

Exchange Throttling Policy and Bandwidth/Performance (2013)

Microsoft Exchange 2013 uses client throttling policies by default to track bandwidth for each Microsoft Exchange user and enforce bandwidth limits as necessary. Throttling policies should be turned off for the Retain Service Account, because they can affect the performance of Retain for Exchange when accessing mailboxes with a large number of folders and mail items.

  1. On a computer that hosts the Microsoft Exchange Management Shell, open the Microsoft Exchange Management Shell.

  2. Type these commands:

    1. New-ThrottlingPolicy [give it a policy name of your choosing]

    2. Set-ThrottlingPolicy [policy name from step "a"] -RCAMaxConcurrency Unlimited -EWSMaxConcurrency Unlimited -EWSMaxSubscriptions Unlimited -CPAMaxConcurrency Unlimited -EwsCutoffBalance Unlimited -EwsMaxBurst Unlimited -EwsRechargeRate Unlimited

    3. Set-Mailbox [Retain impersonation user account] -ThrottlingPolicy [policy name from step "a"]

  3. To check the policy run the command: Get-ThrottlingPolicy -Identity [policy name from step "a"] | Format-List

Exchange Throttling Policy and Bandwidth/Performance (2010)

The error indicates that either you have a throttling policy applied or the Exchange server is busy. Microsoft Exchange 2010 uses client throttling policies by default to track bandwidth for each Microsoft Exchange user and enforce bandwidth limits as necessary. Throttling policies should be turned off for the Retain Service Account, because they can affect the performance of Retain for Exchange when accessing mailboxes with a large number of folders and mail items.

  1. On a computer that hosts the Microsoft Exchange Management Shell, open the Microsoft Exchange Management Shell. Find out the default Throttling Policy: Get-ThrottlingPolicy

  2. Type these commands:

    1. New-ThrottlingPolicy [give it a policy name of your choosing] -RCAMaxConcurrency $null -RCAPercentTimeInAD $null -RCAPercentTimeInCAS $null -RCAPercentTimeInMailboxRPC $null -EWSMaxConcurrency $null -EWSPercentTimeInAD $null -EWSPercentTimeInCAS $null -EWSPercentTimeInMailboxRPC $null -EWSMaxSubscriptions $null -EWSFastSearchTimeoutInSeconds $null -EWSFindCountLimit $null

    2. Set-Mailbox [Retain impersonation user account] -ThrottlingPolicy [policy name from step "a"]

  3. Check the Throttling Policy for the "retain" impersonation user: Get-ThrottlingPolicy -Identity [policy name from step "a"] | Format-List

Exchange Journaling Mailbox

Using Exchange Journaling Mailbox is not recommended, but there are some situations where it is a viable option.

According to a Microsoft technician, they recommend at least 1 journaling mailbox per mail server. Exchange can only effectively support mailboxes under 5 - 10G. Exchange experiences performance issues when the Inbox exceeds 2500-5000 messages. http://blogs.technet.com/b/exchange/archive/2005/03/14/395229.aspx

This means that, once you enable a journaling mailbox, you should begin archiving its contents and using the Retain option to delete the items from the mailbox once archived. However, if there are delays in getting those journaling mailboxes archived, you should watch the size. If it gets to 5G, turn it off and re-route email to another journaling mailbox until you get all of them archived and emptied out.

  1. Set up a journal mailbox for each mailbox database.

  2. Journaling jobs should have their own Profile with the Scope set to "All messages (ignore date)" and Duplicate Check set to "Try to publish all message (SLOW)" to gather all messages from the beginning of the mailbox. This profile can be used for all journaling mailbox jobs.

  3. Under Job, "Enable Journaling" and "Delete archived items from journal" must be enabled (checked) so that the journaling mailbox is cleared during the job, and choose the journaling mailbox you want archived. Create a separate job for each journaling mailbox.

Important note: As Retain archives the journal mailbox it creates a list of messages to be deleted but only sends the delete request when it exits the mailbox. If the job fails before it exits then the messages won't be deleted. Limiting the scope of the job to allow Retain to finish the job successfully ensures that the messages are deleted.

Transitioning from Journaling to Rolling In-Place Hold for Exchange Archiving

There are changes you must make in Exchange and Retain for this transition to go as smoothly as possible.

Mandatory Exchange Tasks:

  1. Enable Rolling In-Place Hold. You can test that the hold is properly enabled by going into Outlook or OWA and deleting an item, going into the recoverable items dialog and attempting to purge the item. It should end up in the Purges folder which the user cannot see but Retain can. So you should run an archive job against it to see it within Search Messages in Retain. In Exchange 2010 you should enable Single Item Recovery, which allows you to set a rolling duration for holding deleted items.

    Get-Mailbox | Set-Mailbox -SingleItemRecoveryEnabled $true -RetainDeletedItemsFor 90

  2. Disable Journal Rule in Exchange. Once the rolling in-place hold is enabled, you can disable the journal rule in Exchange. https://technet.microsoft.com/en-us/library/bb124264%28v=exchg.141%29.aspx

Mandatory Retain Tasks:

  1. Keep the existing Retain journaling job and allow that to run until the journal mailbox is empty. If you are currently unable to archive your existing journal mailbox(es) because they have become too large for Exchange to manage, there are powershell scripts for transfering mail to another mailbox.

  2. Create New Profile. The primary option to enable is Profile/Miscellaneous/"Include user's recoverable items". With this option enabled Retain will dredge each users recoverable items folder and all items and folders inside it, except the logs found in the Audits subfolder.

  3. Create a New Job(s) If you have multiple Exchange databases we recommend one job per mailbox database and one worker per job so they can run in parallel. (Retain Technical Support has a PowerShell 4.0 script to make this easier)

Large Attachments and/or Messages Cannot Be Archived From Exchange

Symptoms you may notice when experiencing problems with default IIS limitations:

  • Retention is turned on in GroupWise and messages up to a certain date can't be deleted.

  • Errors on retrieving attachments show in the Worker log.

  • Can see messages that don't have all the attachments in Retain.

  • You may also have difficulty getting larger exports through the web interface (exports larger than 28.6 MB).

  • When logging is set to diagnostic for the Worker you can see errors like this:

15:15:15,668 RetainServerCommunication - Attempt to connect, but Server returned HTTP status (404): Not found (this line is typically repeated several times over the course of 5 minutes) 15:15:15,668 RetainServerCommunication - Giving up...too many retries! 15:15:15,668 ArchiveAttachment - Send a nice healthy blob:Archive: ERROR: Fatal Error Result=AddedEMails: 0, emailID=null, parentID=null 15:15:15,691 JobUtilities - HandleArchivingException

*Note: IIS is not supported by GWAVA. These are suggested methods for allowing Retain to archive large emails through IIS. For further information visit the MicroSoft support pages: http://www.iis.net/configreference/system.webserver/security/requestfiltering/requestlimits Some other useful information can also be found on the IIS forums: http://forums.iis.net/t/1066272.aspx

This may not be as much of an issue in Retain installations that were created with 3.x and later. The RetainWorker will now communicate, by default, directly to the RetainServer on port 48080 thereby bypassing IIS. If you'd like to change this for an older installation, change the connection address of the worker. See the manual (look up "Worker Configuration") for your particular installation for more information. You may still have this be an issue on your Exchange server when Retain tries to collect from it if there are message attachments or messages that are larger than whatever IIS is set to allow through. This would be a setting on the Exchange side that would need to be changed. Default is 30000000 bytes.

For getting exports out of Retain you can also choose to bypass IIS and use http://(RetainIP):48080/RetainServer. IIS integration is more of a convenience to point users at Retain so that you don't have to deal with port information in a URL and other advantages that this can provide.

IIS, by default, limits the amount of data that can be imported by Retain. You can remove, or at least mitigate, this limitation by changing 4 settings. This example will be assuming you'd like to archive files up to 931 MB.

  1. 1. You'll need to increase the limit on how much data the RetainWorker and RetainServer can push/pull through IIS. You can do that using the following command*:

    1. ** %windir%\system32\inetsrv\appcmd set config "Default Web Site/RetainWorker" -section:requestFiltering -requestLimits.maxAllowedContentLength:1000000000

    2. %windir%\system32\inetsrv\appcmd set config "Default Web Site/RetainServer" -section:requestFiltering -requestLimits.maxAllowedContentLength:1000000000

    3. Current testing indicates that you'll also have to do a blanket statement: %windir%\system32\inetsrv\appcmd set config -section:requestFiltering -requestLimits.maxAllowedContentLength:1000000000

      *Note: the number at the end of the command is the size you'd like to have as the max in bites.

      **Note: the "Default Web Site/RetainWorker" piece may vary depending on your server setup. See the picture in the next section.

  2. 2. If you don't like command line you can also change it through the IIS manager.

    1. Bring up the IIS manager and highlight "Default Web Site"

    2. Double click on "Configuration Editor" as shown in the figure above.

    3. Use the "Section" area drop down box to go to "requestFilterg" as shown in the following figure.

    4. Expand the "requestLimits" section. Change the "maxAllowedContentLength" shown in the next figure to the size (in bytes) you would like to be able to pass though.

    5. Repeat for both RetainServer and RetainWorker.

  3. 3. You may also need to change the timeouts in IIS. To do this:

    1. Open the IIS manager.

    2. Highlight "Default Web Site".

    3. Click on "Limits"

    4. Change "Connection time-out (in seconds):" to the desired time.

Moving Users to a New Exchange Domain

If you need to move your users to a new Exchange domain without changing their email addresses (for example from user@organization.local to user@organization.org) you will need to use the moveMailboxes tool to keep the users associated with their existing archive, otherwise a new archive will be created for all users.

Prerequisites
  • The new on-premise Exchange system can not have been archived by Retain before.

  • The users continue to use the same email address, though the UPN may be different.

Procedure
  1. In the Retain Web Console, go to the Exchange module and select configure.

  2. Under the Impersonation tab, enter the new impersonation user credentials.

  3. Under the Exchange Forest tab, reconfigure the settings to the new Exchange system.

  4. Click the Test Connection button to confirm the connection can be made.

  5. Save your changes.

  6. Return to the Module Configuration page and Refresh the Address Book by clicking the Refresh Address Book button. Wait for the refresh to complete.

  7. Open the RetainServer log and tail the log to watch progress of the tool. On Windows a utility program like baretail is useful for this.

  8. Open a new tab and enter the URL: http://<your Retain Server Address>/RetainServer/Util/moveMailboxes.jsp. The page will be blank.

  9. In the RetainServer log when the migration is complete, you will see the message "MoveMailboxes: mailboxes moved: [amount of mailboxes]. Process Complete."

  10. Re-index all messages. In the Retain Web Console, go to Server Configuration | Index and press the Re-index All Messages button. This may take significant time in larger systems and search will be limited as the re-index is going on.

  11. Once re-indexing is complete, archiving can resume normally.

When the users log in to Retain they will see two folders one with the mails from the original Exchange system and the other with mail from the new system. They have different system IDs so cannot be combined seamlessly

Exchange Concurrent Connection Limits

If jobs fail with the error: 421 4.3.2 The maximum number of concurrent connections has exceeded a limit, closing transmission channel.

It may be because we are hitting the maximum inbound connections per source limit connecting to Exchange. You will need to increase the MaxInboundConnectionPerSource parameter. See “Understanding message rate limits and throttling” for details.

Core Settings Tab (Exchange)

Path: Retain Server Manager > Configuration > Module Configuration > Exchange-Configure > Core Settings

Impersonation Tab (Exchange)

Path: Retain Server Manager > Configuration > Module Configuration > Exchange-Configure > Impersonation

Hosted Services Tab - Office 365 Settings

Path: Retain Server Manager > Configuration > Module Configuration > Exchange-Configure > Hosted Services > Office 365 option

Hosted Services Tab - Non-LDAP Exchange Settings

Path: Retain Server Manager > Configuration > Module Configuration > Exchange-Configure > Hosted Services > Hosted Exchange without LDAP option

Exchange Forest Tab (Exchange)

Path: Retain Server Manager > Configuration > Module Configuration > Exchange-Configure > Exchange Forest

User Forests Tab (Exchange)

Path: Retain Server Manager > Configuration > Module Configuration > Exchange-Configure > User Forests

Delegates Tab (Exchange)

Path: Retain Server Manager > Configuration > Module Configuration > Exchange-Configure > Delegates

2.8.3 Setting an Exchange Schedule

If you have not already created one or more schedules for use with your Exchange Job, go to Creating Schedules and complete the task now.

2.8.4 Specifying an Exchange Profile

The job will need an Exchange profile set up to connect to the email system properly.

This requires that an appropriate module be configured a documented in one of the following sections:

After the Exchange Module has been configured, the Exchange Profile will be available for configuration. If an Exchange Profile is not configured, jobs cannot be run against the Exchange system.

Click on “Add Profile” and provide a profile name, or select an already existing profile to access the configuration tabs. All changes made on this page must be saved by selecting the “save changes”, disk icon, at the top right of the page. Tabs may be changed and navigated through without affecting new settings, but any move to another page will require saving, or abandoning the changes made.

Skype for Business

With Office 365, Retain also archives Skype for Business conversations. They are saved to the Conversation History folder of the user.

Core Settings Tab

The core settings consist of an enabled/disabled option which must be enabled for any jobs based on this profile to archive anything.

Profile Functions

The Profile Functions tell the Retain Server what to do with the mail it archives from the messaging system. If Archiving is not enabled, mail will not be archived by Retain.

Archive Mark

Some users may opt to use the Archive Mark for messages that have been archived by Retain. The archive mark is a custom flag and may be modified, therefore is not secure and should not be used for compliance. Archive Mark slightly degrades job performance. Check the check box to enable Archive Mark for the selected profile.

When the Archive Mark is active, Retain creates a custom column for mail, called “RetainArchived” which users and administrators may add to their email client to view mail which has been archived. The “RetainArchived” column indicates an archived mail item by displaying a ‘1’ in the message row, while remaining blank when the message is not archived.

Messaging System Deletion

For systems where the administrator wishes to have archived messages removed from the system automatically, the Messaging System Deletion option may be used. Messaging System Deletion will remove messages from a mailbox after they are archived, according to the time frame specified in the settings. The amount of time to keep messages is specified in days. The recommended setting depends on the archiving scheme in the system. For example, if messages are to persist in the system for 30 days, then the system deletion setting should be set to 30 and enabled. A setting of 0 will remove messages from the system as soon as they are archived. Be sure to configure the system before enabling the setting in the profile.

However, it is recommended that the messaging system do the deletion rather than Retain.

Message Settings Tab

Retain can archive and select specific types of mail and Exchange system items to be archived. The Manage Settings tab provides access to manage those settings.

The Mailbox type specifies whether to include or exclude the available types of mailboxes. Because there can be multiple profiles and jobs, it may be advantageous to archive the Users and Room / Equipment mailboxes separately as needed and appropriate for the system.

The Item Type option specifies the different types of messages found in Exchange that can be archived, and allows the exclusion of or inclusion of the different individual types.

The Item Source option allows administrators to exclude or include messages that have not yet been sent or received, or posted.

The Message Status allows messages which have or have not been read or opened, or marked private or confidential to be archived. The different options in the drop-down menu are as shown.

Scope Tab

The Scope tab dictates the date range Retain will scan in the attached archiving jobs.

Date Range to Scan

The Date Range to Scan instructs Retain to scan for, and archive, messages after, or before, a certain date. This is useful if only specific chunks or areas of mail are to be archived.

New Items: All items that have not been archived by Retain since the last time the job ran.

All Items in Mailbox: All items in the mailbox starting from 1/1/1970, duplicates will be processed but not stored if they already exist in Retain's archive.

Number of days before job start date and newer: Only items from the relative number of days from the time the job began will be archived. E.g. messages that came into the email system less than 7 days ago.

Number of days from job start date and older: Only items previous to the relative number of days from the time the job began will be archived. E.g. messages that came into the email system more than 7 days ago.

Specify custom date range: Only items between two absolute dates will be dredged.

Specify custom date range relative to job start: Only items between two relative dates will be dredged. E.g. messages that came into the email system between 7 and 5 days ago.

It is recommended to archive all New items.

Advance Flags

Enabling "Don't Advance Timestamp" will not update the timestamp flag. Items that are dredged will still be considered new by Retain the next time the job runs.

This is useful when troubleshooting, but is generally not used for normal jobs.

NOTE:Unlike GroupWise, Exchange does not ensure any compliance when scanning end user mailboxes; users may freely delete their email. The Item store flag does not prevent mail deletion. Only setting a rolling hold on all mailboxes guarantees all items have been archived.

Miscellaneous Tab

The Miscellaneous tab allows access to settings detailing how messages are stored and what is archived. Attachments, message information such as the Internet headers, and how the data is stored and named, (by folders, year, or year and month), dictate not only the message store structure, but affect the storage size.

Miscellaneous options also allow for the archiving of the ‘recoverable items’. To enable checking and archiving of the ‘Recoverable Items’ for compliance reasons, select the checkbox next to the option.

Advanced Tab

Generally, since storage space is inexpensive, Micro Focus recommends that you archive all message content.

However, if you need to limit what is archived, you can use the Advanced tab to do it.

Table 2-31 Configuring Exchange Profile Advanced Settings

Option, Field, or Sub-panel

Information and/or Action

Use this dialog to define the conditions Retain uses to determine what to archive.

Advanced Criteria

Each line sets a specific parameter and the lines are all added together (AND-ed). To check how Retain will interpret your settings, read through the lines in turn, inserting AND between each one.

  1. For the first field, select from among the following items:

    • Subject

    • Sender

    • Recipient

    • Size

    • Attachment Name

  2. For the second field, specify the relationship of the first field to the (third field):

    • is

    • is not

    • contains

    • does not contain

  3. Type a string for the third field.

  4. Click Add to enter another statement

  5. When the conditions are defined, click Save Changes.

Folder Scope

By default, Retain archives the items from all folders.

Using this panel, you can restrict which folders are archived, by either:

  • Specifying the folders to include (the Only items from folders listed below option)

    or

  • Specifying the folders to exclude (the All folders except those listed below option)

To specify the folders to include or exclude, do the following:

  1. Specify a System Folder (mandatory).

    Example: Calendar.

  2. Optionally, specify a subfolder within that folder.

    Example: entering old would mean the folder old under Calendar.

  3. Use the / delimiter to specify multiple hierarchies under that.

    Example: old/mail would mean the subfolder mail under old under Calendar.

  4. Specify whether the option includes subfolder.

    Example: If you specify old and deselect includes subfolder, then Calendar/old is archived.

    If you specify old and select includes subfolder, Calendar/old/mail is archived.

2.8.5 Setting Up an Exchange Worker

If you have not already created one or more Workers for use with your Exchange Job, go to Creating Workers and complete the tasks there.

2.8.6 Creating an Exchange Job

Use Exchange jobs for On-premise Exchange or Office 365.

A job is made up of:

Core Settings Tab

The Core Settings of a job contains configuration which must be set for the job to be saved and become active. A job must be enabled before it will run. Jobs must also have a specified schedule, profile, and worker. These are all selected from drop-down menus, and will not be populated unless those items are already configured in the system.

The Data Expiration setting is an option to place a time stamp on data in the Retain database, which allows for ease of automation for the deletion manager. In addition, devices such as NetApp, Centera, and Hitachi HCAP may use this number to enforce hardware level protection of the stored item so that no one (including Retain) may delete the item before its expiration date. Job Expiration is not retroactive for mail in the database, and only applies to mail archived by the job that it is active for. In order to have messages with custom job or folder expiration dates properly expire, the deletion management date scope must be set to delete messages with an Expiration Date older than 1 day.

Exchange Job Option - Journaling

In order to achieve compliance, Exchange utilizes a Journaling mailbox. This mailbox can be set to be archived by retain to collect all messages on the system. The Journaling mailbox can rapidly grow in size if it is not cleaned out after messages have been archived. The Journaling option for Exchange jobs allows Administrators to set whether Retain will automatically clean out messages from the Journaling mailbox which have been archived.

On larger systems where there are multiple journaling mailboxes, Retain will automatically create a mailbox for each of them in the archive. However, if desired, the journaling mailboxes may be all archived to the same specified mailbox in the archive. This is the funnel mailbox. If desired, specify the mailbox by selecting the ‘funnel mailbox’ button, search for and select the desired mailbox. Only existing mailboxes in the Retain system may be specified as a funnel mailbox.

Once a job begins the job may be monitored in Reporting and Monitoring or on the Worker Console.

Mailboxes Tab

The mailboxes tab is where the administrator specifies which entities (mail servers and/or Distribution Lists) are to be scanned. This tab is not displayed for the mobile module.

Expand the Post Office and/or Distribution List trees, and check off the items you want to be dredged.

NOTE:If you desire to have a job backup a single user, or selected group of users, select the Users menu and assign the users desired

The Distribution List selection allows you to include or exclude a group of users from an archive job. If you want to use GroupWise Distribution Lists, the visibility needs to be set to “system wide”.

The users section allows you to select individual users to include, or exclude them from an archive job. For example: you can select an entire Mail Server to be archived, and then expand the users section to include or exclude users to the job.

This can also be used to select only certain users in the system for an archive job.

To add a user to the Include or Exclude list, select the respective ‘Add user’ button and search for the user. It can be helpful to unselect the ‘only show recently cached items’ option. Add the selected users to the list in the search window, then select ‘Ok’ to add them to the include or exclude list.

Notification Tab

When a job is run, the Notification option allows the administrator to be emailed a summary and report of any errors, for each running job.

For notification to function correctly, the SMTP information for the desired SMTP server must be fully filled-out. How much information is required varies depending on the mail system used.

Status Tab

The Status tab displays the status of any currently running jobs, as well as the stats of the last completed job.

On some modules, currently running jobs may be terminated here. For the rest, this tab is informational only.

Next Step

Once a job has completed you can confirm the items are in the archive by checking the Search Message interface. See Using Retain’s Archives in the Retain 4.9.1: User Guide. .