Getting Started with Dynamic Storage Utilization (DSU)
Purpose of this Guide
The purpose of this guide is to familiarize you with a feature being introduced in PeerGFS v5.0: Dynamic Storage Utilization (DSU). It will introduce you to the feature set, UI, and workflows of DSU and step you through setting up and running a job that uses DSU.
This guide assumes general familiarity with using PeerGFS. You should be comfortable with creating and running file collaboration and/or file synchronization jobs. This guide focuses on DSU, not on the broader capabilities of PeerGFS. See the PeerGFS User Guide for your version of PeerGFS for more information about other features and settings.
If you are not familiar with creating and running file collaboration and/or file synchronization jobs, please discuss this with your Peer Software point-of-contact.
Introduction to Dynamic Storage Utilization
Starting in PeerGFS v5.0, our new Dynamic Storage Utilization (DSU) feature allows you to save storage space on edge storage devices (for example, storage devices used in branch offices) where only a small subset of files are used on a frequent basis. Files that are used less frequently are replaced with stub files on the edge storage device so that it appears to have a complete set of files. When a user accesses a stub file, DSU retrieves the full version of the file from a master storage device. The benefit of using DSU is that it allows you to efficiently utilize storage capacity on edge devices while preserving fast access performance on files that are used most often.
DSU offers flexible edge storage management with:
- The ability to assign an amount or percentage of available storage to be used on the edge storage device.
- Dynamic adjustments of the time periods used to determine whether to stub or rehydrate a file, allowing DSU to keep the assigned storage space as full as possible (best experience for the end user).
- Direct integration with our file collaboration and synchronization job types.
- Point-to-point data transfer capability between one edge and one or more masters.
- The flexibility to mix and match master and edge roles across different jobs.
- The ability to pin files or folders to always be local or always be stubbed on the edge storage device.
- Alerting, to ensure you stay ahead of potential storage capacity limits.
Important Concepts
A master participant has a complete set of hydrated files and no stub files. An edge participant contains a subset of the complete, hydrated files on a master participant, while the rest of the files are stub files that don't take up any space. Users can retrieve stubbed files directly from a master participant as needed. The goal of Dynamic Storage Utilization (DSU) is to keep as much as possible cached locally on edge participants for rapid access.
Every edge participant must have at least one master participant assigned to it. When a stub file needs to be rehydrated, DSU will retrieve the file from a master participant.
User-defined business rules (volume and utilization policies) manage the storage capacity on edge devices. DSU scans edge participants on a set basis (typically at least once daily) and uses these policies to determine whether adjustments are needed, i.e., whether to stub files to free up space or to rehydrate files. This ensures that the storage capacity is being used at optimum efficiency.
Glossary of DSU Terms
This glossary is provided to help you understand some of the core terminology used in conjunction with DSU.
Edge participant | A subset of the files stored on a master participant are physically stored on an edge participant; the rest of the files on an edge participant are stub files that take up minimal space but can be rehydrated as needed. |
Local file | A file that is fully available without network access to a master participant; all of its bytes are present (stored locally) on the participant. |
Master Data Service | A service that handles requests from edge participants for files on a master participant. The Master Data Service is installed on the Peer Agent server as part of the Peer Agent installation process. |
Master participant | Always has a complete set of files for the job. None of its files are stubbed; they are always stored physically on that device. |
Pinning filter | Specifies whether specific files or files in a particular directory are always stubbed or always local on the edge participant. A pinning filter similar is to a utilization policy—it can be applied to multiple jobs. If there is a conflict between a pinning filter and utilization policy (where, for example, you might have something set to be always stubbed), the pinning filter will take precedence. |
Rehydrated file | A file that was stubbed but has been fully reconstituted on the edge participant. |
Stub file | A file that appears to the user to be stored on the local disk and immediately available for use but is actually held either in part or entirely on a different storage medium. |
Temporary storage space | Space that is used to temporarily store the content of stub files that are being rehydrated. |
Utilization policy | Defines when a file should be stubbed versus fully hydrated across all volumes of this edge participant. Parameters are based on the size of the files to be potentially stubbed and when they were last accessed and modified. A utilization policy enables you to balance getting the best performance while keeping the cache as full as possible. |
Volume policy | Specifies how much of the available space on the volume monitored by the Agent/edge participant to be assigned for local (hydrated) files. |
Requirements
- The requirements for PeerGFS (including both DSU and enterprise NAS support) can be found here.
- It is important to note that all our support enterprise NAS platforms can be used as master participants but only Windows File Servers can be used as edge participants.
- We strongly recommend having at least two master participants for every job that uses DSU. If one of the master participants goes down, another master participant can take over and serve data to the edges.
Create a Job that Uses DSU
The process for creating a job remains essentially the same; the primary change occurs when creating participants, where you must designate a participant as a master or edge participant. See the following sections for instructions on how to create a job that utilizes DSU.
Create and name the job
- Open Peer Management Center.
- From the File menu, select New Job.
- Click File Collaboration or File Synchronization, and then click Create.
- Enter a name for the job and click OK.
Create the job participants
You are now ready to create the job participants. A participant can be designated as either a master or edge participant.
To effectively deploy DSU, we recommend that the job has at least four participants. At least two of those should be master participants, so that if one master participant goes offline, the other master participant can continue to serve as source for files for the edge participant.
Create a master participant
- Click the Add button.
- Select an Agent, and then click Next.
- Select a storage platform, and then click Next.
A master participant can be any type of storage device. - If you selected Windows File Server, click Next on the Storage Information page. If you selected any other type of storage platform, select existing credentials or enter new credentials.
- Click Next.
- On the Path page, enter the path to the watch set.
- Select Seeding Target if you want this participant to be a seeding target, and then click OK.
You must have at least one master participant that is not a seeding target. - Click Next.
- Select the Enable Dynamic Storage Utilization checkbox.
The Master role is selected by default. - Click Next.
The Master Data Service page appears. The Master Data Service handles requests for rehydrating stubbed files. If the Agent you selected is already being used as a master participant in another job utilizing DSU, then the existing parameters for the Master Data Service will be displayed. You can edit the values by clicking Edit Maser Data Service. Any modifications you make will be applied to every other job that uses this Agent as a master participant. - (Optional) Enter a value in the Agent Alias field.
A value is required only if the name of the Agent server is different from the Agent name (for example, if the master participant's Windows server is named "Server1.Lab.local" but the edge participant can't resolve "Server1.Lab.local" to an IP because the edge is not on the domain at all).
The value for Agent Alias can be the hostname, FDQN, or IP address of the server on which the Agent is installed. If a value is entered, it will be used by the edge service on edge participants to connect with this master data service. If no value is entered, the name of the Agent server will be used. - (Optional) If necessary, modify the port number of the server that the Agent is installed on.
- Click Finish.
The Participants page appears; the participant has been added to the Participants table. - Repeat the steps in this section to add additional master participants.
Create an edge participant
- Click the Add button.
- Select an Agent, and then click Next.
- Select Windows Files Server as the storage platform, and then click Next.
For an edge participant, the storage platform must be Windows Files Server. - Click Next on the Storage Information page.
- Click Next.
- On the Path page, enter the path to the watch set.
- Select Seeding Target if you want this participant to be a seeding target, and then click OK.
- Select the Enable Dynamic Storage Utilization checkbox, and under Dynamic Storage Utilization Role, select Edge.
- Click Next.
Create a volume policy
- Enter the root drive of the path specified on the Path page (e.g., C:) in the Temporary Storage Path field.
A subfolder named .PeerTempPath will be created under the location that you specify and DSU will store temporary file blocks in that folder. - Under Cache Size, we recommend keeping the default percentage of 75%.
The percentage represents the maximum amount of disk space you want to allocate to DSU for fully hydrated files on the volume specified by the path on the Path page. - For Cache Utilization, we recommend keeping the default value of 80%.
- For Cache Scan Schedule, accept the default value of a scan every day at 10 p.m., or set your preferred nightly schedule.
- Click Next.
Create a utilization policy
- Enter a name for the policy.
- Leave the remaining options at their default values. If you want to create a pinning filter, see the appropriate PeerGFS User Guide for instructions.
- Click Finish.
The Participants page appears; the participant has been added to the Participants table. - Repeat the steps in the Create an edge participant section to add additional edge participants.
Assign master participants to edge participants
For each edge participant, assign at least one master participant to it and set the failover order. DSU will use the master participants when reading or rehydrating a stub file on the edge participant.
- Select an edge participant in the Master Configuration table, and then click Assign.
- In the Select Master Participants dialog, set the failover order by selecting a master participant and then using the Move Up and Move Down buttons to order the participants.
- Click OK.
- Repeat for each edge participant.
- Click Next.
Create DSU email alerts
After creating a DSU-enabled job, we recommend creating DSU email alerts that flag you to potential or actual problems that allow you stay ahead of potential storage capacity limits. For example, cache utilization specifies the maximum percentage of the cache to be allocated to fully hydrated files. When the maximum percentage is reached, you will receive an alert.
To create a DSU email alert:
- Select Preferences from the Window menu.
- Expand Collab, Sync, and Replication in the navigation tree, and then select Dynamic Storage Utilization.
- Select Email Alerts in the navigation tree, and then click Create.
- Enter a name for the alert.
- Create DSU Email Alert.
- Select the caching scan events to be alerted.
- Select the volume event types to be alerted
- Select Master/Edge Services Health Monitoring if you want to be alerted if either the Peer Master Data Service or the Peer Edge Service goes down.
- Enter alert recipients, and then click Add to List.
- Click OK to close the dialog.
- The new email alert can now be applied to DSU-enabled jobs.
- Click Apply and Close or Apply.
- After creating the alert, you will need to edit the job to apply the alert to the job. Specifically, you will need to edit the volume policy for a participant.
Edit the job to apply DSU email alerts
If you have created DSU email alerts, you can now apply those alerts to a job utilizing DSU.
- In the Jobs view, right-click the job for which you want to apply alerts and select Edit Job.
The Edit Job wizard opens. - Select a participant in the Participants table.
- Click the Edit button.
The Edit Participant page appears. - Click Next.
The Dynamic Storage Utilization page appears. - Click Next.
The Volume Policy page appears. - On the Volume Policy page, click the Edit Volume Policy link.
The Volume Policies page appears. - In the Volume Policies table, select the participant that you selected in Step 2, and then click the Edit button.
- Click OK in the In Use Configuration dialog that appears.
The Edit Volume Policy page appears. - Click Next.
The Email Alerts page appears. The table on this page lists the email alerts already applied. - Click the Select button.
The Select Email Alert page appears. - Select an alert from the Email Alert dropdown, and then click OK.
The selected alert appears in the table. - Add additional alerts if desired.
- Click Finish.
- Click OK in the Data Change dialog.
- Click Apply and Close in the Volume Policies page.
- Click Finish in the Volume Policy page.
- Click OK to close the Edit Job wizard.
Get Feedback on Your DSU Jobs
- Determine the State of a File on an Edge Participant
- View DSU Job Statistics
- Examine Application-Specific Interaction with DSU
- Test Stubbing and Rehydration
Determine the State of a File on an Edge Participant
You can determine whether a file is a stub on the edge participant in one of two ways:
Method 1
Look for the offline X in the bottom left corner of the file icon, (as seen on the the left-hand file in the below example). The offline X should disappear.
Method 2
Display the Attributes column in Windows Explorer and look for the L and O attributes.
If you open that file on a client system that is using SMB to access the edge participant, it will be automatically rehydrated with file data pulled from one of the master participants. The offline X should disappear, as should the L and O attribute attributes.
L and O attributes
No L and O attributes
View DSU Job Statistics
You can get feedback about how DSU is performing in your environment. To view DSU job statistics, you must start DSU-enabled jobs. When a DSU job starts:
- Connectivity to all Agents and other file storage devices is checked.
- The real-time monitoring engine is initialized.
- A background scan is kicked off to ensure all file servers are in sync with another.
- If an edge participant is new to a job, the initial data will be replicated to that participant as stub files.
- Once the initial background scan is complete, a cache optimization scan will be run on each edge participant.
- In the Jobs view, right-click one or more of your DSU jobs, and then select Start.
- Double-click on a specific job in the Jobs view to see its runtime information and statistics.
- Double-click on the File Collaboration or File Synchronization job type in the Jobs view, and then click the Dynamic Storage Utilization tab to see details about how DSU is performing in your environment. Note that it may take some time for statistics to be displayed after starting a job for the first time.
Examine Application-Specific Interaction
We recommend that you test DSU's interaction with your other applications. For example, some applications may not like having their files stubbed or may not like the additional time it takes from the time the application tries to open a stub file to when DSU is able to get full file data from a master (an application may be used to opening a 1 GB file locally within a few seconds).
To discover any potential compatibility issues between DSU and your line of business applications, we suggest that you:
- Make a list of the key applications that you and your end users use in production.
- For each available application, perform a series of normal operations (create, open, modify, etc.) against new and legacy files:
- All operations performed against new files should be replicated to and fully available on all participants.
- Opens of older files may take longer on the edge participants as the Edge service receives file data from the Master Data service over the network.
All testing should be performed against both master and edge participants from a client system that is not hosting a Peer Agent. Client systems can be running client versions of Windows or the PMC server can be used as a client system.
Test Stubbing and Rehydration
Purpose
We recommend that in addition to testing application compatibility with DSU, you also test stubbing and rehydration to ensure that your network can handle the additional traffic required to rehydrate files from a master.
Data to Test
Testing of the stubbing and rehydration functionality is best done with older data (1 year or older). That said, if the amount of local data on an edge participant does not fill its assigned cache size, DSU will dynamically rehydrate older files to ensure a better end user experience.
Test Process
By default, DSU will perform a nightly cache optimization scan of all files on each edge participant to determine if they should be stubbed or rehydrated. To force the cache scan to run, right-click on a volume in the Dynamic Storage Utilization tab of the Collab, Sync, and Repl Summary view and click Run Cache Scan.