One of the most vexing problems facing organizations today lies in the digital landfill known as “shared drives” (aka network drives or file shares). As an information governance consultancy, IMERGE develops information classification schemes, but applying those schemes to even a few gigabytes of shared drive content can be time-consuming and challenging. The effort to organize content with actual value requires elimination of the valueless kind—redundant, outdated and trivial (ROT) content.
The primary information available to evaluate content is the metadata associated with the folders and objects on the file share: folder names, file names, date created, date modified, date accessed, etc.—basically, the file attributes shown in File Explorer. Additional metadata, as shown to the right, is stored in file properties, which may be useful in establishing the value of content. Of course, opening and reading content or properties on individual files is far too time-consuming to be a realistic approach. Using simple shareware tools, such as Directory Lister Pro, is one approach to extracting this data; once extracted, the data can be moved to Excel, but it can be a daunting task to manipulate thousands of rows of data and get meaningful results.
There are added complexities when evaluating content, such as metadata not being accurate. I had the unfortunate necessity to restore my files from a cloud backup vendor on May 28, 2008. Why do I remember this date seven years later? Because this restore reset all of the modified dates. Then on September 21, I restored files from OneDrive to a new computer and, as shown below, the created and last accessed date now show the dates of the latest restore. Metadata changes from backup or other system designs are commonplace, albeit less so in recent years.
While metadata integrity is one vexing challenge, the most common issue originates from the organic development of shared drive file structures over the years. Shared drives, by and large, are not planned; rather, they are created by individuals for their own purposes and without an enterprise view. As a result, there is content with varying retention rules mixed together within a folder structure, making it impossible to simply delete a folder and all of its content.
In a perfect world, our shared drive structures would look like the one on the left—consistent naming conventions, date clarity and unambiguous content. However, the vast majority of shared drives look like the structure on the right: mixed content types, ambiguous or useless file names (e.g., letter.docx), confusion from duplication or near-duplication (e.g., the HVAC agreement versions with nearly identical created dates) and personal content (e.g., wedding video and photo). Of course, it wouldn’t be unusual for a shared drive structure to contain hundreds and tens of thousands of files.
When initiating a content analysis effort, the basic decision tree looks like this:
The keys to success are carefully defined business rules and consideration of the risks and costs associated with failing to expunge expired content or expunging the wrong content. The rules must then be applied in the order that remediates the most content for each rule while respecting governance rules, such as the retention schedule, legal holds, etc. Finally, instituting a review process for likely ROT and a method for classifying and organizing the remaining valuable content is a must.
Jim Just is a partner with IMERGE Consulting, Inc., with over 20 years of experience in business process redesign, document management technologies, business process management and records and information management. For more information, visit www.imergeconsult.com.