Cleaning Up Content Cobwebs

Every Knowledge Management (KM) effort should include cleaning up existing content, whether the goal is migrating to a new system, implementing a taxonomy or search tool, or just clearing the clutter of old, irrelevant content cobwebs. Regardless of your KM drivers, the following content analysis strategy and checklist can help you take the first steps toward curated, user-friendly content that is both findable and discoverable.

As with any initiative regarding business content, we first need to understand the depth, breadth, and amount of content that exists, as well as some key information about the content itself. Starting with a high-level inventory, identify the content repositories (Document Management systems, file shares, Google Drive, SharePoint document libraries, CMS systems, etc.) and capture information about each piece of content using the key fields in the table below. If the content in scope for your current effort is somewhere in the range less than 1000 items, you can do so manually, using a spreadsheet with columns for each of these fields:

Content Inventory table
However, if you are faced with the haunting task of inventorying millions of content items, consider using tools to automate your process. Helpful tools would include a duplicate finder, disk space software, or content analysis tools (such as Duplicate File Detective, FolderSizes, TreeSizes, etc.).

If your content is complex or scattered across multiple libraries, sites, and systems, the concept of inventorying content items individually is enough to make anyone give up or lose momentum. Check to see if your CMS can generate a list of all items (documents, pages, etc.). Certain CMS solutions can also provide the majority of the information gathered in a typical content inventory.

Regardless of whether you will make a manual action or create rules to remove content that matches the characteristics from scope for migration or retainment, here is where the ROT analysis can help you shine.

In addition to the acronym coined by Zach Wahl, NERDy, the knowledge and information management community has oft spoken about content “ROT” and the importance of cleaning content. So what is ROT and how can we use yet another acronym to facilitate the clean-up and overall improvement of our business content?

ROT stands for:

Three blocks, each block contains a different letter of ROT with an accompanying graphic


To assess for content ROT, we can ask the following questions:

  • Is this item Redundant?
    • Are there duplicates of the same item?
    • Can this information be found elsewhere?
  • Is this item Outdated or Obsolete?
    • When was it last modified?
    • When was it last accessed?
    • Is the information within the item still relevant?
    • Is the information within the item accurate?
    • Is there a newer version?
    • Is the item subject to a retention schedule?
      • Has it met its retention period?
    • Is the item subject to a workflow or review and update process?
  • Is this item Trivial?
    • Is this a business related item?
      • Or is it personal?
    • Is this a system generated file?
    • Is it a file extension other than .pdf, .doc, .xls, .ppt, etc?

Then, consider making the following recommendations from the results of your high-level inventory:

  • Maintain as-is: Content item is up-to-date, relevant, and meets and branding and editorial guidelines. No recommended action.
  • Archive: Content item is no longer relevant. Recommendation is to archive or delete
    content item. This could mean leaving content that hasn’t been accessed in decades behind, or a more in-depth archival process.
  • Update: Content item is relevant but inaccurate or doesn’t contain the most recent
    information. Recommendation is to update content item with most up-to-date information
    and/or ensure that the content item meets branding and editorial standards.

To track these recommended actions, you may want to include a few columns in your inventory spreadsheet to track the clean-up decisions you are recommending. These could be:

Now that we’ve identified which pieces of content should be maintained as-is, updated, or archived, it’s time to design content types, establish content governance, and create archival and/or deletion processes.

Do you need help strategically cleaning your content to prepare for a migration, a content strategy and governance plan design, or making your content more easily findable? Let us help.

Rebecca Wyatt Rebecca Wyatt Skilled trainer, content strategist, and project manager who is focused on empowering teams and maximizing learning. Rebecca is a self-described "learning addict" who is at her best collaborating with and inspiring teams to greater success. More from Rebecca Wyatt »