Documents that dazzle, content that clicks, and semantics that snap: the pièce de résistance to your content strategy

When you put a LEGO model together, there’s a certain feeling you get when you click two pieces together. A visceral satisfaction with the way each brick fits into its counterpart with just the right amount of clutch power, forming one small piece of a larger whole. That feeling comes from the studs, the nubbins on the top of each brick that interlace into the hollows of connecting pieces. If you took the studs away, you may technically still be able to build the model, but it would require a lot of super glue, and you’d surely give up long before you finished.

Document semantics play a similar role when it comes to your content in AEM. Sure, you can build a website without them, and it can still technically work, but the overall process is harder, and the end result is fragile. Unfortunately, I’ve all too often worked with customers and partners doing just that, ignoring the document entirely when creating Universal Editor content models.

Today, I want to convince you that document semantics are important even if no one is ever going to see the document view of a page, and show you how easy it can be to do.

What are Document Semantics?

Before I go too far, I feel like I should define what document semantics are. For the purposes of this post, think of document semantics as the meaning carried by the document itself and the extent to which the raw, undecorated, content can be easily read, understood, and conveys its meaning to the reader.

This should be true at both the page level and the block level. At the page level, this means things like keeping headings in a logical order, and using section breaks to separate content. At a block level, you should group content logically within the blocks “cells,” avoid excessive row and column spans, etc.

Hint: you can get a document view of any AEM page by using the sidekick. Just right-click and choose “View Document Source.” This is a great way to see what your page looks like as a document and decide for yourself if it can be easily read and understood or not.

Screenshot of using the AEM Sidekick to view document source

What’s in it for You?

When authors manage their content in documents (using Sharepoint, Google Drive, or DA), the need for document semantics become obvious. You want to create an authoring experience that is simple, intuitive, and easy to use. By making sure the documents where authors manage content convey meaning and are easy to understand, you are doing just that.

On the other hand, when authors work primarily in Universal Editor (or any other third-party content source) it can be easy to dismiss, or at least not prioritize, the investment of time it takes to create semantic models for your blocks. But, there are a few clear benefits to doing so that make the investment payoff.

Agent Friendly

As new AI agents, voice assistants, and other related technologies emerge, semantic content will be more easily interpretable and usable by these systems.

Portability and Re-Use

Edge Delivery Services as a platform is content source agnostic, with native support for many content sources OOTB (Universal Editor, DA, Sharepoint, Google Drive), and can easily support other sources with BYOM. Your project should strive to be content source agnostic as well, so that it is portable across all the content sources AEM supports.

By creating semantic content models, you give authors a great and easy to use experience with any content source they choose to use. This lets you support additional repoless sites from the same codebase, even if those sites aren’t authored in the same way. Similarly, you can add a content source overlay to author certain section(s) of your site in a different way.

Cleaner and Simpler Code

When your content structure follows logical document semantics, the underlying code becomes more straightforward to maintain and debug. This is because the starting DOM structures of your blocks are more concise, making the relationships between content elements more predictable and the transformation logic you need to write and maintain more direct and efficient.

Content Syndication/API Consumption

The undecorated HTML rendered by AEM can be thought of as the contract between AEM and other parts of the system. For the same reasons that semantic structures make your block code cleaner, they also make it easier to consume the content from other systems. This means you can syndicate the content to mobile apps or expose it as an API for use on any number of devices/platforms.

A Real World Example

Some of the concepts above may seem fuzzy. So let me try to make them concrete with a real-world example. To demonstrate, I’m going to use the hero block I encountered when working with a customer. To protect the innocent, I’ve re-implemented everything on my own and made some minor changes, but the concepts and structures are very much real.

Screenshot of the doc view before fixing the document semantics

The document view of our hero block, before and after

The block in question is relatively complex, supporting image and video backgrounds, various style and layout options, up to 2 CTAs, and more. You can see the original version here and to me, it has a few obvious problems:

Cleaning it Up

By using type inference, field collapse, element grouping, and a few other relatively simple changes, we can devise a content model that fixes all of these problems.

Style Options as Classes

The first thing we are going to do is to move all the style options out of the block’s content and render them as classes at the block level. All we need to do is change the name of these fields to start with classes_. This doesn’t change the authoring experience at all, it just fixes how they get mapped to the output html, and therefore how we deal with these values in our code.

Grouping Elements

When I look at this block, I see effectively 3 types of content we can group together.

So let’s group those together by prefixing all the property names with groupName_. Just by doing this, we’ve gone from 20+ block content cells to 4.

Fixing the CTAs

Lastly, I want to fix the CTAs. To do this, we are going to do a few things:

You can see the final version here and the PR with all the changes. The benefits to me are just as obvious as the problems the original had. A simpler and more intuitive block is easier to reason about, both for an author, and for a developer writing the code to decorate it. And it really wasn’t that complicated to do, mostly, it comes down to renaming a few properties to conform to documented conventions.

Apply this to Your Project Today

When it comes to defining content structures that are intuitive, it is definitely more art than science. Even with the final hero block shown above, reasonable minds could disagree about how to best group the elements, etc. But here are some good practices I try to follow that should lead you in the right direction. These principles, when properly applied, deliver that same visceral satisfaction you get from LEGO bricks clicking together. Follow these, and you'll build a semantic masterpiece instead of a fragile tower of digital superglue: