Thursday, 11 December 2025

Building Headless AEM Apps Using Content Fragments: A Complete Guide

 Adobe Experience Manager (AEM) is no longer just a traditional CMS for page-based websites-it has evolved into a powerful

This article explains how to build headless AEM applications using Content Fragments, with a step-by-step development flow, architectural guidance, and best practices.


 What Are Content Fragments in AEM?

Content Fragments are structured, channel-agnostic content units stored in AEM’s DAM.
They are ideal for headless scenarios because:

  • They follow schema-defined models

  • They deliver raw structured content (JSON)

  • They are reusable across apps, channels, and devices

  • They decouple content from presentation

Example use cases:

  • Mobile app content (banners, FAQs, promotions)

  • Product details and catalogs

  • Multi-language content distribution

  • Chatbot or voice assistant responses

  • API-driven websites or micro frontends


 How Content Fragment Models Work

A Content Fragment Model (CFM) defines the structure of a fragment.
Typical field types include:

  • Text (plain or rich)

  • Multi-line text

  • Number

  • Boolean

  • Enumeration

  • Date/time

  • JSON objects

  • References to other fragments (nested content)

  • Content references (DAM, images, etc.)

A well-designed CFM acts like a lightweight backend schema.

Example CFM (Blog Post):

  • Title

  • Summary

  • Body (rich text)

  • Author

  • Tags

  • Publish Date

  • Related Articles (fragment reference)


 Why Use Content Fragments for Headless Apps?

1. Pure Headless JSON Output

AEM provides out-of-the-box JSON endpoints via:

2. Reusable Structured Content

CFs can be reused across:

3. Built-in Governance

4. Cloud-Ready Scaling

In AEM as a Cloud Service, headless delivery automatically scales globally.


 Building a Headless AEM App Using Content Fragments (Step-by-Step)

Step 1: Create a Content Fragment Model

  1. Go to Tools → Assets → Content Fragment Models

  2. Create a new model (ex: “Product Details”)

  3. Add fields:

    • Product Name

    • Price

    • Description

    • Image

    • Specifications (JSON)

  4. Save and publish the model

Step 2: Create Content Fragments

  1. Go to Assets → Files

  2. Create a new fragment using your model

  3. Enter structured content

  4. Add references or nested fragments if needed

  5. Publish the fragments

Step 3: Expose Content via JSON (Two Options)


Option A: Using AEM GraphQL APIs (Recommended)

GraphQL is the preferred headless API for complex queries.

Example query:

{ productList { items { name price description image { ... on Asset { _path } } } } }

Endpoint example:
/content/graphql/global/endpoint.json

Benefits:


Option B: Direct CF JSON API

Default JSON endpoint:

https://publish.author-domain/content/dam/path/to/cf.model.json

Best for:

  • Simple apps

  • Static integrations

  • Low-logic use cases


 Step 4: Consume Content in Your App

Use CF JSON in any frontend:

Example React fetch:

fetch("https://publish-domain/content/dam/app/products/product1.model.json") .then(res => res.json()) .then(data => console.log(data));

Or use GraphQL:

import { ApolloClient, InMemoryCache, gql } from '@apollo/client'; const client = new ApolloClient({ uri: '/content/graphql/global/endpoint.json', cache: new InMemoryCache() }); client.query({ query: gql` { productList { items { name price description } } } ` }).then(result => console.log(result));


 Step 5: Deploy & Integrate at Scale

On AEM as a Cloud Service:

  • CF APIs are globally cached

  • Publish tier auto-scales

  • CDN caching + edge delivery optimizes load

  • Pipelines ensure safe deployments


 Best Practices for Headless AEM Using CFs

1. Keep CF Models small & reusable

Avoid bloated schemas; break content into smaller fragments.

2. Use fragment references for relationships

Good for product families, related posts, FAQs, etc.

3. Prefer GraphQL over JSON endpoints

More efficient, especially for filtering and multi-level queries.

4. Enable caching with immutable paths

CF JSON paths work great with CDN caching.

5. Use AEM workflows for approvals

Ensure consistent publishing across channels.

6. Don’t mix presentation logic into CFs

UI should be controlled by the consuming app.


 Conclusion

Content Fragments make AEM a powerful headless CMS.
By combining robust content modeling, GraphQL APIs, and cloud scalability, AEM enables developers to build fast, flexible, and enterprise-grade omnichannel applications.

Whether you’re building a mobile app, a React SPA, or a global multi-channel platform, Content Fragments deliver clean structured content that fits seamlessly into modern headless architectures.

Monday, 8 December 2025

Decoding AEM's JCR: A Developer's Guide to Node Types and Content Structure

If Adobe Experience Manager (AEM) is the heart of your digital experience, then

Nodes form a tree-like, hierarchical structure within the repository, much like a computer's file system, serving as the building blocks for all content. Each node is defined by a specific type, which dictates the permitted properties (name-value pairs holding the actual data) and allowed child nodes.

Here is a comprehensive overview of the node structure that powers AEM.


The Three Pillars of AEM Node Types

AEM’s robust structure stems from its underlying technologies. Because AEM leverages both the Apache Sling framework for request processing and the Java Content Repository (JCR) specification for data storage, it utilizes node types originating from three distinct categories:

  1. JCR Node Types: The foundational, core types defined by the JCR specification (e.g., nt: and mix:).
  2. Sling Node Types: Types introduced by the Apache Sling framework, often related to resource resolution and folder structures (e.g., sling:).
  3. AEM Custom Node Types: Specific types created by Adobe to manage high-level CMS concepts like pages, components, workflows, and assets (e.g., cq: and dam:).

To inspect the definitive, current list of all associated properties and definitions, developers can always use CRXDE to browse the AEM repository at /jcr:system/jcr:nodeTypes.


1. JCR Node Types: The Foundation

JCR node types provide the basic blueprint for storing data, enforcing rules, and creating hierarchy.

Primary Node Types (The Structure)

Every node must have one declared primary node type.

  • nt:base: This is the abstract base primary node type from which all others inherit. It is crucial because it ensures every node exposes two properties related to its type: jcr:primaryType (the node’s type name) and jcr:mixinTypes (a list of applied mixin types).
  • nt:unstructured: This is a highly flexible type used for storing unstructured content. It allows for any number of child nodes or properties with arbitrary names, and it supports client-orderable child nodes.
  • nt:hierarchyNode: This abstract node type serves as the supertype for structural elements like nt:file and nt:folder.
  • nt:file / nt:folder: These types are used to represent standard file system concepts. An nt:file node requires a child node, typically named jcr:content, which often uses the nt:resource type to hold the actual file content.
  • nt:resource: Used to represent file content itself, notably defining the mandatory binary property jcr:data.

Mixin Node Types (The Features)

Mixin node types are added to specific node instances to incorporate additional characteristics, often related to repository features or metadata. A node can have zero or more mixin types.

  • mix:title: Adds standardized metadata properties, namely jcr:title and jcr:description.
  • mix:created: Used to add creation tracking properties, such as jcr:created (the date) and jcr:createdBy (the user). These properties are often auto-created and protected by the repository.
  • mix:lastModified: Provides modification tracking via jcr:lastModified and jcr:lastModifiedBy.
  • mix:referenceable: Makes a node capable of being referenced, adding the mandatory, protected, auto-created jcr:uuid property which exposes the node's identifier.

2. Sling Node Types

Apache Sling defines node types generally focused on resource and folder handling. Examples of Sling node types defined primarily in the JCR resource bundle include sling:Resource, sling:Folder, and sling:VanityPath. Notably, the cq:PageContent node type inherits from sling:Resource.

3. AEM Custom Node Types

These types, typically prefixed with cq: (for content query/core) or dam: (for digital asset management), define the functionality that AEM developers interact with every day.

Web Content Management (WCM) Structures

AEM organizes content using specific WCM node types:

Node TypeDescriptionKey Properties / Subnodes
cq:PageDefines the overall AEM page structure.Mandatory child node jcr:content (which holds the primary content).
cq:PageContentDefines the content node underneath cq:Page. This node holds WCM-specific properties.jcr:title, cq:template (path to the template), navTitle (used in navigation), hideInNav, onTime, and offTime.
cq:TemplateDefines an AEM template.jcr:content (default content for new pages), allowedParents, allowedChildren, and ranking.
cq:ComponentDefines an AEM component.jcr:title, componentGroup, dialog (primary dialog node), design_dialog (design dialog node).
cq:EditConfigDefines the configuration for the edit bar of a component.cq:dialogMode, cq:layout, cq:actions, and subnode cq:inplaceEditing.

Digital Asset Management (DAM)

The DAM structure uses specialized nodes for managing assets:

  • dam:Asset: Defines a DAM asset. It is a hierarchy node that requires a mandatory child node named jcr:content of type dam:AssetContent.
  • dam:AssetContent: Defines the content structure of an asset, containing subnodes for metadata and renditions.

Operational and Workflow Nodes

AEM uses node types to track system state and processes:

  • cq:ReplicationStatus: A critical mixin that exposes replication status information, including properties like cq:lastReplicated, cq:lastReplicatedBy, and cq:lastReplicationAction (which can be 'activate' or 'deactivate').
  • cq:Workflow: Represents an active instance of a workflow.
  • cq:WorkflowModel: Represents the definition of a workflow.
  • cq:Tag: Defines a single tag, which can also contain other tags, thus creating a taxonomy.
  • cq:Taggable: An abstract base mixin used for content that can be tagged.

By mastering these fundamental building blocks—from the JCR nt:base that ensures node identity, to the cq:PageContent that holds your page's metadata, and the cq:ReplicationStatus that tracks publishing state—you gain the ability to efficiently organize and retrieve all the data in your AEM instance.


In essence, understanding AEM nodes is like knowing the alphabet before writing a novel. Each node type is a specific letter or punctuation mark, and only by using the correct ones in the right hierarchy can you construct a coherent and functioning piece of content architecture.

The OSGi Framework: Unlocking Dynamic Modularity in Java and AEM

The development of large-scale Java applications often struggles with complexity and rigidity. Historically, updating a single feature meant stopping and redeploying the entire application. The -OSGi Framework- (Open Services Gateway initiative) was created to solve this problem by introducing true modularity and dynamism.

OSGi is a -modular, dynamic Java framework- designed for building Java applications as a set of loosely-coupled modules. This dynamic, modular architecture serves as the foundation for how Adobe Experience Manager (AEM) organizes its backend modules and components.

# 1. The Core Unit: Bundles

In the OSGi architecture, the basic modular unit is the -bundle-. A bundle is essentially a standard Java Archive (JAR) file, but it is combined with a manifest file (-MANIFEST.MF-) enriched with OSGi-specific headers.

This manifest is vital because it defines important metadata, including the bundle's unique symbolic name and version. Crucially, it dictates how the code interacts with the outside world by specifying -exported packages- (what code is exposed) and -imported packages- (what dependencies are required). By strictly managing these package definitions, the OSGi framework enforces the modular, loosely-coupled design philosophy.

# 2. The Dynamic Advantage: Runtime Management


The key differentiator of OSGi is its dynamic runtime capability, which governs the -bundle lifecycle-.

Unlike monolithic applications, bundles go through a lifecycle managed by OSGi: they can be -installed-, -resolved- (meaning dependencies are satisfied), and -started- (activated). The immense flexibility stems from the ability to manage these states dynamically. Bundles can be -stopped, updated, or removed at runtime-—without requiring a full application restart. This means that bugfixes or new features can replace only the relevant bundles, dramatically streamlining maintenance and upgrades.

# 3. OSGi as the Foundation of AEM

In the context of AEM, this dynamic architecture is fundamental. Instead of acting as a monolithic Java web-app, AEM and its extensions are built as many small, modular bundles. This structure enables easier maintenance, upgrades, reuse, and the crucial dynamic runtime behavior that AEM developers rely on.

Many functional pieces within the AEM context, such as custom services, utilities, and third-party integrations, are packaged specifically as OSGi bundles.

# 4. Communication and Customization


With so many independent modules running, the OSGi Framework requires structured ways for them to communicate and be customized:

## Services and Components

To facilitate communication between bundles, OSGi provides a -service registry- mechanism.

1.  -Services:- These are Java interfaces or classes that provide specific functionality. Bundles that implement a feature can -register a service implementation- with the service registry.
2.  -Components:- A -component- is the logical module—the class within a bundle—that either implements services or depends on other services. Other bundles or components can then dynamically -discover and consume- that service via the registry. Components manage their own lifecycle within the OSGi runtime.

## Configuration

For true flexibility, components must be customizable without changing the underlying code. OSGi supports the -external configuration- of components to manage configurable parameters, such as external API URLs, feature toggles, or environment-specific settings.

In AEM, these configurations are defined using JSON-based -.cfg.json- files. By placing these files in run-mode specific folders, developers can ensure different settings are applied for environments like DEV, QA, or PROD.

# Summary

The OSGi Framework successfully integrates these concepts to achieve enterprise-level modularity:

*   -Bundles- package the code.
*   -Components- inside bundles implement functionality and provide or consume -Services- via the registry.
*   -Configurations- allow customizing behavior dynamically per environment.

Sunday, 30 November 2025

Fixing Author-to-Author Migration Mistakes in AEM as a Cloud Service: How to Recover Publish State Correctly

Migrating from AEM On-Premise to AEM as a Cloud Service (AEMaaCS) involves a precise content strategy. The industry-standard and Adobe-recommended practice is:

  • Author → Author migration for authored content

  • Publish → Publish migration for live site content


This ensures that the AEM Cloud Publish environment accurately reflects real production, while Author carries the drafts, workflows, approvals, and creation history.

However, mistakes do happen—and one of the most common (and most damaging) is migrating everything from Author → Author and then accidentally publishing all content to Cloud Publish.

This article explains why this is a problem, what risks it introduces, and—most importantly—how to fix it safely and correctly.


Understanding the Mistake: Author-to-Author Migration Followed by “Publish All”

In the mistaken scenario:

  1. Your team migrated all content from On-Prem Author → AEM Cloud Author

  2. Then executed a bulk publish, sometimes unintentionally

  3. As a result:

    • Draft-only content is now publicly visible

    • Deactivated pages from on-prem are active again

    • Historical publish status is lost

    • AEMaaCS Publish does not reflect the real production environment

    • Audit records and approval workflows have been bypassed

This leads to a polluted publish repository, breaking the very purpose of maintaining a trusted live environment.


Why This Is Serious

Incorrect publish data affects:

1. SEO & Public Visibility

Internal pages, test pages, and drafts may get indexed by Google.

2. Compliance & Governance

Pages that were intentionally un-published are suddenly active.

3. Authoring Confidence

Teams can no longer trust what is “live.”

4. Replication Metadata

Original activation dates, users, and states are lost forever.

5. Future Deployments

Content freeze and UAT validation become unreliable.

Fixing this requires a structured, safe approach.


How to Fix Wrong Author-to-Author Publishing During Migration

Below are the best and safest approaches, starting with the recommended one.


Option 1: Re-Do Publish Migration Properly (Recommended Solution)

This is the cleanest and most accurate method—ideal if you are still in pre-production stages.

Steps:

  1. Reset AEM Cloud Publish Environment
    Request Adobe/Cloud Manager to wipe and reinitialize Publish.

  2. Re-export content from On-Prem Publish

    Use cloned instance and Content Transfer Tool 

  3. Import to AEM Cloud Publish → Publish
    This restores the true, correct, production content.

  4. Re-run Author-only delta migration
    If content changed on on-prem author, sync only those changes.

Result:
AEM Cloud Publish now accurately matches your pre-migration live site.


Option 2: Use Replication Metadata to Undo Incorrect Publishing

If restarting migration is not feasible, you can restore state using metadata.

Steps:

  1. Export /var/replication metadata from On-Prem Publish

  2. Compare it with AEM Cloud’s metadata

  3. Identify:

    • Pages that should NOT be published

    • Pages that were previously deactivated

    • Pages with mismatched activation timestamps

  4. Using a script or ACS Bulk Replicator:

    • Deactivate incorrect pages

    • Re-publish only valid published pages

This is effective, but not perfect—you may miss some edge cases.


Option 3: Partial Cleanup for Limited Content Areas

Use this if only some sections of the site were impacted.

Steps:

  1. Identify URLs incorrectly published

  2. Get confirmation from business/UAT

  3. Use Bulk Unpublish scripts to deactivate

  4. Re-publish only approved content

Good for small or medium-sized sites, but not for large enterprise DAM/content.


What You Should NOT Do

  • Do not manually check pages one-by-one — impossible for large sites

  • Do not trust authors to identify wrong-published pages — risky & incomplete

  • Do not keep the polluted publish environment — it breaks future releases

  • Do not adjust publish directly without metadata analysis — creates inconsistencies


Recommended Final Strategy

If you need a clean and correct fix:

➡️ Reset Publish + Re-import Publish-to-Publish Migration

Then:

➡️ Run a delta Author migration

This is how 90% of enterprise projects recover from this mistake and move forward safely.


Conclusion

Migrating to AEM as a Cloud Service requires strict alignment between Author and Publish flows.
When everything is migrated Author → Author and mistakenly published, it disrupts production integrity, governance, SEO, and live accuracy.

Fortunately, with the methods described:

  • Full republish rebuild

  • Metadata-driven cleanup

  • Partial environment correction

You can restore your AEM Cloud Publish environment back to the correct state.

No matter where you are in your migration cycle, there is a safe recovery path.

Faster AEM Cloud Builds: How Module Caching Changes the Game

 AEM as a Cloud Service introduces module caching to significantly reduce build and deployment times across your CI/CD pipelines. This article explains how module caching works, why it improves build performance, and what developers need to configure to take full advantage of it. Learn how cached dependencies, optimized build steps, and smarter pipeline reuse can cut down your build duration and accelerate release cycles—making your AEM Cloud deployments faster, more efficient, and developer-friendly.

A new build model compiles only changed modules (rather than the entire repo) using module-level caching to shorten build times. It applies to code-quality, full-stack, and stage-only pipelines.

Edit Non-Production Pipeline dialog box showing the two Build Strategy options which are Full Build and Smart Build
 

How to achieve this? 
Edit Non-Production Pipeline dialog box showing the two Build Strategy options which are Full Build and Smart Build.

In the Add/Edit Pipeline dialog box, under the Source Code tab, a new Build Strategy section lets you choose one of the following build options:

  • Full Build : Builds all modules in the repository on every run.
  • Smart Build : Builds only modules that changed since the last commit, which shortens overall build time.

You control which pipelines use Smart build. During the beta, this option appears only for Code Quality and Dev Deployment pipelines.

If you are Interested in this, you will have to reach out to Adobe by Email beta_quickbuild_cmpipelines@adobe.com with your Adobe OrgID and Program ID.