Archiving Sitecore XDB Interaction Data

 

Quarterly Cleanup & Archiving of Sitecore XDB Interactions


Requirement

Delete all old interactions (above 1,000) for contacts that have more than 1,000 interactions. These interactions should be archived to a separate database before deletion.
Note: Contacts must not be deleted.
Frequency: This operation should be performed quarterly.


Challenges

Sitecore XConnect does not provide a direct API to delete only the interactions of a contact. Available APIs delete the entire contact along with its interactions, which is not acceptable for this use case.


Implementation Approach

Due to the limitations mentioned above, we will use a SQL Server Job to run a stored procedure quarterly. This procedure will:

  • Archive interactions over the 1,000 mark for each contact to a separate database.

  • Then delete those interactions and their facets from the main xdb_collection tables.

⚠️ Important: Update the database and table names according to your environment.


Step-by-Step Implementation


Step 1: Schedule SQL Job

Create a SQL Server Job to run the stored procedure quarterly.
Note: Steps to create a SQL job are widely available online and not covered here.


Step 2: Create Archive Database

Create a database on the same server as your shard databases.
Suggested Name: XDB_Interaction_Archived


Step 3: Create Archive Tables

In the XDB_Interaction_Archived database, create the following four tables:

Table NameDescription
Interactions_Archive_Shard0Stores interaction data archived from shard0
InteractionFacets_Archive_Shard0Stores interaction facet data from shard0
Interactions_Archive_Shard1Stores interaction data archived from shard1
InteractionFacets_Archive_Shard1Stores interaction facet data from shard1

Note: Click the link above to download the code.


Step 4: Create Stored Procedures

Two stored procedures are required, one for each shard:


a. Archive_Shard0_XDB_Interaction

This runs against:

  • [xdb_collection].[interactions]

  • [xdb_collection].[interactionfacets]

Functionality:

  • Finds contacts with more than 1,000 interactions.

  • Keeps the most recent 1,000 and archives the rest.

  • Archives interaction and facet data to XDB_Interaction_Archived.

  • Deletes archived records from the source tables.

Tables used:

  • Archive: Interactions_Archive_Shard0, InteractionFacets_Archive_Shard0

  • Source: [sc10dev_Xdb.Collection.Shard0][xdb_collection].[interactions], [sc10dev_Xdb.Collection.Shard0][xdb_collection].[interactionfacets]


b. Archive_Shard1_XDB_Interaction

Same logic as above, but operates on:

  • shard1 tables

  • Archives to Interactions_Archive_Shard1 and InteractionFacets_Archive_Shard1

Stored procedure scripts for both shards are similar. Only table names differ.


Detailed Explanation: Archive_Shard0_XDB_Interaction

Here’s a breakdown of how the stored procedure works:

  1. Identify all contacts with > 1,000 interactions.

  2. Use ROW_NUMBER() to sort interactions per contact by lastmodified DESC.

  3. Skip the most recent 1,000 interactions.

  4. Archive the remaining interactions and their facets to the archive database.

  5. Delete archived data from the original shard tables.

  6. Process interactions in batches of 1,000 for better performance and error handling.


Key SQL Logic (Simplified)

-- Step 1: Create a temporary table with row numbers per contact SELECT interactionid, ROW_NUMBER() OVER ( PARTITION BY contactid ORDER BY lastmodified DESC ) AS RowNum INTO #temp FROM [sc10dev_Xdb.Collection.Shard0].[xdb_collection].[interactions] WHERE contactid IN ( SELECT contactid FROM [sc10dev_Xdb.Collection.Shard0].[xdb_collection].[interactions] GROUP BY contactid HAVING COUNT(contactid) > 1000 ) ORDER BY 2; -- Step 2: Remove the first 1000 records per contact DELETE FROM #temp WHERE RowNum <= 1000; -- Step 3: Reassign row numbers for batching WITH newrownum AS ( SELECT interactionid, ROW_NUMBER() OVER (ORDER BY interactionid ASC) AS NewRowNum FROM #temp ) UPDATE t SET t.RowNum = n.NewRowNum FROM #temp t JOIN newrownum n ON t.interactionid = n.interactionid; -- Step 4: Declare batch processing variables DECLARE @BatchSize INT = 1000; -- Set batch size DECLARE @ProcessedCount INT = 0; -- Initialize processed count DECLARE @TotalCount INT; -- Get total number of records to process SELECT @TotalCount = COUNT(*) FROM #temp; -- Step 5: Process in batches WHILE EXISTS ( SELECT 1 FROM #temp WHERE RowNum > @ProcessedCount ) BEGIN PRINT 'Processing batch: ' + CAST(@ProcessedCount / @BatchSize + 1 AS VARCHAR(10)) + ' of ' + CAST((@TotalCount / @BatchSize) + 1 AS VARCHAR(10)) + ' (Rows ' + CAST(@ProcessedCount + 1 AS VARCHAR(10)) + ' to ' + CAST(@ProcessedCount + @BatchSize AS VARCHAR(10)) + ')'; BEGIN TRY BEGIN TRANSACTION; -- Archive InteractionFacets INSERT INTO [XDB_Interaction_Archived].[dbo].[InteractionFacets_Archive_Shard0] SELECT * FROM [sc10dev_Xdb.Collection.Shard0].[xdb_collection].[interactionfacets] WHERE interactionid IN ( SELECT interactionid FROM #temp WHERE RowNum > @ProcessedCount AND RowNum <= (@ProcessedCount + @BatchSize) ); DELETE FROM [sc10dev_Xdb.Collection.Shard0].[xdb_collection].[interactionfacets] WHERE interactionid IN ( SELECT interactionid FROM #temp WHERE RowNum > @ProcessedCount AND RowNum <= (@ProcessedCount + @BatchSize) ); -- Archive Interactions INSERT INTO [XDB_Interaction_Archived].[dbo].[Interactions_Archive_Shard0] SELECT * FROM [sc10dev_Xdb.Collection.Shard0].[xdb_collection].[interactions] WHERE interactionid IN ( SELECT interactionid FROM #temp WHERE RowNum > @ProcessedCount AND RowNum <= (@ProcessedCount + @BatchSize) ); DELETE FROM [sc10dev_Xdb.Collection.Shard0].[xdb_collection].[interactions] WHERE interactionid IN ( SELECT interactionid FROM #temp WHERE RowNum > @ProcessedCount AND RowNum <= (@ProcessedCount + @BatchSize) ); COMMIT TRANSACTION; SET @ProcessedCount = @ProcessedCount + @BatchSize; END TRY BEGIN CATCH ROLLBACK TRANSACTION; PRINT 'Error occurred while processing batch. Rolling back transaction.'; PRINT 'Error Message: ' + ERROR_MESSAGE(); BREAK; END CATCH END -- Step 6: Clean up temporary table DROP TABLE #temp;

Summary

This approach efficiently handles the interaction cleanup in Sitecore XDB:

  • Keeps recent 1,000 interactions per contact.

  • Archives older data securely.

  • Avoids deleting any contact.

  • Runs quarterly via scheduled SQL Job.

Comments

Popular posts from this blog

Solrcloud With Zookeeper -Single server setup

Render Sitecore Experience Forms Using Sitecore XP 10.4 with a Headless Approach (Next.js + JSS SDK)

Next.js with XM Cloud EDGE and GraphQL