How It Works
The SharePoint connector is a self-contained integration that runs entirely inside your Azure subscription. ROOTKey publishes a Terraform module that you deploy once per site. The module wires up an Azure Function App that:- Resolves your SharePoint site URL to a Graph site ID.
- Enumerates every document library (drive) on the site.
- Registers a Microsoft Graph webhook subscription per drive.
- On each notification, runs a delta query against the affected drive and streams every new/updated file to the ROOTKey API authenticated with your Connector API Key.
Documents, Contracts, Marketing Assets). A single deployment auto-discovers them all and creates one Graph subscription per library. New libraries added later are picked up automatically on the next 12-hour reconciliation cycle.What This Module Creates in Your Azure Subscription
Full transparency on what lands in your subscription when youterraform apply. Everything is namespaced by name_suffix + a deterministic hash of the site URL, so multiple deployments don’t collide.
| Resource | Purpose | Cost impact |
|---|---|---|
azurerm_linux_function_app | The connector itself (Node.js 22, Consumption plan, HTTPS-only, TLS 1.2 min, HTTP/2 enabled, CORS closed). | Pay-per-execution; free tier covers 1M executions + 400 K GB-s/month, perpetual. |
azurerm_service_plan (Y1 Consumption) | Hosting plan for the Function App. | No fixed cost; you pay only for executions above the free tier. |
azurerm_storage_account | Function backing, per-drive delta cursors, lock blobs, and DLQ. | ~$0.05–1/month for typical loads. |
azurerm_storage_container (connector-state) | Holds subscriptions.json (the drive→subscription registry), delta-{driveId}.txt per drive, and lock blobs (delta-sync-{driveId}.lock, subscriptions-reconciliation.lock). | Included in storage cost. |
azurerm_storage_queue (rootkey-dlq) | Dead-letter queue for per-file failures, auto-replayed by a queue-triggered function. | Free in normal operation. |
azurerm_key_vault (Standard, RBAC) + 3 secrets | Stores the Graph client secret, ROOTKey API key, and the random webhook clientState. Secrets are never stored as plain Function App settings. | ~$0.01/month. Purge protection is enabled by default. |
azurerm_user_assigned_identity | The Function App’s identity. Granted least-privilege RBAC: Key Vault Secrets User, Storage Blob Data Contributor, Storage Queue Data Contributor. | None. |
azurerm_log_analytics_workspace + azurerm_application_insights | Telemetry and logs with configurable retention (default 30 days). | $0 within the App Insights free tier (5 GB/month). |
| Role assignments | RBAC entries linking the managed identity to the Key Vault, blob, and queue scopes. | None. |
Infrastructure Impact Summary
Does this modify my SharePoint site or my Microsoft 365 tenant?
Does this modify my SharePoint site or my Microsoft 365 tenant?
Does this modify my Resource Group or existing Azure resources?
Does this modify my Resource Group or existing Azure resources?
What happens when a document library is added to the site after deploy?
What happens when a document library is added to the site after deploy?
runOnStartup: true), the connector enumerates the site’s libraries, detects the new one, and creates a Graph subscription for it. No human action required.What happens when a document library is removed from the site?
What happens when a document library is removed from the site?
Does the file content leave my Azure subscription?
Does the file content leave my Azure subscription?
https:// API URLs at plan time). Files are streamed directly from Graph to the ROOTKey API; the Function App never writes them to local storage or to any other Azure service.What happens to my secrets?
What happens to my secrets?
clientState are all stored in Azure Key Vault in your own subscription, encrypted at rest with the Microsoft-managed key for Key Vault. The Function App resolves them at boot using its managed identity and Key Vault references (@Microsoft.KeyVault(SecretUri=…)). They are not stored as plain Function App settings.What happens if the ROOTKey API is unreachable?
What happens if the ROOTKey API is unreachable?
rootkey-dlq-poison queue for human inspection. You can configure an Azure Monitor alarm on the DLQ or poison queue length.What happens if a Microsoft Graph webhook is missed?
What happens if a Microsoft Graph webhook is missed?
runOnStartup) that reconciles all subscriptions for the site and runs a safety-net delta sync per drive. So even if a notification is dropped, the missed changes are picked up — at worst, within the next 12h, or immediately on the next deploy/restart.What happens to other webhook callers?
What happens to other webhook callers?
clientState value generated at apply time and stored in Key Vault. Any POST to the webhook URL without the matching clientState is rejected with 401. Notifications also carry a subscriptionId that the function maps to a known drive — unknown subscription IDs are logged and ignored.Can I roll this back?
Can I roll this back?
terraform destroy removes every resource the module created (Function App, storage account, Key Vault, App Insights, identity, role assignments). The App Registration, Resource Group, and SharePoint site are not deleted. Note: if you kept purge_protection on the Key Vault (the default), the Vault and its secrets will remain in soft-delete state for 7 days after destroy before they can be fully purged.Prerequisites
Before starting, ensure you have:- An Azure subscription and a pre-existing Resource Group to host the connector.
- Permissions to register applications in Microsoft Entra ID (Application Administrator or Global Administrator) and to grant admin consent.
- Permissions to apply Terraform with
ContributorandUser Access Administrator(or equivalent) on the chosen Resource Group. - The SharePoint site URL of the site you want to monitor (e.g.
https://contoso.sharepoint.com/sites/legal). The connector self-discovers every document library on the site — you do not need to provide library IDs. - Terraform v1.3 or later installed locally (or in a CI/CD pipeline that runs
terraform apply). - Node.js v22 or later on the machine running Terraform — the Function App source is compiled at apply time.
- Azure CLI authenticated (
az login) or service principal credentials in the environment.
Required Microsoft Graph and Azure Permissions
Microsoft Graph (Application permissions)
The App Registration you create needs two Microsoft Graph application permissions with admin consent:| Permission | Type | Why |
|---|---|---|
Sites.Read.All | Application | Resolve the site URL to a site ID and enumerate its document libraries. |
Files.Read.All | Application | Read drive items and file content for each library. |
Azure RBAC (granted by the module to its own managed identity)
For full transparency — the module attaches these role assignments to a brand-new user-assigned managed identity it creates. None of these grant access to anything outside the resources the module itself provisions:| Role | Scope |
|---|---|
Key Vault Secrets User | The Key Vault created by the module. |
Storage Blob Data Contributor | The Storage Account created by the module. |
Storage Queue Data Contributor | The Storage Account created by the module. |
Contributor (to create the resources) and User Access Administrator (to attach those role assignments) on the Resource Group.
Configuration Fields
| Field | Required | Default | Description |
|---|---|---|---|
| Connector Name | Yes | — | A human-readable name to identify this connector in the dashboard. |
| Destination Vault | Yes | — | The ROOTKey vault where anchored files will be stored. |
| Tenant ID | Yes | — | Microsoft Entra ID tenant ID (a UUID). |
| Client ID | Yes | — | Application (client) ID of the App Registration. |
| Client Secret | Yes | — | A client secret generated for the App Registration. Stored in Key Vault by the module. |
| Site URL | Yes | — | Full URL of the SharePoint site to monitor. Must end with .sharepoint.com. |
| Azure Region | Yes | — | Azure region where the connector resources will be deployed (e.g., westeurope). |
| Resource Group Name | Yes | — | Pre-existing Azure Resource Group. |
| Name Suffix | Yes | — | 3–12 lowercase alphanumeric chars used to namespace the resources (e.g., acme or prod). |
| Max file size (bytes) | No | 524288000 (500 MiB) | Files larger than this are skipped and sent to the DLQ. |
| Log retention (days) | No | 30 | Application Insights / Log Analytics retention (30–730). |
| Key Vault purge protection | No | true | Keep purge protection enabled for production. Set to false only during short pilots; once enabled it cannot be disabled and the Key Vault cannot be fully purged for 7 days after destroy. |
| Tags | No | {} | Extra tags applied to every module-managed resource. |
Setup
The setup has a natural ordering: the dashboard requires the Tenant/Client/Site URL to create the connector, and the Function App requires the Connector API Key to call the ROOTKey API. The dashboard resolves this by generating a ready-to-run Terraform block with all values pre-filled.Register an application in Microsoft Entra ID
- Name: something descriptive, e.g.,
ROOTKey SharePoint Connector. - Supported account types: Accounts in this organizational directory only.
- Redirect URI: leave blank.
Grant Microsoft Graph permissions
Sites.Read.AllFiles.Read.All
Create a client secret
- Set an expiry appropriate for your rotation policy (e.g., 12 or 24 months).
- Click Add and immediately copy the Value — it is shown only once.
Pre-create the Resource Group
Contributor and User Access Administrator on that Resource Group.Confirm the SharePoint site URL
https://contoso.sharepoint.com/sites/legalhttps://contoso.sharepoint.com/sites/marketing/https://contoso.sharepoint.com(root site)
.sharepoint.com.Create the connector in the dashboard
Copy the Connector API Key and the Terraform block
- The Connector API Key.
- A ready-to-run Terraform block, pre-filled with your values.
Deploy the Terraform module
.tf file in an empty directory, then run:Validate the connector
runOnStartup: true, the Function App reconciles subscriptions within seconds of deploy. You can confirm by inspecting the connector-state blob container:subscriptions.json (with one entry per document library on the site) and one delta-{driveId}.txt per drive after the first sync.Then upload a test file to any document library on the site. Within seconds it should appear in the destination vault and the connector status in the dashboard should be ACTIVE.Reliability and observability
The connector is built for at-least-once delivery to ROOTKey with explicit handling of every failure mode.Retry behaviour
| Layer | Retries | When |
|---|---|---|
| In-function retry | 3 attempts per file (initial + 2 retries) with exponential backoff (1s → 2s, capped 30s) and jitter | On 429, 5xx, network or timeout errors from the ROOTKey API. |
| DLQ replay | The queue-triggered dlqReplay function automatically reprocesses every DLQ message, with another 3-attempt in-function budget per replay | Until either success or the queue’s maxDequeueCount (5) is exhausted. |
| Poison queue | If all DLQ replays fail with transient errors, the message lands in rootkey-dlq-poison for human inspection | Persistent or systemic failure. |
| Permanent failure short-circuit | When DLQ replay hits a PermanentError (oversize, 4xx), the message is acked immediately and a structured marker is logged | The poison queue is bypassed deliberately so it stays reserved for “we don’t know why this keeps failing”. |
| Graph-side retry | Microsoft Graph retries the webhook for up to ~4 hours with exponential backoff | The Function App returns 5xx (only when the delta query itself fails). |
| Safety-net delta sync | The reconciliation timer also runs a delta query per drive — catches up on any missed webhooks | Every 12h and on every cold start. |
Idempotency
Every upload to the ROOTKey API carries three headers extracted from the Graph drive item:| Header | Source |
|---|---|
x-rootkey-source-drive-id | The drive that emitted the notification. |
x-rootkey-source-item-id | Graph driveItem.id. |
x-rootkey-source-etag | Graph driveItem.eTag (quotes stripped). |
Concurrent invocations
The connector serializes work at two levels:- Per-drive sync lease (
delta-sync-{driveId}.lock): only one delta sync per drive runs at a time. Independent drives sync in parallel. Concurrent invocations for the same drive return202 Acceptedimmediately. - Global reconciliation lease (
subscriptions-reconciliation.lock): only one timer instance reconciles subscriptions at a time. Without this, two concurrent timer runs would race onsubscriptions.jsonand create duplicate Graph subscriptions per drive (Graph allows duplicates per resource — each duplicate generates an extra notification per change).
What to monitor
| Signal | What it means | How to alert |
|---|---|---|
rootkey-dlq queue length > 0 for more than ~10 min | DLQ replay is failing repeatedly with transient errors. | Azure Monitor metric alert on the queue length. |
rootkey-dlq-poison queue receives a message | A file failed all DLQ replay attempts for a transient-looking reason — human attention needed. | Azure Monitor metric alert on ApproximateMessageCount. |
App Insights trace contains rootkey.event.dlq_replay_terminal_failure | A file hit a permanent error during DLQ replay (oversize, 4xx). The poison queue is bypassed deliberately. | App Insights alert on the message marker. |
App Insights trace contains rootkey.metric.sync_lease_contention (steady-state) | Concurrent notifications for the same drive serialize — small numbers are healthy. Sustained high rate means a drive is being hammered. | App Insights chart on bin(timestamp, 5m). |
App Insights trace contains rootkey.metric.reconciliation_lease_contention | Two timer instances raced on subscription reconciliation; the loser skipped. Expected at most once per cycle. | App Insights alert if it appears more than ~3×/day. |
Function App Http5xx > 0 | The webhook is failing (Graph will retry). | Azure Monitor metric alert. |
renewSubscription hasn’t run in > 13h | Timer is unhealthy or the Function App is stopped. | App Insights availability or platform health metric. |
ROOTKey dashboard connector status ERROR | API rejected an upload (invalid key, vault deleted, quota). | Email/Slack via your dashboard notification settings. |
Security considerations
The module ships with a defensive default posture; a few choices have intentional trade-offs that are worth understanding upfront:- Secrets in Key Vault, not Function App settings. Graph client secret, ROOTKey API key, and webhook
clientStateare all stored in Key Vault with the Function App’s managed identity grantedKey Vault Secrets User(read-only) RBAC. - Key Vault purge protection is enabled by default. Set
enable_key_vault_purge_protection = falseonly for short pilots — once enabled it CANNOT be disabled and the vault cannot be fully purged for 7 days afterterraform destroy. - CORS is closed. The webhook is server-to-server (Graph); browser access is explicitly disallowed.
- HTTPS-only, TLS 1.2 minimum, FTPS disabled, HTTP/2 enabled on the Function App.
shared_access_key_enabled = trueon the Storage Account is a known constraint of the Azure Functions Consumption plan: the runtime requires the legacyAzureWebJobsStorageconnection string to bootstrap. The connector’s own state operations use RBAC via the managed identity, not the keys. To remove the keys entirely, you need to move to a Premium / Flex Consumption / App Service plan that supports identity-based connections; that path is on the ROOTKey roadmap.
Filtering Rules
To anchor only specific files (e.g., only PDFs, or exclude temporary files), configure Filtering Rules on the connector after creation. Rules apply on the ROOTKey side — files filtered out are not stored in the vault.Troubleshooting
Files are uploaded to SharePoint but nothing reaches ROOTKey
Files are uploaded to SharePoint but nothing reaches ROOTKey
Connector status is ERROR in the dashboard
Connector status is ERROR in the dashboard
- Client Secret has expired. Generate a new secret in the App Registration, update the
graph_client_secretTerraform variable, and runterraform apply. The new secret is written to Key Vault and the Function App picks it up on the next cold start (or restart the Function App to force it). - Admin consent was revoked for one of the Graph permissions. Re-grant consent in the App Registration.
- The destination ROOTKey vault was deactivated or deleted. Reactivate it or change the connector’s vault binding.
A new document library on the site isn't being monitored
A new document library on the site isn't being monitored
az functionapp restart) — runOnStartup: true on the timer triggers reconciliation on the next cold start.If the library still doesn’t appear in subscriptions.json after a restart, check the App Insights traces for the renewSubscription invocation — the most likely cause is a permission issue with that specific library.A large file fails to upload
A large file fails to upload
max_file_size_bytes together cap the maximum file size. The default is 500 MiB on a Consumption plan instance.To support larger files: raise max_file_size_bytes in the Terraform module input and consider moving to a Premium or App Service plan with a higher memory ceiling. Open an issue on the connector repository if you need help.How do I rotate the Connector API Key?
How do I rotate the Connector API Key?
- In the dashboard, delete the connector and create a new one (the App Registration and Site URL can be reused).
- Update the
rootkey_api_keyTerraform variable with the new key. - Run
terraform apply— the module writes the new key into Key Vault. The Function App picks it up on the next cold start.
How do I rotate the Graph client secret?
How do I rotate the Graph client secret?
- In Microsoft Entra ID → App registrations → your app → Certificates & secrets, create a new client secret. Copy its value immediately.
- Update the
graph_client_secretTerraform variable. - Run
terraform apply— the new value goes into Key Vault. - Restart the Function App (e.g.,
az functionapp restart) to force it to pick up the new secret immediately, instead of waiting for the cached OAuth token to expire (up to 1 h). - After confirming the connector is healthy, delete the old secret in the App Registration.
Can I monitor multiple SharePoint sites?
Can I monitor multiple SharePoint sites?
What is the rootkey-dlq-poison queue and what do I do if I see messages there?
What is the rootkey-dlq-poison queue and what do I do if I see messages there?
- The drive item was deleted or moved by the user before retries could complete (safe — the message can be discarded).
- The ROOTKey vault is unreachable due to a misconfiguration or a key/vault rotation gone wrong.
- A persistent Graph permission issue affecting one library.
rootkey-dlq (Azure Storage Explorer makes this easy).Why does my Terraform plan want to recreate the Key Vault on destroy?
Why does my Terraform plan want to recreate the Key Vault on destroy?
enable_key_vault_purge_protection = true (the default), Azure prevents the Key Vault from being fully deleted until the soft-delete retention window (7 days) elapses. If you destroy and re-apply within that window, Terraform may attempt to recover the soft-deleted Key Vault automatically (the module’s provider config enables recover_soft_deleted_key_vaults). For short-lived pilots that need to recycle freely, set enable_key_vault_purge_protection = false.Source code
The Terraform module and Function App source live in the public rootkey-ai/rootkey-connectors repository under thesharepoint/ directory. The code is licensed under the Apache License 2.0 — you are free to fork it, audit it, or pin to a specific commit if your change-management process requires it.
→ Back to Connectors Overview

