> ## Documentation Index
> Fetch the complete documentation index at: https://docs.rootkey.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Cloudflare R2 Connector

> Automatically forward files from a Cloudflare R2 bucket to ROOTKey using a Terraform-deployed Cloudflare Worker in your Cloudflare account, with native queue-based retries, dead-letter queue, and secret management.

## How It Works

The R2 connector is a self-contained integration that runs **entirely inside your Cloudflare account**. ROOTKey publishes a [Terraform module](https://github.com/rootkey-ai/rootkey-connectors/tree/main/r2) that you deploy once per bucket. The module wires a Cloudflare Worker to a Cloudflare Queue that receives R2 event notifications whenever a new object is created.

When a new file is uploaded to the monitored bucket (and matches your [filtering rules](/pages/connectors/rules), if configured), the Worker reads the object directly from R2 via an in-cluster binding (zero egress) and streams it to the ROOTKey API authenticated with your **Connector API Key**. ROOTKey stores the file and anchors it on-chain, enabling both integrity verification and **full file recovery** in the event of corruption, ransomware, or accidental deletion.

```
R2 Bucket          Cloudflare Queue        Cloudflare Worker          ROOTKey API
─────────          ────────────────        ─────────────────          ───────────
  Object Created  →  Auto-published    →   Reads via R2 binding  →    POST /api-v1/connectors/files/
  event              (batched up to 25)    Streams to ROOTKey         (Connector API Key
                                           Acks on success            from Workers Secret,
                                           Retries on transient       + idempotency headers)
                                           failure (5x) → DLQ
```

<Note>
  ROOTKey's cyber resilience guarantee includes full recovery — not just detection. For that reason the connector uploads the **full file content**, not only a hash. Anchoring a hash alone cannot restore a corrupted, encrypted, or deleted file.
</Note>

<Note>
  **Egress to ROOTKey is free** when this connector runs on Cloudflare. The R2 → Worker read is in-cluster (no internet egress), and Cloudflare does not bill egress on Worker fetch responses. Compared to the equivalent S3 connector on AWS, large-volume customers see materially lower recurring cost.
</Note>

***

## What This Module Creates in Your Cloudflare Account

Full transparency on what lands in your account when you `terraform apply`:

| Resource                                  | Purpose                                                                                                                                        | Cost impact                                                                            |
| ----------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------- |
| `cloudflare_workers_script`               | The connector itself (TypeScript bundled to \~6 KB ESM, deployed as a Workers module).                                                         | Free tier covers 100 K invocations/day; paid plan is $5/month for 10M + $0.30/M after. |
| `cloudflare_workers_secret`               | Holds the ROOTKey Connector API Key. Encrypted at rest; **never visible in plaintext** through the dashboard or API after creation.            | None (free with Workers).                                                              |
| `cloudflare_queue` (×2)                   | Main events queue + dead-letter queue. R2 publishes object-created events to the main queue; the Worker consumes from it.                      | \~\$0.40 per million operations (R2 event + Worker ack = \~2 ops).                     |
| `cloudflare_queue_consumer`               | Binds the Worker as the consumer of the main queue with `max_retries = 5` and the DLQ as the failure destination.                              | None.                                                                                  |
| `cloudflare_r2_bucket_event_notification` | Routes `PutObject`, `CompleteMultipartUpload`, and `CopyObject` events from your bucket into the queue. Supports server-side prefix filtering. | None.                                                                                  |

The module **does not** create the R2 bucket itself — you create that. It also does not modify any existing R2 event notifications on the bucket; the new notification config is additive (multiple notifications can coexist on the same bucket).

For a typical bucket with a few thousand uploads per month, the **total recurring cost added to your Cloudflare bill is well under one dollar per month**, dominated by Queue operations.

***

## Infrastructure Impact Summary

<AccordionGroup>
  <Accordion title="Does this modify my R2 bucket?" icon="bucket">
    No bucket-level data or settings are mutated. The module adds an **event notification** binding on the bucket — additive, doesn't replace anything. Any existing event notifications you have on the bucket are untouched.
  </Accordion>

  <Accordion title="Does this affect my other Cloudflare Workers or Queues?" icon="user-shield">
    No. The module creates one Worker, two Queues, and one event-notification binding, all uniquely named with a `name_suffix` you provide plus a hash of the bucket name. Pre-existing Workers, Queues, R2 buckets, and Pages projects in the same account are not touched.
  </Accordion>

  <Accordion title="Does the file content leave my Cloudflare account?" icon="cloud-arrow-up">
    Yes. The whole purpose of the connector is to forward file content to ROOTKey so it can be anchored and recovered. Transport is HTTPS-only (the module rejects non-`https://` API URLs at plan time). Files are streamed from R2 to the ROOTKey API; the Worker never persists them anywhere else. Cloudflare bills \$0 egress for this path.
  </Accordion>

  <Accordion title="What happens to my API key?" icon="key">
    The Connector API Key you paste into Terraform is stored as a Cloudflare Workers Secret **in your own account**, encrypted at rest. After creation it cannot be read back through the dashboard or API — only rotated via `terraform apply` with a new value. The Worker reads it from `env.ROOTKEY_API_KEY` at runtime.
  </Accordion>

  <Accordion title="What happens if the ROOTKey API is unreachable?" icon="circle-exclamation">
    The Worker calls `message.retry()` and Cloudflare Queues automatically retries the message with exponential backoff up to **5 times**. After exhaustion the message is moved to the dead-letter queue for human inspection. Permanent failures (oversize files, 4xx from ROOTKey) bypass the DLQ deliberately and surface via a stable log marker `rootkey.event.dlq_terminal_failure` for alerting.
  </Accordion>

  <Accordion title="Can I roll this back?" icon="rotate-left">
    Yes. Running `terraform destroy` removes every resource the module created (Worker, secret, both queues, queue consumer, event notification). The R2 bucket itself is **not** deleted — it is your resource, not the module's.
  </Accordion>
</AccordionGroup>

***

## Prerequisites

Before starting, ensure you have:

* A **Cloudflare account** with R2 enabled.
* A pre-existing **R2 bucket** to monitor. The module does not create the bucket.
* A **Cloudflare API token** with permissions to manage Workers, Queues, R2 event notifications, and Workers Secrets in the target account. Generate one at [dash.cloudflare.com → My Profile → API Tokens](https://dash.cloudflare.com/profile/api-tokens).
* Your **Cloudflare Account ID** (32-char hex, visible in the dashboard sidebar).
* [Terraform](https://developer.hashicorp.com/terraform/install) v1.3 or later.
* [Node.js](https://nodejs.org) 22+ on the machine running Terraform (used to compile the Worker at `terraform apply` time).

No additional Cloudflare-side configuration is needed. Unlike the AWS/Azure connectors, the Worker's scoping is implicit in the Terraform-declared bindings — there is no separate IAM role or App Registration to create.

***

## API Token Permissions

The Cloudflare API token used to run `terraform apply` needs the following permissions on the target account:

| Permission                                         | Why                                                        |
| -------------------------------------------------- | ---------------------------------------------------------- |
| `Workers Scripts: Edit`                            | Create, update, and delete the Worker.                     |
| `Workers Secrets: Edit`                            | Set the Connector API Key on the Worker.                   |
| `Queues: Edit`                                     | Create the main queue, the DLQ, and the consumer binding.  |
| `R2 Bucket: Read` + `R2 Event Notifications: Edit` | Read bucket metadata and configure the event notification. |

For minimal-privilege deployments, scope the token to the specific account (not "All Accounts").

***

## Configuration Fields

| Field                     | Required | Default                  | Description                                                                                                            |
| ------------------------- | -------- | ------------------------ | ---------------------------------------------------------------------------------------------------------------------- |
| **Connector Name**        | Yes      | —                        | A human-readable name to identify this connector in the dashboard.                                                     |
| **Destination Vault**     | Yes      | —                        | The ROOTKey vault where anchored files will be stored.                                                                 |
| **Cloudflare Account ID** | Yes      | —                        | 32-character hex Account ID.                                                                                           |
| **Bucket Name**           | Yes      | —                        | Exact name of the R2 bucket to monitor.                                                                                |
| **Name Suffix**           | Yes      | —                        | 3–12 lowercase alphanumeric chars used to namespace the resources (e.g. `acme` or `prod`).                             |
| **Prefix**                | No       | `""`                     | R2 key prefix filter. Empty means the entire bucket. Filtering happens server-side at the R2 event-notification layer. |
| **ROOTKey API URL**       | No       | `https://api.rootkey.ai` | ROOTKey API base URL. Must use `https://`.                                                                             |
| **Max file size (bytes)** | No       | `524288000` (500 MiB)    | Objects larger than this are skipped with a structured log marker.                                                     |
| **Tags**                  | No       | `{}`                     | Tags applied to the Worker script (queue/event-notification tagging is not currently supported by Cloudflare).         |

***

## Setup

The setup has a natural ordering dependency: the dashboard requires the Cloudflare Account ID and bucket name to create the connector, and the Worker requires the **Connector API Key** to call the ROOTKey API. The dashboard resolves this by generating a ready-to-run Terraform block with all values pre-filled.

<Steps>
  <Step title="Create the R2 bucket (if you don't have one)">
    In the Cloudflare dashboard → R2 → Create bucket, or via Wrangler:

    ```bash theme={null}
    wrangler r2 bucket create my-company-documents
    ```

    Note the bucket name.
  </Step>

  <Step title="Generate a Cloudflare API token">
    Go to [dash.cloudflare.com → My Profile → API Tokens → Create Token](https://dash.cloudflare.com/profile/api-tokens) and create a token with the permissions listed in [API Token Permissions](#api-token-permissions) above.

    Copy the token — you'll export it before running Terraform.
  </Step>

  <Step title="Find your Cloudflare Account ID">
    In the Cloudflare dashboard, the **Account ID** is visible in the right sidebar of any account-level page (32-char hex string).
  </Step>

  <Step title="Create the connector in the dashboard">
    Go to [app.rootkey.ai](https://app.rootkey.ai) → **Connectors** → **New Connector** → select **Cloudflare R2**.

    Fill in all required fields (see [Configuration Fields](#configuration-fields) above). Save the connector.
  </Step>

  <Step title="Copy the Connector API Key and the Terraform block">
    At the end of the wizard, the dashboard displays:

    1. The **Connector API Key**.
    2. A **ready-to-run Terraform block**, pre-filled with your values.

    <Warning>
      The Connector API Key is shown **only once** and is already embedded in the Terraform block. Copy both now and store them securely before closing this screen. The key cannot be retrieved again.
    </Warning>

    The generated block looks like:

    ```hcl theme={null}
    module "rootkey_r2_connector" {
      source = "github.com/rootkey-ai/rootkey-connectors//r2"

      cloudflare_account_id = "00112233445566778899aabbccddeeff"
      bucket_name           = "my-company-documents"
      name_suffix           = "acme"

      rootkey_api_key = "rk_conn_xxxxxxxxxxxxxxxxxxxx"

      # Optional
      prefix              = "documents/"
      max_file_size_bytes = 524288000
      tags = {
        cost-center = "security"
      }
    }
    ```
  </Step>

  <Step title="Deploy the Terraform module">
    Save the block into a `.tf` file in an empty directory, export your Cloudflare API token, and apply:

    ```bash theme={null}
    export CLOUDFLARE_API_TOKEN="cf_..."
    terraform init
    terraform apply
    ```

    The module bundles the Worker source, uploads it to Cloudflare, creates both queues, wires the consumer binding, and configures the R2 event notification.
  </Step>

  <Step title="Validate the connector">
    Upload a test object to the R2 bucket:

    ```bash theme={null}
    wrangler r2 object put my-company-documents/test.txt --file ./test.txt
    ```

    Within a few seconds the object should appear anchored in the destination vault, and the connector status in the dashboard should be **ACTIVE**.

    To tail Worker logs in real time:

    ```bash theme={null}
    wrangler tail $(terraform output -raw worker_name)
    ```
  </Step>
</Steps>

***

## Reliability and observability

The connector leans on Cloudflare Queues' native retry + DLQ semantics — there's no in-Worker retry layer to debug.

### Retry behaviour

| Layer                           | Retries                                                                        | When                                                                                                                                     |
| ------------------------------- | ------------------------------------------------------------------------------ | ---------------------------------------------------------------------------------------------------------------------------------------- |
| Queue consumer                  | Up to 5 attempts with exponential backoff (native Cloudflare Queues behaviour) | On `message.retry()` — triggered by 429/5xx responses from ROOTKey or network errors.                                                    |
| Dead-letter queue               | Final destination after retry exhaustion                                       | Operators inspect via Cloudflare API or dashboard.                                                                                       |
| Permanent failure short-circuit | `PermanentError` is caught and acked immediately, bypassing the DLQ            | `4xx` from ROOTKey, oversize files, or objects that no longer exist in R2. Surfaced via `rootkey.event.dlq_terminal_failure` log marker. |

### Idempotency

Every upload to the ROOTKey API carries three headers extracted from the R2 object:

| Header                    | Source                                                                       |
| ------------------------- | ---------------------------------------------------------------------------- |
| `x-rootkey-source-bucket` | R2 event `bucket`                                                            |
| `x-rootkey-source-key`    | R2 event `object.key`                                                        |
| `x-rootkey-source-etag`   | Live R2 object ETag (read at processing time — preferred over event payload) |

The ROOTKey API uses these to deduplicate redelivered events.

### What to monitor

| Signal                                                   | What it means                                                   | How to alert                                          |
| -------------------------------------------------------- | --------------------------------------------------------------- | ----------------------------------------------------- |
| DLQ queue depth > 0 for more than \~10 min               | Transient failures exhausted the retry budget.                  | Cloudflare dashboard → Queues → depth metric.         |
| Worker logs contain `rootkey.event.dlq_terminal_failure` | A message hit a permanent error and was acked without retrying. | Workers Logpush + alert on the marker.                |
| Worker invocation error rate > 0                         | Recurring runtime errors.                                       | Workers Analytics → error count.                      |
| ROOTKey dashboard connector status `ERROR`               | API rejected uploads (invalid key, vault deleted, quota).       | Email/Slack via your dashboard notification settings. |

To peek at the DLQ contents:

```bash theme={null}
curl -X POST \
  "https://api.cloudflare.com/client/v4/accounts/${ACCOUNT_ID}/queues/${DLQ_ID}/messages/pull" \
  -H "Authorization: Bearer ${CLOUDFLARE_API_TOKEN}" \
  -H "Content-Type: application/json" \
  -d '{"batch_size": 10, "visibility_timeout_ms": 30000}'
```

***

## Security considerations

The module ships with a defensive default posture out of the box:

* **API key in a Workers Secret, not in code.** Encrypted at rest, not readable after creation. Rotation = `terraform apply` with a new value.
* **HTTPS-only.** The module rejects non-`https://` API URLs at plan time.
* **In-cluster R2 read.** The Worker reads R2 objects via a binding — no HTTP egress, no public network exposure for the data path.
* **Bucket-scoped binding.** The Worker's R2 binding is scoped to one bucket. If the Worker is ever compromised, the blast radius is one bucket — not the customer's whole R2 footprint.
* **No additional IAM model to manage.** Unlike the AWS/Azure connectors, there is no role or App Registration to audit; the Worker can only use the bindings declared in Terraform.

***

## Filtering Rules

To anchor only specific files (e.g., only PDFs, or exclude temporary files), configure [Filtering Rules](/pages/connectors/rules) on the connector after creation. Rules apply on the ROOTKey side — files filtered out are not stored in the vault.

You can also restrict at the R2 side by setting the **Prefix** field, which translates into the event-notification server-side filter: events for objects whose key does not start with the prefix never invoke the Worker.

***

## Troubleshooting

<AccordionGroup>
  <Accordion title="Objects are uploaded to R2 but nothing reaches ROOTKey" icon="magnifying-glass">
    Check in this order:

    1. **Event notification is configured.** Cloudflare dashboard → R2 → your bucket → Settings → Event notifications must show the `rootkey-r2-events-*` queue subscribed.
    2. **DLQ depth.** Cloudflare dashboard → Queues → your DLQ. Non-zero means transient errors exhausted retries.
    3. **Worker logs.** `wrangler tail $(terraform output -raw worker_name)` — look for terminal-failure markers.
    4. **Filter prefix.** If you set `prefix`, events for keys not matching the prefix never reach the Worker.
  </Accordion>

  <Accordion title="Connector status is ERROR in the dashboard" icon="triangle-exclamation">
    Common causes:

    * The Connector API Key was modified or deleted as a Workers Secret. Retrieve a new key by deleting and recreating the connector.
    * The destination vault was deactivated or deleted. Reactivate it or change the connector's vault binding.

    The dashboard error panel shows the underlying message from the ROOTKey API.
  </Accordion>

  <Accordion title="A large file fails to upload" icon="file-zipper">
    The Workers paid-plan practical ceiling for streaming uploads is around **500 MiB**, which is the module default. Above this you may hit CPU/memory limits on the Worker.

    Options:

    * Stay under 500 MiB (recommended).
    * Move to Cloudflare Workers Unbound or the Standard pricing tier with more headroom and adjust `max_file_size_bytes` accordingly.
  </Accordion>

  <Accordion title="How do I rotate the Connector API Key?" icon="arrows-spin">
    1. In the ROOTKey dashboard, delete the connector and create a new one (the Cloudflare resources can be reused).
    2. Update the `rootkey_api_key` Terraform variable with the new key.
    3. Run `terraform apply` — the module overwrites the Workers Secret with the new value, and the Worker picks it up on the next invocation.
  </Accordion>

  <Accordion title="Can I monitor multiple R2 buckets?" icon="layer-group">
    Yes — deploy the module once per bucket. Each instance is fully isolated: its own Worker, queues, secret, and event notification binding. Namespacing is automatic via `name_suffix` + a hash of the bucket name. Reuse the same Cloudflare account.
  </Accordion>

  <Accordion title="What happens if a message is in the DLQ?" icon="skull">
    The DLQ holds messages that the queue trigger could not process after all 5 retry attempts. Pull a message from the DLQ via the Cloudflare API (see [Reliability and observability](#reliability-and-observability)), inspect it, and decide:

    * If the cause was transient (e.g., ROOTKey API was down): re-publish the message to the main queue.
    * If the cause was a permanent issue (key rotated incorrectly, vault deleted): resolve the root cause first, then replay.

    Permanent failures (oversize, 4xx) should never reach the DLQ — they are short-circuited and visible only as `rootkey.event.dlq_terminal_failure` markers in Worker logs.
  </Accordion>
</AccordionGroup>

***

## Source code

The Terraform module and Worker source live in the public [rootkey-ai/rootkey-connectors](https://github.com/rootkey-ai/rootkey-connectors) repository under the [`r2/`](https://github.com/rootkey-ai/rootkey-connectors/tree/main/r2) directory. The code is licensed under the Apache License 2.0 — you are free to fork it, audit it, or pin to a specific commit if your change-management process requires it.

***

→ Back to [Connectors Overview](/pages/connectors/overview)
