> ## Documentation Index
> Fetch the complete documentation index at: https://docs.extend.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# API Quickstart

> Parse your first document with the Extend API to get back clean agent-ready markdown.

This guide walks you through your first Extend API call. You'll parse a document with the [Parse endpoint](/parsing/overview) and get back clean, LLM-ready markdown and structured blocks you can feed into RAG, downstream extraction, or any other step in your pipeline.

***

## What we're going to parse

We'll parse a bank statement PDF with headers, an account summary, and a transactions table.

<img src="https://files.buildwithfern.com/extendconfig.docs.buildwithfern.com/6054ef645b2cc17010f5472ca513d6b4c02ad1c15ba0abc4dbe982408f517322/assets/images/quickstart/bank_statement_page_1.png" alt="Bank statement page 1" decoding="async" />

For this guide we've hosted the bank statement [here](https://extend-public-files.s3.us-east-2.amazonaws.com/bank_statement_example.pdf).

**What you'll get back:**

* Clean markdown for each page, ready to drop into an LLM prompt
* Layout-aware blocks (text, tables, figures, key-value pairs) with their positions on the page
* Page-level chunks you can index or post-process

For structured field extraction (pulling specific values like account numbers into typed JSON), see the [Extract endpoint](/extraction/overview) after this guide.

***

## Get your API key

Create a key on the [Developers page](https://dashboard.extend.ai/developers) and store it in an environment variable:

```bash
export EXTEND_API_KEY="your_api_key_here"
```

***

## Install the SDK

```bash
pip install extend-ai
```

```bash
npm install extend-ai
```

```groovy
// build.gradle
implementation 'ai.extend:extend-java-sdk:1.13.0'
```

```bash
go get github.com/extend-hq/extend-go-sdk
```

No installation needed. Call the API directly with `curl` or any HTTP client.

Prefer raw HTTP? Use the cURL tab below and skip this step. For Maven and other install options, see the [SDKs page](/sdks).

***

## Parse the document

Now let's parse the sample document step by step. Pick your language: each tab walks through initializing the client, parsing the hosted document, and reading the result. Prefer to use your own file? Each "Parse the document" step has a tip showing how to [upload](/api-reference/endpoints/file/upload-file) it and pass the returned `id` instead.

Create a client. It automatically reads your API key from the `EXTEND_API_KEY` environment variable you set earlier.

```python
from extend_ai import Extend

client = Extend()
```

Pass the hosted file `url` to `parse`. This synchronous call sends the document through the pipeline (OCR, layout detection, table extraction, chunking) and returns a fully populated `ParseRun`.

```python
response = client.parse(
    file={"url": "https://extend-public-files.s3.us-east-2.amazonaws.com/bank_statement_example.pdf"}
)

print(response.status)  # PROCESSED
```

Want to parse your own document? [Upload it](/api-reference/endpoints/file/upload-file) first, then pass the returned file `id` instead of a `url`:

```python
with open("bank_statement.pdf", "rb") as f:
    uploaded = client.files.upload(file=f)

response = client.parse(file={"id": uploaded.id})
```

`output.chunks` holds the parsed document. Each chunk has a `content` string of clean markdown and a `blocks` array of typed, layout-aware elements.

```python
for chunk in response.output.chunks:
    print(chunk.content)  # clean markdown for the chunk
    for block in chunk.blocks:
        print(block.type, block.content)  # text, table, figure, ...
```

**Complete code:**

```python
from extend_ai import Extend

client = Extend()

response = client.parse(
    file={"url": "https://extend-public-files.s3.us-east-2.amazonaws.com/bank_statement_example.pdf"}
)
print(response.status)

for chunk in response.output.chunks:
    print(chunk.content)
```

These examples use `await`, so run them inside an `async` function or a module with top-level await enabled.

Create a client. It automatically reads your API key from the `EXTEND_API_KEY` environment variable you set earlier.

```typescript
import { ExtendClient } from "extend-ai";

const client = new ExtendClient();
```

Pass the hosted file `url` to `parse`. This synchronous call sends the document through the pipeline and returns a fully populated `ParseRun`.

```typescript
const response = await client.parse({
  file: { url: "https://extend-public-files.s3.us-east-2.amazonaws.com/bank_statement_example.pdf" },
});

console.log(response.status); // PROCESSED
```

Want to parse your own document? [Upload it](/api-reference/endpoints/file/upload-file) first, then pass the returned file `id` instead of a `url`:

```typescript
import { createReadStream } from "fs";

const uploaded = await client.files.upload(createReadStream("bank_statement.pdf"), {});

const response = await client.parse({ file: { id: uploaded.id } });
```

`output.chunks` holds the parsed document. Each chunk has a `content` string of clean markdown and a `blocks` array of typed, layout-aware elements.

```typescript
for (const chunk of response.output.chunks) {
  console.log(chunk.content); // clean markdown for the chunk
  for (const block of chunk.blocks) {
    console.log(block.type, block.content); // text, table, figure, ...
  }
}
```

**Complete code:**

```typescript
import { ExtendClient } from "extend-ai";

const client = new ExtendClient();

const response = await client.parse({
  file: { url: "https://extend-public-files.s3.us-east-2.amazonaws.com/bank_statement_example.pdf" },
});
console.log(response.status);

response.output.chunks.forEach((chunk) => console.log(chunk.content));
```

Create a client. It automatically reads your API key from the `EXTEND_API_KEY` environment variable you set earlier.

```java
import ai.extend.ExtendClient;

ExtendClient client = ExtendClient.builder().build();
```

Pass the hosted file `url` to `parse` using the `FileFromUrl` variant of the file union.

```java
import ai.extend.requests.ParseRequest;
import ai.extend.types.FileFromUrl;
import ai.extend.types.ParseRequestFile;
import ai.extend.types.ParseRun;

ParseRun response = client.parse(ParseRequest.builder()
    .file(ParseRequestFile.of(FileFromUrl.builder()
        .url("https://extend-public-files.s3.us-east-2.amazonaws.com/bank_statement_example.pdf")
        .build()))
    .build());

System.out.println(response.getStatus()); // PROCESSED
```

Want to parse your own document? [Upload it](/api-reference/endpoints/file/upload-file) first, then pass the returned `id` via `FileFromId`:

```java
import ai.extend.requests.FilesUploadRequest;
import ai.extend.types.File;
import ai.extend.types.FileFromId;

File uploaded = client.files().upload(
    new java.io.File("bank_statement.pdf"),
    FilesUploadRequest.builder().build());

ParseRun response = client.parse(ParseRequest.builder()
    .file(ParseRequestFile.of(FileFromId.builder().id(uploaded.getId()).build()))
    .build());
```

`getOutput().get().getChunks()` holds the parsed document. Each chunk has a `content` string and a `blocks` list of typed, layout-aware elements.

```java
for (var chunk : response.getOutput().get().getChunks()) {
    System.out.println(chunk.getContent()); // clean markdown for the chunk
    for (var block : chunk.getBlocks()) {
        System.out.println(block.getType() + " " + block.getContent()); // text, table, figure, ...
    }
}
```

**Complete code:**

```java
import ai.extend.ExtendClient;
import ai.extend.requests.ParseRequest;
import ai.extend.types.FileFromUrl;
import ai.extend.types.ParseRequestFile;
import ai.extend.types.ParseRun;

ExtendClient client = ExtendClient.builder().build();

ParseRun response = client.parse(ParseRequest.builder()
    .file(ParseRequestFile.of(FileFromUrl.builder()
        .url("https://extend-public-files.s3.us-east-2.amazonaws.com/bank_statement_example.pdf")
        .build()))
    .build());

System.out.println(response.getStatus());
for (var chunk : response.getOutput().get().getChunks()) {
    System.out.println(chunk.getContent());
}
```

Create a client. It automatically reads your API key from the `EXTEND_API_KEY` environment variable you set earlier.

```go
package main

import (
	"context"
	"fmt"
	"log"

	extend "github.com/extend-hq/extend-go-sdk"
	client "github.com/extend-hq/extend-go-sdk/client"
)

func main() {
	c := client.NewClient()
}
```

Pass the hosted file `URL` to `Parse` using the `FileFromURL` variant of the file union.

```go
response, err := c.Parse(context.TODO(), &extend.ParseRequest{
	File: &extend.ParseRequestFile{
		FileFromURL: &extend.FileFromURL{
			URL: "https://extend-public-files.s3.us-east-2.amazonaws.com/bank_statement_example.pdf",
		},
	},
})
if err != nil {
	log.Fatal(err)
}

fmt.Println(response.Status) // PROCESSED
```

Want to parse your own document? Upload it first, then pass the returned `ID` via `FileFromID`:

```go
f, err := os.Open("bank_statement.pdf")
if err != nil {
	log.Fatal(err)
}
defer f.Close()

uploaded, err := c.Files.Upload(context.TODO(), f, &extend.FilesUploadRequest{})
if err != nil {
	log.Fatal(err)
}

response, err := c.Parse(context.TODO(), &extend.ParseRequest{
	File: &extend.ParseRequestFile{
		FileFromID: &extend.FileFromID{ID: uploaded.ID},
	},
})
```

`Output.Chunks` holds the parsed document. Each chunk has a `Content` string and a `Blocks` slice of typed, layout-aware elements.

```go
for _, chunk := range response.Output.Chunks {
	fmt.Println(chunk.Content) // clean markdown for the chunk
	for _, block := range chunk.Blocks {
		fmt.Println(block.Type, block.Content) // text, table, figure, ...
	}
}
```

**Complete code:**

```go
package main

import (
	"context"
	"fmt"
	"log"

	extend "github.com/extend-hq/extend-go-sdk"
	client "github.com/extend-hq/extend-go-sdk/client"
)

func main() {
	c := client.NewClient()

	response, err := c.Parse(context.TODO(), &extend.ParseRequest{
		File: &extend.ParseRequestFile{
			FileFromURL: &extend.FileFromURL{
				URL: "https://extend-public-files.s3.us-east-2.amazonaws.com/bank_statement_example.pdf",
			},
		},
	})
	if err != nil {
		log.Fatal(err)
	}

	fmt.Println(response.Status)
	for _, chunk := range response.Output.Chunks {
		fmt.Println(chunk.Content)
	}
}
```

Pass the hosted file `url` in the `file` object:

```bash
curl -X POST https://api.extend.ai/parse \
  -H "Authorization: Bearer $EXTEND_API_KEY" \
  -H "x-extend-api-version: 2026-02-09" \
  -H "Content-Type: application/json" \
  -d '{ "file": { "url": "https://extend-public-files.s3.us-east-2.amazonaws.com/bank_statement_example.pdf" } }'
```

Want to parse your own document? Upload it first with a multipart request to get a file `id`, then pass that `id` instead of a `url`:

```bash
# Upload the file to get an id
curl -X POST https://api.extend.ai/files/upload \
  -H "Authorization: Bearer $EXTEND_API_KEY" \
  -H "x-extend-api-version: 2026-02-09" \
  -F "file=@bank_statement.pdf"

# Then parse using the returned id
curl -X POST https://api.extend.ai/parse \
  -H "Authorization: Bearer $EXTEND_API_KEY" \
  -H "x-extend-api-version: 2026-02-09" \
  -H "Content-Type: application/json" \
  -d '{ "file": { "id": "file_xK9mLPqRtN3vS8wF5hB2cQ" } }'
```

***

## Understanding the response

The parsed document content lives in `output.chunks`. Each chunk has a `content` string (clean markdown for that page or section) and a `blocks` array of structured elements with layout coordinates.

```json
{
  "object": "parse_run",
  "id": "pr_3f1j6I1gsw5k96xFiCnkM",
  "file": {
    "object": "file",
    "id": "file_GzKUy0VDhHscv7tweODYb",
    "name": "bank_statement.pdf"
  },
  "status": "PROCESSED",
  "output": {
    "chunks": [
      {
        "id": "chunk_qncr8Txe-wYvmFjipXgMD",
        "type": "page",
        "content": "CHASE JPMorgan Chase Bank, N.A. P O Box 659754...",
        "metadata": { "pageRange": { "start": 1, "end": 1 } },
        "blocks": [
          {
            "object": "block",
            "id": "block_WNoJ0WbMj4pRW9MpMpUox",
            "type": "text",
            "content": "CHASE JPMorgan Chase Bank, N.A. P O Box 659754 San Antonio, TX 78265 - 9754",
            "details": {},
            "metadata": { "page": { "number": 1, "width": 612, "height": 792 } },
            "polygon": [
              { "x": 56.873, "y": 35.374 },
              { "x": 162.173, "y": 35.215 },
              { "x": 162.245, "y": 81.158 },
              { "x": 56.938, "y": 81.317 }
            ],
            "boundingBox": { "left": 56.873, "top": 35.215, "right": 162.245, "bottom": 81.317 }
          }
        ]
      }
    ]
  },
  "metrics": { "pageCount": 7, "processingTimeMs": 8293 },
  "usage": { "credits": 14 }
}
```

**Key fields:**

| Field                     | What it contains                                                                |
| ------------------------- | ------------------------------------------------------------------------------- |
| `id`                      | Unique identifier for the parse run. Use it to fetch results later.             |
| `status`                  | Processing state for the run (e.g., `PROCESSED`).                               |
| `output.chunks`           | Parsed content units (page, section, or document, based on config).             |
| `output.chunks[].content` | Clean markdown for the chunk, ready for an LLM.                                 |
| `output.chunks[].blocks`  | Individual elements (text, tables, figures) with their types and positions.     |
| `blocks[].type`           | What kind of element this is: `text`, `table`, `figure`, `key_value`, and more. |
| `blocks[].boundingBox`    | Coordinates showing where the element appears on the page.                      |
| `usage.credits`           | Credits consumed by this run.                                                   |

Pass `content` straight into an LLM, or walk `blocks` when you need tables and spatial structure. For the complete shape, see [Response Format](/parsing/response-format).

***

## Configuring the parser

The default settings work well for most documents, but you can shape the output by passing a `config` object. The most common options are the chunking strategy and per-block options:

```python
response = client.parse(
    file={
        "url": "https://extend-public-files.s3.us-east-2.amazonaws.com/bank_statement_example.pdf",
    },
    config={
        # Group content into logical sections instead of one chunk per page
        "chunkingStrategy": {
            "type": "section",
            "options": {"maxCharacters": 2000},
        },
        # Analyze and summarize charts, diagrams, and images
        "blockOptions": {"figures": {"enabled": True}},
    },
)
```

```typescript
const response = await client.parse({
  file: {
    url: "https://extend-public-files.s3.us-east-2.amazonaws.com/bank_statement_example.pdf",
  },
  config: {
    // Group content into logical sections instead of one chunk per page
    chunkingStrategy: { type: "section", options: { maxCharacters: 2000 } },
    // Analyze and summarize charts, diagrams, and images
    blockOptions: { figures: { enabled: true } },
  },
});
```

```java
ParseRun response = client.parse(ParseRequest.builder()
    .file(ParseRequestFile.of(FileFromUrl.builder()
        .url("https://extend-public-files.s3.us-east-2.amazonaws.com/bank_statement_example.pdf")
        .build()))
    .config(ParseConfig.builder()
        // Group content into logical sections instead of one chunk per page
        .chunkingStrategy(ParseConfigChunkingStrategy.builder()
            .type(ParseConfigChunkingStrategyType.SECTION)
            .options(ParseConfigChunkingStrategyOptions.builder()
                .maxCharacters(2000)
                .build())
            .build())
        // Analyze and summarize charts, diagrams, and images
        .blockOptions(ParseConfigBlockOptions.builder()
            .figures(ParseConfigBlockOptionsFigures.builder()
                .enabled(true)
                .build())
            .build())
        .build())
    .build());
```

```go
response, err := c.Parse(context.TODO(), &extend.ParseRequest{
	File: &extend.ParseRequestFile{
		FileFromURL: &extend.FileFromURL{
			URL: "https://extend-public-files.s3.us-east-2.amazonaws.com/bank_statement_example.pdf",
		},
	},
	Config: &extend.ParseConfig{
		// Group content into logical sections instead of one chunk per page
		ChunkingStrategy: &extend.ParseConfigChunkingStrategy{
			Type: extend.ParseConfigChunkingStrategyTypeSection.Ptr(),
			Options: &extend.ParseConfigChunkingStrategyOptions{MaxCharacters: extend.Int(2000)},
		},
		// Analyze and summarize charts, diagrams, and images
		BlockOptions: &extend.ParseConfigBlockOptions{
			Figures: &extend.ParseConfigBlockOptionsFigures{Enabled: extend.Bool(true)},
		},
	},
})
```

```bash
curl -X POST https://api.extend.ai/parse \
  -H "Authorization: Bearer $EXTEND_API_KEY" \
  -H "x-extend-api-version: 2026-02-09" \
  -H "Content-Type: application/json" \
  -d '{
    "file": {
      "url": "https://extend-public-files.s3.us-east-2.amazonaws.com/bank_statement_example.pdf"
    },
    "config": {
      "chunkingStrategy": { "type": "section", "options": { "maxCharacters": 2000 } },
      "blockOptions": { "figures": { "enabled": true } }
    }
  }'
```

**What these options do:**

* **`chunkingStrategy.type`**: `"page"` (default), `"section"` (logical, heading-aware chunks for RAG), or `"document"` (one chunk for the whole file).
* **`blockOptions`**: Fine-grained control over how figures, tables, and other block types are detected and formatted.

For the full list of options, see [Configuration Options](/parsing/configuration). You can also configure the parser visually and export the config from [Extend Studio](https://dashboard.extend.ai/studio).

***

## Next steps

You've parsed a document with the API. Here's where to go next:

Request and response schemas for every endpoint.

Define a schema and pull exact fields into JSON with the Extract endpoint.

Sort documents into categories you describe in plain language.

Chain parse, extract, and classify into an end-to-end pipeline.