Response Format

The Parse API returns document content in a structured format that provides both high-level formatted content and detailed block-level information. Start with the response structure, then decide whether you need formatted content or block-level detail.

Response structure

The parse run response contains the parsed content in parseRun.output.chunks. Each chunk contains two key properties:

content: A fully formatted representation of the entire chunk in the target format (e.g., markdown). This is ready to use as-is if you need the complete formatted content of a page.
blocks: An array of individual content blocks that make up the chunk, each with its own formatting, position information, and metadata.

Choose content vs. blocks

Use chunk.content when:
- You need the complete, properly formatted content of a page, already doing the logical placement of blocks (e.g. grouping markdown sections and placing spatially, etc)
- You want to display or process the document content as a whole (and can just combine all chunk.content values)
- You’re integrating with systems that expect formatted text (e.g., markdown processors)
Use chunk.blocks when:
- You need to work with specific elements of the document (e.g., only tables or figures)
- You need spatial information about where content appears on the page, perhaps to build citation systems
- You’re building a UI that shows or highlights specific document elements

Examples

Extract specific content types

1 // Extract all tables from a document
2 function extractTables(parseRun) {
3   const tables = [];
4   
5   parseRun.output.chunks.forEach(chunk => {
6     chunk.blocks.forEach(block => {
7       if (block.type === 'table') {
8         tables.push({
9           content: block.content,
10           pageNumber: block.metadata.page.number,
11           position: block.boundingBox
12         });
13       }
14     });
15   });
16   
17   return tables;
18 }
19 
20 // Extract all figures with their images
21 function extractFigures(parseRun) {
22   const figures = [];
23   
24   parseRun.output.chunks.forEach(chunk => {
25     chunk.blocks.forEach(block => {
26       if (block.type === 'figure' && block.details.imageUrl) {
27         figures.push({
28           caption: block.content,
29           imageUrl: block.details.imageUrl,
30           figureType: block.details.figureType,
31           pageNumber: block.metadata.page.number
32         });
33       }
34     });
35   });
36   
37   return figures;
38 }

Reconstruct content with custom formatting

1 // Extract headings and their content to create a table of contents
2 function createTableOfContents(parseRun) {
3   const toc = [];
4   
5   parseRun.output.chunks.forEach(chunk => {
6     chunk.blocks.forEach(block => {
7       if (block.type === 'heading' || block.type === 'section_heading') {
8         toc.push({
9           title: block.content,
10           pageNumber: block.metadata.page.number
11         });
12       }
13     });
14   });
15   
16   return toc;
17 }

Spatial information

Each block contains spatial information in the form of a polygon (precise outline) and a simplified boundingBox. Use this when you need position-aware output:

Highlight specific content in a document viewer
Create visual overlays on top of the original document
Understand the reading order and layout of the document

1 // Create highlight coordinates for a document viewer
2 function createHighlights(parseRun, searchTerm) {
3   const highlights = [];
4   
5   parseRun.output.chunks.forEach(chunk => {
6     chunk.blocks.forEach(block => {
7       if (block.type === 'text' && block.content.includes(searchTerm)) {
8         highlights.push({
9           pageNumber: block.metadata.page.number,
10           boundingBox: block.boundingBox
11         });
12       }
13     });
14   });
15   
16   return highlights;
17 }

By leveraging both the formatted content and the structured block information, you can build powerful document processing workflows that combine the convenience of formatted text with the precision of block-level access.