The Block Object

Detailed information about the Block object structure and types returned by the /parse API endpoint.

Overview

A Block represents a distinct content element within a document, such as a paragraph of text, a heading, a table, or a figure. Blocks are the fundamental units that make up chunks in parsed documents.

Block Object Structure

object
string

The type of object. Always “block”.

id
string

A unique identifier for the block, deterministically generated as a hash of the block content.

type
string

The type of block. Possible values include:

  • text: Regular text content
  • heading: Section or document headings
  • section_heading: Subsection headings
  • table: Tabular data with rows and columns
  • figure: Images, charts, or diagrams
content
string

The textual content of the block, formatted according to the target format specified in the parse request.

details
object

Additional details specific to the block type. The structure varies depending on the block type.

oneOf
object
type
stringDefaults to figure_details

Indicates this is a figure details object.

imageUrl
string

URL to the clipped/segmented figure image. Only set if the option figureImageClippingEnabled is true (which is default true).

figureType
string

The refined type of figure - only set when figure classification and summarization is enabled. Possible values:

  • image: A photographic image
  • chart: A data chart or graph
  • diagram: A schematic or diagram
  • logo: A company or brand logo
  • other: Any other type of figure
type
stringDefaults to table_details

Indicates this is a table details object.

rowCount
number

The number of rows in the table.

columnCount
number

The number of columns in the table.

metadata
object

Metadata about the block.

pageNumber
number

The page number where the block appears in the document.

polygon
array

An array of points defining the polygon that bounds the block on the page. Each point is an object with x and y coordinates.

boundingBox
object

A simplified rectangular bounding box for the block, derived from the polygon.

top
number

The y-coordinate of the top edge of the bounding box.

left
number

The x-coordinate of the left edge of the bounding box.

width
number

The width of the bounding box.

height
number

The height of the bounding box.

Block Type Examples

1{
2 "object": "block",
3 "id": "block_1a2b3c4d5e",
4 "type": "text",
5 "content": "This is a paragraph of text content that appears in the document.",
6 "details": {},
7 "metadata": {
8 "pageNumber": 1
9 },
10 "polygon": [
11 {"x": 100, "y": 50},
12 {"x": 500, "y": 50},
13 {"x": 500, "y": 80},
14 {"x": 100, "y": 80}
15 ],
16 "boundingBox": {
17 "top": 50,
18 "left": 100,
19 "width": 400,
20 "height": 30
21 }
22}