-
Schedule a 30-minute live product demo with expert Q&A
Sample cURL to submit a file directly :
curl -X PUT "https://api.talonic.ai/data-extractor/process" \
-H "Authorization: Bearer YOUR_API_KEY" \
-F "file=@/path/to/your/file.pdf" \
-F "json_schema={\"$schema\":\"http://json-schema.org/draft-07/schema#\",\"type\":\"object\",...}" \
-F "description=Optional description of the file"
Sample cURL to submit a file URL :
curl -X PUT "https://api.talonic.ai/data-extractor/process" \
-H "Authorization: Bearer YOUR_API_KEY" \
-F "file_url=https://example.com/path/to/file.pdf" \
-F "json_schema={\"$schema\":\"http://json-schema.org/draft-07/schema#\",\"type\":\"object\",...}" \
-F "description=Optional description of the file"
Sample cURL to poll job status :
curl -X GET "https://api.talonic.ai/data-extractor/process/YOUR_JOB_ID" \
-H "Authorization: Bearer YOUR_API_KEY"
Submit a file or a file URL along with a JSON schema for processing.
No parameters
No parameters
Request body
(object | object) One of (object | object) #0 object file (string | string | string | string | string | string | string | string | string | string | string | string | string | string | string | string) Any of (string | string | string | string | string | string | string | string | string | string | string | string | string | string | string | string) #0 string binarymedia type: application/pdf .pdf file (Adobe Acrobat) #1 string binarymedia type: text/csv .csv file (Comma-Separated Values) #2 string binarymedia type: application/msword .doc file (Microsoft Word) #3 string binarymedia type: application/vnd.openxmlformats-officedocument.wordprocessingml.document .docx file (Microsoft Word) #4 string binarymedia type: application/vnd.ms-excel .xls file (Microsoft Excel) #5 string binarymedia type: application/vnd.openxmlformats-officedocument.spreadsheetml.sheet .xlsx file (Microsoft Excel) #6 string binarymedia type: application/vnd.oasis.opendocument.spreadsheet .ods file (Open Document Sheet) #7 string binarymedia type: application/vnd.oasis.opendocument.text .odt file (Open Document Text) #8 string binarymedia type: application/vnd.apple.numbers .numbers file (Apple Numbers) #9 string binarymedia type: application/vnd.apple.pages .pages file (Apple Pages) #10 string binarymedia type: image/jpeg .jpg file (JPEG Image) #11 string binarymedia type: image/png .png file (PNG Image) #12 string binarymedia type: text/plain .txt file (Plaintext) #13 string binarymedia type: audio/mpeg .mp3 file (MP3 Audio) #14 string binarymedia type: audio/wav .wav file (Waveform Audio) #15 string binarymedia type: audio/ogg .ogg/.oga file (Ogg Audio) json_schema stringmedia type: application/json Stringified JSON schema describing the desired result. fast_extraction string Enable fast extraction method for simple documents, which significantly increases processing speed, but potentially reduces accuracy. Enum array #0=true #1=false Default=false validation string Validation policy to apply. 'lax' collects concerns but does not fail the job; 'strict' fails if max_errors is reached or overall invalid; 'none' disables validation. Enum array #0="lax" #1="strict" #2="none" Default="lax" description string≤ 1000 characters Optional description of or context for the provided file. #1 object file_url stringuri Publically accessible URL to the file to be processed. (See ProcessRequestFile for supported file formats) json_schema stringmedia type: application/json Stringified JSON schema describing the desired result. fast_extraction string Enable fast extraction method for simple documents, which significantly increases processing speed, but potentially reduces accuracy. Enum array #0=true #1=false Default=false validation string Validation policy to apply. 'lax' collects concerns but does not fail the job; 'strict' fails if max_errors is reached or overall invalid; 'none' disables validation. Enum array #0="lax" #1="strict" #2="none" Default="lax" description string≤ 1000 characters Optional description of or context for the provided file.
Response
Code | Description | Links |
---|---|---|
202 |
Processing request accepted and queued. Media type Controls Accept header.
{
"correlation_id": "3fa85f64-5717-4562-b3fc-2c963f66afa6",
"job_id": "3fa85f64-5717-4562-b3fc-2c963f66afa6",
"status": "queued",
"start_time": "2025-09-09T06:41:31.848Z",
"estimated_time_seconds": 0,
"message": "string",
"filename": "string"
}
ProcessResponse object correlation_id string uuid Unique correlation ID for the request. job_id string uuid Unique job ID for polling status. status string Initial status of the request. Enum array #0"queued" #1"processing" #2"failed" #3"success" #4"cancelled" start_time string date-time ISO 8601 timestamp when processing started. estimated_time_seconds integer Estimated time in seconds for the processing to finish. Only present if status is queued or processing. message string Informational message about the request. filename string Original name of the submitted or linked file, including extension |
No links |
400 |
Bad Request. Invalid input parameters. Media type {
"detail": "string"
}
ErrorResponse object detail string Error message detailing what went wrong. |
No links |
401 |
Unauthorized. Missing or invalid API key. Media type {
"detail": "string"
}
ErrorResponse object detail string Error message detailing what went wrong. |
No links |
413 |
Payload Too Large. Submitted payload is larger than the maximum allowable size. Media type {
"detail": "string"
}
ErrorResponse object detail string Error message detailing what went wrong. |
No links |
415 |
Unsupported Media Type. The server does not support the provided media type. Media type {
"detail": "string"
}
ErrorResponse object detail string Error message detailing what went wrong. |
No links |
429 |
Too Many Requests. Wait a minute and try again. Media type {
"detail": "string"
}
ErrorResponse object detail string Error message detailing what went wrong. |
No links |
500 |
Server error. Media type {
"detail": "string"
}
ErrorResponse object detail string Error message detailing what went wrong. |
No links |
Retrieve the status and result of a processing job using its ID.
No parameters
Name | Description |
---|---|
job_id* string($uuid)(path) |
Unique identifier of the processing job. |
include-schema string(query) |
Include JSON schema in response body. Available values : true, false |
include-markdown string(query) |
Include markdown extracted from source data in response body. Available values : true, false |
Response
Code | Description | Links |
---|---|---|
200 |
Processing status retrieved successfully. Media type Controls Accept header.
{
"correlation_id": "3fa85f64-5717-4562-b3fc-2c963f66afa6",
"job_id": "3fa85f64-5717-4562-b3fc-2c963f66afa6",
"status": "queued",
"start_time": "2025-09-10T10:08:56.445Z",
"estimated_time_seconds": 0,
"finish_time": "2025-09-10T10:08:56.445Z",
"message": "string",
"filename": "string",
"result": {},
"json_schema": {},
"markdown": "string",
"validation_result": {
"concerns": [
{
"path": "string",
"text": "string",
"level": "error",
"code": "missing_value"
}
],
"summary": "string"
}
}
ProcessStatusResponse object correlation_id string uuid Correlation ID of the request. job_id string uuid Job ID for polling. status string Current status of the processing job. Enum array #0"queued" #1"processing" #2"failed" #3"success" #4"cancelled" start_time string date-time ISO 8601 timestamp when processing started. estimated_time_seconds integer Estimated time in seconds for the processing to finish. Only present if status is queued or processing. finish_time string | null date-time ISO 8601 timestamp when processing finished, or null if not finished. message string Status message, if any. filename string Original name of the submitted or linked file, including extension result object | null Processing result following the provided JSON schema, or null if not finished. json_schema object JSON schema used to create the result JSON. Only present if status is finished and include-schema is true. markdown string Markdown representation of the source data. Only present if status is finished and include-markdown is true. validation_result object Validation result of the extracted JSON. Only present if status is finished and validate is true. concerns array |
No links |
401 |
Unauthorized. Missing or invalid API key. Media type {
"detail": "string"
}
ErrorResponse object detail string Error message detailing what went wrong. |
No links |
404 |
Not Found. No job found with the provided ID. Media type {
"detail": "string"
}
ErrorResponse object detail string Error message detailing what went wrong. |
No links |
429 |
Too Many Requests. Wait a minute and try again. Media type {
"detail": "string"
}
ErrorResponse object detail string Error message detailing what went wrong. |
No links |
500 |
Server error. Media type {
"detail": "string"
}
ErrorResponse object detail string Error message detailing what went wrong. |
No links |
Cancel a running job by process_id.
No parameters
Name | Description |
---|---|
job_id* string($uuid)(path) |
Unique identifier of the processing job to cancel. |
Response
Code | Description | Links |
---|---|---|
202 |
Cancellation request accepted. |
No links |
400 |
Bad Request. Media type {
"detail": "string"
}
ErrorResponse object detail string Error message detailing what went wrong. |
No links |
401 |
Unauthorized. Missing or invalid API key. Media type {
"detail": "string"
}
ErrorResponse object detail string Error message detailing what went wrong. |
No links |
404 |
Not Found. No job found with the provided ID. Media type {
"detail": "string"
}
ErrorResponse object detail string Error message detailing what went wrong. |
No links |
409 |
Conflict. Job is already finished or cancelled. Media type {
"detail": "string"
}
ErrorResponse object detail string Error message detailing what went wrong. |
No links |
429 |
Too Many Requests. Wait a minute and try again. Media type {
"detail": "string"
}
ErrorResponse object detail string Error message detailing what went wrong. |
No links |
500 |
Server error. Media type {
"detail": "string"
}
ErrorResponse object detail string Error message detailing what went wrong. |
No links |
Submit a file or a file URL to extract markdown only. No conversion or validation is performed.
No parameters
No parameters
Request body
(object | object) One of (object | object) #0 object file (string | string | string | string | string | string | string | string | string | string | string | string | string | string | string | string) Any of (string | string | string | string | string | string | string | string | string | string | string | string | string | string | string | string) #0 string binary media type: application/pdf .pdf file (Adobe Acrobat) #1 string string binary media type: text/csv .csv file (Comma-Separated Values) #2 string binary media type: application/msword .doc file (Microsoft Word) #3 string binary media type: application/vnd.openxmlformats-officedocument.wordprocessingml.document .docx file (Microsoft Word) #4 string binary media type: application/vnd.ms-excel .xls file (Microsoft Excel) #5 string binary media type: application/vnd.openxmlformats-officedocument.spreadsheetml.sheet .xlsx file (Microsoft Excel) #6 string binary media type: application/vnd.oasis.opendocument.spreadsheet .ods file (Open Document Sheet) #7 string binary media type: application/vnd.oasis.opendocument.text .odt file (Open Document Text) #8 string binary media type: application/vnd.apple.numbers .numbers file (Apple Numbers) #9 string binary media type: application/vnd.apple.pages .pages file (Apple Pages) #10 string binary media type: image/jpeg .jpg file (JPEG Image) #11 string binary media type: image/png .png file (PNG Image) #12 string binary media type: text/plain .txt file (Plaintext) #13 string binary media type: audio/mpeg .mp3 file (MP3 Audio) #14 string binary media type: audio/wav .wav file (Waveform Audio) #15 string binary media type: audio/ogg .ogg/.oga file (Ogg Audio) fast_extraction string Enable fast extraction method for simple documents, which significantly increases processing speed, but potentially reduces accuracy. Enum array #0=true #1=false Default=false description string ≤ 1000 characters Optional description of or context for the provided file. #1 object file_url string uri Publically accessible URL to the file to be processed. (See ExtractRequestFile for supported file formats) fast_extraction string Enable fast extraction method for simple documents, which significantly increases processing speed, but potentially reduces accuracy. Enum array #0=true #1=false Default=false description string ≤ 1000 characters Optional description of or context for the provided file.
Response
Code | Description | Links |
---|---|---|
202 |
Extraction request accepted and queued. Media type Controls Accept header.
{
"correlation_id": "3fa85f64-5717-4562-b3fc-2c963f66afa6",
"job_id": "3fa85f64-5717-4562-b3fc-2c963f66afa6",
"status": "queued",
"start_time": "2025-09-11T03:48:18.939Z",
"estimated_time_seconds": 0,
"message": "string",
"filename": "string"
}
ProcessResponse object correlation_id string uuid Unique correlation ID for the request. job_id string uuid Unique job ID for polling status. status string Initial status of the request. Enum array #0"queued" #1"processing" #2"failed" #3"success" #4"cancelled" start_time string date-time ISO 8601 timestamp when processing started. estimated_time_seconds integer Estimated time in seconds for the processing to finish. Only present if status is queued or processing. message string Informational message about the request. filename string Original name of the submitted or linked file, including extension |
No links |
400 |
Bad Request. Invalid input parameters. Media type {
"detail": "string"
}
ErrorResponse object detail string Error message detailing what went wrong. |
No links |
401 |
Unauthorized. Missing or invalid API key. Media type {
"detail": "string"
}
ErrorResponse object detail string Error message detailing what went wrong. |
No links |
413 |
Payload Too Large. Submitted payload is larger than the maximum allowable size. Media type {
"detail": "string"
}
ErrorResponse object detail string Error message detailing what went wrong. |
No links |
415 |
Unsupported Media Type. The server does not support the provided media type. Media type {
"detail": "string"
}
ErrorResponse object detail string Error message detailing what went wrong. |
No links |
429 |
Too Many Requests. Wait a minute and try again. Media type {
"detail": "string"
}
ErrorResponse object detail string Error message detailing what went wrong. |
No links |
500 |
Server error. Media type {
"detail": "string"
}
ErrorResponse object detail string Error message detailing what went wrong. |
No links |
Submit a file or a file URL to generate a recommended JSON schema only. No conversion or validation is performed.
No parameters
No parameters
Request body
(object | object) One of (object | object) #0 object file (string | string | string | string | string | string | string | string | string | string | string | string | string | string | string | string) Any of (string | string | string | string | string | string | string | string | string | string | string | string | string | string | string | string) #0 string binary media type: application/pdf .pdf file (Adobe Acrobat) #1 string string binary media type: text/csv .csv file (Comma-Separated Values) #2 string binary media type: application/msword .doc file (Microsoft Word) #3 string binary media type: application/vnd.openxmlformats-officedocument.wordprocessingml.document .docx file (Microsoft Word) #4 string binary media type: application/vnd.ms-excel .xls file (Microsoft Excel) #5 string binary media type: application/vnd.openxmlformats-officedocument.spreadsheetml.sheet .xlsx file (Microsoft Excel) #6 string binary media type: application/vnd.oasis.opendocument.spreadsheet .ods file (Open Document Sheet) #7 string binary media type: application/vnd.oasis.opendocument.text .odt file (Open Document Text) #8 string binary media type: application/vnd.apple.numbers .numbers file (Apple Numbers) #9 string binary media type: application/vnd.apple.pages .pages file (Apple Pages) #10 string binary media type: image/jpeg .jpg file (JPEG Image) #11 string binary media type: image/png .png file (PNG Image) #12 string binary media type: text/plain .txt file (Plaintext) #13 string binary media type: audio/mpeg .mp3 file (MP3 Audio) #14 string binary media type: audio/wav .wav file (Waveform Audio) #15 string binary media type: audio/ogg .ogg/.oga file (Ogg Audio) fast_extraction string Enable fast extraction method for simple documents, which significantly increases processing speed, but potentially reduces accuracy. Enum array #0=true #1=false Default=false description string ≤ 1000 characters Optional description of or context for the provided file. #1 object file_url string uri Publically accessible URL to the file to be processed. (See ExtractRequestFile for supported file formats) fast_extraction string Enable fast extraction method for simple documents, which significantly increases processing speed, but potentially reduces accuracy. Enum array #0=true #1=false Default=false description string ≤ 1000 characters Optional description of or context for the provided file.
Response
Code | Description | Links |
---|---|---|
202 |
Recommendation request accepted and queued. Media type Controls Accept header.
{
"correlation_id": "3fa85f64-5717-4562-b3fc-2c963f66afa6",
"job_id": "3fa85f64-5717-4562-b3fc-2c963f66afa6",
"status": "queued",
"start_time": "2025-09-13T03:41:56.257Z",
"estimated_time_seconds": 0,
"message": "string",
"filename": "string"
}
ProcessResponse object correlation_id string uuid job_id string uuid status string start_time string date-time estimated_time_seconds integer message string filename string |
No links |
400 |
Bad Request. Invalid input parameters. Media type {
"detail": "string"
}
ErrorResponse object detail string Error message detailing what went wrong. |
No links |
401 |
Unauthorized. Missing or invalid API key. Media type {
"detail": "string"
}
ErrorResponse object detail string Error message detailing what went wrong. |
No links |
413 |
Payload Too Large. Submitted payload is larger than the maximum allowable size. Media type {
"detail": "string"
}
ErrorResponse object detail string Error message detailing what went wrong. |
No links |
415 |
Unsupported Media Type. The server does not support the provided media type. Media type {
"detail": "string"
}
ErrorResponse object detail string Error message detailing what went wrong. |
No links |
429 |
Too Many Requests. Wait a minute and try again. Media type {
"detail": "string"
}
ErrorResponse object detail string Error message detailing what went wrong. |
No links |
500 |
Server error. Media type {
"detail": "string"
}
ErrorResponse object detail string Error message detailing what went wrong. |
No links |
Check the availability and API version of service.
Parameters
No parameters
Response
Code | Description | Links |
---|---|---|
200 |
Server is able to respond to the request. Media type {
"status": "OK",
"version": "1.1.5"
}
object status string Current health of the server. Enum array #0="OK" #1="Unstable" version string Current API version. Examplesarray #0="1.1.5" |
No links |
429 |
Too Many Requests. Wait a minute and try again. Media type {
"detail": "string"
}
ErrorResponse object detail string Error message detailing what went wrong. |
No links |
500 |
Server error. Media type {
"detail": "string"
}
ErrorResponse object detail string Error message detailing what went wrong. |
No links |
Provide feedback for a finished job in order to help us improve the quality of the results.
No parameters
Name | Description |
---|---|
job_id* string($uuid)(path) |
Unique identifier of the processing job. |
Request body
ProcessFeedback object rating integer[1, 5] Overall rating of the result on a 5-star like scale, 1 being the lowest. data_complete boolean Indicates whether or not the data returned by the API is complete, i.e. it contains all relevant and expected data points. schema_correct boolean Indicates whether or not the auto-generated schema (if any) looks sensible and useful and contains fields for all relevant and expected data points. data_correct boolean Indicates whether or not the data returned by the API seems or is correct, meaning that all it contains no wrong information. share_data boolean Indicates whether or not Talonic may temporarily store and look at the source data and any intermediate and final results to further improve our services. See Privacy Policy. Defaultfalse additional_feedback string≤ 1000 characters Text containing any additional feedback we should know.
Response
Code | Description | Links |
---|---|---|
204 |
Feedback accepted. |
No links |
400 |
Bad Request. Invalid input parameters. Media type {
"detail": "string"
}
ErrorResponse object detail string Error message detailing what went wrong. |
No links |
401 |
Unauthorized. Missing or invalid API key. Media type {
"detail": "string"
}
ErrorResponse object detail string Error message detailing what went wrong. |
No links |
422 |
Unprocessable Content. Processing is likely not yet finished. Media type {
"detail": "string"
}
ErrorResponse object detail string Error message detailing what went wrong. |
No links |
429 |
Too Many Requests. Wait a minute and try again. Media type {
"detail": "string"
}
ErrorResponse object detail string Error message detailing what went wrong. |
No links |
500 |
Server error. Media type {
"detail": "string"
}
ErrorResponse object detail string Error message detailing what went wrong. |
No links |
ExtractRequestFile
file (string | string | string | string | string | string | string | string | string | string | string | string | string | string | string | string) Any of (string | string | string | string | string | string | string | string | string | string | string | string | string | string | string | string) #0 stringbinarymedia type: application/pdf .pdf file (Adobe Acrobat) #1 stringbinarymedia type: text/csv .csv file (Comma-Separated Values) #2 stringbinarymedia type: application/msword .doc file (Microsoft Word) #3 stringbinarymedia type: application/vnd.openxmlformats-officedocument.wordprocessingml.document .docx file (Microsoft Word) #4 stringbinarymedia type: application/vnd.ms-excel .xls file (Microsoft Excel) #5 stringbinarymedia type: application/vnd.openxmlformats-officedocument.spreadsheetml.sheet .xlsx file (Microsoft Excel) #6 stringbinarymedia type: application/vnd.oasis.opendocument.spreadsheet .ods file (Open Document Sheet) #7 stringbinarymedia type: application/vnd.oasis.opendocument.text .odt file (Open Document Text) #8 stringbinarymedia type: application/vnd.apple.numbers .numbers file (Apple Numbers) #9 stringbinarymedia type: application/vnd.apple.pages .pages file (Apple Pages) #10 stringbinarymedia type: image/jpeg .jpg file (JPEG Image) #11 stringbinarymedia type: image/png .png file (PNG Image) #12 stringbinarymedia type: text/plain .txt file (Plaintext) #13 stringbinarymedia type: audio/mpeg .mp3 file (MP3 Audio) #14 stringbinarymedia type: audio/wav .wav file (Waveform Audio) #15 stringbinarymedia type: audio/ogg .ogg/.oga file (Ogg Audio) fast_extraction string Enable fast extraction method for simple documents, which significantly increases processing speed, but potentially reduces accuracy. Enum array #0=true #1=false Default=false description string≤ 1000 characters Optional description of or context for the provided file.
ExtractRequestFileURL
file_url string uri Publically accessible URL to the file to be processed. (See ExtractRequestFile for supported file formats) fast_extraction string Enable fast extraction method for simple documents, which significantly increases processing speed, but potentially reduces accuracy. Enum array #0=true #1=false Default=false description string≤ 1000 characters Optional description of or context for the provided file.
RecommendRequestFile
file (string | string | string | string | string | string | string | string | string | string | string | string | string | string | string | string) Any of (string | string | string | string | string | string | string | string | string | string | string | string | string | string | string | string) #0 stringbinarymedia type: application/pdf .pdf file (Adobe Acrobat) #1 stringbinarymedia type: text/csv .csv file (Comma-Separated Values) #2 stringbinarymedia type: application/msword .doc file (Microsoft Word) #3 stringbinarymedia type: application/vnd.openxmlformats-officedocument.wordprocessingml.document .docx file (Microsoft Word) #4 stringbinarymedia type: application/vnd.ms-excel .xls file (Microsoft Excel) #5 stringbinarymedia type: application/vnd.openxmlformats-officedocument.spreadsheetml.sheet .xlsx file (Microsoft Excel) #6 stringbinarymedia type: application/vnd.oasis.opendocument.spreadsheet .ods file (Open Document Sheet) #7 stringbinarymedia type: application/vnd.oasis.opendocument.text .odt file (Open Document Text) #8 stringbinarymedia type: application/vnd.apple.numbers .numbers file (Apple Numbers) #9 stringbinarymedia type: application/vnd.apple.pages .pages file (Apple Pages) #10 stringbinarymedia type: image/jpeg .jpg file (JPEG Image) #11 stringbinarymedia type: image/png .png file (PNG Image) #12 stringbinarymedia type: text/plain .txt file (Plaintext) #13 stringbinarymedia type: audio/mpeg .mp3 file (MP3 Audio) #14 stringbinarymedia type: audio/wav .wav file (Waveform Audio) #15 stringbinarymedia type: audio/ogg .ogg/.oga file (Ogg Audio) fast_extraction string Enable fast extraction method for simple documents, which significantly increases processing speed, but potentially reduces accuracy. Enum array #0=true #1=false Default=false description string≤ 1000 characters Optional description of or context for the provided file.
RecommendRequestFileURL
file_url strin guri Publically accessible URL to the file to be processed. (See RecommendRequestFile for supported file formats) fast_extraction string Enable fast extraction method for simple documents, which significantly increases processing speed, but potentially reduces accuracy. Enum array #0=true #1=false Default=false description string≤ 1000 characters Optional description of or context for the provided file.
ProcessRequestFile
file (string | string | string | string | string | string | string | string | string | string | string | string | string | string | string | string) Any of (string | string | string | string | string | string | string | string | string | string | string | string | string | string | string | string) #0 stringbinarymedia type: application/pdf .pdf file (Adobe Acrobat) #1 stringbinarymedia type: text/csv .csv file (Comma-Separated Values) #2 stringbinarymedia type: application/msword .doc file (Microsoft Word) #3 stringbinarymedia type: application/vnd.openxmlformats-officedocument.wordprocessingml.document .docx file (Microsoft Word) #4 stringbinarymedia type: application/vnd.ms-excel .xls file (Microsoft Excel) #5 stringbinarymedia type: application/vnd.openxmlformats-officedocument.spreadsheetml.sheet .xlsx file (Microsoft Excel) #6 stringbinarymedia type: application/vnd.oasis.opendocument.spreadsheet .ods file (Open Document Sheet) #7 stringbinarymedia type: application/vnd.oasis.opendocument.text .odt file (Open Document Text) #8 stringbinarymedia type: application/vnd.apple.numbers .numbers file (Apple Numbers) #9 stringbinarymedia type: application/vnd.apple.pages .pages file (Apple Pages) #10 stringbinarymedia type: image/jpeg .jpg file (JPEG Image) #11 stringbinarymedia type: image/png .png file (PNG Image) #12 stringbinarymedia type: text/plain .txt file (Plaintext) #13 stringbinarymedia type: audio/mpeg .mp3 file (MP3 Audio) #14 stringbinarymedia type: audio/wav .wav file (Waveform Audio) #15 stringbinarymedia type: audio/ogg .ogg/.oga file (Ogg Audio) json_schema stringmedia type: application/json Stringified JSON schema describing the desired result. fast_extraction string Enable fast extraction method for simple documents, which significantly increases processing speed, but potentially reduces accuracy. Enum array #0=true #1=false Default=false validation string Validation policy to apply. 'lax' collects concerns but does not fail the job; 'strict' fails if max_errors is reached or overall invalid; 'none' disables validation. Enum array #0="lax" #1="strict" #2="none" Default="lax" description string≤ 1000 characters Optional description of or context for the provided file.
ProcessRequestFileURL
file_url string uri Publically accessible URL to the file to be processed. (See ProcessRequestFile for supported file formats) json_schema stringmedia type: application/json Stringified JSON schema describing the desired result. fast_extraction string Enable fast extraction method for simple documents, which significantly increases processing speed, but potentially reduces accuracy. Enum array #0=true #1=false Default=false validation string Validation policy to apply. 'lax' collects concerns but does not fail the job; 'strict' fails if max_errors is reached or overall invalid; 'none' disables validation. Enum array #0="lax" #1="strict" #2="none" Default"lax" description string≤ 1000 characters Optional description of or context for the provided file.
ProcessResponse
correlation_id string uuid Unique correlation ID for the request. job_id stringuuid Unique job ID for polling status. status string Initial status of the request. Enum array #0="queued" #1="processing" #2="failed" #3="success" #4="cancelled" start_time stringdate-time ISO 8601 timestamp when processing started. estimated_time_seconds integer Estimated time in seconds for the processing to finish. Only present if status is queued or processing. message string Informational message about the request. filename string Original name of the submitted or linked file, including extension
ProcessValidationResult
concerns array object List of concerns with their JSON paths. Items object path string JSON path of the field related to the concern text string Human-readable description of the concern level string Severity level of the concern Enum array #0="error" #1="warning" #2="info" code string Code of the concern Enum array #0"missing_value" #1="null_value" #2="additional_value" #3="format_inconsistent" #4="numeric_mismatch" #5="floating_precision_diff" #6="semantic_conflict" #7="array_length_mismatch" #8="type_mismatch" #9="out_of_range" #10="duplicate_value" #11="incomplete_object" #12="extra_fields" #13="order_difference" #14="array_reordered" #15="array_deduplicated" #16="numeric_precision_normalized" #17="date_format_normalized" #18="optional_field_merged" #19="llm_value_selected" summary string Executive summary of the validation results.
ProcessStatusResponse
correlation_id stringuuid Correlation ID of the request. job_id stringuuid Job ID for polling. status string Current status of the processing job. Enum array #0"queued" #1"processing" #2"failed" #3"success" #4"cancelled" start_time stringdate-time ISO 8601 timestamp when processing started. estimated_time_seconds integer Estimated time in seconds for the processing to finish. Only present if status is queued or processing. finish_time string | nulldate-time ISO 8601 timestamp when processing finished, or null if not finished. message string Status message, if any. filename string Original name of the submitted or linked file, including extension result object | null Processing result following the provided JSON schema, or null if not finished. json_schema object JSON schema used to create the result JSON. Only present if status is finished and include-schema is true. markdown string Markdown representation of the source data. Only present if status is finished and include-markdown is true. validation_result object Validation result of the extracted JSON. Only present if status is finished and validate is true. concerns array object List of concerns with their JSON paths. Items object path string JSON path of the field related to the concern text string Human-readable description of the concern level string Severity level of the concern Enum array #0="error" #1="warning" #2="info" code string Code of the concern Enum array #0="missing_value" #1="null_value" #2="additional_value" #3="format_inconsistent" #4="numeric_mismatch" #5="floating_precision_diff" #6="semantic_conflict" #7="array_length_mismatch" #8="type_mismatch" #9="out_of_range" #10="duplicate_value" #11="incomplete_object" #12="extra_fields" #13="order_difference" #14="array_reordered" #15="array_deduplicated" #16="numeric_precision_normalized" #17="date_format_normalized" #18="optional_field_merged" #19="llm_value_selected" summary string Executive summary of the validation results.
ProcessFeedback
rating integer[1, 5] Overall rating of the result on a 5-star like scale, 1 being the lowest. data_complete boolean Indicates whether or not the data returned by the API is complete, i.e. it contains all relevant and expected data points. schema_correct boolean Indicates whether or not the auto-generated schema (if any) looks sensible and useful and contains fields for all relevant and expected data points. data_correct boolean Indicates whether or not the data returned by the API seems or is correct, meaning that all it contains no wrong information. share_data boolean Indicates whether or not Talonic may temporarily store and look at the source data and any intermediate and final results to further improve our services. See Privacy Policy. Default=false additional_feedback string≤ 1000 characters Text containing any additional feedback we should know.
ErrorResponse
detail string Error message detailing what went wrong.