-

Platform Guides

This API allows you to submit a file (or file URL) along with a JSON schema that describes the structure of the data you want to extract. Once submitted, the request is queued for processing, and you can later poll for the result.

How It Works

1. Requirements and Optionals

  • A file in a supported media type, see below.
  • A Talonic API key. Contact Talonic for details.
  • Optional: A valid JSON schema. See JSON-Schema.org for instructions.
  • Optional: A description of the data contained in the file; increases accuracy.

2. Submit a Request

  • Use the /process endpoint to submit a full job (extract + optional recommend + convert + optional validate). You can either upload a file or provide a URL to one, along with the JSON schema describing the expected results.
  • Alternatively, use /extract to only extract markdown from the source without conversion, or /recommend to only generate a recommended JSON schema for the source.

Sample cURL to submit a file directly :

curl -X PUT "https://api.talonic.ai/data-extractor/process" \

-H "Authorization: Bearer YOUR_API_KEY" \

-F "file=@/path/to/your/file.pdf" \

-F "json_schema={\"$schema\":\"http://json-schema.org/draft-07/schema#\",\"type\":\"object\",...}" \

-F "description=Optional description of the file"

Sample cURL to submit a file URL :

curl -X PUT "https://api.talonic.ai/data-extractor/process" \

-H "Authorization: Bearer YOUR_API_KEY" \

-F "file_url=https://example.com/path/to/file.pdf" \

-F "json_schema={\"$schema\":\"http://json-schema.org/draft-07/schema#\",\"type\":\"object\",...}" \

-F "description=Optional description of the file"

3. Poll for Status

  • To check the status and get the result of your processing job use the /process/{job_id} endpoint with the provided job_id.

Sample cURL to poll job status :

curl -X GET "https://api.talonic.ai/data-extractor/process/YOUR_JOB_ID" \

-H "Authorization: Bearer YOUR_API_KEY"

  • The response (ProcessStatusResponse) will show the current status of the conversion.
  • If "successful", it will also include the extracted data according to your JSON schema.

Notes

  • Replace YOUR_API_KEY with your actual API key.
  • Replace placeholders like /path/to/your/file.pdf and YOUR_JOB_ID with your actual file path and job identifier.
  • Use the json_schema field to clearly define what data you expect to be extracted from the file.
  • The description can be used to provide additional context and information about the file to the system that may be necessary for proper extraction and/or mapping.
  • If a file_url is submitted, ensure that it is publicly accessible. Any errors in file validation will result in a "failed" processing status.

As the API is currently in testing, all endpoints and schemas are subject to change.

Servers
Computed URL: https://api.talonic.ai/data-extractor
Server variables

Processing

PUT
/process
Submit a Processing Request

Submit a file or a file URL along with a JSON schema for processing.

No parameters

No parameters

Request body


{ "file": "", "json_schema": { "$schema": "http://json-schema.org/draft-07/schema#", "title": "Invoice", "description": "ACME Invoice", "type": "object", "properties": { "invoiceId": { "type": "string", "description": "A unique identifier for the invoice.", "pattern": "^[A-Z]{2,3}-\\d{6}$", "examples": [ "INV-000001", "AB-123456" ] }, "date": { "type": "string", "description": "The date when the invoice was issued, in YYYY-MM-DD format.", "pattern": "^\\d{4}-\\d{2}-\\d{2}$", "examples": [ "2025-01-01", "2024-12-31" ] }, "dueDate": { "type": "string", "description": "The payment due date for the invoice, in YYYY-MM-DD format.", "pattern": "^\\d{4}-\\d{2}-\\d{2}$", "examples": [ "2025-01-15", "2024-12-31" ] }, "billTo": { "type": "object", "description": "Details of the entity being billed.", "properties": { "name": { "type": "string", "description": "Name of the customer or client.", "examples": [ "Acme Corporation", "John Doe" ] }, address": { "type": "string", "description": "Billing address of the customer or client.", "examples": [ "123 Main St, Anytown, USA", "456 Elm St, Othertown, USA" ] }, "email": { "type": "string", "description": "Email address of the customer or client.", "format": "email", "examples": [ "contact@acme.com", "johndoe@example.com" ] } }, "required": [ "name", "address", "email" ] }, "items": { "type": "array", "description": "List of items or services included in the invoice.", "items": { "type": "object", "properties": { "description": { "type": "string", "description": "Description of the item or service.", "examples": [ "Web design services", "Consulting hours" ] }, "quantity": { "type": "integer", "description": "Quantity of the item or hours of service.", "minimum": 1, "examples": [ 10, 5 ] }, "unitPrice": { "type": "number", "description": "Price per single unit or hour, in the specified currency.", "minimum": 0, "pattern": "^[0-9]+(\\.[0-9]{2})$", "examples": [ 150, 75.5 ] }, "total": { "type": "number", "description": "Total price for the item (quantity * unitPrice).", "minimum": 0, "pattern": "^[0-9]+(\\.[0-9]{2})$", "examples": [ 1500, 377.5 ] } }, "required": [ "description", "quantity", "unitPrice", "total" ] }, "minItems": 1 }, "subtotal": { "type": "number", "description": "Sum of all item totals before taxes and discounts.", "minimum": 0, "pattern": "^[0-9]+(\\.[0-9]{2})$", "examples": [ 1877.5 ] }, "tax": { "type": "number", "description": "Tax amount applied to the subtotal.", "minimum": 0, "pattern": "^[0-9]+(\\.[0-9]{2})$", "examples": [ 150 ] }, "total": { "type": "number", "description": "Total amount due, including taxes and any additional charges.", "minimum": 0, "pattern": "^[0-9]+(\\.[0-9]{2})$", "examples": [ 2027.5 ] }, "currency": { "type": "string", "description": "ISO 4217 currency code.", "pattern": "^[A-Z]{3}$", "examples": [ "USD", "EUR" ] }, "terms": { "type": "string", "description": "Payment terms and conditions.", "examples": [ "Payment is due within 15 days.", "Net 30 days." ] } }, "required": [ "invoiceId", "date", "dueDate", "billTo", "items", "subtotal", "tax", "total", "currency" ] }, "fast_extraction": false, "description": "Generic invoice document for shop orders, all values are in USD if not otherwise stated." }
(object | object)
    One of (object | object)
        #0 object
            file (string | string | string | string | string | string | string | string | string | string | string | string | string | string | string | string)
                Any of (string | string | string | string | string | string | string | string | string | string | string | string | string | string | string | string)
                    #0 string binarymedia type: application/pdf
                    .pdf file (Adobe Acrobat)

                    #1 string binarymedia type: text/csv
                    .csv file (Comma-Separated Values)

                    #2 string binarymedia type: application/msword
                    .doc file (Microsoft Word)

                    #3 string binarymedia type: application/vnd.openxmlformats-officedocument.wordprocessingml.document
                    .docx file (Microsoft Word)

                    #4 string binarymedia type: application/vnd.ms-excel
                    .xls file (Microsoft Excel)

                    #5 string binarymedia type: application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
                    .xlsx file (Microsoft Excel)

                    #6 string binarymedia type: application/vnd.oasis.opendocument.spreadsheet
                    .ods file (Open Document Sheet)

                    #7 string binarymedia type: application/vnd.oasis.opendocument.text
                    .odt file (Open Document Text)

                    #8 string binarymedia type: application/vnd.apple.numbers
                    .numbers file (Apple Numbers)

                    #9 string binarymedia type: application/vnd.apple.pages
                    .pages file (Apple Pages)

                    #10 string binarymedia type: image/jpeg
                    .jpg file (JPEG Image)

                    #11 string binarymedia type: image/png
                    .png file (PNG Image)

                    #12 string binarymedia type: text/plain
                    .txt file (Plaintext)

                    #13 string binarymedia type: audio/mpeg
                    .mp3 file (MP3 Audio)

                    #14 string binarymedia type: audio/wav
                    .wav file (Waveform Audio)

                    #15 string binarymedia type: audio/ogg
                    .ogg/.oga file (Ogg Audio)

            json_schema stringmedia type: application/json
            Stringified JSON schema describing the desired result.

            fast_extraction string
            Enable fast extraction method for simple documents, which significantly increases processing speed, but potentially reduces accuracy.

                Enum array
                    #0=true
                    #1=false
                Default=false    
            validation string
            Validation policy to apply. 'lax' collects concerns but does not fail the job; 'strict' fails if max_errors is reached or overall invalid; 'none' disables validation.

                Enum array
                    #0="lax"
                    #1="strict"
                    #2="none"
                Default="lax"
            description string≤ 1000 characters
            Optional description of or context for the provided file.

        #1 object
            file_url stringuri
            Publically accessible URL to the file to be processed. (See ProcessRequestFile for supported file formats)

            json_schema stringmedia type: application/json
            Stringified JSON schema describing the desired result.

            fast_extraction string
            Enable fast extraction method for simple documents, which significantly increases processing speed, but potentially reduces accuracy.

                Enum array
                    #0=true
                    #1=false
                Default=false
            validation string
            Validation policy to apply. 'lax' collects concerns but does not fail the job; 'strict' fails if max_errors is reached or overall invalid; 'none' disables validation.

                Enum array
                    #0="lax"
                    #1="strict"
                    #2="none"
                Default="lax"
            description string≤ 1000 characters
            Optional description of or context for the provided file.

Response

Code Description Links
202

Processing request accepted and queued.

Media type

Controls Accept header.
{ "correlation_id": "3fa85f64-5717-4562-b3fc-2c963f66afa6", "job_id": "3fa85f64-5717-4562-b3fc-2c963f66afa6", "status": "queued", "start_time": "2025-09-09T06:41:31.848Z", "estimated_time_seconds": 0, "message": "string", "filename": "string" }
ProcessResponse object
    correlation_id string uuid
    Unique correlation ID for the request.

    job_id string uuid
    Unique job ID for polling status.

    status string
    Initial status of the request.

        Enum array
            #0"queued"
            #1"processing"
            #2"failed"
            #3"success"
            #4"cancelled"

    start_time string date-time
    ISO 8601 timestamp when processing started.

    estimated_time_seconds integer
    Estimated time in seconds for the processing to finish. Only present if status is queued or processing.

    message string
    Informational message about the request.

    filename string
    Original name of the submitted or linked file, including extension
400

Bad Request. Invalid input parameters.

Media type

{ "detail": "string" }
ErrorResponse object
    detail string
    Error message detailing what went wrong.
401

Unauthorized. Missing or invalid API key.

Media type

{ "detail": "string" }
ErrorResponse object
    detail string
    Error message detailing what went wrong.
413

Payload Too Large. Submitted payload is larger than the maximum allowable size.

Media type

{ "detail": "string" }
ErrorResponse object
    detail string
    Error message detailing what went wrong.
415

Unsupported Media Type. The server does not support the provided media type.

Media type

{ "detail": "string" }
ErrorResponse object
    detail string
    Error message detailing what went wrong.
429

Too Many Requests. Wait a minute and try again.

Media type

{ "detail": "string" }
ErrorResponse object
    detail string
    Error message detailing what went wrong.
500

Server error.

Media type

{ "detail": "string" }
ErrorResponse object
    detail string
    Error message detailing what went wrong.
GET
/process/{job_id}
Get Processing Status

Retrieve the status and result of a processing job using its ID.

No parameters

Name Description

job_id* string($uuid)(path)

Unique identifier of the processing job.

include-schema string(query)

Include JSON schema in response body.

Available values : true, false

include-markdown string(query)

Include markdown extracted from source data in response body.

Available values : true, false

Response

Code Description Links
200

Processing status retrieved successfully.

Media type

Controls Accept header.
{ "correlation_id": "3fa85f64-5717-4562-b3fc-2c963f66afa6", "job_id": "3fa85f64-5717-4562-b3fc-2c963f66afa6", "status": "queued", "start_time": "2025-09-10T10:08:56.445Z", "estimated_time_seconds": 0, "finish_time": "2025-09-10T10:08:56.445Z", "message": "string", "filename": "string", "result": {}, "json_schema": {}, "markdown": "string", "validation_result": { "concerns": [ { "path": "string", "text": "string", "level": "error", "code": "missing_value" } ], "summary": "string" } }
ProcessStatusResponse object
    correlation_id string uuid
    Correlation ID of the request.

    job_id string uuid
    Job ID for polling.

    status string
    Current status of the processing job.

        Enum array
        #0"queued"
        #1"processing"
        #2"failed"
        #3"success"
        #4"cancelled"

    start_time string date-time
    ISO 8601 timestamp when processing started.

    estimated_time_seconds integer
    Estimated time in seconds for the processing to finish. Only present if status is queued or processing.

    finish_time string | null date-time
    ISO 8601 timestamp when processing finished, or null if not finished.

    message string
    Status message, if any.

    filename string
    Original name of the submitted or linked file, including extension

    result object | null
    Processing result following the provided JSON schema, or null if not finished.

    json_schema object
    JSON schema used to create the result JSON. Only present if status is finished and include-schema is true.

    markdown string
    Markdown representation of the source data. Only present if status is finished and include-markdown is true.

    validation_result object
    Validation result of the extracted JSON. Only present if status is finished and validate is true.

        concerns array
        List of concerns with their JSON paths.

            Items object
                path string
                JSON path of the field related to the concern

                text string
                Human-readable description of the concern

                level string
                Severity level of the concern

                    Enum array
                    #0"error"
                    #1"warning"
                    #2"info"

                code string
                Code of the concern

                    Enum array
                    #0"missing_value"
                    #1"null_value"
                    #2"additional_value"
                    #3"format_inconsistent"
                    #4"numeric_mismatch"
                    #5"floating_precision_diff"
                    #6"semantic_conflict"
                    #7"array_length_mismatch"
                    #8"type_mismatch"
                    #9"out_of_range"
                    #10"duplicate_value"
                    #11"incomplete_object"
                    #12"extra_fields"
                    #13"order_difference"

        summary string
        Executive summary of the validation results.
                                        
                                                                        
                                
401

Unauthorized. Missing or invalid API key.

Media type

{ "detail": "string" }
ErrorResponse object
    detail string
    Error message detailing what went wrong.
404

Not Found. No job found with the provided ID.

Media type

{ "detail": "string" }
ErrorResponse object
    detail string
    Error message detailing what went wrong.
429

Too Many Requests. Wait a minute and try again.

Media type

{ "detail": "string" }
ErrorResponse object
    detail string
    Error message detailing what went wrong.
500

Server error.

Media type

{ "detail": "string" }
ErrorResponse object
    detail string
    Error message detailing what went wrong.
DELETE
/process/{job_id}
Cancel Processing Job

Cancel a running job by process_id.

No parameters

Name Description

job_id* string($uuid)(path)

Unique identifier of the processing job to cancel.

Response

Code Description Links
202

Cancellation request accepted.

400

Bad Request.

Media type

{ "detail": "string" }
ErrorResponse object
    detail string
    Error message detailing what went wrong.
401

Unauthorized. Missing or invalid API key.

Media type

{ "detail": "string" }
ErrorResponse object
    detail string
    Error message detailing what went wrong.
404

Not Found. No job found with the provided ID.

Media type

{ "detail": "string" }
ErrorResponse object
    detail string
    Error message detailing what went wrong.
409

Conflict. Job is already finished or cancelled.

Media type

{ "detail": "string" }
ErrorResponse object
    detail string
    Error message detailing what went wrong.
429

Too Many Requests. Wait a minute and try again.

Media type

{ "detail": "string" }
ErrorResponse object
    detail string
    Error message detailing what went wrong.
500

Server error.

Media type

{ "detail": "string" }
ErrorResponse object
    detail string
    Error message detailing what went wrong.
PUT
/extract
Submit an Extraction Request

Submit a file or a file URL to extract markdown only. No conversion or validation is performed.

No parameters

No parameters

Request body

{ "file": "string", "fast_extraction": false, "description": "string" }
(object | object)
    One of (object | object)
        #0 object
            file (string | string | string | string | string | string | string | string | string | string | string | string | string | string | string | string)
                Any of (string | string | string | string | string | string | string | string | string | string | string | string | string | string | string | string)
                    #0 string binary media type: application/pdf
                    .pdf file (Adobe Acrobat)

                    #1 string string binary media type: text/csv
                    .csv file (Comma-Separated Values)

                    #2 string binary media type: application/msword
                    .doc file (Microsoft Word)

                    #3 string binary media type: application/vnd.openxmlformats-officedocument.wordprocessingml.document
                    .docx file (Microsoft Word)

                    #4 string binary media type: application/vnd.ms-excel
                    .xls file (Microsoft Excel)

                    #5 string binary media type: application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
                    .xlsx file (Microsoft Excel)

                    #6 string binary media type: application/vnd.oasis.opendocument.spreadsheet
                    .ods file (Open Document Sheet)

                    #7 string binary media type: application/vnd.oasis.opendocument.text
                    .odt file (Open Document Text)

                    #8 string binary media type: application/vnd.apple.numbers
                    .numbers file (Apple Numbers)

                    #9 string binary media type: application/vnd.apple.pages
                    .pages file (Apple Pages)

                    #10 string binary media type: image/jpeg
                    .jpg file (JPEG Image)

                    #11 string binary media type: image/png
                    .png file (PNG Image)

                    #12 string binary media type: text/plain
                    .txt file (Plaintext)

                    #13 string binary media type: audio/mpeg
                    .mp3 file (MP3 Audio)

                    #14 string binary media type: audio/wav
                    .wav file (Waveform Audio)

                    #15 string binary media type: audio/ogg
                    .ogg/.oga file (Ogg Audio)

            fast_extraction string
            Enable fast extraction method for simple documents, which significantly increases processing speed, but potentially reduces accuracy.

                Enum array
                    #0=true
                    #1=false
                Default=false 

            description string ≤ 1000 characters
            Optional description of or context for the provided file.

        #1 object
            file_url string uri
            Publically accessible URL to the file to be processed. (See ExtractRequestFile for supported file formats)

            fast_extraction string
            Enable fast extraction method for simple documents, which significantly increases processing speed, but potentially reduces accuracy.

                Enum array
                    #0=true
                    #1=false
                Default=false

            description string ≤ 1000 characters
            Optional description of or context for the provided file.

Response

Code Description Links
202

Extraction request accepted and queued.

Media type

Controls Accept header.
{ "correlation_id": "3fa85f64-5717-4562-b3fc-2c963f66afa6", "job_id": "3fa85f64-5717-4562-b3fc-2c963f66afa6", "status": "queued", "start_time": "2025-09-11T03:48:18.939Z", "estimated_time_seconds": 0, "message": "string", "filename": "string" }
ProcessResponse object
    correlation_id string uuid
    Unique correlation ID for the request.

    job_id string uuid
    Unique job ID for polling status.

    status string
    Initial status of the request.

        Enum array
            #0"queued"
            #1"processing"
            #2"failed"
            #3"success"
            #4"cancelled"

    start_time string date-time
    ISO 8601 timestamp when processing started.

    estimated_time_seconds integer
    Estimated time in seconds for the processing to finish. Only present if status is queued or processing.

    message string
    Informational message about the request.

    filename string
    Original name of the submitted or linked file, including extension
400

Bad Request. Invalid input parameters.

Media type

{ "detail": "string" }
ErrorResponse object
    detail string
    Error message detailing what went wrong.
401

Unauthorized. Missing or invalid API key.

Media type

{ "detail": "string" }
ErrorResponse object
    detail string
    Error message detailing what went wrong.
413

Payload Too Large. Submitted payload is larger than the maximum allowable size.

Media type

{ "detail": "string" }
ErrorResponse object
    detail string
    Error message detailing what went wrong.
415

Unsupported Media Type. The server does not support the provided media type.

Media type

{ "detail": "string" }
ErrorResponse object
    detail string
    Error message detailing what went wrong.
429

Too Many Requests. Wait a minute and try again.

Media type

{ "detail": "string" }
ErrorResponse object
    detail string
    Error message detailing what went wrong.
500

Server error.

Media type

{ "detail": "string" }
ErrorResponse object
    detail string
    Error message detailing what went wrong.
PUT
/recommend
Submit a Schema Recommendation Request

Submit a file or a file URL to generate a recommended JSON schema only. No conversion or validation is performed.

No parameters

No parameters

Request body

{ "file": "string", "fast_extraction": false, "description": "string" }
(object | object)
    One of (object | object)
        #0 object
            file (string | string | string | string | string | string | string | string | string | string | string | string | string | string | string | string)
                Any of (string | string | string | string | string | string | string | string | string | string | string | string | string | string | string | string)
                    #0 string binary media type: application/pdf
                    .pdf file (Adobe Acrobat)

                    #1 string string binary media type: text/csv
                    .csv file (Comma-Separated Values)

                    #2 string binary media type: application/msword
                    .doc file (Microsoft Word)

                    #3 string binary media type: application/vnd.openxmlformats-officedocument.wordprocessingml.document
                    .docx file (Microsoft Word)

                    #4 string binary media type: application/vnd.ms-excel
                    .xls file (Microsoft Excel)

                    #5 string binary media type: application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
                    .xlsx file (Microsoft Excel)

                    #6 string binary media type: application/vnd.oasis.opendocument.spreadsheet
                    .ods file (Open Document Sheet)

                    #7 string binary media type: application/vnd.oasis.opendocument.text
                    .odt file (Open Document Text)

                    #8 string binary media type: application/vnd.apple.numbers
                    .numbers file (Apple Numbers)

                    #9 string binary media type: application/vnd.apple.pages
                    .pages file (Apple Pages)

                    #10 string binary media type: image/jpeg
                    .jpg file (JPEG Image)

                    #11 string binary media type: image/png
                    .png file (PNG Image)

                    #12 string binary media type: text/plain
                    .txt file (Plaintext)

                    #13 string binary media type: audio/mpeg
                    .mp3 file (MP3 Audio)

                    #14 string binary media type: audio/wav
                    .wav file (Waveform Audio)

                    #15 string binary media type: audio/ogg
                    .ogg/.oga file (Ogg Audio)

            fast_extraction string
            Enable fast extraction method for simple documents, which significantly increases processing speed, but potentially reduces accuracy.

                Enum array
                    #0=true
                    #1=false
                Default=false 

            description string ≤ 1000 characters
            Optional description of or context for the provided file.

        #1 object
            file_url string uri
            Publically accessible URL to the file to be processed. (See ExtractRequestFile for supported file formats)

            fast_extraction string
            Enable fast extraction method for simple documents, which significantly increases processing speed, but potentially reduces accuracy.

                Enum array
                    #0=true
                    #1=false
                Default=false

            description string ≤ 1000 characters
            Optional description of or context for the provided file.

Response

Code Description Links
202

Recommendation request accepted and queued.

Media type

Controls Accept header.
{ "correlation_id": "3fa85f64-5717-4562-b3fc-2c963f66afa6", "job_id": "3fa85f64-5717-4562-b3fc-2c963f66afa6", "status": "queued", "start_time": "2025-09-13T03:41:56.257Z", "estimated_time_seconds": 0, "message": "string", "filename": "string" }
ProcessResponse object
correlation_id string uuid
job_id string uuid
status string
start_time string date-time
estimated_time_seconds integer
message string
filename string
400

Bad Request. Invalid input parameters.

Media type

{ "detail": "string" }
ErrorResponse object
    detail string
    Error message detailing what went wrong.
401

Unauthorized. Missing or invalid API key.

Media type

{ "detail": "string" }
ErrorResponse object
    detail string
    Error message detailing what went wrong.
413

Payload Too Large. Submitted payload is larger than the maximum allowable size.

Media type

{ "detail": "string" }
ErrorResponse object
    detail string
    Error message detailing what went wrong.
415

Unsupported Media Type. The server does not support the provided media type.

Media type

{ "detail": "string" }
ErrorResponse object
    detail string
    Error message detailing what went wrong.
429

Too Many Requests. Wait a minute and try again.

Media type

{ "detail": "string" }
ErrorResponse object
    detail string
    Error message detailing what went wrong.
500

Server error.

Media type

{ "detail": "string" }
ErrorResponse object
    detail string
    Error message detailing what went wrong.

Other Endpoints

GET
/status
Check the availability and API version of service

Check the availability and API version of service.

Parameters

No parameters

Response

Code Description Links
200

Server is able to respond to the request.

Media type

{ "status": "OK", "version": "1.1.5" }
object
status string
Current health of the server.

Enum array
#0="OK"
#1="Unstable"
version string
Current API version.

Examplesarray
#0="1.1.5"
429

Too Many Requests. Wait a minute and try again.

Media type

{ "detail": "string" }
ErrorResponse object
    detail string
    Error message detailing what went wrong.
500

Server error.

Media type

{ "detail": "string" }
ErrorResponse object
    detail string
    Error message detailing what went wrong.
POST
/process/{job_id}/feedback
Submit or update feedback for a completed job

Provide feedback for a finished job in order to help us improve the quality of the results.

No parameters

Name Description

job_id* string($uuid)(path)

Unique identifier of the processing job.

Request body

{ "rating": 1, "data_complete": true, "schema_correct": true, "data_correct": true, "share_data": false, "additional_feedback": "string" }
ProcessFeedback object
    rating integer[1, 5]
    Overall rating of the result on a 5-star like scale, 1 being the lowest.

    data_complete boolean
    Indicates whether or not the data returned by the API is complete, i.e. it contains all relevant and expected data points.

    schema_correct boolean
    Indicates whether or not the auto-generated schema (if any) looks sensible and useful and contains fields for all relevant and expected data points.

    data_correct boolean
    Indicates whether or not the data returned by the API seems or is correct, meaning that all it contains no wrong information.

    share_data boolean
    Indicates whether or not Talonic may temporarily store and look at the source data and any intermediate and final results to further improve our services. See Privacy Policy.

    Defaultfalse
    additional_feedback string≤ 1000 characters
    Text containing any additional feedback we should know.

Response

Code Description Links
204

Feedback accepted.

400

Bad Request. Invalid input parameters.

Media type

{ "detail": "string" }
ErrorResponse object
    detail string
    Error message detailing what went wrong.
401

Unauthorized. Missing or invalid API key.

Media type

{ "detail": "string" }
ErrorResponse object
    detail string
    Error message detailing what went wrong.
422

Unprocessable Content. Processing is likely not yet finished.

Media type

{ "detail": "string" }
ErrorResponse object
    detail string
    Error message detailing what went wrong.
429

Too Many Requests. Wait a minute and try again.

Media type

{ "detail": "string" }
ErrorResponse object
    detail string
    Error message detailing what went wrong.
500

Server error.

Media type

{ "detail": "string" }
ErrorResponse object
    detail string
    Error message detailing what went wrong.

Schemas

ExtractRequestFile

file (string | string | string | string | string | string | string | string | string | string | string | string | string | string | string | string)
    Any of (string | string | string | string | string | string | string | string | string | string | string | string | string | string | string | string)
        #0 stringbinarymedia type: application/pdf
        .pdf file (Adobe Acrobat)

        #1 stringbinarymedia type: text/csv
        .csv file (Comma-Separated Values)

        #2 stringbinarymedia type: application/msword
        .doc file (Microsoft Word)

        #3 stringbinarymedia type: application/vnd.openxmlformats-officedocument.wordprocessingml.document
        .docx file (Microsoft Word)

        #4 stringbinarymedia type: application/vnd.ms-excel
        .xls file (Microsoft Excel)

        #5 stringbinarymedia type: application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
        .xlsx file (Microsoft Excel)

        #6 stringbinarymedia type: application/vnd.oasis.opendocument.spreadsheet
        .ods file (Open Document Sheet)

        #7 stringbinarymedia type: application/vnd.oasis.opendocument.text
        .odt file (Open Document Text)

        #8 stringbinarymedia type: application/vnd.apple.numbers
        .numbers file (Apple Numbers)

        #9 stringbinarymedia type: application/vnd.apple.pages
        .pages file (Apple Pages)

        #10 stringbinarymedia type: image/jpeg
        .jpg file (JPEG Image)

        #11 stringbinarymedia type: image/png
        .png file (PNG Image)

        #12 stringbinarymedia type: text/plain
        .txt file (Plaintext)

        #13 stringbinarymedia type: audio/mpeg
        .mp3 file (MP3 Audio)

        #14 stringbinarymedia type: audio/wav
        .wav file (Waveform Audio)

        #15 stringbinarymedia type: audio/ogg
        .ogg/.oga file (Ogg Audio)

fast_extraction string
Enable fast extraction method for simple documents, which significantly increases processing speed, but potentially reduces accuracy.

    Enum array
        #0=true
        #1=false
        Default=false
    description string≤ 1000 characters
    Optional description of or context for the provided file.

ExtractRequestFileURL

file_url string uri
Publically accessible URL to the file to be processed. (See ExtractRequestFile for supported file formats)

fast_extraction string
Enable fast extraction method for simple documents, which significantly increases processing speed, but potentially reduces accuracy.

    Enum array
        #0=true
        #1=false
        Default=false
description string≤ 1000 characters
Optional description of or context for the provided file.

RecommendRequestFile

file (string | string | string | string | string | string | string | string | string | string | string | string | string | string | string | string)
    Any of (string | string | string | string | string | string | string | string | string | string | string | string | string | string | string | string)
        #0 stringbinarymedia type: application/pdf
        .pdf file (Adobe Acrobat)

        #1 stringbinarymedia type: text/csv
        .csv file (Comma-Separated Values)

        #2 stringbinarymedia type: application/msword
        .doc file (Microsoft Word)

        #3 stringbinarymedia type: application/vnd.openxmlformats-officedocument.wordprocessingml.document
        .docx file (Microsoft Word)

        #4 stringbinarymedia type: application/vnd.ms-excel
        .xls file (Microsoft Excel)

        #5 stringbinarymedia type: application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
        .xlsx file (Microsoft Excel)

        #6 stringbinarymedia type: application/vnd.oasis.opendocument.spreadsheet
        .ods file (Open Document Sheet)

        #7 stringbinarymedia type: application/vnd.oasis.opendocument.text
        .odt file (Open Document Text)

        #8 stringbinarymedia type: application/vnd.apple.numbers
        .numbers file (Apple Numbers)

        #9 stringbinarymedia type: application/vnd.apple.pages
        .pages file (Apple Pages)

        #10 stringbinarymedia type: image/jpeg
        .jpg file (JPEG Image)

        #11 stringbinarymedia type: image/png
        .png file (PNG Image)

        #12 stringbinarymedia type: text/plain
        .txt file (Plaintext)

        #13 stringbinarymedia type: audio/mpeg
        .mp3 file (MP3 Audio)

        #14 stringbinarymedia type: audio/wav
        .wav file (Waveform Audio)

        #15 stringbinarymedia type: audio/ogg
        .ogg/.oga file (Ogg Audio)

fast_extraction string
Enable fast extraction method for simple documents, which significantly increases processing speed, but potentially reduces accuracy.

    Enum array
        #0=true
        #1=false
        Default=false
description string≤ 1000 characters
Optional description of or context for the provided file.

RecommendRequestFileURL

file_url strin guri
Publically accessible URL to the file to be processed. (See RecommendRequestFile for supported file formats)

fast_extraction string
Enable fast extraction method for simple documents, which significantly increases processing speed, but potentially reduces accuracy.

Enum array
    #0=true
    #1=false
    Default=false
description string≤ 1000 characters
Optional description of or context for the provided file.

ProcessRequestFile

file (string | string | string | string | string | string | string | string | string | string | string | string | string | string | string | string)
    Any of (string | string | string | string | string | string | string | string | string | string | string | string | string | string | string | string)
        #0 stringbinarymedia type: application/pdf
        .pdf file (Adobe Acrobat)

        #1 stringbinarymedia type: text/csv
        .csv file (Comma-Separated Values)

        #2 stringbinarymedia type: application/msword
        .doc file (Microsoft Word)

        #3 stringbinarymedia type: application/vnd.openxmlformats-officedocument.wordprocessingml.document
        .docx file (Microsoft Word)

        #4 stringbinarymedia type: application/vnd.ms-excel
        .xls file (Microsoft Excel)

        #5 stringbinarymedia type: application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
        .xlsx file (Microsoft Excel)

        #6 stringbinarymedia type: application/vnd.oasis.opendocument.spreadsheet
        .ods file (Open Document Sheet)

        #7 stringbinarymedia type: application/vnd.oasis.opendocument.text
        .odt file (Open Document Text)

        #8 stringbinarymedia type: application/vnd.apple.numbers
        .numbers file (Apple Numbers)

        #9 stringbinarymedia type: application/vnd.apple.pages
        .pages file (Apple Pages)

        #10 stringbinarymedia type: image/jpeg
        .jpg file (JPEG Image)

        #11 stringbinarymedia type: image/png
        .png file (PNG Image)

        #12 stringbinarymedia type: text/plain
        .txt file (Plaintext)

        #13 stringbinarymedia type: audio/mpeg
        .mp3 file (MP3 Audio)

        #14 stringbinarymedia type: audio/wav
        .wav file (Waveform Audio)

        #15 stringbinarymedia type: audio/ogg
        .ogg/.oga file (Ogg Audio)

json_schema stringmedia type: application/json
Stringified JSON schema describing the desired result.

fast_extraction string
Enable fast extraction method for simple documents, which significantly increases processing speed, but potentially reduces accuracy.

    Enum array
        #0=true
        #1=false
        Default=false
validation string
Validation policy to apply. 'lax' collects concerns but does not fail the job; 'strict' fails if max_errors is reached or overall invalid; 'none' disables validation.

    Enum array
        #0="lax"
        #1="strict"
        #2="none"
        Default="lax"
description string≤ 1000 characters
Optional description of or context for the provided file.

ProcessRequestFileURL

file_url string uri
Publically accessible URL to the file to be processed. (See ProcessRequestFile for supported file formats)

json_schema stringmedia type: application/json
Stringified JSON schema describing the desired result.

fast_extraction string
Enable fast extraction method for simple documents, which significantly increases processing speed, but potentially reduces accuracy.

    Enum array
        #0=true
        #1=false
        Default=false
validation string
Validation policy to apply. 'lax' collects concerns but does not fail the job; 'strict' fails if max_errors is reached or overall invalid; 'none' disables validation.

    Enum array
        #0="lax"
        #1="strict"
        #2="none"
        Default"lax"
description string≤ 1000 characters
Optional description of or context for the provided file.

ProcessResponse

correlation_id string uuid
Unique correlation ID for the request.

job_id stringuuid
Unique job ID for polling status.

status string
Initial status of the request.

    Enum array
        #0="queued"
        #1="processing"
        #2="failed"
        #3="success"
        #4="cancelled"
start_time stringdate-time
ISO 8601 timestamp when processing started.

estimated_time_seconds integer
Estimated time in seconds for the processing to finish. Only present if status is queued or processing.

message string
Informational message about the request.

filename string
Original name of the submitted or linked file, including extension

ProcessValidationResult

concerns array object
List of concerns with their JSON paths.

    Items object
        path string
        JSON path of the field related to the concern

        text string
        Human-readable description of the concern

        level string
        Severity level of the concern

            Enum array
                #0="error"
                #1="warning"
                #2="info"
        code string
        Code of the concern

            Enum array
                #0"missing_value"
                #1="null_value"
                #2="additional_value"
                #3="format_inconsistent"
                #4="numeric_mismatch"
                #5="floating_precision_diff"
                #6="semantic_conflict"
                #7="array_length_mismatch"
                #8="type_mismatch"
                #9="out_of_range"
                #10="duplicate_value"
                #11="incomplete_object"
                #12="extra_fields"
                #13="order_difference"
                #14="array_reordered"
                #15="array_deduplicated"
                #16="numeric_precision_normalized"
                #17="date_format_normalized"
                #18="optional_field_merged"
                #19="llm_value_selected"
summary string
Executive summary of the validation results.

ProcessStatusResponse

correlation_id stringuuid
Correlation ID of the request.

job_id stringuuid
Job ID for polling.

status string
Current status of the processing job.

    Enum array
        #0"queued"
        #1"processing"
        #2"failed"
        #3"success"
        #4"cancelled"
start_time stringdate-time
ISO 8601 timestamp when processing started.

estimated_time_seconds integer
Estimated time in seconds for the processing to finish. Only present if status is queued or processing.

finish_time string | nulldate-time
ISO 8601 timestamp when processing finished, or null if not finished.

message string
Status message, if any.

filename string
Original name of the submitted or linked file, including extension

result object | null
Processing result following the provided JSON schema, or null if not finished.

json_schema object
JSON schema used to create the result JSON. Only present if status is finished and include-schema is true.

markdown string
Markdown representation of the source data. Only present if status is finished and include-markdown is true.

validation_result object
Validation result of the extracted JSON. Only present if status is finished and validate is true.

    concerns array object
    List of concerns with their JSON paths.

        Items object
            path string
            JSON path of the field related to the concern

            text string
            Human-readable description of the concern

            level string
            Severity level of the concern

                Enum array
                    #0="error"
                    #1="warning"
                    #2="info"
            code string
            Code of the concern

                Enum array
                    #0="missing_value"
                    #1="null_value"
                    #2="additional_value"
                    #3="format_inconsistent"
                    #4="numeric_mismatch"
                    #5="floating_precision_diff"
                    #6="semantic_conflict"
                    #7="array_length_mismatch"
                    #8="type_mismatch"
                    #9="out_of_range"
                    #10="duplicate_value"
                    #11="incomplete_object"
                    #12="extra_fields"
                    #13="order_difference"
                    #14="array_reordered"
                    #15="array_deduplicated"
                    #16="numeric_precision_normalized"
                    #17="date_format_normalized"
                    #18="optional_field_merged"
                    #19="llm_value_selected"
summary string
Executive summary of the validation results.

ProcessFeedback

rating integer[1, 5]
Overall rating of the result on a 5-star like scale, 1 being the lowest.

data_complete boolean
Indicates whether or not the data returned by the API is complete, i.e. it contains all relevant and expected data points.

schema_correct boolean
Indicates whether or not the auto-generated schema (if any) looks sensible and useful and contains fields for all relevant and expected data points.

data_correct boolean
Indicates whether or not the data returned by the API seems or is correct, meaning that all it contains no wrong information.

share_data boolean
Indicates whether or not Talonic may temporarily store and look at the source data and any intermediate and final results to further improve our services. See Privacy Policy.

    Default=false
additional_feedback string≤ 1000 characters
Text containing any additional feedback we should know.

ErrorResponse

detail string
Error message detailing what went wrong.