How are my documents processed?
This page explains transparently how uploaded PDF documents are processed within the service “Smart Document Analyzer”. The service enables automated invoice analysis to extract specific information such as invoice number, date, or total amount.
1. Upload of documents
You can upload PDF documents directly via the web interface. The transmission is encrypted using HTTPS (TLS).
The uploaded files are used exclusively for performing the automated analysis.
2. Automated processing
After upload, the content of the PDF files is automatically processed to extract text information from the document.
This process is fully automated and does not involve manual review by staff.
3. Text extraction (OCR)
If a document does not contain directly readable text (e.g. scanned invoices), automated text recognition (OCR – Optical Character Recognition) may be applied.
An external OCR service may be used for this purpose:
OCR.space
This service is used solely to convert document content into text.
The OCR service may process data temporarily as required for text recognition.
4. AI-based analysis
The extracted text is then automatically analyzed to identify structured information from the document.
Typical extracted data may include:
- Invoice number
- Invoice date
- Total amount
- Company name
- VAT ID
- Address
- Additional selected fields
A language model (AI-based system) may be used for this analysis.
The following provider is currently used:
Mistral AI
Parts of the extracted text content may be transmitted to this service to enable structured analysis of document data.
The transmitted data is not used for training purposes.
The provider may temporarily store data where technically necessary (e.g. for debugging or system stability).
5. Temporary storage
Uploaded documents and extracted data may be stored temporarily for the purpose of performing the analysis.
This storage is strictly limited to the following purposes:
- Performing document analysis
- Providing analysis results
- Generating exportable result files (e.g. Excel)
Uploaded documents and extracted data are automatically deleted after a maximum of 24 hours.
6. Provision of results
The extracted data is provided in structured form.
Users may download the results, for example as an Excel file (.xlsx).
The downloaded file is stored exclusively on the user's device.
7. User responsibility
Users are responsible for ensuring that only documents are uploaded whose processing is legally permitted.
In particular, users must not upload documents containing sensitive personal data, such as:
- Health data
- Biometric data
- Information on criminal convictions
- Other sensitive personal data under Art. 9 GDPR
By activating the checkbox during the upload process, you confirm that the uploaded documents do not contain such sensitive data.
8. Accuracy disclaimer
The automated analysis is based on OCR and AI technologies.
Despite careful implementation, it cannot be guaranteed that all extracted data is complete, accurate, or error-free.
The results are provided for assistance purposes only and should be reviewed by the user before further use (e.g. accounting or tax purposes).
9. Further information
Further information on the processing of personal data can be found in our Privacy Policy.