Web Service API

This page documents the Spring Boot service entry point that exposes document-processing, OCR, redaction, watermarking, and AI-assisted classification endpoints.

Scope

The service module is a direct HTTP API. It is separate from the desktop client and web client, and it focuses on file transformation plus text and classification utilities.

Primary Entry Point

The application is defined in WebServiceApplication.

That class is both:

the Spring Boot application entry point
the REST controller for the service endpoints

The main() method loads the iText license key, sets the server port to 8081, and starts the application.

Exposed Endpoints

The current controller surface includes:

POST /redact for PDF redaction using a multipart file and a redaction payload
POST /search for locating matching text ranges in a PDF
POST /watermark/add for applying a watermark to a PDF
POST /watermark/remove for removing a watermark
POST /watermark/get for listing detected watermarks
POST /ocr/get for returning OCR output as JSON or plain data depending on the engine
POST /ocr/add for adding an OCR layer to uploaded files
POST /ocr/text for returning OCR text output
POST /ai/classify/etmf for eTMF content-type prediction
POST /ai/document/details for combined document AI output
GET /demo for the demo landing page
POST /demo/ai/classify for demo classification text output
POST /demo/ner/stanford for Stanford NER output
POST /demo/ner/opennlp for OpenNLP-style entity extraction output
POST /demo/ocr/add for demo OCR PDF generation

Supporting Services

The controller delegates work to the service and utility layer, including:

OcrUtils
VisionApiUtils
ItextUtils for PDF cleanup, redaction, and watermark handling
ClassificationUtils for PDF text extraction and classification helpers
VertexAiUtils for eTMF classification and document AI output
TextUtils for named-entity extraction
DocumentAiDetails as the combined AI response object

OCR Behavior

OcrUtils supports three engine modes:

documentAi for document AI processing
ocr for Google Vision OCR
tesseract for local HOCR-oriented processing

VisionApiUtils bridges Google Vision OCR and OCR-layer generation over PDF and image inputs.

Implementation Notes

Most endpoints operate on multipart file uploads and return generated files or structured JSON directly.
The service writes temporary files during processing rather than streaming transformations in place.
The AI endpoints combine OCR text, entity extraction, and content-type predictions to support downstream document triage.
The demo endpoints are intentionally separate from the main API surface and are useful as reference behavior for the helpers.

Authentication and Deployment Context

Invocation model

WebServiceApplication is an internal utility microservice. It is not an end-user-facing API and does not expose an authentication layer. The service is expected to be co-deployed with the main SureClinical web application, isolated behind the application server's network boundary.

Property	Value
Default port	`8081` (set in `main()` via `server.port`)
Auth mechanism	None — network-level access control only
Expected callers	The SureClinical web application server (internal service-to-service calls)
Public exposure	Should not be directly accessible from the internet

Security constraints

All endpoints accept multipart file uploads. No session token, API key, or OAuth credential is validated.
The iText license key is loaded on startup from a classpath or file-system resource — it must be present for PDF operations to succeed.
Temporary files are written during processing. The working directory must be writable and on a local, trusted volume.
Demo endpoints (/demo/*) are included in the same application context. These expose representative request/response behavior and should be disabled or network-restricted in production deployments.

Hardening checklist

Restrict port 8081 to localhost or a private network interface in production.
Ensure the directory used for temporary file writes is not web-accessible.
Remove or firewall the /demo/* routes in hardened deployments.
Audit multipart upload size limits — WebServiceApplication does not document a spring.servlet.multipart.max-file-size setting; default Spring Boot limits apply unless overridden.

Scope​

Primary Entry Point​

Exposed Endpoints​

Supporting Services​

OCR Behavior​

Implementation Notes​

Authentication and Deployment Context​

Invocation model​

Security constraints​

Hardening checklist​

Related Topics​