Excel and Template Importers
This page covers the workbook-driven import path used for SureDrive and related content models. The main parser is SC/suredms-parser/src/main/java/com/sureclinical/suredms/parser/excel/ExcelArchiveParser.java, with version-specific wrappers such as SC/suredms-parser/src/main/java/com/sureclinical/suredms/parser/excel/ExcelEtmf33Parser.java and SC/suredms-parser/src/main/java/com/sureclinical/suredms/parser/excel/ExcelQms10Parser.java.
Purpose
The Excel importers translate workbook sheets into an ArchiveModel by reading categories, content types, folders, data property definitions, organizations, people, roles, and discrepancy types.
Scope
This page focuses on workbook parsing and template adapters. It does not cover OCR or XML parsing.
Entry Points
SC/suredms-parser/src/main/java/com/sureclinical/suredms/parser/excel/ExcelArchiveParser.javaSC/suredms-parser/src/main/java/com/sureclinical/suredms/parser/excel/ExcelEntityReader.javaSC/suredms-parser/src/main/java/com/sureclinical/suredms/parser/excel/ExcelSheet.javaSC/suredms-parser/src/main/java/com/sureclinical/suredms/parser/excel/ISFContentModelArchiveParser.javaSC/suredms-xls-parser/src/main/java/com/sureclinical/suredms/parser/ExcelParser.java
Primary Components
ExcelArchiveParserreads the workbook, detects sheets, and dispatches each sheet to helper methods.ExcelEntityReaderconverts rows into parser models and validates folder numbering, parent-child structure, metadata values, and sheet-specific rules.ExcelSheetabstracts Apache POI sheet access and normalizes cell reading, column counting, and header lookup.ISFContentModelArchiveParseris the simplified parser variant for ISF content model workbooks.ExcelParserinsuredms-xls-parseris the desktop-side workbook parser that producesArchiveCtxobjects and drives the older desktop import pipeline.- Version-specific wrappers such as
ExcelEtmf33ParserandExcelQms10Parserload bundled template workbooks and then hand them toExcelArchiveParser.
Data Flow
- A workbook is opened with Apache POI.
- The parser checks for known sheet names such as properties, annotations, categories, folders, organizations, persons, roles, users, and discrepancy types.
ExcelEntityReadermaps rows into parser models and attaches annotations or validation errors.- Parsed models are assembled into an
ArchiveModel. - Template-specific parsers set the content model version, name, and date after parsing.
Key Behaviors
- The parser supports both legacy and current sheet labels, which lets older workbook templates continue to work.
- If
FIX_ERRORS_IF_POSSIBLEis enabled, the parser relaxes some input formatting issues and fills sensible defaults. - Folder and content-type ids are validated to prevent duplicate ids and invalid parent relationships.
- The desktop
ExcelParserfollows a parallel but separate path and buildsArchiveCtxfor client-side processing.
Dependencies and Integrations
- Apache POI provides workbook access.
- Parser models in
SC/suredms-parser/src/main/java/com/sureclinical/suredms/parser/modelare the intermediate structures. - Shared entity enums and helper utilities supply numbering, metadata, and validation behavior.
ServiceProvideris used by the wider parser system to select parser implementations.
Edge Cases and Constraints
- Unsupported Office XML files raise a specific parsing error that asks the user to convert the workbook.
- Empty rows terminate workbook scans.
ExcelSheettreats formula, numeric, boolean, and blank cells differently so values stay consistent across sheets.