Data Ingest

OAIS-aligned 6-step batch ingestion pipeline with AI processing.

ahgIngestPlugin
9
ingest_session
16
ingest_file
0
ingest_mapping
56
ingest_row
26
ingest_validation
4
ingest_job
Entity Relationship Diagram
Open Full Size
ingest_session
35 columns 9 rows
Column Type Nullable Default Extra Comment
PK id int NOT NULL - auto_increment
FK user_id int NOT NULL -
title varchar(500) NULL -
sector varchar(44) NOT NULL archive archive, museum, library, gallery, dam
standard varchar(40) NOT NULL isadg isadg, dc, spectrum, cco, rad, dacs
repository_id int NULL -
parent_id int NULL -
parent_placement varchar(46) NULL top_level existing, new, top_level, csv_hierarchy
new_parent_title varchar(500) NULL -
new_parent_level varchar(100) NULL -
output_create_records tinyint(1) NULL 1
output_generate_sip tinyint(1) NULL 0
output_generate_aip tinyint(1) NULL 0
output_generate_dip tinyint(1) NULL 0
output_sip_path varchar(1000) NULL -
output_aip_path varchar(1000) NULL -
output_dip_path varchar(1000) NULL -
derivative_thumbnails tinyint(1) NULL 1
derivative_reference tinyint(1) NULL 1
derivative_normalize_format varchar(50) NULL -
security_classification_id int NULL -
process_ner tinyint(1) NULL 0
process_ocr tinyint(1) NULL 0
process_virus_scan tinyint(1) NULL 1
process_summarize tinyint(1) NULL 0
process_spellcheck tinyint(1) NULL 0
process_translate tinyint(1) NULL 0
process_translate_lang varchar(10) NULL -
process_format_id tinyint(1) NULL 0
process_face_detect tinyint(1) NULL 0
entity_type varchar(30) NOT NULL description description, accession
FK status varchar(81) NULL configure configure, upload, map, validate, preview, commit, completed, failed, cancelled
config json NULL -
created_at datetime NULL CURRENT_TIMESTAMP DEFAULT_GENERATED
updated_at datetime NULL CURRENT_TIMESTAMP DEFAULT_GENERATED on update CURRENT_TIMESTAMP
ingest_file
13 columns 16 rows
Column Type Nullable Default Extra Comment
PK id int NOT NULL - auto_increment
FK session_id int NOT NULL -
file_type varchar(31) NOT NULL - csv, zip, ead, directory
original_name varchar(500) NULL -
stored_path varchar(1000) NOT NULL -
file_size bigint NULL 0
mime_type varchar(100) NULL -
row_count int NULL -
delimiter varchar(5) NULL -
encoding varchar(50) NULL -
headers json NULL -
extracted_path varchar(1000) NULL -
created_at datetime NULL CURRENT_TIMESTAMP DEFAULT_GENERATED
ingest_mapping
8 columns 0 rows
Column Type Nullable Default Extra Comment
PK id int NOT NULL - auto_increment
FK session_id int NOT NULL -
source_column varchar(255) NOT NULL -
target_field varchar(255) NULL -
is_ignored tinyint(1) NULL 0
default_value varchar(500) NULL -
transform varchar(100) NULL -
sort_order int NULL 0
ingest_row
19 columns 56 rows
Column Type Nullable Default Extra Comment
PK id int NOT NULL - auto_increment
FK session_id int NOT NULL -
row_number int NOT NULL -
FK legacy_id varchar(255) NULL -
parent_id_ref varchar(255) NULL -
level_of_description varchar(100) NULL -
title varchar(1000) NULL -
data json NOT NULL -
enriched_data json NULL -
digital_object_path varchar(1000) NULL -
digital_object_matched tinyint(1) NULL 0
metadata_extracted json NULL -
checksum_sha256 varchar(64) NULL -
FK is_valid tinyint(1) NULL 1
is_excluded tinyint(1) NULL 0
created_atom_id int NULL -
created_do_id int NULL -
created_accession_id int NULL -
created_at datetime NULL CURRENT_TIMESTAMP DEFAULT_GENERATED
ingest_validation
8 columns 26 rows
Column Type Nullable Default Extra Comment
PK id int NOT NULL - auto_increment
FK session_id int NOT NULL -
FK row_number int NOT NULL -
severity varchar(28) NULL error error, warning, info
field_name varchar(255) NULL -
message text NOT NULL -
is_excluded tinyint(1) NULL 0
created_at datetime NULL CURRENT_TIMESTAMP DEFAULT_GENERATED
ingest_job
16 columns 4 rows
Column Type Nullable Default Extra Comment
PK id int NOT NULL - auto_increment
FK session_id int NOT NULL -
FK status varchar(51) NULL queued queued, running, completed, failed, cancelled
total_rows int NULL 0
processed_rows int NULL 0
created_records int NULL 0
created_dos int NULL 0
sip_package_id int NULL -
aip_package_id int NULL -
dip_package_id int NULL -
error_count int NULL 0
error_log json NULL -
manifest_path varchar(1000) NULL -
started_at datetime NULL -
completed_at datetime NULL -
created_at datetime NULL CURRENT_TIMESTAMP DEFAULT_GENERATED
Notes

Log in to add notes.

Legend
PK Primary Key FK Foreign Key / Index UQ Unique Constraint
Table structures are read live from the database. Row counts reflect current data.