Migration Guide

Complete guide for migrating ArcGIS geodatabases to PostgreSQL/PostGIS using Arc2Post

Arc2Post Migration Guide

Overview

Arc2Post migrates ArcGIS geodatabases (File Geodatabase .gdb or GeoPackage .gpkg) to PostgreSQL/PostGIS. This guide outlines the architectural changes during migration, data type transformations, and expected output states.

Feature Compatibility Matrix

Supported Components

ArcGIS Component PostgreSQL Result Implementation Details
Feature Classes Tables with geometry columns Mapped to a single table per feature class.
Attribute Fields Table columns Field types strictly converted to PostgreSQL native equivalents.
Geometry Data PostGIS geometry values Preserved with original coordinates and SRID.
Row Data PostgreSQL records All rows and attribute values migrated without truncation.
Domains ENUM types or Lookup Tables Configurable via GUI settings.
Indexes PostgreSQL indexes Spatial indexes (GiST) generated automatically.
Relationships Foreign Key constraints Simple relationships mapped to native SQL constraints.

Unsupported Components

ArcGIS Component Architectural Reason Recommended Alternative
Raster Data Stored as BLOB in GDB Store file paths as text attributes.
Attachments Binary data Extracted locally; paths stored in database.
Topology Rules Proprietary ArcGIS logic Must be explicitly recreated via PostGIS functions.
Annotation/Labels Application rendering logic Store label values as standard text attributes.
Geometric Networks Proprietary ArcGIS logic Utilize pgRouting or PostGIS topology.
ArcGIS Metadata Proprietary XML format Store as JSONB document if preservation is required.

Data Type Mappings

Field Type Conversions

ArcGIS Type PostgreSQL Type Notes
Text/String TEXT Unlimited length representation; no 255 character truncation limit.
Short Integer INTEGER 4-byte signed integer.
Long Integer BIGINT 8-byte signed integer.
Float DOUBLE PRECISION 8-byte floating point.
Double DOUBLE PRECISION 8-byte floating point.
Date TIMESTAMPTZ Timestamp mapped with timezone support.
GUID UUID Universally unique identifier.
Blob BYTEA or File Path Binary data extraction or direct reference.
Geometry GEOMETRY Generic PostGIS geometry type.

Geometry Type Handling

Important: Arc2Post enforces the generic GEOMETRY type for all spatial columns.

-- Implemented by Arc2Post:
geom GEOMETRY  -- Validates any geometry type

-- Avoided by Arc2Post (too restrictive for GDBs):
geom GEOMETRY(LINESTRING, 4326)  

Rationale: ArcGIS feature classes frequently contain mixed geometry types within a single layer (e.g., a “Roads” layer containing both LineString and MultiLineString entities). The generic GEOMETRY type handles these multi-type distributions without triggering insert errors.

SRID Handling:

  • The original SRID from the source data is explicitly preserved in every geometry value.
  • Spatial indexes function correctly with mixed SRID data.
  • Fallback SRID: 4326 (WGS84) is applied if the source SRID is undetectable.

Database Schema Generation

Table Name Quoting

PostgreSQL enforces strict identifier rules. Arc2Post automatically quotes table and column names when they:

  1. Start with a number

    -- Source: 3d_fault
    -- Result: "3d_fault"
  2. Contain special characters

    -- Source: my-table / table name
    -- Result: "my-table" / "table name"
  3. Conflict with SQL reserved keywords

    -- Source: select
    -- Result: "select"

Function and Trigger Names

PostgreSQL function names also follow identifier rules. Arc2Post strictly quotes trigger function names to ensure valid DDL execution:

-- Generated functions:
CREATE OR REPLACE FUNCTION "3d_fault_history_trigger"() ...
CREATE OR REPLACE FUNCTION "3d_fault_updated_at_trigger"() ...

-- Trigger assignments:
DROP TRIGGER IF EXISTS "3d_fault_temporal_trigger" ON "3d_fault";

String Truncation Prevention

Unlike legacy migration tools that cast strings to VARCHAR(255), Arc2Post strictly implements TEXT types.

-- Arc2Post implementation:
description TEXT  -- Unlimited length

-- Legacy tool implementation:
description VARCHAR(255)  -- Risk of data loss

Why TEXT instead of VARCHAR?
PostgreSQL handles TEXT and VARCHAR with identical internal performance profiles. Utilizing TEXT prevents arbitrary character limits, ensuring massive fields (like multi-paragraph geological survey notes) migrate without truncation errors or silent data loss.


Default Column Injections

Every migrated table is injected with the following standard columns:

id SERIAL PRIMARY KEY,                    -- Auto-incrementing primary key
geom GEOMETRY,                            -- Spatial data container
created_at TIMESTAMPTZ DEFAULT NOW(),     -- Creation timestamp
updated_at TIMESTAMPTZ DEFAULT NOW(),     -- Modification timestamp
valid_from TIMESTAMPTZ DEFAULT NOW(),     -- Temporal versioning (optional)
valid_to TIMESTAMPTZ DEFAULT 'infinity'   -- Temporal versioning (optional)

Automated Object Creation

  1. Spatial Indexes

    CREATE INDEX idx_tablename_geom ON tablename USING GIST (geom);
  2. Temporal History Tables (if enabled)

    • {table}_history - Stores historical versions of rows.
    • Associated trigger functions for automatic history tracking.
  3. Updated At Triggers

    • Triggers are generated to automatically refresh the updated_at timestamp on row modification.
  4. Schema Comments

    COMMENT ON TABLE tablename IS 'Feature Class migrated from ArcGIS via Arc2Post';

Domain Architecture

Option A: ENUM Implementation (Default)

Coded Value Domains are translated to PostgreSQL ENUM types.

-- GDB Domain: MaterialType ["Concrete", "Steel", "Wood"]
-- Resulting DDL:
CREATE TYPE materialtype AS ENUM ('Concrete', 'Steel', 'Wood');
ALTER TABLE structures ALTER COLUMN material TYPE materialtype;

Advantage: Enforces strict data integrity, enables fast querying, and reduces storage overhead.
Limitation: Requires explicit ALTER TYPE statements to add new values post-migration.

Option B: Lookup Table Implementation

Domains are translated into discrete lookup tables mapped via Foreign Keys.

CREATE TABLE domain_materialtype (
    code VARCHAR PRIMARY KEY,
    description TEXT,
    metadata JSONB
);

ALTER TABLE structures ADD CONSTRAINT fk_material 
    FOREIGN KEY (material) REFERENCES domain_materialtype(code);

Advantage: Allows for highly dynamic schemas where domain values frequently change or require extended metadata.


Temporal Versioning Pipeline

When enabled, Arc2Post deploys temporal tables that strictly audit all mutations.

Captured Events

  • UPDATE Operations: Pre-mutation state is archived to the history table.
  • DELETE Operations: Deleted record is archived to the history table.
  • Timestamping: valid_from and valid_to bounds define the active lifecycle of the record.

History Table Architecture

-- Primary Object: roads
CREATE TABLE roads (
    id SERIAL PRIMARY KEY,
    geom GEOMETRY,
    name TEXT,
    valid_from TIMESTAMPTZ DEFAULT NOW(),
    valid_to TIMESTAMPTZ DEFAULT 'infinity'
);

-- Historical Archive: roads_history
CREATE TABLE roads_history (LIKE roads INCLUDING ALL);

Querying Historical States

-- Retrieve active dataset (Primary Table):
SELECT * FROM roads;

-- Retrieve dataset state at a specific historical timestamp:
SELECT * FROM roads 
WHERE valid_from <= '2024-01-01' AND valid_to > '2024-01-01'
UNION ALL
SELECT * FROM roads_history 
WHERE valid_from <= '2024-01-01' AND valid_to > '2024-01-01';

Schema Translation Patterns

Pattern 1: Mixed Geometry Types

Condition: A layer contains both LineString and MultiLineString features.
Implementation: Arc2Post assigns the generic GEOMETRY type. The spatial GiST index is applied universally, preventing insertion failures.

Pattern 2: Extensive Text Descriptions

Condition: An attribute field contains string data exceeding 255 characters.
Implementation: Arc2Post explicitly maps to the PostgreSQL TEXT datatype, preventing silent truncation.

Pattern 3: Invalid Identifiers

Condition: A source table name begins with a numerical digit (e.g., 3d_fault).
Implementation: Arc2Post wraps all invalid identifiers in double-quotes ("") during DDL generation. Associated functions ("3d_fault_history_trigger") are similarly escaped.


Performance Baseline

Execution Benchmarks

Performance scales predictably on standard enterprise hardware:

Dataset Volume Estimated Duration Architectural Notes
Small (<10k rows) < 1 minute I/O bound.
Medium (100k rows) 2-5 minutes Includes spatial index generation.
Large (1M+ rows) 15-30 minutes Highly dependent on geometry vertex complexity.

Configuration Tuning

  1. Worker Threads: Default is 4. Scale up to 32 based on available CPU cores.
  2. Batch Size: Default is 5,000. Increase for optimized bulk COPY operations.
  3. Temp Tablespace: Ensure sufficient disk allocation for PostgreSQL temp files during spatial index generation.

Post-Migration Validation Protocol

Following a successful run, database administrators should verify:

  • [ ] Row counts match between source GDB and target PostGIS instance.
  • [ ] Geometry types are validated via ST_GeometryType(geom).
  • [ ] Spatial GiST indexes are present and utilized by the query planner.
  • [ ] Foreign Key constraints map to the correct lookup tables.
  • [ ] updated_at triggers execute correctly on UPDATE.

Validation Queries

-- Row count parity check:
SELECT COUNT(*) FROM target_table;

-- Detect potential text truncation:
SELECT * FROM target_table WHERE LENGTH(description) > 250;

-- Analyze geometry type distribution:
SELECT DISTINCT ST_GeometryType(geom) FROM target_table;

-- Verify spatial index utilization:
EXPLAIN ANALYZE SELECT * FROM target_table 
WHERE ST_Intersects(geom, ST_MakeEnvelope(...));

Troubleshooting Guide

Issue: “Geometry type does not match column type”

Root Cause: The source geodatabase contains mixed geometry types.
Resolution: Ensure you are running the latest version of Arc2Post. The engine now applies the generic GEOMETRY type to explicitly prevent this conflict.

Issue: “Value too long for type character varying(255)”

Root Cause: A source text attribute exceeds standard VARCHAR limitations.
Resolution: Arc2Post explicitly uses the TEXT type to prevent this. Ensure your software is updated to v1.0+.

Issue: “Trailing junk after numeric literal”

Root Cause: An ArcGIS table name begins with a digit, causing function generation to fail under strict PostgreSQL identifier rules.
Resolution: Arc2Post automatically applies double-quotes "" to all identifiers to safely process leading digits. Update your client to the latest build.

Issue: Suboptimal Migration Performance

Root Causes:

  • Exceptionally large geometries (e.g., nationwide MultiPolygons).
  • Constrained PostgreSQL memory configurations.
  • High network latency to remote DB servers.

Resolutions:

  • Increase work_mem and maintenance_work_mem in your postgresql.conf.
  • Execute the migration to a local localhost instance before replicating to the cloud.
  • Reduce the Arc2Post worker count if you are CPU-bottlenecked.

License Requirements

DDL Export

  • Requires active license to export DDL scripts.
  • License must be activated before the export module is enabled.

Migration Execution

  • Requires active license to execute the pipeline.
  • License state is validated at the start of the migration sequence.
  • Licenses are node-locked to the host machine.

Support & Diagnostics

Telemetry Logs

Migration execution logs are written locally:

  • Windows: %TEMP%\arc2post_debug.log
  • Linux/macOS: /tmp/arc2post_debug.log

Engineering Support

For complex schema issues, contact support with the following diagnostic payload:

  1. Error traceback.
  2. Last 100 lines of arc2post_debug.log.
  3. Source data format (.gdb or .gpkg).
  4. Target PostgreSQL version string.

Version History

Latest Updates (v1.0+)

  • Fixed: Resolved geometry type mismatch exceptions when handling mixed LineString and MultiLineString distributions.
  • Fixed: Eliminated VARCHAR(255) string truncation by strictly enforcing the TEXT datatype.
  • Fixed: Implemented strict identifier quoting for table, function, and trigger names beginning with numerical digits.
  • Added: DDL Export functionality is now gatekept behind active license validation.
  • Improved: Enhanced verbose error logging during the extraction phase.

Document Build: 1.1
Target Architecture: PostgreSQL 13+ / PostGIS 3+