Verifying MS SQL to PostgreSQL Migration

The most important phase of database migration is validating results at the end of the process. This whitepaper explains how to verify the primary database entries:
- Schemas
- Data
- Indexes
- Views
Schemas
SQL Server allows to explore schemas in two ways.
- In T-SQL console client use the query EXEC sp_columns @table_name=’table_name’
- In Management Studio click on database name in the left pane, expand ‘Tables’ item, right-click on the appropriate name and select ‘Design’ item
PostgreSQL displays table structure through the query:\d table_name
MS SQL schema is considered as properly converted when each column has the same type, size and default value in the resulting PostgreSQL table. This is the list of safe type mapping:
SQL Server | PostgreSQL |
BIGINT | BIGINT |
BINARY(n) | BYTEA |
CHAR(n), CHARACTER(n) | CHAR(n), CHARACTER(n) |
DATE | DATE |
DATETIME | TIMESTAMP(3) |
DATETIME2(p) | TIMESTAMP(p) |
DATETIMEOFFSET(p) | TIMESTAMP(p) WITH TIME ZONE |
DECIMAL(p,s), DEC(p,s) | DECIMAL(p,s), DEC(p,s) |
DOUBLE PRECISION | DOUBLE PRECISION |
FLOAT(p) | DOUBLE PRECISION |
INT, INTEGER | INT, INTEGER |
MONEY | MONEY |
NCHAR(n) | CHAR(n) |
NTEXT | TEXT |
NUMERIC(p,s) | NUMERIC(p,s) |
NVARCHAR(n) | VARCHAR(n) |
NVARCHAR(max) | TEXT |
REAL | REAL |
ROWVERSION | ROWVERSION |
SMALLDATETIME | TIMESTAMP(0) |
SMALLINT | SMALLINT |
TEXT | TEXT |
TIME(p) | TIME(p) |
TIMESTAMP | BYTEA |
TINYINT | SMALLINT |
UNIQUEIDENTIFIER | CHAR(16) |
VARBINARY(n), VARBINARY(max) | BYTEA |
VARCHAR(n) | VARCHAR(n) |
VARCHAR(max) | TEXT |
XML | XML |
Data
To verify conversion of data it is required to do visual comparison of random fragment(s) in MS SQL and Postgres tables. Microsoft SQL provides two options to extract data fragment:
- In T-SQL console client use the query SELECT TOP number_of_records * FROM table_name
or
- Click on the appropriate table name in the left pane of Microsoft Management Studio and select ‘Select Top 1000 Rows’ menu item
PostgreSQL extracts fragment of data through the query:
SELECT * FROM table_name LIMIT number_of_records
It is also reasonable to check that source and destination tables have the same number of records. Both MS SQL and PostgreSQL display number of rows as follows:
SELECT COUNT(*) FROM table_name
Indexes
Microsoft SQL provides two options to explore indexes:
- In T-SQL console client run the query
SELECT o.name AS Tbl_Name,
i.name AS Idx_Name,
i.type_desc AS Idx_Type
FROM sys.indexesi
INNER JOIN sys.objects o ON i.object_id = o.object_id
WHERE i.name IS NOT NULL AND o.type = ‘U’
ORDER BY o.name, i.type
- In Management Studio, open ‘Design’ view of the table (see ‘Schemas’ section above) and click ‘Manage Indexes and Keys’ icon on the toolbar (it is marked red on the screenshot)
PostgreSQL describes indexes at the bottom of table definition generated by ‘\d table_name’ statement.
Views
In order to check that the view is converted properly,it is required to compare SQL code of this view in MS SQL and PostgreSQL with respect to differences between SQL dialects of these two database management systems.
MS SQL allows explore views list using the query:
select TABLE_NAME, VIEW_DEFINITION from INFORMATION_SCHEMA.VIEWS where TABLE_NAME not in (‘sysalternates’,’sysconstraints’,’syssegments’)
PostgreSQL uses this query for the same purpose:
select viewname, definition from pg_catalog.pg_views where schema name NOT IN (‘pg_catalog’, ‘information_schema’)
Find more details on database migration from MS SQL to PostgreSQL at https://www.convert-in.com/mss2pgs.htm