Loading Data Into PostGIS – Text Files
I’ve fallen out of the blogosphere for quite awhile. Time to catch up – lately we’ve been experimenting with deploying our own instance of Open Geoportal (see previous post) and square one is getting a functioning data repository up and running. I’ve been tinkering with PostgreSQL / PostGIS and am documenting tests in a wiki we’ve created. The wiki is private, so I thought I’d re-post some of the tests here.
I’m working with a developer who’s installed and configured the database on a server, and I’m accessing it remotely using pgAdmin. I’ve started loading data and running queries and so far I’m relieved that most of what I’m seeing seems familiar, having worked with SQLite / Spatialite for a few years now. I’m using the excellent book PostGIS in Action as my guide. Here’s my take on loading a text file with coordinates into the db and building geometry for it. Loading shapefiles, spatial SQL experiments, and connecting to Geoserver will follow.
“POSTGIS=”2.1.1 r12113″ GEOS=”3.4.2-CAPI-1.8.2 r3921″ PROJ=”Rel. 4.8.0, 6 March 2012″ GDAL=”GDAL 1.9.2, released 2012/10/08″ LIBXML=”2.7.6″ LIBJSON=”UNKNOWN” TOPOLOGY RASTER”
Create Schema for Holding Related Objects
CREATE SCHEMA usa_general;
Query returned successfully with no result in 297 ms.
Import Delimited Text File
Test file is the US Populated Places gazetteer file from the USGS Geographic Names Information Service (GNIS). It is a pipe-delimited text file in encoded in UTF-8 with longitude and latitude coordinates in both DMS and DEC format. http://geonames.usgs.gov/domestic/download_data.htm
Create Table for Holding Data
CREATE TABLE usa_general.pop_places
feature_id int NOT NULL PRIMARY KEY,
Query returned successfully with no result in 390 ms.
Copy Data From File
Must be run from the PSQL Console plugin in order to load data from a local client machine. If data was stored on the server, you could use the PGAdmin GUI and use COPY in an SQL statement instead of \copy. When running the PSQL Console on MS Windows the default character encoding is WIN1252. In this example, the data contains characters unsupported by this encoding; the file is encoded in UTF-8 and the console must be set to match. COPY is not a standard SQL command but is native to PostgreSQL.
— The optional HEADER specifies that the data file has a header column that should be skipped when importing
DELIMITER ‘|’ CSV HEADER
Basic Query to Test that Load was Success
WHERE state_alpha=’NY’ and county_name=’New York’
ORDER BY feature_name
Create Geometry from Coordinates and Make Spatial Index
4269 is the EPSG code for NAD 83, the basic geographic coordinate system used by US government agencies.
“usa_general.pop_places.geom SRID:4269 TYPE:POINT DIMS:2 ”
SET geom = ST_GeomFromText(‘POINT(‘|| prim_long_dec || ‘ ‘|| prim_lat_dec || ‘)’,4269);
Query returned successfully: 200359 rows affected, 18268 ms execution time.
CREATE INDEX idx_pop_places_geom
ON usa_general.pop_places USING gist(geom);
Query returned successfully with no result in 8439 ms.
Basic SQL Query to Test Geometry
ST_AsEWKT transforms the output of geometry from binary code into the OGC Well Known Text format (WKT).
SELECT feature_name, ST_AsEWKT(geom) AS geometry
WHERE state_alpha=’NY’ AND county_name=’New York’
ORDER BY feature_name
“Amalgamated Dwellings”;”SRID=4269;POINT(-73.9828 40.715)”
“Amsterdam Houses”;”SRID=4269;POINT(-73.9881 40.7731)”
“Battery Park City”;”SRID=4269;POINT(-74.0163 40.7115)”…
Basic Spatial SQL Query to Test Geometry
Selects the northernmost point in the table.
SELECT feature_name, state_alpha, ST_AsEWKT(geom) AS geometry
WHERE ST_Xmax(geom) IN(
Drop some columns that have no data.
ALTER TABLE usa_general.pop_places
DROP COLUMN source_lat_dms,
DROP COLUMN source_long_dms,
DROP COLUMN source_lat_dec,
DROP COLUMN source_long_dec;
Query returned successfully with no result in 180 ms.