Wednesday, August 21, 2013

PostGIS 2.x: Getting Raster Data out of the Database

Note: For an updated version of this post with better code samples, see the article on the author's weblog.

PostGIS 2.x (latest release, 2.1) enables users to do fairly sophisticated raster processing directly in a database. For many applications, these data can stay in the database; it's the insight into spatial phenomena that comes out. Sometimes, however, you need to get file data (e.g. a GeoTIFF) out of PostGIS. It isn't immediately obvious how to do this efficiently, despite the number of helpful functions that serialize a raster field to Well-Known Binary (WKB) or other "flat" formats.

Background

In particular, I recently needed to create a web service that delivers PostGIS raster outputs as file data. The queries that we needed to support were well suited for PostGIS and sometimes one query would consume another (one or more) as subquer(ies). These and other considerations led me to decide to implement the service layer in Python using either GeoDjango or GeoAlchemy. More on that later. Suffice to say, a robust and stable solution for exporting and attaching file data from PostGIS to an HTTP response was needed. I found at least six (6) different ways of doing this; there may be more:
  • Export an ASCII grid ("AAIGrid")
  • Connect to the database using a desktop client (e.g. QGIS) [1]
  • Use a procedural language (like PLPGSQL or PLPython) [2]
  • Use the COPY declaration to get a hex dump out, then convert to binary
  • Fill a 2D NumPy array with a byte array and serialize it to a binary file using GDAL or psycopg2 [3, 4]
  • Use a raster output function to get a byte array, which can be written to a binary field
It's nice to have options. But what's the most appropriate? If that's a difficult question to answer, what's the easiest option? I'll explore some of them in detail. For better code sample display, refer to this gist.

Export an ASCII Grid

This works great! Because an ASCII grid file (or "ESRI Grid" file, with the *.asc or *.grd extension, typically) is just plain text, you can directly export it from the database. The GDAL driver name is "AAIGrid" which should be the second argument to ST_AsGDALRaster(). Be sure to remove the column header from your export (see image below). However, what you get is a file that has no projection information that you may need to convert to another format. This can present problems for your workflow, especially if you're trying to automate the production of raster files, say, through a web API.

Want an ASCII grid (or "ESRI Grid")? No problem! Just don't export the column names.

Connect Using the QGIS Desktop Client

There is a plug-in for QGIS that promises to allow you to load raster data from PostGIS directly into a QGIS workspace. I used the Plugins Manager ("Plugins" > "Fetch Python Plugins...") in QGIS to get this plug-in package. The first time I selected the "Load PostGIS Raster to QGIS" plug-in and tried to install it, I found that I couldn't write to the plug-ins directory (this with a relatively fresh installation of QGIS). After creating and setting myself as the owner of the python/plugins directory, I was able to install the plug-in without any further trouble. Connecting to the database and viewing the available relations was also no trouble at all. One minor irritation is that you need to enter your password every time the plug-in interfaces with the database, which can be very often, at every time the list of available relations needs to be updated.
You'll be doing this a lot.
There are a few options available to you in displaying raster data from the database: "Read table's vector representation," "Read one table as a raster," "Read one row as a raster," or "Read the dataset as a raster." It's not clear what the second and last choices are, but "Reading the table as a raster" did not work for me where my table has one raster field and a couple of non-raster, non-geometry/geography fields; QGIS hung for a few seconds then said it "Could not load PG..." Reading one row worked, however, you have to select the row by its primary key (or row number in a random selection, not sure which it is returning). This may not be the easiest way for you to find a particular raster image in a table.

Using the COPY Declaration in SQL

My colleague suggested this method, demonstrated in Python, which requires the pygresql module to be installed; easy enough with pip:
pip install psycopg2 pygresql
The basic idea is to use the COPY declaration in SQL to export the raster to a hexadecimal file, then to convert that file to a binary file using xxd:
import os, stat, pg
# See: http://www.pygresql.org/install.html
# pip install psycopg2, pygresql

# Designate path to output file
outfile = '/home/myself/temp.tiff'

# Name of PostgreSQL table to export
pg_table = 'geowepp_soil'

# PostgreSQL connection parameters
pg_server = 'my_server'
pg_database = 'my_database'
pg_user = 'myself'

# Desginate a file to receive the hex data; make sure it exists with the right permissions
pg_hex = '/home/myself/temp.hex'
os.mknod(pg_hex, stat.S_IRUSR | stat.S_IWUSR | stat.S_IRGRP | stat.S_IWGRP)

conn = pg.connect(pg_database, pg_server, 5432, None, None, pg_user)
sql = "COPY (SELECT encode(ST_AsTIFF(ST_Union(" + pg_table + ".rast)), 'hex') FROM " + pg_table + ") TO '" + pg_hex + "'"

# You can check it with: print sql
conn.query(sql)

cmd = 'xxd -p -r ' + pg_hex + ' > ' + outfile
os.system(cmd)
This needs to be done on the file system of the database server, which is where PostgreSQL will write.

Using an Output Function and Serializing from a Byte Array

Despite the seeming complexity of this option (then again, compare it to the above), I think it is the most flexible approach. I'll provide two examples here, with code: using GeoDjango to execute a raw query and using GeoAlchemy2's object-relational model to execute the query. Finally, I'll show an example of writing the output to a file or to a Django HttpResponse() instance.

Using GeoDjango


First, some setup. We'll define a RasterQuery class to help with handling the details. While a new class isn't exactly an idiomatic example, I'm hoping it will succinctly illustrate the considerations involved in using performing raw SQL queries with Django
class RasterQuery:
    '''
    Assumes some global FORMATS dictionary describes the valid file formats, their file extensions and MIME type strings.
    '''
    def __init__(self, qs, params=None, file_format='geotiff'):
        assert file_format in FORMATS.keys(), 'Not a valid file format'

        self.cursor = connection.cursor()
        self.params = params
        self.query_string = qs
        self.file_format = file_format
        self.file_extension = FORMATS[file_format]['file_extension']

    def execute(self, params=None):
        '''Execute the stored query string with the given parameters'''
        self.params = params

        if self.params is not None:
            self.cursor.execute(self.query_string, params)

        else:
            self.cursor.execute(self.query_string)

    def fetch_all(self):
        '''Return all results in a List; a List of buffers is returned'''
        return [
            row[0] for row in self.cursor.fetchall()
        ]

    def write_all(self, path, name=None):
        '''For each raster in the query, writes it to a file on the given path'''
        name = name or 'raster_query'

        i = 0
        results = self.fetch_all()
        for each in results:
            name = name + str(i + 1) + self.file_extension
            with open(os.path.join(path, name), 'wb') as stream:
                stream.write(results[i])

            i += 1
With the RasterQuery class available to us, we can more cleanly execute our raw SQL queries and serialize the response to a file attachment in a Django view:
def clip_one_raster_by_another(request):

    # Our raw SQL query, with parameter strings
    query_string = '''
    SELECT ST_AsGDALRaster(ST_Clip(landcover.rast,
        ST_Buffer(ST_Envelope(burnedarea.rast), %s)), %s) AS rast 
      FROM landcover, burnedarea 
     WHERE ST_Intersects(landcover.rast, burnedarea.rast)
       AND burnedarea.rid = %s'''

    # Create a RasterQuery instance; apply the parameters
    query = RasterQuery(query_string)
    query.execute([1000, 'GTiff', 2])

    filename = 'blah.tiff'

    # Outputs:
    # [(<read-only buffer for 0x2613470, size 110173, offset 0 at 0x26a05b0>),
    #  (<read-only buffer for 0x26134b0, size 142794, offset 0 at 0x26a01f0>)]

    # Return only the first item
    response = HttpResponse(query.fetch_all()[0], content_type=FORMATS[_format]['mime'])
    response['Content-Disposition'] = 'attachment; filename="%s"' % filename
    return response
Seem simple enough? To write to a file instead, see the write_all() method of the RasterQuery class. The query.fetch_all()[0] at the end is contrived. I'll show a better way of getting to a nested buffer in the next example.

Using GeoAlchemy2


GeoAlchemy2's object-relational model (ORM) allows tables to be represented as classes, just like in Django.
class LandCover(DeclarativeBase):
    __tablename__ = 'landcover'
    rid = Column(Integer, primary_key=True)
    rast = Column(ga2.types.Raster)
    filename = Column(String(255))

class BurnedArea(DeclarativeBase):
    __tablename__ = 'burnedarea'
    rid = Column(Integer, primary_key=True)
    rast = Column(ga2.types.Raster)
    filename = Column(String(255))
    burndate = Column(Date)
    burnname = Column(String(255))
Assuming that SESSION and ENGINE global variables are available, the gist of this approach can be seen in this example:
def clip_fccs_by_mtbs_id2(request):

    query = SESSION.query(LandCover.rast\
        .ST_AsGDALRaster(LandCover.rast\
        .ST_Clip(LandCover.rast, BurnedArea.rast\
        .ST_Envelope()\
        .ST_Buffer(1000)), 'GTiff').label('rast'))\
        .filter(LandCover.rast.ST_Intersects(BurnedArea.rast), BurnedArea.rid==2)

    filename = 'blah.tiff'

    # Outputs:
    # [(<read-only buffer for 0x2613470, size 110173, offset 0 at 0x26a05b0>),
    #  (<read-only buffer for 0x26134b0, size 142794, offset 0 at 0x26a01f0>)]

    result = query.all()
    while type(result) != buffer:
        result = result[0] # Unwrap until a buffer is found

    # Consequently, it returns only the first item
    response = HttpResponse(result, content_type=FORMATS[_format]['mime'])
    response['Content-Disposition'] = 'attachment; filename="%s"' % filename 
Here we see a better way of getting at a nested buffer. If we wanted all of the rasters that were returned (all of the buffers), we could call ST_Union on our final raster selection before passing it to ST_AsGDALRaster.

After considering all my (apparent) options, I found this last technique, using the PostGIS raster output function(s) and writing the byte array to a file-attachment in an HTTP response, to be best suited for my application. I'd be interested in hearing about other techniques not described here.

35 comments:

  1. Each of these Changes is communicated to the database. The change is first made on the "read/write" database and then replicated to the remote database copy. create mysql dashboard

    ReplyDelete
  2. The key factors and criteria that organizations consider important towards managing, measuring and controlling production processes and costs can be logged into a database for analysis. DB Designer

    ReplyDelete
  3. imo
    imo downloading
    imo download
    imo app
    imo apk
    The instant messaging app imo has crossed the threshold of a half-billion Play Store installs, and there's a pretty good chance you've never ...

    ReplyDelete
  4. What’s Happening i’m new to this, I stumbled upon this I’ve found It
    positively useful and it has helped me out loads.دانلود آهنگ جدید

    I hope to contribute & help different customers like its aided me.
    Good job.

    ReplyDelete
  5. My friend has praised your blog and I want to read your blog. There is a lot of good information on your blog, I loved reading it and I think people will get a lot of help from this article. though I have written this kind of blog, I think you will get a lot of help from this too. This is a good inspiration for your article. coca-cola job application

    ReplyDelete
  6. Hi everyone! it’s my first visit of this web site; this blog includes amazing and infact good stuff in support of visitors. Thanks for the useful information on database. Click easy info blog

    ReplyDelete
  7. This is a very useful and important information, it was so useful to me and other readers, thank you for always feeding the readers with interesting and useful information, your blog is indeed doing very well, kudos, meanwhile you can checkout this infoguide

    ReplyDelete
  8. This is a very useful and important information, it was so useful to me and other readers, thank you for always feeding the readers with interesting and useful information, your blog is indeed doing very well, kudos, meanwhile you can checkout this njfp sign up

    ReplyDelete
  9. Good to know about this. Also visit here: Feeta.pk house for rent in johar town Lahore . Thank you for this sharing.

    ReplyDelete
  10. Thanks for sharing this wonderful post.  Redspider web design company dubai  is about to tell you that I have perused your article, it is exceptionally enlightening and helpful for me.

    ReplyDelete
  11. The post you left was very useful to me. Thank you for your good site
    دانلود آهنگ ترکی

    ReplyDelete
  12. A very details explanation was provided on this topic. Your writing technique is impressive and enjoyable to read. They are a few options available on my display data from in the database. Now it's time to get Long distance taxi for more information.

    ReplyDelete
  13. Thanks for writing this quality informational content. I get inspired by the writing skills of the writer. Many options are available in this article. I really appreciate your effort in this article.
    Now it's time to get inbound call center for more information.

    ReplyDelete
  14. أفضل خزانات مياه فيبر جلاس ومطابقة للمواصفات العالمية ، بسعة 850 لتر وحتى 60 متر مكعب ، بأشكال مختلفة تناسب جميع الأغراض (طولية وأفقية).

    ReplyDelete
  15. Yes, I am very happy to read this one. I hope you will share more good updates. Now it's time to avail shutter repair for more information.

    ReplyDelete
  16. I continuously visit your blog and retriev you understand this subject. Bookmarked this page, will come back for more. Online Shop

    ReplyDelete
  17. Efficiently extract raster data from the database for seamless data analysis and visualization for more info visit our Tokister

    ReplyDelete
  18. Fetch Python is very good plugin. I am using this plugin. I am so satisfied from this plugin. Keep it up and share more updates with us. Now its time to avail Sign and Blind Services in Eastbourne for more information.

    ReplyDelete
  19. reckless driving virginia codeis a legal professonals those who experts in guiding clients with their legal problems in the state of virginia.Keep sharing the post!

    ReplyDelete
  20. This is a useful blog about databases and their changes. Thank you so much for sharing this informational content. Now it's time to avail Jupiter Florida Airport for more information.

    ReplyDelete
  21. Trusted Personal Injury Lawyer Virginia Beach is provide a legal advice, and represent in court for you.

    ReplyDelete
  22. Good technical post. Keep updating us more technically by sharing more technical blogs. Uncontested Divorce Lawyer Fairfax

    ReplyDelete
  23. "PostGIS 2.x: Getting Raster Data out of the Database" is a comprehensive guide that offers step-by-step instructions and clear examples for extracting raster data from a PostGIS database. It is an excellent resource for those working with geospatial data and utilizing PostGIS effectively, providing valuable insights for both beginners and experienced users. Nueva Jersey Crímenes Sexuales

    ReplyDelete
  24. "PostGIS 2.x: Getting Raster Data out of the Database" is a comprehensive guide for extracting raster data from databases. It breaks down complex technical processes into accessible steps, making it a valuable resource for GIS enthusiasts and professionals. The review focuses on PostGIS 2.x, enabling users to effectively retrieve raster information, making it a must-read for those seeking practical guidance in geospatial data management. ¿Quién Recibe el Anillo de Compromiso en un Divorcio Nueva York?

    ReplyDelete
  25. Outstanding! The Best Divorce Lawyer Fairfax! Expertise, empathy, and a personalized approach. Navigated my case with precision. Highly recommended for anyone seeking top-notch legal support in divorce proceedings.

    ReplyDelete
  26. Thanks for sharing this article to us. I gathered more useful information. abogados divorcio roanoke va

    ReplyDelete
  27. Amazing, Your blogs are really good and informative. I got a lots of useful information in your blogs. His life in Southern and Central Illinois, andlater in upstate New York was very spartan and emotionally bare: "These are small, cold desolate places and you have no friends. fairfax divorce lawyers Keeps sharing more useful and interesting blogs of articles,,..

    ReplyDelete
  28. The text suggests that the user is requesting a review or feedback on a document or resource related to extracting raster data from PostGIS, but has not provided the actual content or context. The phrase "postgis-2x-getting-raster-data-out-of" appears to be a placeholder or filename related to PostGIS, a spatial database extender for PostgreSQL. To provide a review comment, the user would need more information about the specific content, context, or issues the user would like feedback on. If the user has specific lines or excerpts from the document, they can be provided for review or feedback. If the user has questions or concerns about the content, they can provide more context or details, and the user will try to offer guidance based on the information provided. abogado lesiones personales virginia

    ReplyDelete