Advanced GIS

Assignment 4: SQL

Readings

A guest speaker will join us on March 5th, please read the following in preparation:

Resources

An SQL Reference (pdf)

This video on SQL in CartoDB might be helpful.

The power of CartoDB is a handy list of SQL functions and techniques you can use in CartoDB.

If you want to learn all there is to know about SQL, Learn SQL The Hard Way and SQL School are good places to start.

Part 1: SQL

Write at least five SQL statements for one or two tables in your CartoDB account. Base them on a variety of the queries under Common Queries in the SQL Reference. Some should be mappable and some should be unmappable (eg, selecting the COUNT of a table, since that cannot be mapped).

Submit:

Part 2: Spatial SQL, Map Kibera

We're going to work through using SQL to locate features within other features using the Map Kibera data. This looks long, but that's mostly because we're going step-by-step. Follow along:

  1. Download education and boundaries data if you don't already have them.
  2. Upload the education and boundaries data to CartoDB. Rename the boundaries table to kiberaboundaries.
  3. Let's try selecting boundaries by name:
    1. Open the kiberaboundaries table.
    2. In the SQL tab enter the following:
      SELECT * 
      FROM kiberaboundaries
      WHERE name = 'Makina'
    3. Two rows should be shown. Click the Map View button to see just those features.
    4. Try another name from the table and see what the result is.
  4. Now let's look at the education table:
    1. Open the education table.
    2. In the SQL tab delete the old query and enter the following:
      SELECT *
      FROM education
      WHERE type = 'primary'
    3. Only the primary schools should show be displayed. Look at the Map View. This is not a required step, just making sure that the table is loaded correctly and SQL works.
  5. We're going to join the education table with the kiberaboundaries table. While looking at the education table, delete the old SQL query and enter the following SQL:
    SELECT education.*
    FROM education, kiberaboundaries
    WHERE ST_within(education.the_geom, kiberaboundaries.the_geom)

    Let's look at this statement line by line:

    1. SELECT education.*
      Selects all the columns from education.
    2. FROM education, kiberaboundaries
      We want to have education for its columns, but we also want kiberaboundaries to be available for the next line.
    3. WHERE ST_within(education.the_geom, kiberaboundaries.the_geom)
      Finally, we're only selecting the education rows where the_geom (the point for the row) is within one of the kiberaboundaries rows' the_geom.

  6. Since we're basically saying "select all the features in Kibera" and we know all the features are in Kibera, we should still see all the features. Look at Map View and confirm.
  7. To pick only education features that are within particular boundaries, now all we have to do is use conditions to pick those boundaries. Make your SQL query look like this:
    SELECT education.*
    FROM education, kiberaboundaries
    WHERE ST_within(education.the_geom, kiberaboundaries.the_geom)
    AND kiberaboundaries.name = 'Makina'

    You should be able to add the last line (AND kiberaboundaries.name = 'Makina') to the SQL you already have.

    The last line specifies that we only want to look at kiberaboundaries with the name attribute set to 'Makina'. So when the database is looking for education features to include, it only considers those that are within kiberaboundaries with this name.

  8. Now that we have this query, we can change the name and see different education features appear on the map. In your SQL, replace Makina with Ayany. Confirm that only those features are selected by looking at the Map View.
  9. Finally, we can filter the education filters more by adding conditions to our query. Switch the name back to Makina and add the following line:
    AND education.type = 'primary'

    Now we're only selecting education features in Makina that are primary schools.

That's it! You're using spatial SQL queries. Submit a screenshot with the result and your final code.

Part 3: Spatial SQL, Your Data

Follow the steps in Part 2 to select features within a boundary using two datasets other than the ones used in Part 2. The datasets will generally be (1) smaller features such as points and (2) larger polygons that represent boundaries of some area.

If you're having a hard time finding data to use, consider using the Inside AirBnB data we worked with last week for the points and some boundaries such as city council districts for the boundaries.

Submit a map using the SQL.

Submitting your work

You may optionally choose to include a short paragraph response or summary of your work.