Tuesday, July 7, 2015

GIS Programming - Module 7 - Exploring and Manipulating Spatial Data

This assignment covered two chapters (6 and 7) in the textbook, and involved more work, over two weeks.  It is so far the most challenging of the modules of this course but potentially the most useful as well.  We worked with lists, dictionaries, and search, input and update cursors.

A dictionary is a list of pairs of objects.  The first of each pair is called the key, and it is associated with a value.  In our example, the keys were county seat cities, and the values were their population values.  A key can be associated with more than one value, but not vice-versa.  A dictionary can be used to look up a value for an object, such as the population for a city.

A cursor is a data access object.  A search cursor can be used to retrieve certain fields from a dataset, with the help of SQL statements.  Our script involved using a search cursor to retrieve the county seat cities in New Mexico, with names and populations, add them to a dictionary, then print out the dictionary.  A delete cursor finds and deletes fields, and/or data based on an SQL statement, from a feature class.  An update cursor can add or change data.

In our assignment, we used a search cursor to find the names those cities in New Mexico that are county seats, along with their populations.  Then, using a for loop, we updated a new, empty dictionary, adding pairs of key and value with the name and population of each city.

The following figures show the results from my script.  Because this script is fairly long and involved, it details the steps of the process, and prints statements after each process is accomplished, and finally prints the resultant dictionary of county seats and their populations.

(Because these results are long, they're shown in three figures, although in actuality the whole thing was produced in a single run of the script, in the Interactive Window.)

Figure 1. Part 1: Results for Creation of Geodatabase, List,
Initiation of Copy to Geodatabase

Figure 2. Part 2: Results of Copy  (continued)






Figure 3. Part 3:  Results for Populate Dictionary
with Search Cursor for loop.  Print Dictionary.













































Process Summary Details.

Question: Which step did you have the most difficulty with? 
Describe:
1) the problem you were having, and 
2) the solution or correct steps to fix it.

I had some trouble with Step 7, populating the dictionary with the search cursor.  My main trouble was understanding the logic of the Search Cursor, and exactly what it’s supposed to do.  Once I read the text and Help topics and some of the Discussion post answers, I figured out the purpose and then was able to put together the logic in the form of pseudo code.  I then understood the idea of populating the dictionary with the rows, and began to figure out what exactly the row variable represents.    
My first version looked like this:
cursor = arcpy.da.SearchCursor(fc,["NAME", "FEATURE", "POP_2000"],'"FEATURE" = \'County Seat\'')
for row in cursor:
    print row[0], row[1], row[2]
This was before I had gotten to the step where I created the dictionary, and the result was simply a list of the names, feature type and population for those county seat cities.  
Then I created the empty dictionary, but had some trouble figuring out where to put the update statement (from the assignment instructions) in relation to the search cursor statement.  For several tries, the script ran okay, but I didn’t get the dictionary to populate.  It remained empty. Unfortunately, I didn’t save this incorrect version of the script and can’t remember exactly what it looked like. I initially put the dictionary update statement outside the for loop, and put print statements for the rows inside.  This ran okay, but because the rows weren’t being iterated through the update function, the dictionary wasn’t being updated at all.  
After looking at the ArcGIS Help page, and the text, and the discussion posts, it finally began to dawn on me just what the search cursor is supposed to do, and realized that the update statement needed to be on the inside of the for  loop of the cursor statement.  At the same time, I also started paying attention to the fact that the [0] and [1] and [2] etc. represent index positions, reviewed that concept a little,  and saw that I needed to have the correct ones for each key and value that I wanted to add to the dictionary from the search cursor rows.  Another thing I realized, is that row is actually a variable.    After studying the syntax some more, I took an experimental stab at putting the update statement inside the for loop, and luckily it worked.  I found that the trickiest part was configuring the part inside the parentheses properly.  It was necessary to combine the syntax of the row variable with that of the dictionary.
I have to emphasize here that part of my understanding came AFTER this, when I saw that it worked.  
At this point, I still had the print county_seats  statement inside the for loop as well, and this resulted in the dictionary being printed repeatedly, and longer, every time a new key and value pair was added.  I’ve had enough practice at this point, however, that I knew right away that the print statement needed to be un-indented, so I did that.  
My final script that works properly is shown here:

for row in cursor:
    county_seats.update({row[0]: row[2]})  
print county_seats

row 0, the key, is first index position: NAME.  row 2, the value, is third index position: POP_2000.  The dictionary is inside the curly brackets, with the key and value separated by a colon ( : ).
It would be very useful to save scripts that run okay, but don’t produce the correct results.  I plan to do this in the future, and include commentary about what the script fails to do, as well as what it does do.  


  I initially had a section at the beginning of the script that looked for the existence of the geodatabase, and deleted it if it was there.  I did this because I thought that was what caused the error statement that the .gdb already exists.  Then I realized it’s okay if the geodatabase already exists, and that it will be overwritten, but only if ArcMap is closed, and there’s no lock on the geodatabase.  When I figured this out, I got rid of the Exist – Delete part of the script.  



No comments:

Post a Comment