library(sf)
<- read_sf("/opt/data/data/assignment01/cejst_nw.shp")
cejst.sf <- st_read("/opt/data/data/assignment01/cejst_nw.shp") cejst.st
Assignment 1 Solutions: Introductory material
How does geographic analysis fit into your goals for your research? Given our discussion of the aims and limitations of geographic analysis, are there particular issues that you would like to know more about or guard against?
My research is based almost entirely on geographic analysis, between estimating ecosystem services and studying agricultural change through time in different Idaho locations. One issue I face is that I integrate ecosystem services data collected and estimated at multiple scales into one analysis. I have to be careful I don’t fall into any pitfalls or fallacies when I combine multiple sources of data. In other cases, I have county level data and have to acknowledge the measurement bias that comes with data collection along those policial boundaries.
What are the primary components that describe spatial data?
I would say that the primary components are the coordinate reference system (because it helps us understand where we actually are on Earth), the extent of the data (because that helps me know what scale we’re working with and the size of the computational problem), the resolution (same reason as extent), the geometry, and spatial support. I don’t think about this last one often enough, but it really is the key to honest interpretation of the spatial data that you have.
What is the coordinate reference system and why is it important
The CRS consists of the information necessary to locate points in 2 or 3 dimensional space. Coordinates are only meaningful in the context of a CRS (i.e., (2,2) could describe any number of places in the world - we need to know the origin and the datum to actually know where that is). The CRS becomes particularly important when we need to align datasets that were not collected in the same CRS originally or when we need to transfer locations from the globe to a flat surface (e.g., map, screen, etc).
Find two maps of the same area in different projections? How does the projection affect your perception of the data being displayed?
Here’s a fun article on projections that shows what I’m talking about!
Read in the cejst_nw.shp file in the assignment01 folder. How many attributes describe each object? How many unique geometries are there? What is the coordinate reference system?
I can read in the data using st_read
or read_sf
You can inspect the differences between the resulting object classes by calling class
```{r}
#| message: false
class(cejst.sf)
class(cejst.st)
```
[1] "sf" "tbl_df" "tbl" "data.frame"
[1] "sf" "data.frame"
You’ll notice that using st_read
assigns the object to an sf
and data.frame
class meaning that functions defined for those two classes will work. Alternatively, read_sf
assigns the object to sf
, tbl_df
, tbl
, and data.frame
classes meaning that a much broader set of functions can be run on the cejst.sf
object.
Because the data are in wide format, we can assume that there is only 1 observation for each location (because sf
requires that there is a geometry entry for every observation (even if it’s empty)). Probably the easiest way to get the number of observations is:
```{r}
nrow(cejst.sf)
```
[1] 2590
Similarly, if we wanted to know how many attributes are collected for each observation we could use ncol
:
```{r}
ncol(cejst.sf)
```
[1] 124
Note that these are really only approximate estimates. There’s usually a lot of extra ID-style columns in spatial data such that the number of columns with useful information is less than the total number of columns, but we won’t worry about that for now.