Assignment Revision 3: Revisiting your code

This is your final opportunity to reconsider your answers to the last few assignments and evaluate what you might have done differently now that you’ve had a little more practice. I’ve also asked some specific questions based on common mistakes across the assignments. You’ll still be using Quarto to complete this homework.

Instructions

After you’ve joined the assignment repository, you should have this file (named Readme.md) inside of a R project named assignment-revision-3-xx where xx is your github username (or initials).
Once you’ve verified that you’ve correctly cloned the assignment repository, create a new Quarto document. Name this file ar3_xxx.qmd and give it a title (like M Williamson Assignment Revision 3). Make sure that you select the html output option (Quarto can do a lot of cool things, but the html format is the least-likely to cause you additional headaches). We’ll be using Quarto throughout the course so it’s worth checking out the other tutorials in the getting started section.
Copy the questions in the assignment into your document and change the color of their text.
Save the changes and make your first commit!
Answer the questions making sure save and commit at least 2 more times (having 3 commits is part of the assignment).
Render the document (by clicking the “Render” button in RStudio) to html (you should now have at least 3 files in the repository: Readme.md, ar3_xx.qmd, and ar3_xx.html). Commit these changes and push them to the repository on GitHub. You should see the files there when you go to github.com.

Questions

Autocorrelation metrics:

What can a Ripley’s K curve tell you about spatial autocorrelation?
What about Moran’s I?
Look at the posted answers for assignment 8. What do the Ripley’s K curve and the Moran’s I slope shown there tell you about the spatial autocorrelation of the data?

When kriging, you need both a spatial trend surface and an experimental variogram. What is the role of each of these components in kriging?
Why are confusion matrices and ROC/AUC plots only useful for categorical dependent/outcome variables? What might you use instead for continuous dependent/outcome variables?
What is the difference between a training/testing data split and k-fold or leave-one-out cross validation?
Notate the changes you made to assignment 9 during class on 11/11 and push them to GitHub.
We’ve covered 3 of the 4 sections of this course so far:

Getting Started: What is spatial analysis and how do we do it in R?
Spatial Data Operations in R: Prepping geospatial data for use in R
Statistical Workflows for Spatial Data: Putting spatial data to work!

As you think back across these sections, what is one topic/workflow you feel very confident doing? What topic/workflow that you expect to use in your research are you least confident in? What strategies do you think will help you feel more confident? What support could I or another spatial data expert provide that might help?

Which statistical approach do you plan to use for the final project? What challenges do you anticipate working through during your final project?