seascapemodels

What will be the next big advance in spatial analysis using R?

The programming tools available for scientists to use for data analysis are becoming increasingly complex and sophisticated. This trend will continue, however I believe the most significant new advances will come from the development open, light and accessible packages. By 'light' I mean packages that are both easy to learn, straightforward to work with, and can run on standard desktops and laptops.

The capacity for languages like R to produce incredibly sophisticated analyses outstrips most people's ability to use those tools. A few people are pushing the boundaries (e.g. see Sean Anderson and colleague's recent global extinction risk analysis), but most scientists who use will continue to use only those tools they are most familiar with (but are pushing the boundaries of human knowledge in other ways of course).

Case in point: the success of Wickham's dplyr package for data wrangling in R. dplyr doesn't really do anything you can't do with R's base package, but it does make data wrangling easier, more intuitive and faster. dplyr is brilliant not only because it is accessible for relative begginners, but also because it saves coding time for advanced users too.

More generally, other areas of science can benefit from accessible and light packages too. An example, from coral reef science. For many decades, reef scientists have been interested in how the complexity of reefs varies across environmental gradients. One way to measure complexity it to take a lead rope of say, 3 metres length, and lay it out over the contours of the reef. Then measure the rope's resulting length when viewed from above. The ratio of lengths is then a measure of the reef's complexity. More detailed measurements of reef complexity can be made by taking video footage and post-processing it to make 3D models of the reef. In fact, you can make precise measurements of reefs using this method. But the fact is, people keep going back to the old lead rope method.

The reason people continue to use the lead rope method is that ropes are reliable and calcuting complexity from your field data is easy. Compare that with the hassles of taking a 3D camera underwater, generating large amounts of video data and then having to use complex post-processing programs to get your results. The rise of cheap high-def cameras, like go-pros has solved the first issue, but the proccessing step still remains a major challenge. Several groups have developed programs for processing 3D video over the years, but they typically required computer scientists to implement these programs. The method won't be broadly used until someone develops a program that is accessible to reef scientists, light and cheap.

The uptake of R's sophisticated spatial tools also faces the same challenge: many of them are accessible only to experienced programmers. This is changing of course, for instance the raster package comes with very accessible help files. The developers are even expanding it to include some processing options for shape files. However, the documentation for the main package for processing shapes, rgeos, is quite technical. Further, rgeos operations often through up incomprehensible errors when you perform common tasks like polygon intersects.

The potential to use R as a fully functioning GIS is only just beginning to be realised. So watch this space.

Further reading: If you are just starting out using R as a GIS, check out my introductory course

seascape models

What will be the next big advance in spatial analysis using R?

Contact: Chris Brown