DATA SCIENCE

data scientist is a professional who creates programming code and combines it with statistical knowledge to create insights from data.

Wikipedia. Oh yes, Wikipedia.

NBR AND DATA SCIENCE

It is not clear if “Data science” and “Data scientist” are terms that will pass the test of time, but they currently seem pertinent for NBR. Much of the work the researchers at NBR undertake isn’t just about doing statistical analyses, but about collecting, manipulating, sorting etc data. So it seems reasonable to apply science to these parts of the process as well.

At NBR, Excel is the backbone of data management. It suits the level of knowledge across the entire team and the complexity of the data being managed. The system does include the use of some queries, some pivot tables, a good deal of R, some apps that generally produce csv files, more and more spatial data and relevant files types, no SQL. The system is pretty good, but far from perfect and far from where it could be. Were it could be might not be were it needs to be thought. The system is continually being updated, when need and time permits.

ME AND DATA SCIENCE

My work in data science is all through NBR. My role is mainly as the R and the spatial person. I work mainly in RStudio for editing, but a little in Visual Studio Code, syncing code to GitHub. The bulk of the spatial data is developed and viewed in SOLVI, with connection to our research data completed in R. Some other analyses or visualisation of spatial data is in QGIS, additionally through SAGA or GRASS libraries via an R wrapper. Spatial data is also used in Google Maps, and in Pix4DCapture for making drone flying easier.

My hope is to continue to expand the NBR data science system, particularly when it comes to the collation of research data, the use of spatial data, and the presentation of results. I have built a few interactive tools for farmers to use: see GitHub for the current list of publicly available apps/ models.

And finally, beyond data science, there are a few other key programs and programming languages I have used at NBR.

  • Latex is used when writing – I use the open source program TexStudio or the online platform Overleaf.
  • R Markdown in RStudio is sometimes used when the writing involves data from R.
  • OpenFOAM is a collection of C++ libraries used for running CFD models.
  • The Arduino systems we build use C++.
  • The Raspberry pi systems we build use Python.