I’m not sure if Data science and Data scientist are terms that will pass the test of time, but for my part, I really like the concepts. Much of the work I and many of my colleagues do isn’t just about doing statistical analyses, but about collecting, manipulating, sorting etc data. So it seems reasonable to apply science to these parts of the process as well.
At NBR, we use Excel a lot, some queries, some pivot tables, a good deal of R, some apps that generally produce csv files, more and more spatial data and relevant files types, no SQL. Our system is pretty good, but far from perfect and far from where we want it. We are continually working on it, when need and time permits. I’m mainly the R and the spatial person. The bulk of the spatial data is developed and viewed in SOLVI, with some manipulation in R. Some goes through QGIS. It is also used in Google Maps, and in Pix4DCapture for making drone flying easier.
Particular to my work, I don’t really need to step outside this system, but I do want to be able to expand it, particularly when it comes to the spatial data and to the presentation of results. I have built a few interactive tools for farmers to use: see GitHub for the current list of publicly available apps/ models. I use DataCamp for learning (well, more often YouTube and StackOverFlow), and RStudio for editing and syncing code to GitHub.
And finally, beyond data science, there are a few other key programs and programming languages we use.
- Latex is used when writing – I use the open source program TexStudio.
- R Markdown in RStudio is used when the writing involves data from R.
- The new kid on the block is OpenFOAM. It’s not a language, but a program for running CFD. It uses C++.
- The Arduino systems we build use C++.
- The Raspberry pi systems we build use python.