SQL SERVER 2016 – Getting Started with R Services and Forecasting – Notes from the Field #131

[Notes from Pinal]: R Services are very much new thing and trust me, I still have to get my hand dirty on this subject. I know the basics of it, but still I personally do not know enough that I can blog about it and teach others. This is when I reach out to Eduardo, he has this incredible ability to learn new thing and teach others in the lucid easy way. I requested Eduardo to help me out with R Services and Forecasting and he happily accepted.

SQL SERVER 2016 - Getting Started with R Services and Forecasting - Notes from the Field #131 EduardoCastro In this 131st episode of the Notes from the Fields series database expert Eduardo Castro (partner at Linchpin People) shares very interesting conversation related to we can get started with R Services and Forecasting.


One of the new challenges the data professional is to deal with are the new techniques that are emerging for data analysis, those techniques are in the field of Data Science. Recently I was working in a project and a new requirement was brought to the pipeline, we were requested to apply some forecasting algorithms to the data we already had in our database, immediately data mining and Analysis Services came to my mind, but we had a restriction: the algorithm was already written using R Studio by one the company’s data scientists.

SQL SERVER 2016 - Getting Started with R Services and Forecasting - Notes from the Field #131 gettingstartedwithr-800x363

Based on that restriction, we asked the project manager authorization to use the new feature in SQL Server 2016 that allows us to include R Scripts inside the database, the transition of the R code in SQL Server 2016 is possible, and in this post I will show you how to do it.

First, you must be using SQL Server 2016 and you must install the Advanced Analytics Extensions also known as R Services, this is done during the installation process, the following image shows what must be selected during setup.

SQL SERVER 2016 - Getting Started with R Services and Forecasting - Notes from the Field #131 131-1

Optionally, you can also install the Microsoft R Server Standalone as a Shared Feature. During installation, you will be asked to install some additional libraries to support the R Services, you can accept the defaults.

SQL SERVER 2016 - Getting Started with R Services and Forecasting - Notes from the Field #131 131-2

Once you have the R Services installed, you must check the R Script that must run inside SQL Server is working properly, the best way to do it is by using R Studio, which a free tool to create R Scripts, you must take a copy of the script and check if it runs inside R Studio, if the script has any problem it must be corrected before trying it inside SQL Server. The following image shows the successful installation of the R Services.

SQL SERVER 2016 - Getting Started with R Services and Forecasting - Notes from the Field #131 131-3

In this blog, we will use a very simple R Script to show how can we ported it to SQL Server, the sample R Script uses a very well know data set called Air Passengers, the dataset contains information about the passenger statistics, then we will use a Forecasting function to predict the next values in the time series.

The following figure shows the sample code inside R Studio.

SQL SERVER 2016 - Getting Started with R Services and Forecasting - Notes from the Field #131 131-4

If we run this script we will get the forecasted values for the passenger data, the output in the R Studio Console is shown in the following image.

SQL SERVER 2016 - Getting Started with R Services and Forecasting - Notes from the Field #131 131-5

Once you have the R Services installed you must configure the SQL Server instance to allow the execution of R Scripts, the R Scripts are run as an external script so we need to activate this feature in SQL Server using the following command, and the restart SQL Server.

sp_configure 'external scripts enabled', 1;
RECONFIGURE
GO

Is very common that the R Scripts need any specific library and these libraries are not installed by default, for example the “tseries” package must be installed in order the script to run. The installation of the required packages can be done using the RGui which is installed when you choose the Analytics Libraries during the SQL Server Installation, you must run the RGui as an administrator in order to be able to install the libraries.  The RGui console is shown below:

SQL SERVER 2016 - Getting Started with R Services and Forecasting - Notes from the Field #131 131-6

If you need to install a package for example “tseries” you must run the following command.

install.packages("tseries", lib = "C:/Program Files/Microsoft SQL Server/MSSQL13.MSSQLSERVER/R_SERVICES/library")

The following image shows the successful installation of the package.

SQL SERVER 2016 - Getting Started with R Services and Forecasting - Notes from the Field #131 131-7

We already had tested our R Script in R Studio so we can start porting it to SQL Server as an in-database script, basically the execution of the R Scripts is done using the EXEC sp_execute_external_script and we must specify that the script language is R.  The basic parameters needed to execute R Scripts are explained in the following table:

@language = N'R'
This parameter specifies that the scripting language is R
@script = N' '
This parameter contains the script body, what you must do is to copy the content of the script in R Studio and include it inside this parameter
@input_data_1 = N' Your T-SQL goes here'
If you need to read data from a SQL Server table, for example a valid input can be:

USE [AdventureworksDW2016]
GO
SELECT [ModelRegion]
      ,[TimeIndex]
      ,[Quantity]
      ,[Amount]
      ,[CalendarYear]
      ,[Month]
      ,[ReportingDate]
  FROM [dbo].[vTimeSeries]
GO	

The variable @input_data_1 will contain the result the of query and this variable can be used inside the R Script to manipulate the data.

@output_data_1_name = N' Output Data Frame Name'
This parameter reference a data frame generated inside the R Script and its intention is to hold the result of the forecasting algorithm.
WITH RESULT SETS (([U] float , [L] float, [Forecast] float  ))
This last part is optional and it gives you the option the specify the name and the data type of the data frame columns included in the output of the R Script

The complete ported R Script is shown below:

-- calculate forecasting with R
EXEC sp_execute_external_script
      @language = N'R', 
     @script = N'   
	                      library(tseries)
				  f <- decompose(AirPassengers);
				  fit <- arima(AirPassengers, order=c(1,0,0), 
                                   list(order=c(2,1,0), period=12));
				  fore <- predict(fit, n.ahead=24);
				  U <- fore$pred + 2*fore$se;
                            L <- fore$pred - 2*fore$se;
				  df_U <- data.frame(U,L,fore$pred);
				  '
				  ,
     @input_data_1 = N'',
     @output_data_1_name = N'df_U'
 WITH RESULT SETS (([U] float , [L] float, [Forecast] float  ))		

If we run the previous script inside the Management Studio we will get the following output.

SQL SERVER 2016 - Getting Started with R Services and Forecasting - Notes from the Field #131 131-8

The above result is the same we got inside R Studio, the main difference is that the script is now part of our database solution and is using the Microsoft R Server and SQL Server 2016 to execute the script.

Conclusions

There are new requirements for the DBA and is more often that they come from the Data Science area, in this blog we had shown how to leverage the new features in SQL Server 2016 to include R Scripts inside the database in an integrated way. We had also shown how to port current R Scripts to SQL Server 2016.

Reference: Pinal Dave (https://blog.sqlauthority.com)

Notes from the Field, SQL Server
Previous Post
Pester the Tester: PowerShell Bugs, Beware! – Notes from the Field #129
Next Post
SQL SERVER 2016 – Find Expiry Date of Developer Edition

Related Posts

3 Comments. Leave new

  • i have a little problem, when i run the first script of R, “Hello World” i see the next error: No se puede iniciar el tiempo de ejecución para el script ‘R’. Compruebe la configuración del tiempo de ejecución ‘R’., please help me, i installed sql server 2016 enterprise full, and i followed all instructions above,
    Thanks

    Reply
  • Hello
    when I run R in my Sql Code I face the following error, does anybody know how I can solve that?

    ———————————————————————————-
    Msg 39012, Level 16, State 14, Line 0
    Unable to communicate with the runtime for ‘R’ script for request id: 0197B658-2C50-4F7E-894F-5FAFEFE78D13. Please check the requirements of ‘R’ runtime.
    STDERR message(s) from external script:
    Error: .onLoad failed in loadNamespace() for ‘RevoScaleR’, details:
    call: inDL(x, as.logical(local), as.logical(now), …)
    error: fatal error: RevoScaleR cannot be used in this R session anymore, if possible restart R session
    error code -1066598274, detailed error message might be found in: (standard output unavailable) and (standard error output unavailable)
    Execution halted
    exception while shutting down RxClientPipe: cannot write to BxlServer, child process is dead
    ——————————————————————————–

    Reply
  • I have the exact same problem and have been searching for an answer… if you find out the issue please post a reply!

    Reply

Leave a Reply