Revision as of 15:04, 22 December 2014 edit Deepdraft (talk \| contribs) Extended confirmed users 505 edits m Emphasize the alternative (in-database analytics) name ← Previous edit		Revision as of 23:48, 4 April 2015 edit undo Mild Bill Hiccup (talk \| contribs) Extended confirmed users, Pending changes reviewers, Rollbackers 175,651 edits →Types: in either in → in either Next edit →
Line 13: ==Types== There are three main types of in-database processing: translating a model into SQL code, loading C or C++ libraries into the database process space as a built-in user-defined function (UDF), and out-of-process libraries typically written in C, C++ or ~~JAVA~~Java and registering them in the database as a built-in UDFs in a SQL statement. ===Translating ~~Models~~models into SQL ~~Code~~code=== In this type of in-database processing, a predictive model is converted from its source language into SQL that can run in the database usually in a stored procedure. Many analytic model-building tools have the ability to export their models in either in SQL or [[PMML]] (Predictive Modeling Markup Language). Once the SQL is loaded into a stored procedure, values can be passed in through parameters and the model is executed natively in the database. Tools that can use this approach include SAS, R and KXEN. ===Loading C or C++ ~~Libraries~~libraries into the database process space=== With C or C++ UDF libraries that run in process, the functions are typically registered as built-in functions within the database server and called like any other built-in function in a SQL statement. Running in process allows the function to have full access to the database server’s memory, parallelism and processing management capabilities. Because of this, the functions must be well-behaved so as not to negatively impact the database or the engine. This type of UDF gives the highest performance out of any method for OLAP, mathematical, statistical, univariate distributions and data mining algorithms. ===Out-of-~~Process~~process=== Out-of-~~Process~~process UDFs are typically written in C, C++ or ~~JAVA~~Java. By running out of process, they do not run the same risk to the database or the engine as they run in their own process space with their own resources. Here, they wouldn’t be expected to have the same performance as an in-process UDF. They are still typically registered in the database engine and called through standard SQL, usually in a stored procedure. Out-of-process UDFs are a safe way to extend the capabilities of a database server and are an ideal way to add custom data mining libraries. ==Uses==

In-database processing: Difference between revisions