Cookies help us display personalized product recommendations and ensure you have great shopping experience.

By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
SmartData CollectiveSmartData Collective
  • Analytics
    AnalyticsShow More
    data analytics
    How Data Analytics Can Help You Construct A Financial Weather Map
    4 Min Read
    financial analytics
    Financial Analytics Shows The Hidden Cost Of Not Switching Systems
    4 Min Read
    warehouse accidents
    Data Analytics and the Future of Warehouse Safety
    10 Min Read
    stock investing and data analytics
    How Data Analytics Supports Smarter Stock Trading Strategies
    4 Min Read
    predictive analytics risk management
    How Predictive Analytics Is Redefining Risk Management Across Industries
    7 Min Read
  • Big Data
  • BI
  • Exclusive
  • IT
  • Marketing
  • Software
Search
© 2008-25 SmartData Collective. All Rights Reserved.
Reading: How are data transformations represented in PMML?
Share
Notification
Font ResizerAa
SmartData CollectiveSmartData Collective
Font ResizerAa
Search
  • About
  • Help
  • Privacy
Follow US
© 2008-23 SmartData Collective. All Rights Reserved.
SmartData Collective > Uncategorized > How are data transformations represented in PMML?
Uncategorized

How are data transformations represented in PMML?

MichaelZeller
MichaelZeller
6 Min Read
SHARE

PMML supports several kinds of data transformations. Below, we list the most common together with examples.

Data transformations involved in the pre-processing of the input variables/fields are mainly located inside the following PMML elements: TransformationDictionary and LocalTransformations.

For the formal PMML schema definition of the transformations covered here, please refer to the PMML Transformations page on the DMG website.

Value Mapping

More Read

Meeting Customers Where They Are: When Social Media Misses the Mark
Here Comes Web 3.0: Wolfram|Alpha Launches Today
Business Side Guide: See Your Customers in a Global Light
The #1 job of every CIO
Enhancing BPM with Business Rules and Analytics at Air Products

Value mapping can be used to map discrete values to discrete values. The example below shows how to map a categorical field (color) into a numerical field (derived_color).


Note that in this example, we are mapping yellow to 3, white to 1, blue to 6, and green to 4.

The original input field named color needs to be defined in the DataDictionary element. The derived field, the result of our data transformation is now called derived_color. This field can subsequently be used by the model as an input variable or used as input for another data transformation.

Discretization

Discretization is used to map continuous values to discrete values. The example below shows how to discretize a continuous field (units) into a discrete field (derived_units).


In thi…


PMML supports several kinds of data transformations. Below, we list the most common together with examples.

Data transformations involved in the pre-processing of the input variables/fields are mainly located inside the following PMML elements: TransformationDictionary and LocalTransformations.

For the formal PMML schema definition of the transformations covered here, please refer to the PMML Transformations page on the DMG website.

Value Mapping

Value mapping can be used to map discrete values to discrete values. The example below shows how to map a categorical field (color) into a numerical field (derived_color).


Note that in this example, we are mapping yellow to 3, white to 1, blue to 6, and green to 4.

The original input field named color needs to be defined in the DataDictionary element. The derived field, the result of our data transformation is now called derived_color. This field can subsequently be used by the model as an input variable or used as input for another data transformation.

Discretization

Discretization is used to map continuous values to discrete values. The example below shows how to discretize a continuous field (units) into a discrete field (derived_units).


In this example, we are transforming an interval to a discrete value, more specifically, discretize will transform [1,2[ to 1, [2,3[ to 2, and [3,100] to 3.

The new field is now called derived_units and can be used as input to another transformation or to the model itself.

Normalization

As specified in the DMG website, normalization provides a basic framework for mapping input values to specific value ranges, usually the numeric range [0 .. 1].

NormContinuous

Normalization is used, e.g., in neural networks. In fact, if you export your neural network model using SPSS (starting with version 16), the PMML code generated will contain this kind of transformation for the neural inputs. The R PMML package will also generate a file containing the normalization of input variables for Support Vector Machines (SVMs). The example below was extracted from the Iris_SVM.xml file available in the Zementis website.


The PMML element NormContinuous can be used to implement simple normalization functions such as the z-score transformation (X – m ) / s, where m is the mean value and s is the standard deviation.

NormDiscrete

The NormDiscrete element is used to implement the dummyfication of categorical or ordinal fields. For example, if you have a categorical variable called Marital with the following possible values: Absent, Divorced, Married, Married-spouse-absent, Unmarried, and Widowed, you may want these to be dummyfied (i.e. translated into 0s and 1s) for use by a neural network or SVM. The example below shows the use of element NormDiscrete to accomplish just that.


The set of NormDiscrete instances which refer to input field Marital define a fan-out function which maps a single input field to a set of normalized fields. Note that if Marital is equal to Married, the field derived_MaritalMarried will be assigned a value equals to 1.0 and all other derived_MaritalX fields shown will be assigned values equal to 0.

This code was extrated from the Audit_SVM.xml file available in the Zementis website. It is automatically exported by the R PMML package for SVMs built using the R ksvm (kernlab) package.

Functions

PMML offers several built-in functions, all of which are supported by ADAPA. The list is as follows:

1. +, -, * and /
2. min, max, sum and avg
3. log10, ln, sqrt, abs, exp, pow, threshold, floor, ceil, round
4. uppercase
5. substring
6. trimBlanks
7. formatNumber
8. formatDatetime
9. dateDaysSinceYear
10. dateSecondsSinceYear
11. dateSecondsSinceMidnight

You can find several examples of the use of such functions in the DMG website.

Note that functions such as min, max, sum and avg take a variable number of parameters (derived fields or input fields) and return a single value which you would then assign to a new derived field.

Comprehensive blog featuring topics related to predictive analytics with an emphasis on open standards, Predictive Model Markup Language (PMML), cloud computing, as well as the deployment and integration of predictive models in any business process.

Link to original post

Share This Article
Facebook Pinterest LinkedIn
Share

Follow us on Facebook

Latest News

protecting patient data
How to Protect Psychotherapy Data in a Digital Practice
Big Data Exclusive Security
data analytics
How Data Analytics Can Help You Construct A Financial Weather Map
Analytics Exclusive Infographic
AI use in payment methods
AI Shows How Payment Delays Disrupt Your Business
Artificial Intelligence Exclusive Infographic
financial analytics
Financial Analytics Shows The Hidden Cost Of Not Switching Systems
Analytics Exclusive Infographic

Stay Connected

1.2KFollowersLike
33.7KFollowersFollow
222FollowersPin

You Might also Like

What Data Will Your Customers Share for Better Service?

4 Min Read

3 factors that lead to better employee performance

3 Min Read

BI on the Go: About Functionality and Level of Satisfaction

11 Min Read

The Many Faces of R

0 Min Read

SmartData Collective is one of the largest & trusted community covering technical content about Big Data, BI, Cloud, Analytics, Artificial Intelligence, IoT & more.

giveaway chatbots
How To Get An Award Winning Giveaway Bot
Big Data Chatbots Exclusive
ai in ecommerce
Artificial Intelligence for eCommerce: A Closer Look
Artificial Intelligence

Quick Link

  • About
  • Contact
  • Privacy
Follow US
© 2008-25 SmartData Collective. All Rights Reserved.
Go to mobile version
Welcome Back!

Sign in to your account

Username or Email Address
Password

Lost your password?