Automation of Data Mining Using Integration Services

SQL Server Technical Article
Writer: Jeannine Takaki
Technical Reviewer: Raman Iyer
Published: July 2011
Applies to: SQL Server 2008 R2, SQL Server 2008, SQL Server 2005
Summary: This article is a walkthrough that illustrates how to build multiple related data models
by using the tools that are provided with Microsoft SQL Server Integration Services. In this
walkthrough, you will learn how to automatically build and process multiple data mining models
based on a single mining structure, how to create predictions from all related models, and how
to save the results to a relational database for further analysis. Finally, you view and compare
the predictions, historical trends, and model statistics in SQL Server Reporting Services reports.
Copyright
This document is provided “as-is”. Information and views expressed in this document, including URL and
other Internet Web site references, may change without notice. You bear the risk of using it.
Some examples depicted herein are provided for illustration only and are fictitious. No real association
or connection is intended or should be inferred.
This document does not provide you with any legal rights to any intellectual property in any Microsoft
product. You may copy and use this document for your internal, reference purposes.
© 2011 Microsoft. All rights reserved.
2
Contents
.1
Introduction .................................................................................................................................................. 5
Automating the Creation of Data Mining Models ........................................................................................ 5
Solution Walkthrough ................................................................................................................................... 6
Scope ......................................................................................................................................................... 6
Overall Process.......................................................................................................................................... 7
Phase 1 - Preparation .................................................................................................................................... 7
Create the Forecasting mining structure and default time series mining model ..................................... 7
Extract and edit the XMLA statement....................................................................................................... 8
Prepare the replacement parameters ...................................................................................................... 9
Phase 2 - Model Creation............................................................................................................................ 12
Create package and variables (CreateParameterizedModels.dtsx) ........................................................ 13
Configure the Execute SQL task (Get Model Parameters) ...................................................................... 13
Configure the Foreach Loop container (Foreach Model Definition)....................................................... 14
Phase 3 - Process the Models ..................................................................................................................... 15
Create package and variables (ProcessEmptyModels.dtsx) ................................................................... 16
Add a Data Mining Query task (Execute DMX Query) ............................................................................ 16
Create Execute SQL task (List Unprocessed Models) .............................................................................. 17
Create a Foreach Loop container (Foreach Model in Variable) .............................................................. 17
Add an Analysis Services Processing task to the Foreach Loop (Process Current Model) ...................... 17
Add a Data Mining Query task after the Foreach Loop (Update Processing Status) .............................. 18
Phase 4 - Create Predictions for All Models................................................................................................ 19
Create package and variables (PredictionsAllModels.dtsx) .................................................................... 20
Create Execute SQL task (Get Processed Models) .................................................................................. 20
Create Execute-SQL task (Get Series Names) ......................................................................................... 20
Create Foreach Loop container (Predict Foreach Model) ...................................................................... 21
Create variables for the Foreach Loop container ................................................................................... 22
3
Create the Data Mining Query task (Predict Amt) .................................................................................. 23
Create the second Data Mining Query task (Predict Qty) ...................................................................... 24
Create Data Flow tasks to Archive the Results ....................................................................................... 25
Create Data Flow tasks (Archive Results Qty, Archive Results Amt) ...................................................... 26
Run, Debug, and Audit Packages ............................................................................................................ 27
Phase 5 - Analyze and Report ..................................................................................................................... 27
Using the Data Mining Viewer ................................................................................................................ 27
Using Reporting Services for Data Mining Results .................................................................................. 28
Interpreting the Results .......................................................................................................................... 31
Discussion.................................................................................................................................................... 32
The Case for Ensemble Models ............................................................................................................... 33
Closing the Loop: Interpreting and Getting Feedback on Models .......................................................... 33
Conclusion ................................................................................................................................................... 34
Resources .................................................................................................................................................... 34
Acknowledgements..................................................................................................................................... 34
Code for Script Task .................................................................................................................................... 35
4
Introduction
This article is a walkthrough that illustrates how to use the data mining tools that are provided
with Microsoft SQL Server Integration Services. If you are an experienced data miner, you
probably already use the tools provided in Business Intelligence Development Studio or the
Data Mining Client Add-in for Microsoft Excel for building or browsing mining models. However,
Integration Services helps you automate many processes.
This solution also introduces the concept of ensemble models for data mining, which are sets of
multiple related models. For most data mining projects, you need to create several models,
analyze the differences, and compare outputs before you can select a best model to use
operationally. Integration Services provides a framework within which you can easily generate
and manage ensemble models.
In this series, you will learn how to:
•
Configure the Integration Services components that are provided for data mining.
•
Automatically build and update mining models by using Integration Services.
•
Store mining model parameters and prediction results in the database engine.
•
Integrate reporting requirements in the model design workflow.
Note that these are just a few of the ways that you can use Integration Services to incorporate
data mining into analytic and data handling workflows. Hopefully these examples will help you
get more mileage out of existing installations of Integration Services and SQL Server Analysis
Services.
Automating the Creation of Data Mining Models
This scenario positions you as an analyst who has been tasked with creating some projections
based on past sales data. You are unsure about how to configure the time series algorithm for
best results (ARIMA? ARTXP? What hints to provide?). Moreover, you know that the modeling
process typically involves building several models and testing different scenarios.
Rather than build variations on the model ad hoc, you decide to automatically generate multiple
related models, varying the parameters systematically for each model. This way you can easily
create many models, each using a different combination of periodicity hints and algorithm type.
After you have created and processed all the models, you will put the historical data plus the
predictions for each model into a series of reports to see which models provide good results.
This walkthrough demonstrates these features:
•
Automatically building multiple mining models using parameters stored in SQL Server
•
Generating bulk predictions for each model and storing them in SQL Server
•
Comparing trends from the models by putting them side by side in reports
5
Solution Walkthrough
This section describes the complete solution that builds multiple models and creates queries
that return predictions from each model. It contains these parts:
[1] Analysis Services project: To create this project, follow the instructions in the Data Mining
tutorial on MSDN (http://msdn.microsoft.com/en-us/library/ms169846.aspx) to create a
Forecasting mining structure and default time series mining model.
[2] Integration Services project: You will create a new project, containing multiple packages:



A package that builds multiple models, using the Analysis Services Execute DDL
task
A package that processes multiple models, using the Analysis Services Processing
task
A package that creates predictions from all models, using the Data Mining query task
Scope
The following Integration Services tasks and components are used in this walkthrough. For
more information from SQL Server Books Online, click on the link in the Task or component
column.
Task or component
Execute SQL task
Analysis Services Execute DDL
task
Analysis Services Processing task
Foreach Loop Container
Script task
Data Mining Query task
Data Flow task
OLE DB source
OLE DB destination
Derived Column transformation
Used for
Gets variable values, and creates tables to store results
Creates individual models
Populates the models with data
Builds and processes multiple data mining models
Builds the required XMLA commands
Creates predictions from each model
Manages and merges prediction results
Gets data from temporary prediction table
Writes predictions to permanent table
Adds metadata about predictions
Even though the following Integration Services components are also very useful for data mining,
they are not used in this walkthrough—look for examples in a later paper:





Data Profiling task
Conditional Split transformation
Percentage Sampling transformation
Lookup and Fuzzy Lookup transformation
Data Mining Training destination
Note: The SQL Server Reporting Services project containing the reports that compare models
is not included here, even though this project generates all the data required for the reports.
That is because the report creation process is somewhat lengthy to describe, especially if you
6
are not familiar with Reporting Services. Moreover, since all the prediction data is stored in the
relational database, there are other reporting clients you can use, including Microsoft
PowerPivot for Excel and Project Crescent. However, we hope to describe the process in a
separate article later on the TechNet Wiki
(http://social.technet.microsoft.com/wiki/contents/articles/default.aspx).
Overall Process
Phase 1 - Preparation: The definition of the models you want to create is stored in SQL Server
as a set of parameters, values, and model names.
Phase 2 – Model creation: Integration Services retrieves the model definitions and passes the
parameter values to a Foreach Loop that builds and then executes the XML for Analysis (XMLA)
statement for each model.
Phase 3 – Model processing: Integration Services retrieves a list of available models, and then
it processes each model by populating it with data.
Phase 4 - Prediction: Integration Services issues a prediction query to each processed model.
Each set of predictions is saved to a SQL Server table.
Phase 5 – Reporting and analysis: The prediction trends for each model are compared by
using reports (created by Reporting Services, PowerPivot, or your favorite reporting client) using
the data in the relational table.
Phase 1 - Preparation
In this phase, you set up the structure, sample data, and parameters your packages will use.
Before you build the working packages, you need to complete the following tasks:

Create the Forecasting mining structure used by all mining models.

Generate the sample XMLA that represents the default time series mining model, to use
as a template.

Create a table that stores replacement parameters for the new models, and then insert
the parameter values.
The following section describes how to perform these tasks.
Create the Forecasting mining structure and default time series mining model
To create multiple models based on a single mining structure, you need to create the
Forecasting mining structure first. Based on that mining structure, you also need to create a
time series model that can be used as the template for generating other models. If you do not
already have a mining structure capable of supporting a time series model, you can build one by
following the steps described in the Microsoft Time Series tutorial
(http://msdn.microsoft.com/en-us/library/ms169846.aspx) in SQL Server Books Online.
7
Extract and edit the XMLA statement
Next, use the time series model from the previous step to extract an XMLA statement that will
serve as a template for all other models.
You can get the XMLA statement for any model or structure by using the scripting options in
SQL Server Management Studio:
1.
2.
3.
4.
In SQL Server Management Studio, right-click the time series model.
Click Script Mining Model as.
Save the XMLA to a text file.
Open the text file in Notepad or another plain-text editor.
After you generate XMLA for the default time series model by using the script option, it looks
like the following code. (The XMLA statement for models can be lengthy, so only an excerpt
is shown here.) The XMLA statement always includes the database, the mining structure,
metadata such as the model name, and the algorithm used for analysis. It can optionally
include multiple parameters.
<ParentObject>
<DatabaseID>Forecasting Models</DatabaseID>
<MiningStructureID>Forecasting</MiningStructureID>
</ParentObject>
<ObjectDefinition>
<MiningModel>
<ID>ARIMA_1-10-30</ID>
<Name>ARIMA_1-10-30</Name>
<Algorithm>Microsoft_Time_Series</Algorithm>
<AlgorithmParameters>
<AlgorithmParameter>
<Name>FORECAST_METHOD</Name>
<Value xsi:type="xsd:string">ARIMA</Value>
</AlgorithmParameter>
<AlgorithmParameter>
<Name>PERIODICITY_HINT</Name>
<Value xsi:type="xsd:string">{1,10,30}</Value>
8
</AlgorithmParameter>
</AlgorithmParameters>
<Columns>
</MiningModel>
</ObjectDefinition>
Next, make the following changes to the command text that you extracted:


Add the parameters that you want to change, if they are not already present in
the model. Default parameters are part of the XMLA output, so if your base
model does not contain any parameters, you will need to add the XMLA section
that contains parameters.
Remove unnecessary white space and all line breaks. For this walkthrough, the
XMLA is stored as a string in a variable, which cannot contain line breaks. If you
leave in any line breaks, the problem is not detected at package validation, but at
run time, the Analysis Services engine attempts to execute the XMLA and fails
with an error.
To clean up the file, use your favorite text editor. White spaces such as tabs and multiple
space characters are okay but you can remove them if you like, to shorten the string
variable. There is no limit on the size of string variables, but there is a 4,000-character
limit in the expression editor.
5. If your model does not already contain the parameters FORECAST_METHOD and
PERIODICITY_HINT, use the code listed earlier, and copy the XML node that begins
with <AlgorithmParameters> and ends with </AlgorithmParameters>. Paste it into the
text file containing the XMLA command, directly below the line that defines the algorithm,
and before the section that defines the columns.
6. Edit the entire XMLA statement to remove line breaks. You can use any text editor that
you like, so long as you verify that the result is a single line of text.
Prepare the replacement parameters
To create new models, you must update the basic XMLA command that you just created by
inserting different values for the parameters. Among the parameters you must update are the
model names and the model ID. Before you do this, you may find it helpful to review the format
of the parameters you will change:




9
FORECAST_METHOD – Can have the values MIXED (default), ARIMA, or ARTXP.
PERIODICITY_HINT – Can have any combination of numbers separated by commas
and enclosed by curly braces.
MODEL_ID – Must be unique for each model you create or an error will be generated.
MODEL_NAME – Should match the MODEL_ID; optional, but having them match
makes the process easier to understand.
Integration Services is extremely flexible, so there are many different ways to store the
parameters and insert them into the model XMLA. For example, you could:

Store the parameters as text in a SQL Server table, and then insert them into the XMLA
within a Foreach Loop, by using an ADO.NET iterator.

Save the XMLA command as a text file and read it using a flat file connection. Save the
variables in a configuration file and apply them at run time.

Save the XMLA command as an .xml file, and then read it into a package by using an
XML Source. Insert the variables into the XML by using the properties and methods of
the XML task.

Create multiple XMLA files in advance and then read the files with a combination of a
Foreach loop and an XML Source connection.
However, for this scenario, you need to be able to easily add new sets of parameters, and to
view and update the complete list of models and parameters.
Therefore, you’ll use the first method: create the parameter-value pairs as records in a SQL
Server database, and then read in the new values at run time by using a Foreach Loop
container. This way, you can easily view or update the parameters by using SQL queries.
Run the following statement to create the parameters table.
USE [DMReporting] -- substitute your database name here
GO
/****** Object:
******/
Table [dbo].[ModelParameters]
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[ModelParameters](
[RecordID] [int] IDENTITY(1,1) NOT NULL,
[RecordDate] [datetime] NULL,
[ModelID] [nvarchar](50) NULL,
[ModelName] [nvarchar](50) NULL,
[ForecastMethod] [nvarchar](50) NULL,
10
Script Date: 11/09/2010 10:56:26
[PeriodicityHint] [nvarchar](50) NULL
) ON [PRIMARY]
GO
The following table lists the parameters that are used to build models in this walkthrough. Insert
these values into the parameters table you created by using the script.
ModelID
ARIMA_1-7-10
ARIMA_1-10-30
ARIMA_nohints
MIXED_1-7-10
MIXED_1-10-30
MIXED_nohints
ARTXP_1-7-10
ARTXP_1-10-30
ARTXP_nohints
ForecastMethod
ARIMA
ARIMA
ARIMA
MIXED
MIXED
MIXED
ARTXP
ARTXP
ARTXP
PeriodicityHint
{1,7,10}
{1,10,30}
{1}
{1,7,10}
{1,10,30}
{1}
{1,7,10}
{1,10,30}
{1}
This scenario uses the parameters FORECAST_METHOD and PERIODICITY_HINT because
they are among the most important parameters for time series models (also because they are
string values and easy to change!).
However, the parameters that you change will be completely different for other algorithms. For
example, if you build a clustering model, you might decide to change the
CLUSTERING_METHOD parameter and build models using each of the four clustering
methods, such as K-Means. You might also try altering the MINIMUM_SUPPORT parameter, or
trying a variety of cluster seeds. For a list of the parameters provided by the different algorithms,
see the algorithm technical reference topics (http://msdn.microsoft.com/enus/library/cc280427.aspx) in MSDN.
Important note for data miners: Altering parameter values can strongly affect the model
results. Therefore, you should have some sort of plan for analyzing the results and weeding out
badly fit models.
For example, because the time series algorithm is very sensitive to periodicity hints, it can
produce poor results if you provide the wrong hint. If you specify that the data contains weekly
cycles and it actually contains monthly cycles, the algorithm attempts to fit the data to the
suggested weekly cycle and might produce odd results. Some of the models generated by this
automation process demonstrate this behavior.
There are many ways that you can check the validity of models:

11
Use descriptive statistics and metadata for the individual models to eliminate models that
have characteristics of overfitting or poor fit.

Validate data sets and models by using cross-validation or one of the other accuracy
measures provided by SQL Server. For more information, see Validating Data Mining
Models (http://technet.microsoft.com/en-us/library/ms174493.aspx) in SQL Server Books
Online.

Choose only parameters and values that make sense for the business problem; use
business rules to guide your modeling.
This completes the preparations, and you can now build the three packages.
The instructions for each package begin with a diagram that illustrates the package workflow
and briefly describes the package components.
The diagram is followed by steps that you can follow to configure each task or destination.
Phase 2 - Model Creation
In Phase 2, you build a package that creates many mining models, using the templates and
parameters you prepared in Phase 1. From among these models, you can choose an analysis
that best suits your needs. This package includes the following tasks:




12
The initial Execute SQL task, which gets the model names and model parameters from
the relational database and then stores the results of that query in an ADO.NET rowset
variable
A Foreach Loop container, which iterates over the values stored in the ADO.NET rowset
variable and then passes the new model names and model parameters, one at a time, to
the Script task and the Analysis Services Execute DDL task inside the Foreach Loop
The Script task, which loads the base XMLA statement from a variable, inserts the new
parameters, and then writes out the new XMLA to another variable
The Analysis Services Execute DDL task, which executes the updated XMLA statement
contained in the variable
Create package and variables (CreateParameterizedModels.dtsx)
1. Create a new Integration Services package and name it
CreateParameterizedModels.dtsx.
2. Set up the variable that stores the model parameters. The variable name should be
User:objAllModelParameters, and the variable should have the type Object. Make sure
that the variable has package scope, because it passes values from the query results,
returned by the Get Model Parameters (Execute SQL task), to a Foreach loop.
Configure the Execute SQL task (Get Model Parameters)
1. Add an Execute SQL task to the package you just created and name it Get Model
Parameters.
2. In the Execute SQL Task Editor, specify the database that stores the model
parameters. This walkthrough uses an OLE DB connection manager.
3. For the SQLSourceType property, specify Direct input, and then paste in the following
query.
SELECT ModelID AS MODEL_ID,
ModelID AS MODEL_NAME,
ForecastMethod AS FORECAST_METHOD,
PeriodicityHint AS PERIODICITY_HINT
FROM dbo.ModelParameters
13
4. For the ResultSet property, choose Full row set. This enables you to store a multi-row
result in a variable.
5. In the Result Set pane, for Result Name, type 0 (zero), and then assign the variable
User:objAllModelParameters.
Configure the Foreach Loop container (Foreach Model Definition)
Now that you have loaded a set of parameters into an object variable, you pass the variable to a
Foreach Loop container, which then performs an operation for each row of data:
1. Create a Foreach Loop container, and name it Foreach Model Definition.
2. Select the Foreach Loop container and then open the Variables window, to create the
following variables with the data types specified, scoped to the Foreach Loop container:
User::strBaseXMLA
String
User::strModelID
String
User::strModelName
String
User::strForecastMethod
String
User::strPeriodicityHint
String
User::strModelXMLA
String
3. Open the Foreach Loop Editor. For Collection, choose ADO.NET enumerator.
4. In the Foreach Loop Editor, configure the enumerator by choosing the variable
User:objAllModelParameters in the ADO object source variable dropdown list. Do not
change the default enumeration mode – you are only passing in one table, so the default
setting, Rows in the first table, is correct.
5. Click Variable mappings, and then assign columns from the parameters to the indexes
of columns in variable data as follows:
User::strModelID
0
User::strModelName
1
User::strForecastMethod
2
User::strPeriodicityHint
3
6. Add a Script task inside the Foreach Loop container, and name it Update Model XMLA.
7. In the Script Task Editor, specify the properties of the variables as follows:
User::strBaseXMLA
Read-only
User::strModelID
Read-only
User::strModelName
Read-only
User::strForecastMethod
Read-only
User::strPeriodicityHint
Read-only
User::strModelXMLA
Read/write
8. Click Edit Script to add the code that replaces the string for each variable value.
Note: The code for this task is included in the Appendix. The Script task performs a
simple operation: it finds the default values in the basic XMLA and inserts the new
parameter values by doing string replacement. You could also do this by using regular
expressions or XML methods, of course.
14
9. Add an Analysis Services Processing task inside the Foreach Loop container and name
it Execute Model XMLA.
10. For the Connection property, specify the instance of Analysis Services where your
models are stored.
11. On the DDL tab, for SourceType, choose Variable, and then select the variable
User::strModelXMLA.
This completes the package that creates the models. You can now execute just this package by
right-clicking the package in Solution Explorer and then clicking Execute Now.
After the package runs, if there are no errors, you can connect to your Analysis Services
database by using SQL Server Management Studio and see the list of new models that were
created. However, you cannot browse the models or build prediction queries yet. That is
because the models are just metadata until they are processed, and they contain no data or
patterns. In the next package, you will process the models.
Phase 3 - Process the Models
This package gets a list of valid models from the Analysis Services server, and then it
processes the models using the Analysis Services Processing task.
Until processed, the model that you created by using XMLA is just a definition: a collection of
metadata that defines parameters and data source bindings. Processing gets the data from the
Forecasting data source, and then it generates patterns based on the algorithm you specified.
(For more information about the architecture of mining models, and about processing, see
Mining Structures (http://msdn.microsoft.com/en-us/library/ms174757.aspx) in the MSDN
Library.)
In summary, this is how the package handles processing:




15
The first Execute Data Mining Query task issues a query to get a list of valid models.
That list is written to a temporary table in the relational engine.
The next task, an Execute SQL task, retrieves the model names from that table and then
puts them in an ADO.NET rowset variable.
The Foreach Loop container takes the ADO.NET rowset variable contents as input, and
then it processes each model serially, using the embedded Analysis Services
Processing task.
Finally you update the status of the models.
Create package and variables (ProcessEmptyModels.dtsx)
1. Create a new Integration Services package and name it ProcessEmptyModels.dtsx.
2. With the package background selected, add a user variable, objModelsList. This
variable will have package scope and will be used to store the list of models that are
available on the server.
Add a Data Mining Query task (Execute DMX Query)
1. Create a new Data Mining Query task and name it Execute DMX Query.
2. In the Data Mining Query Task Editor, on the Mining Model tab, specify the Analysis
Services database that contains the time series mining models.
3. For Mining Structure, choose Forecasting.
4. Click the Query tab. Here, instead of creating a prediction query, paste in the following
text of a content query. A content query returns metadata about the model and data
already stored in the model in the form of summary statistics.
SELECT MODEL_NAME, IS_POPULATED, LAST_PROCESSED, TRAINING_SET_SIZE
FROM $system.DM_SCHEMA_MINING_MODELS
Not all of these columns are needed for processing, but you can add the columns now
and update the information later.
5. On the Output tab, for Connection, select the relational database where you will store
the results. For this solution, it is <local server name>/DM_Reporting.
6. For Output table, type a temporary table name (in this solution, tmpProcessingStatus)
and then select the option Drop and re-create the output table.
16
Create Execute SQL task (List Unprocessed Models)
1. Add a new Execute SQL Task, and name it List Unprocessed Models. Connect it to the
previous task.
2. In the Execute SQL Task Editor, for Connection, use an OLE DB connection, and then
choose the server name: for example, <local server name>/DM_Reporting.
3. For Result set, select Full result set.
4. For SQLSourceType, select Direct input.
5. For SQL Statement, type the following query text.
SELECT MODEL_NAME FROM tmpProcessingStatus
6. On the Result Set tab, assign the columns in the result set to variables. There is only
one column in the result set, so you assign the variable, User::objModelList, to
ResultSet 0 (zero).
Create a Foreach Loop container (Foreach Model in Variable)
By now you should be pretty comfortable with using the combination of an Execute SQL task
and a Foreach Loop container.
1. Create a new Foreach Loop container and name it Foreach Model in Variable. Connect
it to the previous task.
2. With the Foreach Loop container selected, open the Variables window, and then add
three variables scoped to the Foreach Loop container. The latter two variables work
together: you store the processing command template in one variable, and then you use
an Integration Services expression to alter the text and save the changes to the second
variable:
strModelName1
String
strXMLAProcess1
String
strXMLAProcess2
String
3. In the Foreach Loop Editor, set the enumerator type to ForEach ADO enumerator.
4. In the Enumerator configuration pane, set ADO object source variable to
User::objModelList.
5. Set Enumeration mode to Rows in first table only.
6. In the Variables mapping pane, assign the variable, User::strModelName1, to Index 1.
This means that each row of the single-column table returned by the query will be fed
into the variable.
Add an Analysis Services Processing task to the Foreach Loop (Process
Current Model)
The editor for this task requires that you first connect to an Analysis Services database and then
choose from a list of objects that can be processed. However, because you need to automate
this task, you can’t use the interface to choose the objects to process. So how do you iterate
through a list of objects for processing?
17
The solution is to use an expression to alter the contents of the property,
ProcessingCommand. You use the variable, strXMLAProcess1, which you set up earlier, to
store the basic XMLA for processing a model, but you insert a placeholder that you can modify
later when you read the variable. You alter the command using an expression and write the new
XMLA out to a second variable, strXMLAProcess2.
1. Drag a new Analysis Services Processing task into the Foreach Loop container you just
created. Name it Process Current Model.
2. With the Foreach Loop selected, open the Variables window, and then select the
variable User::strXMLAProcess2.
3. In the Properties pane, select Evaluate as expression and set it to True.
4. For the value of the variable, type or build this expression.
REPLACE( @[User::strXMLAProcess1] , "ModelNameHere", @[User::strModelName1]
)
5. In the Analysis Services Processing Task Editor, click Expressions, and then
expand the list of expressions.
6. Select ProcessingCommand and then type the variable name as follows:
@[User::strXMLAProcess2]
Another way to train the model would be to add a processing task within the same Foreach
Loop that you used to create the model. However, there are good reasons to build and process
the models in separate packages. For example:


Processing can be time-consuming, and it depends on connections to source data.
It is easier to debug problems when model creation and processing are in separate
packages.
Moreover, the Data Mining Query task that is provided in the Control Flow can be used to
execute many different types of queries against an Analysis Services data source. You can use
schema rowset queries within this task to get information about other Analysis Services objects,
including cubes and tabular models, or even run Data Mining Extensions (DMX) DDL
statements. (In contrast, the Data Mining Query Transformation component, available in the
Data Flow, can only be used to create predictions against an existing mining model.)
The final step in this phase is to add a task that updates the status of your mining models. You
can now execute this package as before.
Add a Data Mining Query task after the Foreach Loop (Update Processing
Status)
This task uses the Data Mining Query task to get the updated status of the mining models, and
write that to a relational data table.
1. Right-click the Data Mining Query task you created before, because it has all the right
connections and the correct query text, and then click Copy.
2. Paste the task after the Foreach Loop container and connect it to the loop.
18
3. Rename the task Update Processing Status.
4. Open the Data Mining Query Task Editor, click the Output tab, and verify that the
option Drop and re-create the output table is selected.
This completes the package. You can now execute this package as before.
When you execute this package, the actual processing of each model can take a fairly long
time, depending on how many models are available. You might want to add logging to the
package to track the time used for processing each model.
Phase 4 - Create Predictions for All Models
In this package, you create prediction queries, using the Data Mining Query task, and run the
queries for each of the models that you just created and processed:




19
You first use an Execute SQL task to query the list of models in the database, and save
that list in a variable.
The Foreach Loop then uses the variable to first customize the prediction targets, by
inserting the model name from the variable into the two Data Mining Query tasks.
You write the prediction results to a table in the relational database, using a pair of Data
Flow tasks, which also write out some useful metadata.
The Analysis Services Execute DDL task executes the updated XMLA statement
contained in the variable.
Create package and variables (PredictionsAllModels.dtsx)
1. Create a new Integration Services package and name it PredictionsAllModels.dtsx.
2. Create a variable with package scope as follows:
objProcessedModels Object
Create Execute SQL task (Get Processed Models)
1. Create a new Execute SQL task and name it Get Processed Models.
2. In the Execute SQL Task Editor, for Connection, use an OLE DB connection, and for
the server, type <local server name>/DM_Reporting.
3. For Result set, select Full result set.
4. For SQLSourceType, select Direct input.
5. For SQL Statement, type the following query text. (The purpose of adding the WHERE
condition is to ensure that you do not create a prediction against a model that has not
been processed, which would generate an error).
SELECT MODEL_NAME FROM dbo.tmpProcessingStatus
WHERE LAST_PROCESSED IS NOT null
6. On the Result Set tab, assign the columns in the result set to variables. There is only
one column in the result set, so you assign the variable User::objProcessedModels to
ResultSet 0 (zero).
Tip: When you are working with data mining models and especially when you are building
complex queries, we recommend that you build DMX queries beforehand by opening the model
directly in Business Intelligence Developer Studio and using Prediction Query Builder, or by
launching Prediction Query Builder from SQL Server Management Studio. The reason for is that
when you build queries by using the data mining designers in SQL Server Management Studio
or Business Intelligence Developer Studio, Analysis Services does some validation, which
enables you to browse and select valid objects. However, the Query Builder provided in the
Data Mining Query task does not have this context and cannot validate or help with your
selections.
Create Execute-SQL task (Get Series Names)
This task is not strictly necessary for prediction, but it creates data that is useful later for
reporting.
Recall that the time series data mining model is based on sales data for different product lines in
different regions, with each combination of a product line plus a region making a single series.
For example, you can predict sales for the M200 product in Europe or the M200 product in
North America. Here, the series name is extracted and stored in a table, making it easier to
group and filter the predictions later in Reporting Services:
1. Add a new Execute SQL task and name it Get Series Names.
20
2. Connect it to the previous task.
3. In the Execute SQL Task Editor, choose OLE DB connection, and for Connection,
type <local server name>/DM_Reporting.
4. For Result set, select None.
5. For SQLSourceType, select Direct input.
6. For SQL Statement, type the following query text.
IF EXISTS
(SELECT [modelregion] FROM DMReporting.dbo.tmpModelRegions)
BEGIN
TRUNCATE TABLE DMReporting.dbo.tmpModelRegions
INSERT DMReporting.dbo.tmpModelRegions
SELECT DISTINCT [ModelRegion]
FROM AdventureWorksDW2008R2.dbo.vTimeSeries
END
Create Foreach Loop container (Predict Foreach Model)
In this Foreach Loop, you create two pairs of tasks: a prediction query plus data flow to
handle the results, one for Amount and one for Quantity.
You might ask, why generate the results for Amount and Quantity separately, when the
Prediction Query Builder allows you to predict both at once?
The reason is that the data mining query returns a nested rowset for each series you
predict, but the providers in Integration Services can work only with flattened rowsets. If you
predict both Amount and Quantity in one query, the rowset contains many nulls when
flattened. Rather than try to remove the nulls and sort out the results, it is easier to generate
a separate set of results and then combine them later in the Integration Services data flow.
1. Create a new Foreach Loop container and name it Predict Foreach Model. Connect it to
the task, Get Processed Models.
2. With the Foreach Loop container selected, open the Variables window and create a new
variable scoped to the task, as follows:
strModelName String
3. Return to the Foreach Loop Editor, and set the enumerator type to ForEach ADO
enumerator.
4. In the Enumerator configuration pane, set ADO object source variable to
User::objProcessedModels.
5. Set Enumeration mode to Rows in first table only.
6. In the Variables mapping pane, assign the variable User::strModelName to Index 0.
Each row of the single-column table returned by the query is fed into the variable
21
strModelName, which is in turn used to update the prediction query in the next set of
tasks.
Create variables for the Foreach Loop container
Much of the work in this package is done by the variable assignments. You create one set of
variables that store the text of the prediction queries, and then you insert the name of the
mining models by looking it up in another variable, strModelName. This illustrates a useful
technique in Integration Services: updating the contents of a variable by using an expression
as the variable definition.
1. With the Foreach Loop selected, create four new variables:
strQueryBaseAmount String
strQueryBaseQty
String
strPredictAmt
String
strPredictQty
String
2. For the value of variable strQueryBaseAmount, type the following query.
SELECT FLATTENED 'ModelNameHere' as [Model Name],
[ModelNameHere].[Model Region] as [Model and Region],
(SELECT $TIME as NewTime,
Amount as NewValue,
PredictStDev([Amount])as ValueStDev,
PredictVariance([Amount]) as ValueVariance
FROM PredictTimeSeries([ModelNameHere].[Amount],10) )
AS Predictions
FROM [ModelNameHere]
Important: The query here is formatted for readability, but the query will not work if you
copy and paste these statements into the variable as is. You must copy the statement
into a text editor first and remove all line breaks. Unfortunately the Integration Services
editors do not detect line breaks or raise any errors while you are editing the task, but
when you run the package, you will get an error. So be sure to remove the line breaks
first!
3. For the value of variable strQueryBaseQty, type the following query after removing the
line breaks.
SELECT FLATTENED 'ModelNameHere' as [Model Name],
[ModelNameHere].[Model Region] as [Model and Region],
(SELECT $TIME as NewTime,
22
Quantity as NewValue,
PredictStDev([Quantity])as ValueStDev,
PredictVariance([Quantity]) as ValueVariance
FROM PredictTimeSeries([ModelNameHere].[Amount],10) )
AS Predictions
FROM [ModelNameHere]
Notice the placeholder, ModelNameHere, in this procedure. This placeholder will be
replaced with a valid model name, which the package gets from the variable
strModelName.
The next steps explain how to create an expression that updates the query text each
time the loop is executed.
4. In the Variables window, select the variable strPredictQty, and then open or select the
Properties window to see the extended properties of the variable.
5. Locate Evaluate as Expression and set the value to True.
6. Locate Expression and type or paste in the following expression.
REPLACE( @[User::strQueryBaseQty] , "ModelNameHere", @[User::strModelName2]
)
7. Repeat this process for the variable strPredictAmt, using the following expression.
REPLACE( @[User::strQueryBaseAmount] , "ModelNameHere", @[User::strModelName2]
)
Create the Data Mining Query task (Predict Amt)
Now that you’ve configured the variables, most of the work is done. All you have to do is create
a pair of Data Mining Query tasks. Each task gets the updated query string out of the variable
that you just created, runs the query, and then saves the predictions to the specified output:
1. Drop a new Data Mining Query task into the Foreach Loop container. Name it Predict
Amt.
2. In the Data Mining Query Task Editor, on the Mining Model tab, for Connection,
choose the name of the Analysis Services instance that hosts your models. For
example:<servername>.ForecastingModels.
3. For Mining structure, choose Forecasting.
4. On the Output tab, for Connection, choose the instance of the database engine where
you will store the results. For example, <servername>.DM_Reporting.
5. For Output table, select or type the table name, tmpPredictionResults. Choose the
option Drop and re-create the output table. (Note: If this package has never been
run, you must type the name of the table, and the task will then create it. However, if you
are rerunning the package, the table already exists, so you must drop and then rebuild
it.)
23
6. On the Query tab, for Build query, you can paste in the base query temporarily. It will
be replaced with the contents of a variable. After you run the package once, you should
see the text of the base query.
7. With the Predict Amt task selected, open the Properties pane and locate the
Expressions property.
8. Expand the list of expressions and add the variable @[User::strPredictAmt] as the
value for the QueryString property. You can also select the value from a list by clicking
the Browse (...) button.
Create the second Data Mining Query task (Predict Qty)
Repeat the steps just described for a query that does the same thing, only with Quantity as
the predictable column:
1. Create a new Data Mining Query task named Predict Qty.
2. Repeat steps 2-6 from the previous procedure exactly as described.
3. With the Predict Qty task selected, open the Properties pane and locate the
Expressions property.
4. Expand the list of expressions and add the variable @[User::strPredictQty] as the
value for the QueryString property.
After you run the package once, the DMX statement contains blank brackets, like these.
SELECT FLATTENED '' as [Model Name],
[].[Model Region] as [Model and Region],
(SELECT $TIME as NewTime, Amount as NewValue,
PredictStDev([Amount])as ValueStDev,
PredictVariance([Amount]) as ValueVariance
FROM PredictTimeSeries([].[Amount],10) )
AS Predictions
FROM []
These brackets will be populated by a variable that supplies the model name at run time.
To summarize all the variable activity at run time:




24
The package gets a variable with a list of models.
The loop gets the name of one model from the list.
The loop gets a prediction query from a variable, inserts the model name, and writes
out a new prediction query.
The query task executes the updated prediction query.
Note that these prediction queries all write their results to the same temporary table, which is
dropped and then rebuilt during each loop. Therefore, you need to add a Data Flow task in
between, which moves the results to the archive table and also adds some metadata.
Create Data Flow tasks to Archive the Results
Remember that you created separate predictions for the data values, Amount and Quantity, to
avoid dealing with a lot of nulls in the results. Next they are merged back together for reporting,
to make a table that looks like this.
Job
ID
012
Time executed
Series
Time slice
2010-09-20
012
2010-09-20
012
2010-09-20
012
2010-09-20
M200
Europe
M200
Europe
M200
Europe
M200
Europe
January
2012
February
2012
January
2012
February
2012
Prediction
type
Sales
Amount
Sales
Amount
Sales
Quantity
Sales
Quantity
Predicted
value
283880
StDev
Variance
nnn
nnn
251225
nnn
nnn
507
nnn
nnn
758
nnn
nnn
You can also add any extra metadata that might be useful later, such as the date the predictions
were generated, a job ID, and so forth.
Let’s take another look at the DMX query statements used to generate the predictions.
SELECT $TIME as NewTime, Amount as NewValue, PredictStDev([Amount])as
ValueStDev,PredictVariance([Amount]) as ValueVariance FROM PredictTimeSeries
Ordinarily the default column names that are generated in a prediction query are named by
default based on the predictable column name, so the names would be something like
PredictAmount and PredictQuantity. However, you can use a column alias in the output (here, it
is NewValue) to make it easier to combine predicted values.
Again, because Integration Services is so flexible, there are lots of ways you might accomplish
this task:




Store results in memory and merge them before writing to the archive table.
Store the results in different columns, one for each prediction type.
Write the results to temporary tables and merge them later.
Use the Integration Services raw file format to quickly write out and then read the interim
results.
However, in this scenario, you want to verify the prediction data that is generated by each
query. So you use the following approach:

25
Write predictions to a temporary table.



Use an OLE DB Source component to get the predictions that were written to the
temporary table.
Use a Derived Column transformation to clean up the data and add some simple
metadata.
Save the results to the archive table that is used for reporting on all models.
The graphic illustrates the overall task flow within each Data Flow task.
Create Data Flow tasks (Archive Results Qty, Archive Results Amt)
1. Within the loop Predict Foreach Model, create two Data Flow tasks and name them
Archive Results Qty and Archive Results Amt.
2. Connect each Data Flow task to its related Data Mining Query task, in the order shown
in the earlier Control Flow diagram for Package 3.
Note: You must have these tasks in a sequence, because they use the same temporary
table and archive table. If Integration Services executes the tasks in parallel, the
processes could create conflicts when attempting to access the same table.
3. In each Data Flow task, add the following three components:
 An OLE BD data source that reads from tmpPredictionResults
 A Derived Column transformation as defined in the following table
 An OLE DB destination that writes to table ArchivedPredictions
4. Create expressions in each Derived Column transformation, to generate the data for the
new columns as follows.
Task name
Archive Results Qty
Archive Results Qty
26
Derived column
name
PredictionDate
PredictedValue
Data type
Value
datetime
string
GETDATE()
Amount
Archive Results Amt
Archive Results Amt
PredictionDate
PredictedValue
datetime
string
GETDATE()
Quantity
Tip: Isolating the data flows for each prediction type has another advantage: it is much, much
easier to modify the package later. For example, you might decide that there is no good reason
for creating a separate prediction for quantity. Instead of editing your query or the output, you
can just disable that part of the package and it will still run without modification – you just won’t
have predictions for Quantity.
Run, Debug, and Audit Packages
That’s it – the packages contain all the tools you need to dynamically create and update multiple
related data mining models.
The packages for this scenario have been designed so that you can run them individually. We
recommend that you run each package on its own at least once, to get a feel for what each
package produces. Later you can add logging to the packages to track errors, or create a parent
task to connect them by adding an Execute Package task.
Phase 5 - Analyze and Report
Now that you have a set of predictions for multiple models, you are probably anxious to see the
trends, and to analyze differences.
Using the Data Mining Viewer
The quickest way to view individual models is by using the Data Mining viewers. The Microsoft
Time Series Viewer (http://technet.microsoft.com/en-us/library/ms175331.aspx) is particularly
handy because it combines the historical data with the predictions for each series, and it
displays error bars for the predictions.
27
However, some users are not comfortable with using Business Intelligence Development Studio.
Even if you use the Data Mining Add-ins for Microsoft Office, which provides a Microsoft Visio
viewer and a time series browser, the amount of detail in the time series viewer can be
overwhelming.
In contrast, analysts typically want even more detail, including statistics embedded in the model
content, together with the metadata you captured about the source and the model parameters.
It’s impossible to please everyone!
Fortunately Reporting Services lets you pick the data you want, add extra data sets and linked
reports, filter, and group, so you can create reports that meet the needs of each set of users.
Using Reporting Services for Data Mining Results
Our requirements for the basic report were as follows:





28
All related models should be in one chart, for quick comparison.
To simplify comparisons, we can present the predictions for Amount and Quantity
separately.
The results should be separable by model and region.
We need to compare predictions for the same periods, for multiple models.
Rather than working with multiple sources of data, we would like all data in a relational
store.
Additional requirements might include:



A chart showing historical values along with predictions.
Statistics derived from comparison of prediction values.
Metadata about each model in a linked report.
As the analyst, you might want even more detail:



First and last dates used for training each model
List of the algorithm parameters and pattern formulas
Descriptive statistics that summarize the variability and range of source data in each of
the series, or across series
However, for the purposes of this walkthrough, there is already plenty of detail for comparing
models. You can always add this data later and then present it in linked reports
The following graphic shows the Reporting Services report that compares the prediction results
for each model:
29
Notice that you can configure a report to show all kinds of information in ToolTips—in this
example, as you pause the mouse over a prediction, you see the standard deviation and
variance for the predictions.
The next shows a series of charts that have been copied into a matrix. By using a matrix, you
can create a set of filtered charts. This series of graphs shows predictions for Amount for all
models.
30
M200
R750
T1000
Europe
North
America
Pacific
Interpreting the Results
If you are familiar with time series modeling, a couple of trends begin to jump out at you just
from scanning these charts:



There are some extreme series in ARIMA models – possibly the periodicity hint is to
blame.
Predictions in ARTXP models cut off at a certain point in many series. This is expected,
because ARTXP detects the instability of a model and does not make predictions if they
are not reliable.
You would expect the MIXED models to generally perform better, because they combine
the best characteristics of ARTXP and ARIMA. Indeed, they seem more reliable, though
you would want to verify that.
The following trend lines are interesting, and they illustrate some problems you might see with
models. The results might indicate that the data is bad, there is inadequate data, or the data is
too variable to fit.
31
R750 Europe (Amount)
R750 Europe (Quantity)
R250 North America (Amount)
R250 North America (Quantity)
When you see wildly varying trends from models on the same data, you should of course reexamine the model parameters, but you might also use cross-prediction or aggregate your data
differently, to avoid being influenced too strongly by a single data series:


With cross-prediction, you can build a reliable model from aggregated data or a series
with solid data, and then make predictions based on that model for all series. ARTXP
models and mixed models support cross-prediction.
If you do not have enough data to meaningfully analyze each region or product line
separately, you might get better results by aggregating by product or region or both, and
create predictions from the aggregate model.
Discussion
Data mining can be a labor-intensive process. From data acquisition and preparation to
modeling, testing, and exploration of the results, much effort is needed to ensure that the data
supports the intended analysis and that the output of the model is meaningful.
Some parts of the model-building process will always require human intervention –
understanding the results, for example, requires careful review by an expert who can assess
whether the numbers make sense.
However, by automating some part of the data mining process, Integration Services can not
only speed the process, but also potentially improve the results. For example, if you don’t know
which mixture of algorithms produces the best results, or what the possible time cycles are in
your data, you can use automation to experiment.
Moreover, there are benefits beyond simple time saving.
32
The Case for Ensemble Models
Automation supports the creation of ensemble models. Ensemble models (roughly speaking)
are models that are built on the same data, but use different methods of analysis.
Typically the results of multiple models are compared and/or combined, to yield results that are
superior to those of any single model. Assessing multiple models for the same prediction task is
now considered a best practice, even with the best data. Some reasons that are often cited for
using ensemble models include:


Avoiding overfitting. When a model is too closely patterned on a specific training set,
you get great predictions on a test data set that matches the training data, and a lot of
variability when you try the model out on real-world data.
Variable selection. Each algorithm type processes data differently and delivers different
insights. Rather than compare predictions, as we did here, you might use one algorithm
to identify the most important variables or to prescreen for correlations that can mask
other more interesting patterns.
There has been much research in recent years on the best methods for combining the
estimates from ensemble models — merging, bagging, voting, averaging, weighting by posterior
evidence, gating, and so forth. A discussion of ensemble models is beyond the scope of this
paper, and you will note that we did not attempt to combine prediction results in this paper; we
only presented them for comparison. However, we encourage you to read the linked resources
to learn more about these techniques.
Closing the Loop: Interpreting and Getting Feedback on Models
Now that you have summarized the results in an easy-to-read report, what’s next? Typically,
you just think of more questions to answer!





What about internal promotions or known events? Have we eliminated known
correlations?
Local and cultural events can significantly affect sales of particular products. Rather than
expect the same seasonality in multiple regions, should we separate the regions for
modeling?
Should we choose a time series algorithm that can account for the effects of random or
cyclical external events, or do we want to smooth data to find overall trends?
Can we compare these projections graphs with a projection done by the traditional
business method of year-to-date comparisons?
How do these predictions compare to percentage increases targeted by the business?
Fortunately, because you have created an extensible framework for incorporating data mining in
analysis using Integration Services, it will be relatively easy to collect more data, update models,
and refine your presentation.
33
Conclusion
This paper introduced a framework for automation of data mining, with results saved to a
relational data store, to encourage a systematic approach to predictive analytics.
This walkthrough showed that it is relatively easy to set up Integration Services packages that
create data mining models and generate predictions from them. A framework like the one
demonstrated here could be extended to support further parameterization, encourage the use of
ensemble models, and incorporate data mining in other analytic workflows.
Resources
[1] Jamie MacLennan: Walkthrough of SQL Server 2005 Integration Services for data mining
http://www.sqlserverdatamining.com/ssdm/Default.aspx?tabid=96&Id=338
[2] Microsoft Research: Ensemble models
http://academic.research.microsoft.com/Paper/588724.aspx
[3] Reporting Services tutorials
http://msdn.microsoft.com/en-us/library/bb522859.aspx
[4] Michael Ohler: Assessing forecast accuracy
http://www.isixsigma.com/index.php?option=com_k2&view=item&id=1550:assessing-forecastaccuracy-be-prepared-rain-or-shine&Itemid=1&tmpl=component&print=1
[5] Statistical methods for assessing mining models
http://ms-olap.blogspot.com/2010/12/do-you-trust-your-data-mining-results.html
[6] John Maindonald, “Data Mining from a Statistical Perspective”
http://maths.anu.edu.au/~johnm/dm/dmpaper.html.
Acknowledgements
I am indebted to my coworkers for their assistance and encouragement. Carla Sabotta
(technical writer, Integration Services) provided invaluable feedback on the steps in each of the
SSIS packages, ensuring that I didn’t leave out anything. Ranjeeta Nanda of the Integration
Services test team kindly reviewed the code in the Script task. Mary Lingel (technical writer,
Reporting Services) took my complex data source and developed a set of reports that made it
look simple.
34
Code for Script Task
The following code can be added to the Script task to change values in the XML model
definition. This very simple sample was written using VB.NET, but the Script task supports C#
as well.
A number of message boxes have been added to verify that the task was processing the XML
as expected. You would eventually comment these out. You would also want to add string
length checking and other validation to prevent DMX injection.
Public Sub Main()
'get base XMLA and create new blank XMLA used for output
Dim strXMLABaseDef As String = Dts.Variables("strBaseXMLA").Value.ToString
Dim strXMLANewDef As String = strXMLABaseDef
'create local variables and fill them with values from the SQL query
Dim txtModelID As String = Dts.Variables("strModelID").Value.ToString
Dim txtModelName As String = Dts.Variables("strModelName").Value.ToString
Dim txtForecastMethod As String =
Dts.Variables("strForecastMethod").Value.ToString
Dim txtPeriodicityHint As String =
Dts.Variables("strPeriodicityHint").Value.ToString
'first update base XMLA with new model ID and model name
'
<ID>ForecastingDefault</ID>
'
<Name>ForecastingDefault</Name>
Dim txtNewID As String = "<ID>" & txtModelID & "</ID>"
Dim txtNewName As String = "<Name>" & txtModelName & "</Name>"
'insert values
strXMLANewDef = strXMLANewDef.Replace("<ID>ForecastingDefault</ID>",
txtNewID)
strXMLANewDef = strXMLANewDef.Replace("<Name>ForecastingDefault</Name>",
txtNewName)
35
'display model names – for troubleshooting only
MessageBox.Show(strXMLANewDef, "Verify new model ID and name")
'create temporary variables for replacement operations
Dim strParameterName As String = ""
Dim strParameterValue As String = ""
'update value for FORECAST METHOD. Because all possible values have exactly 5
chars, simply replace
strParameterName = "FORECAST_METHOD"
strParameterValue = "MIXED" 'default value
If strXMLABaseDef.Contains(strParameterValue) Then
'replace the default value MIXED with whatever is in the variable from the
SQL Server query
strXMLANewDef = strXMLANewDef.Replace(strParameterValue, txtForecastMethod)
'display Forecast parameter value– for troubleshooting only
MessageBox.Show(strXMLANewDef, "Check Forecast Method", MessageBoxButtons.OK)
Else : MessageBox.Show("Problem with base XMLA", "The XMLA definition does
not include the parameter, ;" & _
strParameterName, MessageBoxButtons.YesNoCancel)
End If
'look for a PERIODICITY_HINT value
strParameterName = "PERIODICITY_HINT"
strParameterValue = "{1}" 'default value
36
If strXMLABaseDef.Contains(strParameterName) Then
Dim StartString As Integer = strXMLABaseDef.IndexOf("{")
Dim EndString As Integer = strXMLABaseDef.IndexOf("}")
'replace the default value {1} with whatever is in the variable
strXMLANewDef = strXMLANewDef.Replace(strParameterValue,
txtPeriodicityHint)
MessageBox.Show(strXMLANewDef, "Check Periodicity Hint",
MessageBoxButtons.OK)
Else : MessageBox.Show("Problem with base XMLA", "The XMLA definition does
not include the parameter, ;" & _
strParameterName, MessageBoxButtons.YesNoCancel)
End If
'save the completed definition to the package variable
Dts.Variables("strModelXMLA").Value = strXMLANewDef
Dts.TaskResult = ScriptResults.Success
End Sub
For more information:
http://www.microsoft.com/sqlserver/: SQL Server Web site
http://technet.microsoft.com/en-us/sqlserver/: SQL Server TechCenter
http://msdn.microsoft.com/en-us/sqlserver/: SQL Server DevCenter
Did this paper help you? Please give us your feedback. Tell us on a scale of 1 (poor) to 5
(excellent), how would you rate this paper and why have you given it this rating? For example:
37


Are you rating it high due to having good examples, excellent screen shots, clear writing,
or another reason?
Are you rating it low due to poor examples, fuzzy screen shots, or unclear writing?
This feedback will help us improve the quality of white papers we release.
Send feedback.
38