Blending Data Mining with Design of Experiments (Two-day seminar)
Classical Design of Experiments was introduced nearly 100 years ago and has slowly evolved in the decades that followed. Thanks to powerful software, much has recently changed in this arena. Computer generated designs allow us to conduct incredibly efficient experiments. Mixed levels for factors can be readily accommodated as well as infeasible experimental regions. More than two or three levels can be accommodated for factors. Multiple responses can be traded-off with ease. We no longer need to assume the model of interest is of a simple linear or quadratic form. Software now allows us to fit more complex models. Interactions can now be more effectively visualized with three-dimensions. New strategies have also emerged regarding run order and number of replications.
Organizations are now able to learn amazing things about their customers and processes using a host of relatively new techniques referred to as data mining. Instantaneously determining if a client is a good credit risk, selecting what content to display on a web page, spotting fraud, optimization of complex bio-chemical formulations, and predicting which customers are likely to leave in the next three months are just some of the amazing things companies have accomplish with data mining techniques. With the advent of inexpensive computing capability and powerful software organizations can now collect massive amounts of operational and customer data. Collecting data is not a problem, but collecting the right data and then being able to extract latent information about relationships is a huge challenge. These techniques have the potential of readily determining latent variable relationships in complex historical datasets.
Exciting areas of application have emerged from the combination of these seemingly disparate families of tools. One involves using data mining tools to screen the vital few key variables from a massive number of dataset variables as well as identification of intriguing ranges for the key variables. This information can then be loaded into a modeling designed experiment so as to approximate underlying relationships between the key inputs variables and key responses. Simulation and optimization strategies can then be applied to the resultant models.
This seminar provides a review of experimental design techniques and data mining tools, then tackles the problem of blending the approaches for successful deployment. Numerous exercises and datasets will be evaluated with several popular software packages.
The following
is an outline of the various topics covered in this 2 day course:
Agenda
Goals and expectations
Benefit of structured experimentation
Overview of basic orthogonal array experiments and analysis
Competing experimental objectives
What are the steps?
Fundamentals of mathematical modeling using software
Graphical analysis and statistical analysis
Co-optimization of simultaneous responses
What is data mining?
What are the steps?
Using variable screening tools
CART
Association rules
Graphical techniques
MLR
Neural Networks
Blending the two approaches for screening and modeling of key responses
Numerous case studies
Fees:
$1600 includes the text Engineering Today's Designed Experiments,
student version of DOE Wisdom Software, and participant guide.
Lunch and coffee breaks are provided each day.
Register
Early - class sizes are limited for optimal interaction and
instruction.