Datasets¶
Overview
Datasets objects are specifications of datasets columns which, when simulated result in an actual dataset (N row by M columns). These datasets can be used to build models within or outside of the platform. They can also be used to create business reports.
A dataset usually consists of a set of independent inputs (data elements or features or dataset transforms), a dependent (a data element or a feature) and/or any other algorithm specific data components.
How is this used?
Datasets can be used in the Experiment module to estimate models within the platform. They can also be exported and used outside the platform to build a model and eventually register it back into the platform.
Creating a Dataset
A dataset can be created by accessing the Dataset tab under Model Studio and clicking on Create
Example
Case study: User has registered Fico Range Low and Loan Amount as data elements, created Default Within 18 Months as a feature and registered Default Model Algorithm for training default models. Now User wants to create a modeling dataset for the Default Model Algorithm, the modeling dataset will predict Default Within 18 Months using Fico Range Low and Loan Amount as predictors.
- User opens Dataset tab under Model Studio and registers the dataset information by clicking on Create
- The user simulates the dataset to generate the actual modeling file



