Corridor package

Introduction To Corridor Package and Corridor Package Entities¶

Corridor is a python library that gives users access to all the Registered Objects on the Corridor Platforms. Below is a list of Registered Objects that are accessible through Corridor

DataTable
DataElement
Feature
Model
ModelTransform
Policy
GlobalFunction

Corridor package contains class objects and functions that facilitate access to information on platform. DataTable, DataElement, Feature, Model, ModelTransform, Policy, GlobalFunctions are classes with predefined attributes (eg: name, alias, permissible_purpose, platform_entity, description, version etc) and methods (eg: get_simulation(), get_approval_workflow() ). Each class represent a specific registered object, for instance: class DataElement is used to instantiate a DataElement type object. Once instantiated, users can access data and metadata of instantiated DataElement in notebooks. Similarly it can be done for other objects. Attributes and methods depend on what is meaningful for a class to have. For instance, get_python_function() is only applicable for GlobalFunctions.

In addition to class objects, there are couple of functions that can be used to access data in notebook:
create_data This function creates a data from a given list of aliases or objects of DataElement/Features/Models etc.
read_data: Read data from the provided location. It returns a pyspark dataframe with data at input location.

For each of the Registered Objects, we can access a set of metadata for the objects, also we can recreate the Registered Objects on new datasets. This notebook illustration is divided into 2 sections:

How to access basic metadata for registered objects: Model Example - Using Corridor Package Classes
How to create data using registered objects - Using create_data()

lllustration: How to load a model registered on platform and access it's basic details¶

In [1]:

Copied!

# Import Model from Corridor
from corridor import Model
# Import Model from Corridor
from corridor import Model

In [2]:

Copied!

# use the MODEL NAME to access the model
Model_example = Model('PD Model Strict')
# use the MODEL NAME to access the model
Model_example = Model('PD Model Strict')

In [3]:

Copied!





print(f'name: {Model_example.name}')
print(f'output_alias: {Model_example.output_alias}')
print(f'inputs: {[x.alias for x in Model_example.inputs]}')
print(f'type: {Model_example.type}')
print(f'description: {Model_example.description}')
print(f'platform_entity: {Model_example.platform_entity}')
print(f'permissible_purpose: {Model_example.permissible_purpose}')
print(f'group: {Model_example.group}')
print(f'current_status: {Model_example.current_status}')
print(f'name: {Model_example.name}')
print(f'output_alias: {Model_example.output_alias}')
print(f'inputs: {[x.alias for x in Model_example.inputs]}')
print(f'type: {Model_example.type}')
print(f'description: {Model_example.description}')
print(f'platform_entity: {Model_example.platform_entity}')
print(f'permissible_purpose: {Model_example.permissible_purpose}')
print(f'group: {Model_example.group}')
print(f'current_status: {Model_example.current_status}')

name: PD Model Strict
output_alias: pd_model_ver1
inputs: ['debt_capacity', 'fico_range_high']
type: Binary Classification
description: Ver1 of the PD Model based on FICO, Age of Credit Profile and Debt Capacity
platform_entity: Application
permissible_purpose: ['Underwriting']
group: Probability of Default
current_status: Draft

Illustration: Creating data using registered objects¶

In [4]:

Copied!

# Import Corridor Package Objects
from corridor import create_data
# Import Corridor Package Objects
from corridor import create_data

In [5]:

Copied!

# Create dataset using aliases of registered DataELement
df = create_data('requested_loan_amount','annual_income')
df.limit(5).toPandas()
# Create dataset using aliases of registered DataELement
df = create_data('requested_loan_amount','annual_income')
df.limit(5).toPandas()

Out[5]:

	requested_loan_amount	annual_income
0	13000.0	42000.0
1	8800.0	27165.0
2	10000.0	42000.0
3	20000.0	30000.0
4	10875.0	38000.0