Running Simulations¶

Simulations can be run for registered objects with new data. For DataElement and Feature, this implies generating DE and feature values for new dataset. For models, simulation generates scores for input data. In policy simulation, offers are generated for new data based on registered policy.

Running DataElement and Feature simulation with new data
Running Model simulation with new data
Policy simulation and analysis of generated offers

Running DataElement and Feature simulation with new data¶

1. An illustration of Simulation on Simple DataElement/Feature

In [1]:

Copied!





# import spark library
import findspark; findspark.init(); import pyspark
spark = pyspark.sql.SparkSession.builder.getOrCreate()

# read in the Application Level Dataset
application_table = spark.read.parquet('s3a://corridor.dev/master/sampleAppData.parquet')
# get the first 1000 records 
application_table = application_table.limit(1000)

# take a look at the first 5 rows of the application_table dataframe by runningthe line below:
application_table.limit(5).toPandas()
# import spark library
import findspark; findspark.init(); import pyspark
spark = pyspark.sql.SparkSession.builder.getOrCreate()

# read in the Application Level Dataset
application_table = spark.read.parquet('s3a://corridor.dev/master/sampleAppData.parquet')
# get the first 1000 records 
application_table = application_table.limit(1000)

# take a look at the first 5 rows of the application_table dataframe by runningthe line below:
application_table.limit(5).toPandas()

Out[1]:

	corridor_application_id	open_acc_6m	acc_open_past_24mths	addr_state	zip_code	annual_inc	corridor_application_date	application_type	simulated_age	...	total_bal_ex_mort	total_bc_limit	tot_coll_amt	tot_cur_bal	tot_hi_cred_lim	total_bal_il	total_il_high_credit_limit	total_rev_hi_lim	all_util	__index_level_0__
0	20000018440	0.0	2.0	AZ	852xx	75000.0	2015-11-24 07:00:00	Individual	41	...	100088.0	32000.0	73.0	444844.0	476098.0	67536.0	91314.0	36400.0	89.0	2122946
1	225770004638	2.0	8.0	OH	452xx	125000.0	2018-07-18 07:00:00	Individual	36	...	28524.0	20300.0	1216.0	166257.0	199832.0	13993.0	21625.0	26600.0	59.0	1810773
2	20000077834	NaN	3.0	MA	018xx	113536.0	2015-10-23 07:00:00	Individual	45	...	275761.0	36600.0	0.0	496816.0	559526.0	NaN	253426.0	51600.0	NaN	2182340
3	20000267750	NaN	7.0	MD	216xx	140000.0	2015-05-22 07:00:00	Individual	37	...	53771.0	100700.0	0.0	323232.0	439972.0	NaN	8831.0	136300.0	NaN	2372256
4	225769878100	1.0	8.0	NC	282xx	130000.0	2018-04-25 07:00:00	Individual	47	...	65131.0	56900.0	0.0	168988.0	247844.0	17643.0	26658.0	104300.0	50.0	1684235

5 rows × 89 columns

In [2]:

Copied!

# Import Corridor Package Objects
from corridor import create_data

# Run simulation of annual_income: registered DataElement and # debt_capacity: registered Feature
df = create_data('annual_income', 'debt_capacity',
                 data={'application': application_table})

# take a look at the attributes createdby runningthe line below:
df.limit(10).toPandas()
# Import Corridor Package Objects
from corridor import create_data

# Run simulation of annual_income: registered DataElement and # debt_capacity: registered Feature
df = create_data('annual_income', 'debt_capacity',
                 data={'application': application_table})

# take a look at the attributes createdby runningthe line below:
df.limit(10).toPandas()

Out[2]:

	annual_income	debt_capacity
0	75000.0	0.195333
1	125000.0	0.088000
2	113536.0	0.325888
3	140000.0	0.142857
4	130000.0	0.307692
5	68000.0	0.029412
6	52500.0	0.123810
7	97000.0	0.115464
8	40000.0	0.425000
9	130000.0	0.161538

2. An illustration of Simulation on Aggregated DataElement/Feature

In this example, we will create loan_default_in_18m using data stored outside the platform
default_within_18m is created to aggregate loan_performance_table to loan level. We can run create_data and input both the "performance table" (=performance) and "loan_table" (=loan)

In [3]:

Copied!





# import spark library
import findspark; findspark.init(); import pyspark
spark = pyspark.sql.SparkSession.builder.getOrCreate()

# read in the loan Level Dataset
loan_table_1000 = spark.read.parquet('s3a://corridor.dev/master/sampleLoanData.parquet').limit(1000)

# subset loan performance table to keep only the records for the loan_id in loan_table_1000
loan_perf_table = spark.read.parquet('s3a://corridor.dev/master/samplePerfData.parquet')
# # loan_table_1000['corridor_loan_id'] == loan_perf_table['corridor_loan_id']
loan_perf_table_1000 = loan_table_1000.join(loan_perf_table, on='corridor_loan_id')

# Import Corridor Package Objects
from corridor import create_data

df = create_data('loan_default_in_18m',
                 data={'loan': loan_table_1000,'performance': loan_perf_table_1000})
df.show(2)
# try whether create_data() can use pandas df
# import spark library
import findspark; findspark.init(); import pyspark
spark = pyspark.sql.SparkSession.builder.getOrCreate()

# read in the loan Level Dataset
loan_table_1000 = spark.read.parquet('s3a://corridor.dev/master/sampleLoanData.parquet').limit(1000)

# subset loan performance table to keep only the records for the loan_id in loan_table_1000
loan_perf_table = spark.read.parquet('s3a://corridor.dev/master/samplePerfData.parquet')
# # loan_table_1000['corridor_loan_id'] == loan_perf_table['corridor_loan_id']
loan_perf_table_1000 = loan_table_1000.join(loan_perf_table, on='corridor_loan_id')

# Import Corridor Package Objects
from corridor import create_data

df = create_data('loan_default_in_18m',
                 data={'loan': loan_table_1000,'performance': loan_perf_table_1000})
df.show(2)
# try whether create_data() can use pandas df

+-------------------+
|loan_default_in_18m|
+-------------------+
|                0.0|
|                0.0|
+-------------------+
only showing top 2 rows

Running Model simulation with new data¶

In [3]:

Copied!





# import spark library
import findspark; findspark.init(); import pyspark
spark = pyspark.sql.SparkSession.builder.getOrCreate()

# read in the Application Level Dataset
application_table = spark.read.parquet('s3a://corridor.dev/master/sampleAppData.parquet')
# get the first 1000 records 
application_table = application_table.limit(1000)

# take a look at the first 5 rows of the application_table dataframe by runningthe line below:
application_table.limit(5).toPandas()
# import spark library
import findspark; findspark.init(); import pyspark
spark = pyspark.sql.SparkSession.builder.getOrCreate()

# read in the Application Level Dataset
application_table = spark.read.parquet('s3a://corridor.dev/master/sampleAppData.parquet')
# get the first 1000 records 
application_table = application_table.limit(1000)

# take a look at the first 5 rows of the application_table dataframe by runningthe line below:
application_table.limit(5).toPandas()

Out[3]:

	corridor_application_id	open_acc_6m	acc_open_past_24mths	addr_state	zip_code	annual_inc	corridor_application_date	application_type	simulated_age	...	total_bal_ex_mort	total_bc_limit	tot_coll_amt	tot_cur_bal	tot_hi_cred_lim	total_bal_il	total_il_high_credit_limit	total_rev_hi_lim	all_util	__index_level_0__
0	20000018440	0.0	2.0	AZ	852xx	75000.0	2015-11-24 07:00:00	Individual	41	...	100088.0	32000.0	73.0	444844.0	476098.0	67536.0	91314.0	36400.0	89.0	2122946
1	225770004638	2.0	8.0	OH	452xx	125000.0	2018-07-18 07:00:00	Individual	36	...	28524.0	20300.0	1216.0	166257.0	199832.0	13993.0	21625.0	26600.0	59.0	1810773
2	20000077834	NaN	3.0	MA	018xx	113536.0	2015-10-23 07:00:00	Individual	45	...	275761.0	36600.0	0.0	496816.0	559526.0	NaN	253426.0	51600.0	NaN	2182340
3	20000267750	NaN	7.0	MD	216xx	140000.0	2015-05-22 07:00:00	Individual	37	...	53771.0	100700.0	0.0	323232.0	439972.0	NaN	8831.0	136300.0	NaN	2372256
4	225769878100	1.0	8.0	NC	282xx	130000.0	2018-04-25 07:00:00	Individual	47	...	65131.0	56900.0	0.0	168988.0	247844.0	17643.0	26658.0	104300.0	50.0	1684235

5 rows × 89 columns

In [4]:

Copied!





# Import Corridor Package Objects
from corridor import create_data

# Run simulation for PD Model Strict: pd_model_ver1 is output feature alias
df = create_data('pd_model_ver1',
                 data={'application': application_table})
df.limit(10).toPandas()
# Import Corridor Package Objects
from corridor import create_data

# Run simulation for PD Model Strict: pd_model_ver1 is output feature alias
df = create_data('pd_model_ver1',
                 data={'application': application_table})
df.limit(10).toPandas()

Out[4]:

	pd_model_ver1
0	0.25
1	0.10
2	0.15
3	0.10
4	0.10
5	0.10
6	0.10
7	0.10
8	0.25
9	0.25

Policy simulation and analysis of generated offers¶

A Policy registered in Corridor can be used to generate offers using the corridor library commands in a 3 step process:

Initiate a Policy object
Read application data and run single loan review
Analyse generated offers (as required)
- All Available Offers
- Offers To Be Displayed
- Rules Output By Offer
- Other Adhoc Analysis

we will illustrate this process by taking an example policy that is already registered on the platform

Initiate a Policy object¶

In [5]:

Copied!





# Import necessary packages
import findspark; findspark.init(); import pyspark
spark = pyspark.sql.SparkSession.builder.getOrCreate()
from corridor import Policy
# Import necessary packages
import findspark; findspark.init(); import pyspark
spark = pyspark.sql.SparkSession.builder.getOrCreate()
from corridor import Policy

In [6]:

Copied!

# Initiate Policy Object
example_policy = Policy('UW Policy with PD Model and Framework')
# Initiate Policy Object
example_policy = Policy('UW Policy with PD Model and Framework')

Read application data and run single loan review¶

In [7]:

Copied!

# Import application data - For this illustration, application data is stored in s3 storage as a parquet file
df = spark.read.parquet('s3a://corridor.dev/master/kishan/notebook_examples/policy_inputs.parquet')
df.toPandas().head(2)
# Import application data - For this illustration, application data is stored in s3 storage as a parquet file
df = spark.read.parquet('s3a://corridor.dev/master/kishan/notebook_examples/policy_inputs.parquet')
df.toPandas().head(2)

Out[7]:

	fico_range_high	annual_inc	corridor_application_date	corridor_requested_loan_amount	earliest_cr_line	corridor_application_id
0	750.0	100000.0	2018-01-01	15000.0	2016-01-01	1.0
1	760.0	120000.0	2018-02-01	12500.0	2016-02-01	2.0

In [8]:

Copied!





# To execute 'example_policy' on application data, we use 'run' method defined in 'Policy' class of Corridor Package
'''
Input to 'run': application data

Output from 'run' - Two spark dataframes: 
    1. 'entity_data': Summary of application data and policy output 
    2. 'offer_data' : Details on every offer evaluated and PASS/FAIL result of each strategy 
'''
policy_entity_data, policy_offer_data = example_policy.run(data={'application': df})
# To execute 'example_policy' on application data, we use 'run' method defined in 'Policy' class of Corridor Package
'''
Input to 'run': application data

Output from 'run' - Two spark dataframes: 
    1. 'entity_data': Summary of application data and policy output 
    2. 'offer_data' : Details on every offer evaluated and PASS/FAIL result of each strategy 
'''
policy_entity_data, policy_offer_data = example_policy.run(data={'application': df})

In [9]:

Copied!

# Coverting to Pandas dataframe for assessment
policy_entity_data = policy_entity_data.toPandas()
policy_offer_data = policy_offer_data.toPandas()
# Coverting to Pandas dataframe for assessment
policy_entity_data = policy_entity_data.toPandas()
policy_offer_data = policy_offer_data.toPandas()

Information availiable in 'policy_entity_data':

1. Application Inputs: 'fico_range_high', 'requested_loan_amount', 'debt_capacity', 'annual_income', 
                        'age_of_credit_file', 'application_date', 'earliest_credit_line', 'application_id'
2. Policy PASS/FAIL flag: 'policy_overall'
3. Strategy PASS/FAIL flag: 'min___eligibility__requirement', 'loan__approval__strategy',
                            'set__customer__value',

Column names mentioned above are same as in policy definition except 'policy_overall' (which is same across policies)

In [10]:

Copied!

policy_entity_data.head(2)
policy_entity_data.head(2)

Out[10]:

	fico_range_high	requested_loan_amount	debt_capacity	annual_income	age_of_credit_file	application_date	earliest_credit_line	policy_overall	min___eligibility__requirement	loan__approval__strategy	set__customer__value	display	application_id
0	750.0	15000.0	0.150000	100000.0	24.0	2018-01-01	2016-01-01	PASS	PASS	PASS	PASS	PASS	1.0
1	760.0	12500.0	0.104167	120000.0	24.0	2018-02-01	2016-02-01	PASS	PASS	PASS	PASS	PASS	2.0

Information availiable in 'policy_offer_data':

1. Offer parameters: 'potential_loan_amount', 'potential_int_rate', 'potential_term'
2. IDs: 'application_id', 'offer_id'
3. Strategy details: 'strategy_name', 'strategy_type', 'strategy_output'
4. Segment_details (if applicable): 'segment_name', 'segment_output'
5. Rule details: 'rule_name','rule_output'
6. Action based rule details (only applied to action rules else NaN/None): 
                    'action_feature_alias', 'action_feature_type', 'action_value_float', 'action_value_str'

In [11]:

Copied!

policy_offer_data.head(2)
policy_offer_data.head(2)

Out[11]:

	potential_loan_amount	potential_int_rate	potential_term	application_id	offer_id	strategy_name	strategy_type	strategy_output	segment_name	segment_output	rule_name	rule_output	action_feature_alias	action_feature_type	action_value_float	action_value_str
0	20000.0	7.99	36.0	1.0	68719476740	Min. Eligibility Requirement	Default	PASS	All	PASS	Min FICO & Max Debt Capacity	PASS	None	None	NaN	None
1	20000.0	7.99	36.0	1.0	68719476740	Loan Approval Strategy	Default	FAIL	680 < FICO < 780	PASS	Loan Amount	FAIL	None	None	NaN	None

Analyse generated offers (as required)¶

In [12]:

Copied!

# Number of applications assessed in this run and their application ids
print('Number of applications reviewed:', len(policy_entity_data))
print("Application ids are: " + ", ".join(policy_entity_data['application_id'].astype(str))
# Number of applications assessed in this run and their application ids
print('Number of applications reviewed:', len(policy_entity_data))
print("Application ids are: " + ", ".join(policy_entity_data['application_id'].astype(str))

Number of applications reviewed: 2
Application ids are: 1.0, 2.0

Assessment of a particular application: ID = 1

In [13]:

Copied!

# Subsetting offer_data for 'application_id = 1'
offer_data_app_1 = policy_offer_data[policy_offer_data['application_id'] == 1]
len(offer_data_app_1)
# Subsetting offer_data for 'application_id = 1'
offer_data_app_1 = policy_offer_data[policy_offer_data['application_id'] == 1]
len(offer_data_app_1)

Out[13]:

In [14]:

Copied!

# Contents of the 'offer_data_app_1' for a particular 'offer_id'
offer_data_app_1[offer_data_app_1['offer_id']==68719476743].reset_index(drop=True)
# Contents of the 'offer_data_app_1' for a particular 'offer_id'
offer_data_app_1[offer_data_app_1['offer_id']==68719476743].reset_index(drop=True)

Out[14]:

	potential_loan_amount	potential_int_rate	potential_term	application_id	offer_id	strategy_name	strategy_type	strategy_output	segment_name	segment_output	rule_name	rule_output	action_feature_alias	action_feature_type	action_value_float	action_value_str
0	15000.0	10.99	36.0	1.0	68719476743	Min. Eligibility Requirement	Default	PASS	All	PASS	Min FICO & Max Debt Capacity	PASS	None	None	NaN	None
1	15000.0	10.99	36.0	1.0	68719476743	Loan Approval Strategy	Default	PASS	680 < FICO < 780	PASS	Loan Amount	PASS	None	None	NaN	None
2	15000.0	10.99	36.0	1.0	68719476743	Loan Approval Strategy	Default	PASS	680 < FICO < 780	PASS	PD Threshold	PASS	None	None	NaN	None
3	15000.0	10.99	36.0	1.0	68719476743	Loan Approval Strategy	Default	PASS	680 < FICO < 780	PASS	Loan Pricing	PASS	None	None	NaN	None
4	15000.0	10.99	36.0	1.0	68719476743	Loan Approval Strategy	Default	PASS	FICO >= 780	FAIL	Loan Amount	None	None	None	NaN	None
5	15000.0	10.99	36.0	1.0	68719476743	Loan Approval Strategy	Default	PASS	FICO >= 780	FAIL	Loan Pricing	None	None	None	NaN	None
6	15000.0	10.99	36.0	1.0	68719476743	Loan Approval Strategy	Default	PASS	FICO >= 780	FAIL	PD Threshold	None	None	None	NaN	None
7	15000.0	10.99	36.0	1.0	68719476743	Set Customer Value	Action	PASS	All	PASS	Customer Value	PASS	customer_value	float	442293.478322	None
8	15000.0	10.99	36.0	1.0	68719476743	Display	Default	PASS	None	None	Is Displayed	PASS	None	None	NaN	None

Each row in offer_data_app_1 is single rule.There are total 9 rules under strategy in this policy: 8 unique rules under strategy (segment and rules layout shown below) and 1 display rule. So for each offer, there are always 9 rows, and in total there are 9*10 = 90 rows in offer_data_app_1.

strategy_name This captures current strategy. For instance 'strategy_name' = 'Min. Eligibility Requirement' which is the first strategy being evaluated.
strategy_type could be any of 'Default' (implying 'Rule Based'), 'Optimization Startegy' or 'Action' depending on stratgey steup on platform.
strategy_output could take value PASS (if the current strategy passes), FAIL (if it fails) or None if it is action strategy.
segment_name is 'All' if there is no segmentation or the applicable segment. For instance: 'segment_name' for 'Loan Approval Strategy' is either "680 < FICO < 780" or "FICO >= 780".
segment_output is PASS/FAIL depending on whether segment definition matches the current offer being assessed. It is always 'PASS' if there are no segments. Example: in current application, fico = 750, hence 'segment_ouput' == PASS when 'segment_name' == "680 < FICO < 780".
rule_name captures the current rule - each row in this table is unique for a (rule,segment) combination for a given offer. Note that there are 3 rows for each 'segment_name' in "Loan Approval Strategy"- this is becasue there are 3 rules being evaluated for each segment - 'Loan Amount', 'Loan Pricing' and 'PD Threshold'.
rule_output is PASS (if the the current offer meets the current rule), FAIL(if it doesn't meet rule criteria) or None (if the rule is not evaluateed - this could be the case for action rule or when segment is not applicable).
action_feature_alias shows the alias of action feature.
action_feature_type shows the type (float/integer) in case of action rule, else None
action_value_float shows the assigned numerical value of 'action_feature' if applicable, else NaN
action_value_str shows the assigned string value of 'action_feature' if applicable, else NaN

Note:

Last four elements in above list are applicable for an action strategy. In example_policy, "Set Customer Value" is an action strategy with no segmentation. customer_value == 425074.652897 for this offer and it is of float type - the calculation method is defined under "Valuation Function" for invoked framework in this policy.
The last row in above table is output of Display logic on this particular offer. Hence, strategy_name = "Display" and 'rule_name' in such case could be Is Displayed/Is Not Displayed depending on whether the offer passes the display logic. Other fields are now applicable. This row will be absent if 'Offer Display' is not defined.

In [15]:

Copied!





# Segments and rules layout in example_policy:
for strategy in example_policy.strategies:
    print(f'strategy: {strategy.name}')
    if strategy.segments:
        for seg in strategy.segments:
            print(f' segment: {seg.name}')
            for rule in seg.rules:
                print(f'  rule: {rule.name}')
    else:
        for rule in strategy.rules:
            print(f' rule: {rule.name}')
# Segments and rules layout in example_policy:
for strategy in example_policy.strategies:
    print(f'strategy: {strategy.name}')
    if strategy.segments:
        for seg in strategy.segments:
            print(f' segment: {seg.name}')
            for rule in seg.rules:
                print(f'  rule: {rule.name}')
    else:
        for rule in strategy.rules:
            print(f' rule: {rule.name}')

strategy: Min. Eligibility Requirement
 rule: Min FICO & Max Debt Capacity
strategy: Loan Approval Strategy
 segment: 680 < FICO < 780
  rule: Loan Amount
  rule: PD Threshold
  rule: Loan Pricing
 segment: FICO >= 780
  rule:  Loan Amount
  rule:  Loan Pricing
  rule:  PD Threshold
strategy: Set Customer Value
 rule: Customer Value

Output of Single Loan Review can be divided into 3 sections:

All Available Offers: This shows all PASS offers - based on PASS criteria defined in Offer Strategy
Offers To Be Displayed: This shows offer combinations to be displayed to end user/customer after filtering all PASS offers through offer display logic
Rules Output By Offer: This is a detail section. It shows segment MATCH (if segment is present for strategy) and PASS/FAIL output of each rule in every strategy for each offer.

All Available Offers¶

In [16]:

Copied!





# Number of offers tested for a particular application (id =1) based on offer configuration setup
print('Total offers evaluated:', len(offer_data_app_1['offer_id'].unique()))

offer_ids = []  # List to store offers_ids for which each strategy PASS
for i in range(0,len(example_policy.strategies)):
    strategy_name = example_policy.strategies[i].name
    passed_offers = offer_data_app_1[(offer_data_app_1['strategy_name']==example_policy.strategies[i].name) & (offer_data_app_1['strategy_output']=="PASS")]
    unique_offers = passed_offers['offer_id'].nunique()
    offer_ids.extend(set(passed_offers.offer_id))
    print(f'Number of Offers passing strategy - {strategy_name}: {unique_offers}')

# Passed offers
offer_ids_passed = set([x for x in offer_ids if offer_ids.count(x) == len(example_policy.strategies)])    
print(f'Total offers passed: {len(offer_ids_passed)}' + "\n")

# Print passed offers -- This is visible in "All Available Offers" tab on platform
cols = ['potential_loan_amount', 'potential_int_rate', 'potential_term', 'offer_id']
print("*"*20 + " All Available Offers " + "*"*20)
print(offer_data_app_1[offer_data_app_1.offer_id.isin(offer_ids_passed)][cols].drop_duplicates().reset_index(drop=True))
# Number of offers tested for a particular application (id =1) based on offer configuration setup
print('Total offers evaluated:', len(offer_data_app_1['offer_id'].unique()))

offer_ids = []  # List to store offers_ids for which each strategy PASS
for i in range(0,len(example_policy.strategies)):
    strategy_name = example_policy.strategies[i].name
    passed_offers = offer_data_app_1[(offer_data_app_1['strategy_name']==example_policy.strategies[i].name) & (offer_data_app_1['strategy_output']=="PASS")]
    unique_offers = passed_offers['offer_id'].nunique()
    offer_ids.extend(set(passed_offers.offer_id))
    print(f'Number of Offers passing strategy - {strategy_name}: {unique_offers}')

# Passed offers
offer_ids_passed = set([x for x in offer_ids if offer_ids.count(x) == len(example_policy.strategies)])    
print(f'Total offers passed: {len(offer_ids_passed)}' + "\n")

# Print passed offers -- This is visible in "All Available Offers" tab on platform
cols = ['potential_loan_amount', 'potential_int_rate', 'potential_term', 'offer_id']
print("*"*20 + " All Available Offers " + "*"*20)
print(offer_data_app_1[offer_data_app_1.offer_id.isin(offer_ids_passed)][cols].drop_duplicates().reset_index(drop=True))

Total offers evaluated: 10
Number of Offers passing strategy - Min. Eligibility Requirement: 10
Number of Offers passing strategy - Loan Approval Strategy: 3
Number of Offers passing strategy - Set Customer Value: 10
Total offers passed: 3

******************** All Available Offers ********************
   potential_loan_amount  potential_int_rate  potential_term     offer_id
0                10000.0               10.99            36.0  68719476741
1                12500.0               10.99            36.0  68719476742
2                15000.0               10.99            36.0  68719476743

Offer To be Displayed¶

In [17]:

Copied!





# Offer displayed based on 'display_logic' defined in Offer Creation  
#                    -- This is visible in "Offer To be Displayed" tab on platform
print("*"*20 + " Offer To be Displayed " + "*"*20)
print(offer_data_app_1[(offer_data_app_1['rule_name'] == 'Is Displayed') & 
                       (offer_data_app_1['rule_output'] == 'PASS')][cols].reset_index(drop=True))
# Offer displayed based on 'display_logic' defined in Offer Creation  
#                    -- This is visible in "Offer To be Displayed" tab on platform
print("*"*20 + " Offer To be Displayed " + "*"*20)
print(offer_data_app_1[(offer_data_app_1['rule_name'] == 'Is Displayed') & 
                       (offer_data_app_1['rule_output'] == 'PASS')][cols].reset_index(drop=True))

******************** Offer To be Displayed ********************
   potential_loan_amount  potential_int_rate  potential_term     offer_id
0                15000.0               10.99            36.0  68719476743

Rules Output by Offer¶

We can further look into PASS/FAIL details of a particular offer, say offer_id = 68719476743. This offer is defined as:

potential_loan_amount = 20000
potential_int_rate = 7.99
potential_term = 36

Above 3 parameters along with application_details are assessed together.

In [18]:

Copied!

# Contents of the 'offer_data_app_1' for a particular 'offer_id'
offer_data_app_1[offer_data_app_1['offer_id']==68719476743]
# Contents of the 'offer_data_app_1' for a particular 'offer_id'
offer_data_app_1[offer_data_app_1['offer_id']==68719476743]

Out[18]:

	potential_loan_amount	potential_int_rate	potential_term	application_id	offer_id	strategy_name	strategy_type	strategy_output	segment_name	segment_output	rule_name	rule_output	action_feature_alias	action_feature_type	action_value_float	action_value_str
81	15000.0	10.99	36.0	1.0	68719476743	Min. Eligibility Requirement	Default	PASS	All	PASS	Min FICO & Max Debt Capacity	PASS	None	None	NaN	None
82	15000.0	10.99	36.0	1.0	68719476743	Loan Approval Strategy	Default	PASS	680 < FICO < 780	PASS	Loan Amount	PASS	None	None	NaN	None
83	15000.0	10.99	36.0	1.0	68719476743	Loan Approval Strategy	Default	PASS	680 < FICO < 780	PASS	PD Threshold	PASS	None	None	NaN	None
84	15000.0	10.99	36.0	1.0	68719476743	Loan Approval Strategy	Default	PASS	680 < FICO < 780	PASS	Loan Pricing	PASS	None	None	NaN	None
85	15000.0	10.99	36.0	1.0	68719476743	Loan Approval Strategy	Default	PASS	FICO >= 780	FAIL	Loan Amount	None	None	None	NaN	None
86	15000.0	10.99	36.0	1.0	68719476743	Loan Approval Strategy	Default	PASS	FICO >= 780	FAIL	Loan Pricing	None	None	None	NaN	None
87	15000.0	10.99	36.0	1.0	68719476743	Loan Approval Strategy	Default	PASS	FICO >= 780	FAIL	PD Threshold	None	None	None	NaN	None
88	15000.0	10.99	36.0	1.0	68719476743	Set Customer Value	Action	PASS	All	PASS	Customer Value	PASS	customer_value	float	442293.478322	None
89	15000.0	10.99	36.0	1.0	68719476743	Display	Default	PASS	None	None	Is Displayed	PASS	None	None	NaN	None

In [19]:

Copied!

# Strategy: Min. Eligibility Requirement
offer_data_app_1[(offer_data_app_1['offer_id']==68719476743) & 
                 (offer_data_app_1['strategy_name'] == "Min. Eligibility Requirement")]
# Strategy: Min. Eligibility Requirement
offer_data_app_1[(offer_data_app_1['offer_id']==68719476743) & 
                 (offer_data_app_1['strategy_name'] == "Min. Eligibility Requirement")]

Out[19]:

	potential_loan_amount	potential_int_rate	potential_term	application_id	offer_id	strategy_name	strategy_type	strategy_output	segment_name	segment_output	rule_name	rule_output	action_feature_alias	action_feature_type	action_value_float	action_value_str
81	15000.0	10.99	36.0	1.0	68719476743	Min. Eligibility Requirement	Default	PASS	All	PASS	Min FICO & Max Debt Capacity	PASS	None	None	NaN	None

First row (index = 27) captures result of Min. Eligibility Requirement which doesn't have any segmentation. This strategy has just one rule 'Min FICO & Max Debt Capacity' which passes based on rule definition. Hence the strategy_ouptut is also a PASS. Since this is a rule based strategy (indicated by strategy_type = 'default'), action rule related columns are not applicable.

In [20]:

Copied!

# Strategy: Loan Approval Strategy
offer_data_app_1[(offer_data_app_1['offer_id']==68719476743) & 
                 (offer_data_app_1['strategy_name'] == "Loan Approval Strategy")]
# Strategy: Loan Approval Strategy
offer_data_app_1[(offer_data_app_1['offer_id']==68719476743) & 
                 (offer_data_app_1['strategy_name'] == "Loan Approval Strategy")]

Out[20]:

	potential_loan_amount	potential_int_rate	potential_term	application_id	offer_id	strategy_name	strategy_type	strategy_output	segment_name	segment_output	rule_name	rule_output	action_feature_alias	action_feature_type	action_value_float	action_value_str
82	15000.0	10.99	36.0	1.0	68719476743	Loan Approval Strategy	Default	PASS	680 < FICO < 780	PASS	Loan Amount	PASS	None	None	NaN	None
83	15000.0	10.99	36.0	1.0	68719476743	Loan Approval Strategy	Default	PASS	680 < FICO < 780	PASS	PD Threshold	PASS	None	None	NaN	None
84	15000.0	10.99	36.0	1.0	68719476743	Loan Approval Strategy	Default	PASS	680 < FICO < 780	PASS	Loan Pricing	PASS	None	None	NaN	None
85	15000.0	10.99	36.0	1.0	68719476743	Loan Approval Strategy	Default	PASS	FICO >= 780	FAIL	Loan Amount	None	None	None	NaN	None
86	15000.0	10.99	36.0	1.0	68719476743	Loan Approval Strategy	Default	PASS	FICO >= 780	FAIL	Loan Pricing	None	None	None	NaN	None
87	15000.0	10.99	36.0	1.0	68719476743	Loan Approval Strategy	Default	PASS	FICO >= 780	FAIL	PD Threshold	None	None	None	NaN	None

The next 6 rows (index = 28 - 33) captures result of Loan Approval Strategy which has 2 segments: 680 < FICO < 780 and FICO >= 780.
For this appliation, fico = 750, hence segment_output = PASS when 'segment_name' = '680 < FICO < 780', else FAIL. In each segment, 3 rules are tested.
'Loan Amount' is FAIL
'PD Threshold' is PASS
'Loan Pricing' is FAIL
Hence the strategy_output = FAIL.

Last 3 rows are not applicable because the segment therein is 'FAIL'. Hence rule_output = None.
Since this is a rule based strategy (indicated by strategy_type = 'default'), action rule related columns are not applicable.

In [21]:

Copied!

# Strategy: Set Customer Value
offer_data_app_1[(offer_data_app_1['offer_id']==68719476743) & 
                 (offer_data_app_1['strategy_name'] == "Set Customer Value")]
# Strategy: Set Customer Value
offer_data_app_1[(offer_data_app_1['offer_id']==68719476743) & 
                 (offer_data_app_1['strategy_name'] == "Set Customer Value")]

Out[21]:

	potential_loan_amount	potential_int_rate	potential_term	application_id	offer_id	strategy_name	strategy_type	strategy_output	segment_name	segment_output	rule_name	rule_output	action_feature_alias	action_feature_type	action_value_float	action_value_str
88	15000.0	10.99	36.0	1.0	68719476743	Set Customer Value	Action	PASS	All	PASS	Customer Value	PASS	customer_value	float	442293.478322	None

The second last row (index = 34) captures result of Set Customer Value which doesn't have any segmentation. This strategy has just one rule 'Customer Value'. The strategy_ouptut is a PASS as it is the case for any action strategy (indicated by strategy_type = 'Action'). customer_value is the alias of the fieature to be assigned which is of type float. COmputed action value = 425074.652897in this case. Since, the assignemnt is of type numerical, action_value_str = None.

In [22]:

Copied!

# Display Logic
offer_data_app_1[(offer_data_app_1['offer_id']==68719476743) & 
                 (offer_data_app_1['strategy_name'] == "Display")]
# Display Logic
offer_data_app_1[(offer_data_app_1['offer_id']==68719476743) & 
                 (offer_data_app_1['strategy_name'] == "Display")]

Out[22]:

	potential_loan_amount	potential_int_rate	potential_term	application_id	offer_id	strategy_name	strategy_type	strategy_output	segment_name	segment_output	rule_name	rule_output	action_feature_alias	action_feature_type	action_value_float	action_value_str
89	15000.0	10.99	36.0	1.0	68719476743	Display	Default	PASS	None	None	Is Displayed	PASS	None	None	NaN	None

The last row (index = 35) captures result of Display for this offer. If the offer passes the display logic as defined in offer creation, 'rule_name' = "Is Displayed" else "Is Not Displayed" All these details are captured in Rules Output by Offer section in Single Loan Reivew on platform

Other Adhoc Analysis¶

In this section, we will do an adhoc analysis on passed offers. In particular, we will look at offered amount, interest and term with respect to application grade

In [23]:

Copied!





# Subsetting passed offers from all offers data
offer_ids = []  # List to store offers_ids for which each strategy PASS
for i in range(0,len(example_policy.strategies)):
    strategy_name = example_policy.strategies[i].name
    passed_offers = policy_offer_data[(policy_offer_data['strategy_name']==example_policy.strategies[i].name) & (policy_offer_data['strategy_output']=="PASS")]
    unique_offers = passed_offers['offer_id'].nunique()
    offer_ids.extend(set(passed_offers.offer_id))
    print(f'Number of Offers passing strategy - {strategy_name}: {unique_offers}')

# Passed offers
offer_ids_passed = set([x for x in offer_ids if offer_ids.count(x) == len(example_policy.strategies)])    
print(f'Total offers passed: {len(offer_ids_passed)}' + "\n")

# Print passed offers -- This is visible in "All Available Offers" tab on platform
cols = ['potential_loan_amount', 'potential_int_rate', 'potential_term', 'offer_id','application_id']
passed_offers = policy_offer_data[policy_offer_data.offer_id.isin(offer_ids_passed)][cols].drop_duplicates().reset_index(drop=True)
passed_offers
# Subsetting passed offers from all offers data
offer_ids = []  # List to store offers_ids for which each strategy PASS
for i in range(0,len(example_policy.strategies)):
    strategy_name = example_policy.strategies[i].name
    passed_offers = policy_offer_data[(policy_offer_data['strategy_name']==example_policy.strategies[i].name) & (policy_offer_data['strategy_output']=="PASS")]
    unique_offers = passed_offers['offer_id'].nunique()
    offer_ids.extend(set(passed_offers.offer_id))
    print(f'Number of Offers passing strategy - {strategy_name}: {unique_offers}')

# Passed offers
offer_ids_passed = set([x for x in offer_ids if offer_ids.count(x) == len(example_policy.strategies)])    
print(f'Total offers passed: {len(offer_ids_passed)}' + "\n")

# Print passed offers -- This is visible in "All Available Offers" tab on platform
cols = ['potential_loan_amount', 'potential_int_rate', 'potential_term', 'offer_id','application_id']
passed_offers = policy_offer_data[policy_offer_data.offer_id.isin(offer_ids_passed)][cols].drop_duplicates().reset_index(drop=True)
passed_offers

Number of Offers passing strategy - Min. Eligibility Requirement: 20
Number of Offers passing strategy - Loan Approval Strategy: 7
Number of Offers passing strategy - Set Customer Value: 20
Total offers passed: 7

Out[23]:

	potential_loan_amount	potential_int_rate	potential_term	offer_id	application_id
0	10000.0	10.99	36.0	68719476741	1.0
1	12500.0	10.99	36.0	68719476742	1.0
2	15000.0	10.99	36.0	68719476743	1.0
3	15000.0	10.99	36.0	25769803784	2.0
4	7500.0	10.99	36.0	25769803781	2.0
5	10000.0	10.99	36.0	25769803782	2.0
6	12500.0	10.99	36.0	25769803783	2.0

In [24]:

Copied!

# Adding application information to to offers data
passed_offers_with_app_data = passed_offers.merge(policy_entity_data, on="application_id")
passed_offers_with_app_data
# Adding application information to to offers data
passed_offers_with_app_data = passed_offers.merge(policy_entity_data, on="application_id")
passed_offers_with_app_data

Out[24]:

	potential_loan_amount	potential_int_rate	potential_term	offer_id	application_id	fico_range_high	requested_loan_amount	debt_capacity	annual_income	age_of_credit_file	application_date	earliest_credit_line	policy_overall	min___eligibility__requirement	loan__approval__strategy	set__customer__value	display
0	10000.0	10.99	36.0	68719476741	1.0	750.0	15000.0	0.150000	100000.0	24.0	2018-01-01	2016-01-01	PASS	PASS	PASS	PASS	PASS
1	12500.0	10.99	36.0	68719476742	1.0	750.0	15000.0	0.150000	100000.0	24.0	2018-01-01	2016-01-01	PASS	PASS	PASS	PASS	PASS
2	15000.0	10.99	36.0	68719476743	1.0	750.0	15000.0	0.150000	100000.0	24.0	2018-01-01	2016-01-01	PASS	PASS	PASS	PASS	PASS
3	15000.0	10.99	36.0	25769803784	2.0	760.0	12500.0	0.104167	120000.0	24.0	2018-02-01	2016-02-01	PASS	PASS	PASS	PASS	PASS
4	7500.0	10.99	36.0	25769803781	2.0	760.0	12500.0	0.104167	120000.0	24.0	2018-02-01	2016-02-01	PASS	PASS	PASS	PASS	PASS
5	10000.0	10.99	36.0	25769803782	2.0	760.0	12500.0	0.104167	120000.0	24.0	2018-02-01	2016-02-01	PASS	PASS	PASS	PASS	PASS
6	12500.0	10.99	36.0	25769803783	2.0	760.0	12500.0	0.104167	120000.0	24.0	2018-02-01	2016-02-01	PASS	PASS	PASS	PASS	PASS

In [25]:

Copied!

# Import necessary packages
import pandas as pd

# FICO vs amount   
pd.DataFrame(passed_offers_with_app_data.groupby(['application_id'])['fico_range_high', 'potential_loan_amount'].median())
# Import necessary packages
import pandas as pd

# FICO vs amount   
pd.DataFrame(passed_offers_with_app_data.groupby(['application_id'])['fico_range_high', 'potential_loan_amount'].median())

Out[25]:

	fico_range_high	potential_loan_amount
application_id
1.0	750.0	12500.0
2.0	760.0	11250.0

In [26]:

Copied!

# FICO vs interest   
pd.DataFrame(passed_offers_with_app_data.groupby(['application_id'])['fico_range_high', 'potential_int_rate'].median())
# FICO vs interest   
pd.DataFrame(passed_offers_with_app_data.groupby(['application_id'])['fico_range_high', 'potential_int_rate'].median())

Out[26]:

	fico_range_high	potential_int_rate
application_id
1.0	750.0	10.99
2.0	760.0	10.99

In [27]:

Copied!

# FICO vs term   
pd.DataFrame(passed_offers_with_app_data.groupby(['application_id'])['fico_range_high', 'potential_term'].median())
# FICO vs term   
pd.DataFrame(passed_offers_with_app_data.groupby(['application_id'])['fico_range_high', 'potential_term'].median())

Out[27]:

	fico_range_high	potential_term
application_id
1.0	750.0	36.0
2.0	760.0	36.0