Running Simulations¶
Simulations can be run for registered objects with new data. For DataElement and Feature, this implies generating DE and feature values for new dataset. For models, simulation generates scores for input data. In policy simulation, offers are generated for new data based on registered policy.
- Running DataElement and Feature simulation with new data
- Running Model simulation with new data
- Policy simulation and analysis of generated offers
Running DataElement and Feature simulation with new data¶
1. An illustration of Simulation on Simple DataElement/Feature
# import spark library
import findspark; findspark.init(); import pyspark
spark = pyspark.sql.SparkSession.builder.getOrCreate()
# read in the Application Level Dataset
application_table = spark.read.parquet('s3a://corridor.dev/master/sampleAppData.parquet')
# get the first 1000 records
application_table = application_table.limit(1000)
# take a look at the first 5 rows of the application_table dataframe by runningthe line below:
application_table.limit(5).toPandas()
| corridor_application_id | acc_now_delinq | open_acc_6m | acc_open_past_24mths | addr_state | zip_code | annual_inc | corridor_application_date | application_type | simulated_age | ... | total_bal_ex_mort | total_bc_limit | tot_coll_amt | tot_cur_bal | tot_hi_cred_lim | total_bal_il | total_il_high_credit_limit | total_rev_hi_lim | all_util | __index_level_0__ | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 20000018440 | 0 | 0.0 | 2.0 | AZ | 852xx | 75000.0 | 2015-11-24 07:00:00 | Individual | 41 | ... | 100088.0 | 32000.0 | 73.0 | 444844.0 | 476098.0 | 67536.0 | 91314.0 | 36400.0 | 89.0 | 2122946 |
| 1 | 225770004638 | 0 | 2.0 | 8.0 | OH | 452xx | 125000.0 | 2018-07-18 07:00:00 | Individual | 36 | ... | 28524.0 | 20300.0 | 1216.0 | 166257.0 | 199832.0 | 13993.0 | 21625.0 | 26600.0 | 59.0 | 1810773 |
| 2 | 20000077834 | 0 | NaN | 3.0 | MA | 018xx | 113536.0 | 2015-10-23 07:00:00 | Individual | 45 | ... | 275761.0 | 36600.0 | 0.0 | 496816.0 | 559526.0 | NaN | 253426.0 | 51600.0 | NaN | 2182340 |
| 3 | 20000267750 | 0 | NaN | 7.0 | MD | 216xx | 140000.0 | 2015-05-22 07:00:00 | Individual | 37 | ... | 53771.0 | 100700.0 | 0.0 | 323232.0 | 439972.0 | NaN | 8831.0 | 136300.0 | NaN | 2372256 |
| 4 | 225769878100 | 0 | 1.0 | 8.0 | NC | 282xx | 130000.0 | 2018-04-25 07:00:00 | Individual | 47 | ... | 65131.0 | 56900.0 | 0.0 | 168988.0 | 247844.0 | 17643.0 | 26658.0 | 104300.0 | 50.0 | 1684235 |
5 rows × 89 columns
# Import Corridor Package Objects
from corridor import create_data
# Run simulation of annual_income: registered DataElement and # debt_capacity: registered Feature
df = create_data('annual_income', 'debt_capacity',
data={'application': application_table})
# take a look at the attributes createdby runningthe line below:
df.limit(10).toPandas()
| annual_income | debt_capacity | |
|---|---|---|
| 0 | 75000.0 | 0.195333 |
| 1 | 125000.0 | 0.088000 |
| 2 | 113536.0 | 0.325888 |
| 3 | 140000.0 | 0.142857 |
| 4 | 130000.0 | 0.307692 |
| 5 | 68000.0 | 0.029412 |
| 6 | 52500.0 | 0.123810 |
| 7 | 97000.0 | 0.115464 |
| 8 | 40000.0 | 0.425000 |
| 9 | 130000.0 | 0.161538 |
2. An illustration of Simulation on Aggregated DataElement/Feature
In this example, we will create loan_default_in_18m using data stored outside the platform
default_within_18m is created to aggregate loan_performance_table to loan level.
We can run create_data and input both the "performance table" (=performance) and "loan_table" (=loan)
# import spark library
import findspark; findspark.init(); import pyspark
spark = pyspark.sql.SparkSession.builder.getOrCreate()
# read in the loan Level Dataset
loan_table_1000 = spark.read.parquet('s3a://corridor.dev/master/sampleLoanData.parquet').limit(1000)
# subset loan performance table to keep only the records for the loan_id in loan_table_1000
loan_perf_table = spark.read.parquet('s3a://corridor.dev/master/samplePerfData.parquet')
# # loan_table_1000['corridor_loan_id'] == loan_perf_table['corridor_loan_id']
loan_perf_table_1000 = loan_table_1000.join(loan_perf_table, on='corridor_loan_id')
# Import Corridor Package Objects
from corridor import create_data
df = create_data('loan_default_in_18m',
data={'loan': loan_table_1000,'performance': loan_perf_table_1000})
df.show(2)
# try whether create_data() can use pandas df
+-------------------+ |loan_default_in_18m| +-------------------+ | 0.0| | 0.0| +-------------------+ only showing top 2 rows
Running Model simulation with new data¶
# import spark library
import findspark; findspark.init(); import pyspark
spark = pyspark.sql.SparkSession.builder.getOrCreate()
# read in the Application Level Dataset
application_table = spark.read.parquet('s3a://corridor.dev/master/sampleAppData.parquet')
# get the first 1000 records
application_table = application_table.limit(1000)
# take a look at the first 5 rows of the application_table dataframe by runningthe line below:
application_table.limit(5).toPandas()
| corridor_application_id | acc_now_delinq | open_acc_6m | acc_open_past_24mths | addr_state | zip_code | annual_inc | corridor_application_date | application_type | simulated_age | ... | total_bal_ex_mort | total_bc_limit | tot_coll_amt | tot_cur_bal | tot_hi_cred_lim | total_bal_il | total_il_high_credit_limit | total_rev_hi_lim | all_util | __index_level_0__ | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 20000018440 | 0 | 0.0 | 2.0 | AZ | 852xx | 75000.0 | 2015-11-24 07:00:00 | Individual | 41 | ... | 100088.0 | 32000.0 | 73.0 | 444844.0 | 476098.0 | 67536.0 | 91314.0 | 36400.0 | 89.0 | 2122946 |
| 1 | 225770004638 | 0 | 2.0 | 8.0 | OH | 452xx | 125000.0 | 2018-07-18 07:00:00 | Individual | 36 | ... | 28524.0 | 20300.0 | 1216.0 | 166257.0 | 199832.0 | 13993.0 | 21625.0 | 26600.0 | 59.0 | 1810773 |
| 2 | 20000077834 | 0 | NaN | 3.0 | MA | 018xx | 113536.0 | 2015-10-23 07:00:00 | Individual | 45 | ... | 275761.0 | 36600.0 | 0.0 | 496816.0 | 559526.0 | NaN | 253426.0 | 51600.0 | NaN | 2182340 |
| 3 | 20000267750 | 0 | NaN | 7.0 | MD | 216xx | 140000.0 | 2015-05-22 07:00:00 | Individual | 37 | ... | 53771.0 | 100700.0 | 0.0 | 323232.0 | 439972.0 | NaN | 8831.0 | 136300.0 | NaN | 2372256 |
| 4 | 225769878100 | 0 | 1.0 | 8.0 | NC | 282xx | 130000.0 | 2018-04-25 07:00:00 | Individual | 47 | ... | 65131.0 | 56900.0 | 0.0 | 168988.0 | 247844.0 | 17643.0 | 26658.0 | 104300.0 | 50.0 | 1684235 |
5 rows × 89 columns
# Import Corridor Package Objects
from corridor import create_data
# Run simulation for PD Model Strict: pd_model_ver1 is output feature alias
df = create_data('pd_model_ver1',
data={'application': application_table})
df.limit(10).toPandas()
| pd_model_ver1 | |
|---|---|
| 0 | 0.25 |
| 1 | 0.10 |
| 2 | 0.15 |
| 3 | 0.10 |
| 4 | 0.10 |
| 5 | 0.10 |
| 6 | 0.10 |
| 7 | 0.10 |
| 8 | 0.25 |
| 9 | 0.25 |
Policy simulation and analysis of generated offers¶
A Policy registered in Corridor can be used to generate offers using the corridor library commands in a 3 step process:
- Initiate a Policy object
- Read application data and run single loan review
- Analyse generated offers (as required)
- All Available Offers
- Offers To Be Displayed
- Rules Output By Offer
- Other Adhoc Analysis
we will illustrate this process by taking an example policy that is already registered on the platform
Initiate a Policy object¶
# Import necessary packages
import findspark; findspark.init(); import pyspark
spark = pyspark.sql.SparkSession.builder.getOrCreate()
from corridor import Policy
# Initiate Policy Object
example_policy = Policy('UW Policy with PD Model and Framework')
Read application data and run single loan review¶
# Import application data - For this illustration, application data is stored in s3 storage as a parquet file
df = spark.read.parquet('s3a://corridor.dev/master/kishan/notebook_examples/policy_inputs.parquet')
df.toPandas().head(2)
| fico_range_high | annual_inc | corridor_application_date | corridor_requested_loan_amount | earliest_cr_line | corridor_application_id | |
|---|---|---|---|---|---|---|
| 0 | 750.0 | 100000.0 | 2018-01-01 | 15000.0 | 2016-01-01 | 1.0 |
| 1 | 760.0 | 120000.0 | 2018-02-01 | 12500.0 | 2016-02-01 | 2.0 |
# To execute 'example_policy' on application data, we use 'run' method defined in 'Policy' class of Corridor Package
'''
Input to 'run': application data
Output from 'run' - Two spark dataframes:
1. 'entity_data': Summary of application data and policy output
2. 'offer_data' : Details on every offer evaluated and PASS/FAIL result of each strategy
'''
policy_entity_data, policy_offer_data = example_policy.run(data={'application': df})
# Coverting to Pandas dataframe for assessment
policy_entity_data = policy_entity_data.toPandas()
policy_offer_data = policy_offer_data.toPandas()
Information availiable in 'policy_entity_data':
1. Application Inputs: 'fico_range_high', 'requested_loan_amount', 'debt_capacity', 'annual_income',
'age_of_credit_file', 'application_date', 'earliest_credit_line', 'application_id'
2. Policy PASS/FAIL flag: 'policy_overall'
3. Strategy PASS/FAIL flag: 'min___eligibility__requirement', 'loan__approval__strategy',
'set__customer__value',
Column names mentioned above are same as in policy definition except 'policy_overall' (which is same across policies)
policy_entity_data.head(2)
| fico_range_high | requested_loan_amount | debt_capacity | annual_income | age_of_credit_file | application_date | earliest_credit_line | policy_overall | min___eligibility__requirement | loan__approval__strategy | set__customer__value | display | application_id | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 750.0 | 15000.0 | 0.150000 | 100000.0 | 24.0 | 2018-01-01 | 2016-01-01 | PASS | PASS | PASS | PASS | PASS | 1.0 |
| 1 | 760.0 | 12500.0 | 0.104167 | 120000.0 | 24.0 | 2018-02-01 | 2016-02-01 | PASS | PASS | PASS | PASS | PASS | 2.0 |
Information availiable in 'policy_offer_data':
1. Offer parameters: 'potential_loan_amount', 'potential_int_rate', 'potential_term'
2. IDs: 'application_id', 'offer_id'
3. Strategy details: 'strategy_name', 'strategy_type', 'strategy_output'
4. Segment_details (if applicable): 'segment_name', 'segment_output'
5. Rule details: 'rule_name','rule_output'
6. Action based rule details (only applied to action rules else NaN/None):
'action_feature_alias', 'action_feature_type', 'action_value_float', 'action_value_str'
policy_offer_data.head(2)
| potential_loan_amount | potential_int_rate | potential_term | application_id | offer_id | strategy_name | strategy_type | strategy_output | segment_name | segment_output | rule_name | rule_output | action_feature_alias | action_feature_type | action_value_float | action_value_str | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 20000.0 | 7.99 | 36.0 | 1.0 | 68719476740 | Min. Eligibility Requirement | Default | PASS | All | PASS | Min FICO & Max Debt Capacity | PASS | None | None | NaN | None |
| 1 | 20000.0 | 7.99 | 36.0 | 1.0 | 68719476740 | Loan Approval Strategy | Default | FAIL | 680 < FICO < 780 | PASS | Loan Amount | FAIL | None | None | NaN | None |
Analyse generated offers (as required)¶
# Number of applications assessed in this run and their application ids
print('Number of applications reviewed:', len(policy_entity_data))
print("Application ids are: " + ", ".join(policy_entity_data['application_id'].astype(str))
Number of applications reviewed: 2 Application ids are: 1.0, 2.0
Assessment of a particular application: ID = 1
# Subsetting offer_data for 'application_id = 1'
offer_data_app_1 = policy_offer_data[policy_offer_data['application_id'] == 1]
len(offer_data_app_1)
90
# Contents of the 'offer_data_app_1' for a particular 'offer_id'
offer_data_app_1[offer_data_app_1['offer_id']==68719476743].reset_index(drop=True)
| potential_loan_amount | potential_int_rate | potential_term | application_id | offer_id | strategy_name | strategy_type | strategy_output | segment_name | segment_output | rule_name | rule_output | action_feature_alias | action_feature_type | action_value_float | action_value_str | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 15000.0 | 10.99 | 36.0 | 1.0 | 68719476743 | Min. Eligibility Requirement | Default | PASS | All | PASS | Min FICO & Max Debt Capacity | PASS | None | None | NaN | None |
| 1 | 15000.0 | 10.99 | 36.0 | 1.0 | 68719476743 | Loan Approval Strategy | Default | PASS | 680 < FICO < 780 | PASS | Loan Amount | PASS | None | None | NaN | None |
| 2 | 15000.0 | 10.99 | 36.0 | 1.0 | 68719476743 | Loan Approval Strategy | Default | PASS | 680 < FICO < 780 | PASS | PD Threshold | PASS | None | None | NaN | None |
| 3 | 15000.0 | 10.99 | 36.0 | 1.0 | 68719476743 | Loan Approval Strategy | Default | PASS | 680 < FICO < 780 | PASS | Loan Pricing | PASS | None | None | NaN | None |
| 4 | 15000.0 | 10.99 | 36.0 | 1.0 | 68719476743 | Loan Approval Strategy | Default | PASS | FICO >= 780 | FAIL | Loan Amount | None | None | None | NaN | None |
| 5 | 15000.0 | 10.99 | 36.0 | 1.0 | 68719476743 | Loan Approval Strategy | Default | PASS | FICO >= 780 | FAIL | Loan Pricing | None | None | None | NaN | None |
| 6 | 15000.0 | 10.99 | 36.0 | 1.0 | 68719476743 | Loan Approval Strategy | Default | PASS | FICO >= 780 | FAIL | PD Threshold | None | None | None | NaN | None |
| 7 | 15000.0 | 10.99 | 36.0 | 1.0 | 68719476743 | Set Customer Value | Action | PASS | All | PASS | Customer Value | PASS | customer_value | float | 442293.478322 | None |
| 8 | 15000.0 | 10.99 | 36.0 | 1.0 | 68719476743 | Display | Default | PASS | None | None | Is Displayed | PASS | None | None | NaN | None |
Each row in offer_data_app_1 is single rule.There are total 9 rules under strategy in this policy: 8 unique rules under strategy (segment and rules layout shown below) and 1 display rule. So for each offer, there are always 9 rows, and in total there are 9*10 = 90 rows in offer_data_app_1.
strategy_nameThis captures current strategy. For instance 'strategy_name' = 'Min. Eligibility Requirement' which is the first strategy being evaluated.strategy_typecould be any of 'Default' (implying 'Rule Based'), 'Optimization Startegy' or 'Action' depending on stratgey steup on platform.strategy_outputcould take value PASS (if the current strategy passes), FAIL (if it fails) or None if it is action strategy.segment_nameis 'All' if there is no segmentation or the applicable segment. For instance: 'segment_name' for 'Loan Approval Strategy' is either "680 < FICO < 780" or "FICO >= 780".segment_outputis PASS/FAIL depending on whether segment definition matches the current offer being assessed. It is always 'PASS' if there are no segments. Example: in current application, fico = 750, hence 'segment_ouput' == PASS when 'segment_name' == "680 < FICO < 780".rule_namecaptures the current rule - each row in this table is unique for a (rule,segment) combination for a given offer. Note that there are 3 rows for each 'segment_name' in "Loan Approval Strategy"- this is becasue there are 3 rules being evaluated for each segment - 'Loan Amount', 'Loan Pricing' and 'PD Threshold'.rule_outputis PASS (if the the current offer meets the current rule), FAIL(if it doesn't meet rule criteria) or None (if the rule is not evaluateed - this could be the case for action rule or when segment is not applicable).action_feature_aliasshows the alias of action feature.action_feature_typeshows the type (float/integer) in case of action rule, else Noneaction_value_floatshows the assigned numerical value of 'action_feature' if applicable, else NaNaction_value_strshows the assigned string value of 'action_feature' if applicable, else NaN
Note:
- Last four elements in above list are applicable for an action strategy. In example_policy, "Set Customer Value" is an action strategy with no segmentation.
customer_value == 425074.652897for this offer and it is of float type - the calculation method is defined under "Valuation Function" for invoked framework in this policy. - The last row in above table is output of
Display logicon this particular offer. Hence, strategy_name = "Display" and 'rule_name' in such case could be Is Displayed/Is Not Displayed depending on whether the offer passes the display logic. Other fields are now applicable. This row will be absent if 'Offer Display' is not defined.
# Segments and rules layout in example_policy:
for strategy in example_policy.strategies:
print(f'strategy: {strategy.name}')
if strategy.segments:
for seg in strategy.segments:
print(f' segment: {seg.name}')
for rule in seg.rules:
print(f' rule: {rule.name}')
else:
for rule in strategy.rules:
print(f' rule: {rule.name}')
strategy: Min. Eligibility Requirement rule: Min FICO & Max Debt Capacity strategy: Loan Approval Strategy segment: 680 < FICO < 780 rule: Loan Amount rule: PD Threshold rule: Loan Pricing segment: FICO >= 780 rule: Loan Amount rule: Loan Pricing rule: PD Threshold strategy: Set Customer Value rule: Customer Value
Output of Single Loan Review can be divided into 3 sections:
- All Available Offers: This shows all PASS offers - based on PASS criteria defined in Offer Strategy
- Offers To Be Displayed: This shows offer combinations to be displayed to end user/customer after filtering all PASS
offers through
offer displaylogic - Rules Output By Offer: This is a detail section. It shows segment MATCH (if segment is present for strategy) and PASS/FAIL output of each rule in every strategy for each offer.
All Available Offers¶
# Number of offers tested for a particular application (id =1) based on offer configuration setup
print('Total offers evaluated:', len(offer_data_app_1['offer_id'].unique()))
offer_ids = [] # List to store offers_ids for which each strategy PASS
for i in range(0,len(example_policy.strategies)):
strategy_name = example_policy.strategies[i].name
passed_offers = offer_data_app_1[(offer_data_app_1['strategy_name']==example_policy.strategies[i].name) & (offer_data_app_1['strategy_output']=="PASS")]
unique_offers = passed_offers['offer_id'].nunique()
offer_ids.extend(set(passed_offers.offer_id))
print(f'Number of Offers passing strategy - {strategy_name}: {unique_offers}')
# Passed offers
offer_ids_passed = set([x for x in offer_ids if offer_ids.count(x) == len(example_policy.strategies)])
print(f'Total offers passed: {len(offer_ids_passed)}' + "\n")
# Print passed offers -- This is visible in "All Available Offers" tab on platform
cols = ['potential_loan_amount', 'potential_int_rate', 'potential_term', 'offer_id']
print("*"*20 + " All Available Offers " + "*"*20)
print(offer_data_app_1[offer_data_app_1.offer_id.isin(offer_ids_passed)][cols].drop_duplicates().reset_index(drop=True))
Total offers evaluated: 10 Number of Offers passing strategy - Min. Eligibility Requirement: 10 Number of Offers passing strategy - Loan Approval Strategy: 3 Number of Offers passing strategy - Set Customer Value: 10 Total offers passed: 3 ******************** All Available Offers ******************** potential_loan_amount potential_int_rate potential_term offer_id 0 10000.0 10.99 36.0 68719476741 1 12500.0 10.99 36.0 68719476742 2 15000.0 10.99 36.0 68719476743
Offer To be Displayed¶
# Offer displayed based on 'display_logic' defined in Offer Creation
# -- This is visible in "Offer To be Displayed" tab on platform
print("*"*20 + " Offer To be Displayed " + "*"*20)
print(offer_data_app_1[(offer_data_app_1['rule_name'] == 'Is Displayed') &
(offer_data_app_1['rule_output'] == 'PASS')][cols].reset_index(drop=True))
******************** Offer To be Displayed ******************** potential_loan_amount potential_int_rate potential_term offer_id 0 15000.0 10.99 36.0 68719476743
Rules Output by Offer¶
We can further look into PASS/FAIL details of a particular offer, say offer_id = 68719476743. This offer is defined as:
- potential_loan_amount = 20000
- potential_int_rate = 7.99
- potential_term = 36
Above 3 parameters along with application_details are assessed together.
# Contents of the 'offer_data_app_1' for a particular 'offer_id'
offer_data_app_1[offer_data_app_1['offer_id']==68719476743]
| potential_loan_amount | potential_int_rate | potential_term | application_id | offer_id | strategy_name | strategy_type | strategy_output | segment_name | segment_output | rule_name | rule_output | action_feature_alias | action_feature_type | action_value_float | action_value_str | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 81 | 15000.0 | 10.99 | 36.0 | 1.0 | 68719476743 | Min. Eligibility Requirement | Default | PASS | All | PASS | Min FICO & Max Debt Capacity | PASS | None | None | NaN | None |
| 82 | 15000.0 | 10.99 | 36.0 | 1.0 | 68719476743 | Loan Approval Strategy | Default | PASS | 680 < FICO < 780 | PASS | Loan Amount | PASS | None | None | NaN | None |
| 83 | 15000.0 | 10.99 | 36.0 | 1.0 | 68719476743 | Loan Approval Strategy | Default | PASS | 680 < FICO < 780 | PASS | PD Threshold | PASS | None | None | NaN | None |
| 84 | 15000.0 | 10.99 | 36.0 | 1.0 | 68719476743 | Loan Approval Strategy | Default | PASS | 680 < FICO < 780 | PASS | Loan Pricing | PASS | None | None | NaN | None |
| 85 | 15000.0 | 10.99 | 36.0 | 1.0 | 68719476743 | Loan Approval Strategy | Default | PASS | FICO >= 780 | FAIL | Loan Amount | None | None | None | NaN | None |
| 86 | 15000.0 | 10.99 | 36.0 | 1.0 | 68719476743 | Loan Approval Strategy | Default | PASS | FICO >= 780 | FAIL | Loan Pricing | None | None | None | NaN | None |
| 87 | 15000.0 | 10.99 | 36.0 | 1.0 | 68719476743 | Loan Approval Strategy | Default | PASS | FICO >= 780 | FAIL | PD Threshold | None | None | None | NaN | None |
| 88 | 15000.0 | 10.99 | 36.0 | 1.0 | 68719476743 | Set Customer Value | Action | PASS | All | PASS | Customer Value | PASS | customer_value | float | 442293.478322 | None |
| 89 | 15000.0 | 10.99 | 36.0 | 1.0 | 68719476743 | Display | Default | PASS | None | None | Is Displayed | PASS | None | None | NaN | None |
# Strategy: Min. Eligibility Requirement
offer_data_app_1[(offer_data_app_1['offer_id']==68719476743) &
(offer_data_app_1['strategy_name'] == "Min. Eligibility Requirement")]
| potential_loan_amount | potential_int_rate | potential_term | application_id | offer_id | strategy_name | strategy_type | strategy_output | segment_name | segment_output | rule_name | rule_output | action_feature_alias | action_feature_type | action_value_float | action_value_str | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 81 | 15000.0 | 10.99 | 36.0 | 1.0 | 68719476743 | Min. Eligibility Requirement | Default | PASS | All | PASS | Min FICO & Max Debt Capacity | PASS | None | None | NaN | None |
First row (index = 27) captures result of Min. Eligibility Requirement which doesn't have any segmentation. This strategy has just one rule 'Min FICO & Max Debt Capacity' which passes based on rule definition. Hence the strategy_ouptut is also a PASS. Since this is a rule based strategy (indicated by strategy_type = 'default'), action rule related columns are not applicable.
# Strategy: Loan Approval Strategy
offer_data_app_1[(offer_data_app_1['offer_id']==68719476743) &
(offer_data_app_1['strategy_name'] == "Loan Approval Strategy")]
| potential_loan_amount | potential_int_rate | potential_term | application_id | offer_id | strategy_name | strategy_type | strategy_output | segment_name | segment_output | rule_name | rule_output | action_feature_alias | action_feature_type | action_value_float | action_value_str | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 82 | 15000.0 | 10.99 | 36.0 | 1.0 | 68719476743 | Loan Approval Strategy | Default | PASS | 680 < FICO < 780 | PASS | Loan Amount | PASS | None | None | NaN | None |
| 83 | 15000.0 | 10.99 | 36.0 | 1.0 | 68719476743 | Loan Approval Strategy | Default | PASS | 680 < FICO < 780 | PASS | PD Threshold | PASS | None | None | NaN | None |
| 84 | 15000.0 | 10.99 | 36.0 | 1.0 | 68719476743 | Loan Approval Strategy | Default | PASS | 680 < FICO < 780 | PASS | Loan Pricing | PASS | None | None | NaN | None |
| 85 | 15000.0 | 10.99 | 36.0 | 1.0 | 68719476743 | Loan Approval Strategy | Default | PASS | FICO >= 780 | FAIL | Loan Amount | None | None | None | NaN | None |
| 86 | 15000.0 | 10.99 | 36.0 | 1.0 | 68719476743 | Loan Approval Strategy | Default | PASS | FICO >= 780 | FAIL | Loan Pricing | None | None | None | NaN | None |
| 87 | 15000.0 | 10.99 | 36.0 | 1.0 | 68719476743 | Loan Approval Strategy | Default | PASS | FICO >= 780 | FAIL | PD Threshold | None | None | None | NaN | None |
The next 6 rows (index = 28 - 33) captures result of Loan Approval Strategy which has 2 segments: 680 < FICO < 780 and FICO >= 780.
For this appliation, fico = 750, hence segment_output = PASS when 'segment_name' = '680 < FICO < 780', else FAIL.
In each segment, 3 rules are tested.
'Loan Amount' is FAIL
'PD Threshold' is PASS
'Loan Pricing' is FAIL
Hence the strategy_output = FAIL.
Last 3 rows are not applicable because the segment therein is 'FAIL'. Hence rule_output = None.
Since this is a rule based strategy (indicated by strategy_type = 'default'), action rule related columns are not applicable.
# Strategy: Set Customer Value
offer_data_app_1[(offer_data_app_1['offer_id']==68719476743) &
(offer_data_app_1['strategy_name'] == "Set Customer Value")]
| potential_loan_amount | potential_int_rate | potential_term | application_id | offer_id | strategy_name | strategy_type | strategy_output | segment_name | segment_output | rule_name | rule_output | action_feature_alias | action_feature_type | action_value_float | action_value_str | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 88 | 15000.0 | 10.99 | 36.0 | 1.0 | 68719476743 | Set Customer Value | Action | PASS | All | PASS | Customer Value | PASS | customer_value | float | 442293.478322 | None |
The second last row (index = 34) captures result of Set Customer Value which doesn't have any segmentation. This strategy has just one rule 'Customer Value'. The strategy_ouptut is a PASS as it is the case for any action strategy (indicated by strategy_type = 'Action'). customer_value is the alias of the fieature to be assigned which is of type float. COmputed action value = 425074.652897in this case. Since, the assignemnt is of type numerical, action_value_str = None.
# Display Logic
offer_data_app_1[(offer_data_app_1['offer_id']==68719476743) &
(offer_data_app_1['strategy_name'] == "Display")]
| potential_loan_amount | potential_int_rate | potential_term | application_id | offer_id | strategy_name | strategy_type | strategy_output | segment_name | segment_output | rule_name | rule_output | action_feature_alias | action_feature_type | action_value_float | action_value_str | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 89 | 15000.0 | 10.99 | 36.0 | 1.0 | 68719476743 | Display | Default | PASS | None | None | Is Displayed | PASS | None | None | NaN | None |
The last row (index = 35) captures result of Display for this offer. If the offer passes the display logic as defined in offer creation, 'rule_name' = "Is Displayed" else "Is Not Displayed"
All these details are captured in Rules Output by Offer section in Single Loan Reivew on platform
Other Adhoc Analysis¶
In this section, we will do an adhoc analysis on passed offers. In particular, we will look at offered amount, interest and term with respect to application grade
# Subsetting passed offers from all offers data
offer_ids = [] # List to store offers_ids for which each strategy PASS
for i in range(0,len(example_policy.strategies)):
strategy_name = example_policy.strategies[i].name
passed_offers = policy_offer_data[(policy_offer_data['strategy_name']==example_policy.strategies[i].name) & (policy_offer_data['strategy_output']=="PASS")]
unique_offers = passed_offers['offer_id'].nunique()
offer_ids.extend(set(passed_offers.offer_id))
print(f'Number of Offers passing strategy - {strategy_name}: {unique_offers}')
# Passed offers
offer_ids_passed = set([x for x in offer_ids if offer_ids.count(x) == len(example_policy.strategies)])
print(f'Total offers passed: {len(offer_ids_passed)}' + "\n")
# Print passed offers -- This is visible in "All Available Offers" tab on platform
cols = ['potential_loan_amount', 'potential_int_rate', 'potential_term', 'offer_id','application_id']
passed_offers = policy_offer_data[policy_offer_data.offer_id.isin(offer_ids_passed)][cols].drop_duplicates().reset_index(drop=True)
passed_offers
Number of Offers passing strategy - Min. Eligibility Requirement: 20 Number of Offers passing strategy - Loan Approval Strategy: 7 Number of Offers passing strategy - Set Customer Value: 20 Total offers passed: 7
| potential_loan_amount | potential_int_rate | potential_term | offer_id | application_id | |
|---|---|---|---|---|---|
| 0 | 10000.0 | 10.99 | 36.0 | 68719476741 | 1.0 |
| 1 | 12500.0 | 10.99 | 36.0 | 68719476742 | 1.0 |
| 2 | 15000.0 | 10.99 | 36.0 | 68719476743 | 1.0 |
| 3 | 15000.0 | 10.99 | 36.0 | 25769803784 | 2.0 |
| 4 | 7500.0 | 10.99 | 36.0 | 25769803781 | 2.0 |
| 5 | 10000.0 | 10.99 | 36.0 | 25769803782 | 2.0 |
| 6 | 12500.0 | 10.99 | 36.0 | 25769803783 | 2.0 |
# Adding application information to to offers data
passed_offers_with_app_data = passed_offers.merge(policy_entity_data, on="application_id")
passed_offers_with_app_data
| potential_loan_amount | potential_int_rate | potential_term | offer_id | application_id | fico_range_high | requested_loan_amount | debt_capacity | annual_income | age_of_credit_file | application_date | earliest_credit_line | policy_overall | min___eligibility__requirement | loan__approval__strategy | set__customer__value | display | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 10000.0 | 10.99 | 36.0 | 68719476741 | 1.0 | 750.0 | 15000.0 | 0.150000 | 100000.0 | 24.0 | 2018-01-01 | 2016-01-01 | PASS | PASS | PASS | PASS | PASS |
| 1 | 12500.0 | 10.99 | 36.0 | 68719476742 | 1.0 | 750.0 | 15000.0 | 0.150000 | 100000.0 | 24.0 | 2018-01-01 | 2016-01-01 | PASS | PASS | PASS | PASS | PASS |
| 2 | 15000.0 | 10.99 | 36.0 | 68719476743 | 1.0 | 750.0 | 15000.0 | 0.150000 | 100000.0 | 24.0 | 2018-01-01 | 2016-01-01 | PASS | PASS | PASS | PASS | PASS |
| 3 | 15000.0 | 10.99 | 36.0 | 25769803784 | 2.0 | 760.0 | 12500.0 | 0.104167 | 120000.0 | 24.0 | 2018-02-01 | 2016-02-01 | PASS | PASS | PASS | PASS | PASS |
| 4 | 7500.0 | 10.99 | 36.0 | 25769803781 | 2.0 | 760.0 | 12500.0 | 0.104167 | 120000.0 | 24.0 | 2018-02-01 | 2016-02-01 | PASS | PASS | PASS | PASS | PASS |
| 5 | 10000.0 | 10.99 | 36.0 | 25769803782 | 2.0 | 760.0 | 12500.0 | 0.104167 | 120000.0 | 24.0 | 2018-02-01 | 2016-02-01 | PASS | PASS | PASS | PASS | PASS |
| 6 | 12500.0 | 10.99 | 36.0 | 25769803783 | 2.0 | 760.0 | 12500.0 | 0.104167 | 120000.0 | 24.0 | 2018-02-01 | 2016-02-01 | PASS | PASS | PASS | PASS | PASS |
# Import necessary packages
import pandas as pd
# FICO vs amount
pd.DataFrame(passed_offers_with_app_data.groupby(['application_id'])['fico_range_high', 'potential_loan_amount'].median())
| fico_range_high | potential_loan_amount | |
|---|---|---|
| application_id | ||
| 1.0 | 750.0 | 12500.0 |
| 2.0 | 760.0 | 11250.0 |
# FICO vs interest
pd.DataFrame(passed_offers_with_app_data.groupby(['application_id'])['fico_range_high', 'potential_int_rate'].median())
| fico_range_high | potential_int_rate | |
|---|---|---|
| application_id | ||
| 1.0 | 750.0 | 10.99 |
| 2.0 | 760.0 | 10.99 |
# FICO vs term
pd.DataFrame(passed_offers_with_app_data.groupby(['application_id'])['fico_range_high', 'potential_term'].median())
| fico_range_high | potential_term | |
|---|---|---|
| application_id | ||
| 1.0 | 750.0 | 36.0 |
| 2.0 | 760.0 | 36.0 |