# bonify
###### tags: `interview`
"Our users often present a small number of data points, from which we should extract some meaning. Often the most valuable patterns we are concerned with are repeated payments or credits. Below are 12 months of fictional transactional data. We want to better understand what are repeated payments and any patterns in those transactions.
Your task is to write python code to extract any patterns that you think are important with particular relevance to prediction. We are looking for creativity in understanding user data and elegance of your solution.
Keep in mind that at bonify we want to be useful to our users, so think about what features might be useful for the user to understand their data better.
Please answer all questions; either answer A (a) or A (b) should be answered with Python code. Don’t spend more than 4 hours on this task please.
## A. Write code to read in the JSON file, extract the transaction data and perform an analysis on those transactions:
### a. Write a generic algorithm to recognise income and spending patterns of the user. Hint: Use tools and techniques you think are appropriate in best understanding user behaviour. How would you discover repeated payments?
### code
https://drive.google.com/open?id=1e7ei-_NbTdHB11PAk2ax8gJlIQiFibTr
### tools for visualization
https://microsoft.github.io/SandDance/app/
### outcome pattern
1. account and booking type
from the x-account and color-booking type,
besides the unknown account,
all the account and the booking type have really high correlation,
we can almost can identify the booking type with the account.
or other way around.
and since all `unknown account` maps to `booking type ATM`, we can almost sure, as long as there's an unknown account payment,
should map to the ATM withdraw.

2. we can see the monthly pattern, knowing the user is not an impulsive person, with steady income and outcome.


3. If we check the records, most transaction happen in the end of month. If there's a transaction happen in 1st-20th of the month, we almost could tell it belongs to ATM withdraw. Also means, if there's any payment comes between 1st-20th, that could be a fraud alarm.

4. how did I do feature engineering
check `1.feature engineering.ipynb`
6. how to discover repeat transaction
check `2.repeat payment.ipynb`
### b. How would you predict the next six months of transactions. Create a prediction method or function that can predict transactions based on any rules you discovered earlier. Be prepared to explain your reason.
check `3.model.ipynb`
#### remove outlier
1. repeat payment
2. high income
#### models
model A: input(date, type ), predict ammount
model B: input(date, amount), predict type
#### use case
When we received a transaction, we send to the model A and B, to see if it fits the historical trend, (ie.behavior pattern) of this user. Check with the repeat transaction service, to see if it's repeat payment.
note: the model hasn't been done with in 4 hours, but I can send over the model later when finished.
## B. If more users were provided, explain how you would use more user data from other users to give more insight into individual data patterns.
gather all other users booking type, to know the this user is belongs to
- family or single
- with or without pet
- with or without car
- rent a house or pay the loan
- use cash or not
from comparing the transaction date, to know this user is belongs to
- impulsive or rational transaction
- spend money on weekday or weekend
## C. What pitfalls do you see in your analysis? Or the techniques used?"
When a new user comes, there's no historical data, the model lacks of history data. For a better prediction, will need to build a model to predict which user pattern for this new user.