# 订票系统
考察重点:数据库设计,**防止超卖**,峰值高并发
## What's the functional requirement?
根据地点,电影名搜索电影,选座位订票付款,排队购票
Search for movies by location and title, select seats, make a reservation, and proceed with payment, and queue for ticket purchase
Queue for ticket purchase.
## non-functional requirement?
高扩展行,支持大量电影院及电影,高一致性(可以牺牲读的一致性,但是写入必须高一致性),一个位置只能卖一次,高峰值流量
High scalability, supporting a large number of cinemas and movies, high consistency (read consistency can be sacrificed, but write operations must have high consistency), one seat can only be sold once, handling high peak traffic
high consistency, high scalability.
设计注意事项
1。为简单起见,假设我们的服务不需要任何用户身份验证。
2、系统不处理部分客票订单。要么用户得到了他们想要的所有门票,要么什么也得不到。
3、公平是制度的强制性要求。
4、为了防止系统滥用,我们可以限制用户一次预订10个以上的座位。
5、我们可以假设,在人们期待已久的热门电影上映时,流量会急剧上升,座位很快就会挤满。该系统应具有可扩展性和高可用性,以跟上流量激增的步伐。
Design considerations:
For simplicity, let's assume our service doesn't require any user authentication.
The system doesn't handle partial ticket orders. Users either get all the tickets they want or none at all.
Fairness is a mandatory requirement of the system.
To prevent abuse of the system, we can limit users from booking more than 10 seats at once.
We can assume that when highly anticipated movies are released, traffic will increase significantly, and seats will quickly fill up. The system should be scalable and highly available to cope with the surge in traffic."
## Calcualtion
每天访问1M次(1 million daily visits)(read操作)
每天售出100 K(售出时写入数据库 - write)
Sell 100,000 tickets daily (writing to the database during sales)
考虑抢票情况:100倍于平时流量
Peak QPS Calculation:
read: 1M * 100/(3600 * 24) = 1200 - AP
write: 100k * 100 / (3600 * 24) = 120 - CP
五年存储需要多大,假设100B一张票 (May I ask how much storage space is needed for one ticket)
Ticket DB - 100K * 100B * 365 * 5 = 18GB
Transaction DB - 100K * 100B * 365 * 5 = 18GB
## service
1. Search service:查找电影,确定电影院,放映厅
根据zipcode(GEOhash),电影名字返回电影场次信息,
2. Ticket Transaction service:选具体座位,出票,付款(不同付款方式)高一致性,出票付款同时完成才算成功
展示电影院座位分布-> 保留座位-〉给用户15min付款 -> 如果完成,确认用户购票成功。如果没有完成,重新开放座位
3. Payment service:(External dependency)付款系统
e.g. 验证信用卡信息是对的(和外部有API的交互)
4. waitlist service:等票系统。e.g. 30个座位,100个人想看,剩余的人进入waiting list中。前面的人退票or付款失败,其他人可以买票
提供付费方式,加入waitlist(很想要这张票时才会提供credit card然后去排队等票) -> 如果前一个用户未能付款,系统自动买票 -> 如果购买失败,买票机会留给下一位 -> 如果购买成功 通知用户
5. user service
## DB design
Searching service:
1. Cinema: Cinima_Id, zipcode(GeoHash), address, desc
2. Cinema Hall: Hall_id, cinema_id, desc
3. Movies: Movie_id, name, desc, Duration
4. Events: Id, movie_id, hall_id, startTime, status, desc(哪个电影在哪个地方放映,几点开始)
- status:
- 正在上映(Now Showing): 这些电影当前正在影院上映,观众可以购买票并观看。
- 即将上映(Coming Soon): 这是指即将在不久的将来上映的电影,通常有宣传和预告片。
- 预售中(Pre-Sale): 这表示电影票已经开始预售,观众可以提前购买票,通常在电影上映前几天或周末。
- 已下映(Off Screen): 这些电影已经从影院下映,观众不能再购买票观看。
- 特别放映(Special Screening): 这包括一些特别场次,如经典电影重映、主题电影展映等。
- 加料场次(Encore Screening): 某些电影可能会在原本上映周期结束后重新上映。
- 全天上映(All Day): 一些影院可能在某一天播放特定电影,称为全天上映。
- 限量场次(Limited Screening): 这些是在特定时间或地点上映的电影,通常不在所有影院播放。
Transaction service:
1. tickets(Strong consistency): ticketId, event_id, row, col, price, status, modifytime
2. transactions(因为有第三方dependency,当transaction完成后才更新ticket出票): Trans_id, userId, ticket_id, status, type, associate_trans_id(e.g. 出现了refund, associate上一次交易成功的id), paymentId(外部支付id,e.g. zelle or paypal)
- status:
- pending: Payment pending on confirmation 15min
- Completed: 付款已成功完成,
- Failed: 付款尝试失败,
- Cancelled: 付款被取消,
- Refunded: 付款金额已退,
- Authorized: 付款已被授权,但尚未完成,
- Processing: 付款正在处理中,
- on Hold: 付款暂时被挂起,等待进一步处理。
- 
User service:
1. Users: userId, name, email,password
## Scalable
查电影如何按位置做sharding, 加入geoHash做为sharding key。
所以Events talbe(Id, movie_id, hall_id, starttime, status, desc(哪个电影在哪个地方放映,几点开始))中需要再加入GeoHash,根据GeoHash(把二维坐标变成一维存储) sharding来找到某个具体的event。通过zipcode可以快速找到场次

GeoHash前面重合的字母越多,离得越近。(sharding_hash5)表示所处位置的精度有多少。找电影院的话25km以内 -> hash4 or hash5

## API design
1. SearchTicket: GET /v1/searchTicket?zipCode=10014&movieId=123
```java
@GetMapping
public PostResponse getAllPosts(
@RequestParam(value = "pageNo", defaultValue = AppConstants.DEFAULT_PAGE_NUMBER, required = false) int pageNo,
@RequestParam(value = "pageSize", defaultValue = AppConstants.DEFAULT_PAGE_SIZE, required = false) int pageSize,
@RequestParam(value = "sortBy", defaultValue = AppConstants.DEFAULT_SORT_BY, required = false) String sortBy,
@RequestParam(value = "sortDir", defaultValue = AppConstants.DEFAULT_SORT_DIR, required = false) String sortDir
) {
return postService.getAllPost(pageNo, pageSize, sortBy, sortDir);
}
```

2. Transaction: POST /v1/Transaction

## 保证Strong Consistency - Coordinator Service
保证操作要么全部成功,要么全部失败
## Concurrency
How to handle concurrency to ensure that no two users can book the same seat. We can **use transactions in an SQL database to avoid any conflicts**. For example, if we are using SQL Server, we can leverage **transaction isolation levels to lock rows and then update them**
Serializable is the highest isolation level, ensuring immunity to dirty reads, non-repeatable reads, and phantom reads. It's important to note that when we read rows in a transaction, we obtain a write lock, preventing others from updating them.
Once the above database transaction is successful, we can begin tracking reservations in the ActiveReservationService
## 如何保证不超卖-使用Lock
在选定座位后进行预留并开始倒计时,避免其他用户购买该座位
避免用户输入付款信息后,发现座位已经sold out
只对写加锁,读不需要。如何在分布式下实现?
1. master-slave,master负责写加锁,slave不用加锁

## 服务器在首款完成后,更新票状态前down机怎么处理
Distributed Transaction:需要有一个Transaction Manager,保证收款和更新ticket status同时发生,如果只发生其一,则需要重试失败操作或者rollback到initial status
缺点:Blocking,一旦有任何节点长期down机,Transaction就会停滞

这里的participant是:ticket table和payment db,transaction
prepare阶段,每个participant都有一个undo log
## Cache
Query cache:相同的zipcode和movieid
## Challenge
如何保证15min 其他人可以看到票available
Ticket Status - Available/NotAvailable/pending **lazy update**
where status = available or(status = pending and current_timestamp - modified_time > 15 min), 其他人选了这个座位后update status
Cron job
当一个人买票后状态马上变成pending, 放入kafka中,设置延迟,15min后available
## 等票时如何检查付款超时并通知下一位
一旦确认付款失败,则将此交易状态设置为fail,同时读取waiting list仍在等票的最早用户,自动购票,一直到成功为止
## Disaster recover
1. 服务器:stateless service, Kafka as distributed log, transaction coordinate
2. 存储:Master-slave replica(DB,cache)
## ORR(Operational Readiness Review)运维
1. Payment Success Rate(如果一直回滚,需要dive deep一下原因)
2. Service Availability(Availablility如果从100降到90 -> 机器挂掉,需要oncall有所响应,看是什么原因导致的)
3. DB Write/Read load(看突然读的特别多,读的特别少的情况。读的特别少 -> 有可能系统出了问题)
4. Cache Hit Rate(缓存有效利用率)