ar851060 - HackMD

Zapier vs. Make.com vs. n8n
Recently, I need to find some no-code workflow automation platforms to help me deal with the backend of own built website, and I find these three platform to test which one is more suitable for my situation. However, I want to mention what no-code workflow automation platform is. lots-of-robots-are-working-on-the-production-line- no-code workflow automation platform Actually, most of the programs are sorts of automation workflow. Most of the programs help people avoid lots of tedius works. However, not everyone knows how to code a program, so no-code workflow automation platforms are here. The basic logic in the no-code workflow automation platforms are: "if triggered, then do something."
ar851060 changed a month agoView mode Like Bookmark
Winsorization
_b2193875-34f1-402e-8305-6d65acf78fda What is Winsorization Winsorization is a method to remove the outliers. The process is: Set a threshold of outliers(e.g. data under 5% and higher than 95%) Change the values of outliers into the threshold value. That's it, it is a very simple method to deal with outliers.
ar851060 changed a month agoView mode Like Bookmark
Partial Out Everything - from FWL theory to CUPED, IV/2SLS, and Double ML
-partial-out-everything In recent research, I found that multiple methods in AB testing and causal inference are relyed on the same theory: Frisch-Waugh-Lovell Theorem. However, to dig in this topic, I want to breifly explain the relation between linear regression and t-test. T-test v.s. Linear Regression They are the same, if variable in linear regression is dummy ({0,1}) In T-test, we always calculate the t-statistic $T$ after experiment using treament $X$ and outcome $Y$. However, if we use linear regression, we can find out that $$
ar851060 changed 2 months agoView mode Like Bookmark
Introduction to Switchback Testing
ChatGPT Image 2025年5月21日下午04_06_49 Why Switchback testing There are lots of situations that it is hard to set the classical AB testing. More precific, it is hard to randomize as two groups to do the experiments. For example, in social network or message software, if Alice and Bob are in different groups, but have connection, they might find out the differet version between them, and this is called Network effect. In other example, in two-sided platform, such as Uber or Airbnb, if we want to test the whether the different prices, the two sided users in a group might start to take more resources and push out the people in other groups. In above cases, we cannot use classical AB testing, since there will be lots of effect make experiment not trustworthy. At this situation, instead of randomizing on users, we can randomize on time slot. The method is called backswicth testing.
ar851060 changed 2 months agoView mode Like Bookmark
Multiple Comparison with Some Solutions
What is multiple comparison ChatGPT Image 2025年5月13日下午09_49_59 Acutally, I think the comic from xkcd explains multiple comparison problem breifly and precisely significant from xkcd In one sentence The more the same experiment we do, the most likely to get some extremely results.
ar851060 changed 2 months agoView mode Like Bookmark
How to use AWS dynamoDB outside AWS Environment
Previously, I saved my json-like data in Firebase realtime database, when I just develop the function, they are all fine. However, once I deployed and started to use in public, I found out that I ran out the limit of free tier in Firebase. The free tier of firebase realtime database At the same time, I found out that the limit of free tier on AWS dynamoDB is much bigger than firebase realtime database. Therefore, I decided to transfer my data from firebase to dynamoDB. The free tier of AWS dynamoDB You can see the difference, 25GB of storage v.s. 1GB storing. That's why I want to transfer data. Using dynamoDB outside AWS is not that simple I thought my old article Build an AI Line Chatbot using AWS Bedrock already had the solution to dealing with dynamoDB. However, I was wrong. Under AWS enviroment, you just need to do the two things:
ar851060 changed 9 months agoView mode Like Bookmark
Find Out the Interest of Users - Fake Door Testing
Introduction to Fake Door Testing Strictly speaking, fake door testing is not one of AB testing method. However, it can find out the information which do not have history data. For example, if we want to how many people are interested in the upcoming feature before releasing, and there is no other features like the new one, then it will not use AB testing but fake door testing. How to Do the Fake Door Testing As the name of testing, "Fake door" means we need to find a proxy to find out the interest of users. For example, we can make a button for the new feature, but it will show up "under construction" when users click it. Although it cannot directly say that those who click button are really interested in that new feature, it still can tell us some information. The hardest part of fake door testing is to find proxy. The prefect proxy is low cost, is directly related to the new feature, and cannot reduce the user experience during the testing period. Example When I finished my project "handpalm AI", I put a button into chatbot to show "Sorry! This new feature is under construction now. Please wait for our news".
ar851060 changed 10 months agoView mode Like Bookmark
Introduction to Sequential Testing
Introduction to Sequential Testing Advantage of Sequential Testing Sequential testing is one of the methods that "can reduce the samples of AB testing". Also, it can peek the testing result during the test process. The advantage is clearly two mainly points: Reducing the number of samples and speeding up the process of experiments Peeking the testing results during testing process. Suitable Scenarios for Sequential Testing However, after I tried sequential testing once, I found that there are two scenarios suitable for sequential testing:
ar851060 changed 10 months agoView mode Like Bookmark
Thompson Sampling on Searching Engine
Overview From the last article Embedding Searching Articles and Corresponding Thompson Sampling, I explain how to build a embedding vector seraching engine with OpenAI embedding model. However, in OpenAI, there three types of embedding models: text-embedding-3-small, text-embedding-3-large, and ada v2. Although I can cancel out text-embedding-3-large for its bad performance on Mandarin, but I cannot tell the difference between ada v2 and text-embedding-3-small in a short time. Therefore, I decide to perform a Bayesian AB Testing, more precisely, Thompson Sampling to decide which embedding model is more suitable in this project. Thompson Sampling Since here we only record clicking, in Thompson Sampling, we use bernoulli-beta model. The model looks like: $$ p \sim \text{Beta}(\alpha+x,\beta+(1-x)), $$
ar851060 changed a year agoView mode Like Bookmark
Embedding Searching Articles and Corresponding Thompson Sampling
Overview Since the search function in Instagram is terrible, one of my friends wanted a function which can search his articles so that his customers can find the most related articles by their question. Basically, I built this searching function according to embedding searching. Moreover, I want to set Thompson Sampling experiment in this function for finding the most suitable embedding models from OpenAI. ![Example for Searching Article](https://hackmd.io/_uploads/By93iF6IR.jpg =270x850) The whole structure of this function is belowed, there are three parts in this project: Search, Record, and Update. This article will focus on Search and Update functions, if you just want to build your own embedding searching engine, then read this article is enough. However, if you want to know how I build a experiment on this search engine or you are interested in Record, then please read my next article (progressing...). graph LR
ar851060 changed a year agoView mode Like Bookmark
AA Testing and AB Testing on Reducing Error Rate
I helped my friend build a Line chatbot before, and I put a feature in it. That feature is to help customer find out what waiting number they are so that they can know how much time they need to wait for the service. When the customers want to their waiting numbers, they just need to provide ==Either name OR Line ID==. The example is like the following pictures. Searching Number Feature However, I find some situations that make customers not to find out their numbers: some users provide ==Both name and Line ID==, such as typing ChengChingLin ar851060 Case 1 Example
ar851060 changed a year agoView mode Like Bookmark
How XGBoost deal with missing values
If you are familiar with Machine Learning methods, you must head about the strong weapon in ML: XGBoost. It is a powerful waepon with lots of adventages and opitmization. In this article, I only talk about one adventage in XGBoost: It can deal with missing values naturally. But how? Sparsity-Aware Split Finding In the original paper, they came up with a brilliant idea called Sparsity-Aware Split Finding. The algorithm is belowed. Algorithm Let me explain how it works without using lots of mathematical symbols. When it needs to split, it will split data into two groups: data with missing values and data without missing values.
ar851060 changed a year agoView mode Like Bookmark
Build an AI Line Chatbot using AWS Bedrock
From the last article, I explained how to use build a simple chatbot with AWS Lambda. However, the chatbot we built last time it can only response what you said, it did not Chat at all. This time, I try to connect that chatbot with AWS bedrock, and choose a perfect LLM so that it can actually chat with us. AWS Structure Steps Get the access of models from AWS bedrock, and choose the suitable model Create a no sql database from DynamoDB Open the permissions from AWS IAM Install boto layers Connect Bedrock and DynamoDB to AWS Lambda
ar851060 changed a year agoView mode Like Bookmark
How to make a Line chatbot with AWS Lambda
Since I participated Hackathon held by AWS just one week ago, I want to practice how to use AWS services. This side project is just a practice to make a very simple Line chatbot (echo chatbot) using AWS Lambda. I will write another article to explain how to create an AI Line chatbot using AWS bedrock and AWS dynamoDB. Let's get started. The struture of Line chatbot The steps is following: Create a Business Line account and create a Line official account Open webhook feature in Line Copy Channel access token and Channel secert from Line official account, we will use it later. Create AWS account
ar851060 changed a year agoView mode Like Bookmark