Gabriel - HackMD

Machine Learning Catalog
<i class="fa fa-fw fa-book"></i>Machine Learning <i class="fa fa-fw fa-book"></i>目錄 ⚙️準備中 🆕有更新 Machine Learning Featrue Engineer Logistic Regression Gradient Descent Optimizers Naive Bayes Support Vector Machines
Gabriel changed 4 days agoView mode Like 1 Bookmark
Feature Engineering
Improve Model Performance: 原始數據通常包含噪聲、不完整或不相關的資訊。特徵工程可以提取重要的資訊，降低噪聲，從而提高模型的準確性和性能。 Address Data Quality Issues: 資料集可能包含缺失值(Missing Values)、異常值(outliers)或不一致的數據(inconsistent data)。通過特徵工程，可以清理和準備資料，使其適合用於模型訓練。 Dimensionality Reduction: 高維度數據集容易導致模型過擬合，因為模型可能會學習到數據中的噪聲。特徵工程通過選擇最重要的特徵或進行降維(dimensionality reduction)，降低模型的複雜度，防止過擬合。 Increase Model Interpretability: 原始特徵並不直觀或難以解釋。通過創建新特徵或轉換現有特徵，可以使模型的輸出更具解釋性，便於理解和分析。 Managing Categorical Data Model Requirements : 許多機器學習算法（如線性回歸、支持向量機）只能處理數值數據，因此需要將分類數據轉換為數值形式
 Gabriel changed a month agoView mode Like 1 Bookmark
Logistic Regression
logistic regression : refers to a classifier that classifies an observation into one of two classes multinomial logistic regression : is used when classifying into more than two classes linearly separable 線性可分（Linearly Separable）：如果存在一組權重 𝑤 能夠使模型在所有訓練數據上都沒有或只有很少的分類錯誤（在某個可接受的錯誤閾值內），那麼這個問題就是線性可分的。這意味著可以找到一條直線（或一個平面，在高維空間中）來將不同類別的數據點完全分開。非線性可分（Non-linearly Separable）：
Gabriel changed 2 months agoView mode Like 1 Bookmark
Machine Learning
有input x的資料(age,balance,loan) 有output y的資料(labels,targets) 經過training後，試圖從x找出的^y能夠越接近y越好(兩者的距離稱之為loss) Regression 是指Continuous values : stock price, house price Clssification 是指 discrete values : email message spam or not, an image of a dog Unsupervised Learning(無label) 根據資料之間的相似性(similarity or distance measure)，將資料分群
 Gabriel changed 2 months agoView mode Like 1 Bookmark
PLM
1. 文件管理(藍圖、2d、3d) 藍圖：飛機的結構圖（機身、機翼、駕駛艙）。 2D 設計文件：細部的零件設計，例如螺栓、金屬板的鉚接點設計。 3D 模型：用於展示完整飛機的 CAD 模型，幫助工程師了解部件的裝配方式，並模擬結構性能（如壓力測試）。 1.1 規範涵蓋文件管理規範，例如命名規則、版本控制策略、存取權限管理。支援各類文件格式（如 CAD 輸出文件：DXF、DWG、STEP 等）。追蹤文檔生命週期（創建、審核、簽核、發布） 2. 變更EC
Gabriel changed 4 months agoView mode Like Bookmark
Optimizers
image The stochasticity comes from using only one sample (or mini-batch) to compute the gradient, making it suitable for large datasets. Advantages: Fast computation, especially for large-scale data. Disadvantages: Slower convergence, prone to getting stuck in local minima, and fluctuations in update directions. SGD with Momentum image
Gabriel changed 8 months agoView mode Like 1 Bookmark
Gradient Descent
Definition: Linear problems typically involve first-order equations or quadratic equations. In these problems, the objective function is either linear (e.g., 𝑦=𝑎𝑥+𝑏) or a convex quadratic equation (e.g., 𝑦=𝑎𝑥2+𝑏𝑥+𝑐). Characteristics: First-order functions: The graph is a straight line, with no minima or maxima. Quadratic functions: The graph is a parabola, with a single global minimum (if it opens upwards) or a global maximum (if it opens downwards). Global Minimum/Maximum: The only extremum point in a quadratic function is the global minimum or maximum. Nonlinear Problems Definition: Nonlinear problems involve polynomial equations of degree higher than two (e.g., cubic, quartic, etc.), or other more complex functions (e.g., exponential, logarithmic).
Gabriel changed 8 months agoView mode Like 1 Bookmark
Introduction to DeepLearning and TensorFlow
Our goal is to allow you to understand some basic concepts that can be useful before starting a complete course. Speech, text, and image recognition; autonomous vehicles; and intelligent bots (just to name a few) are common applications normally based on deep learning models and have outperformed any previous classical approach Artificial neural networks An artificial neural network (ANN) or simply neural network (NN) is a directed structure that connects an input layer with an output one. Normally, all operations are differentiable and the overall vector function can be easily written as: Y=f(x)
Gabriel changed 9 months agoView mode Like 1 Bookmark
Hierarchical Clustering
Hierarchical clustering is based on the general concept of finding a hierarchy of partial clusters, built using either a bottom-up or a top-down approach. Agglomerative clustering: The process starts from the bottom (each initial cluster is made up of a single element) and proceeds by merging the clusters until a stop criterion is reached. In general, the target has a sufficiently small number of clusters at the end of the process Divisive clustering: In this case, the initial state is a single cluster with all samples and the process proceeds by splitting the intermediate cluster until you hit the desired number of clusters. Scikit-learn implements only the agglomerative clustering. However, this is not a real limitation because the complexity of divisive clustering is higher and the performances of agglomerative clustering are quite similar to the ones achieved by the divisive approach. The standard algorithm for hierarchical agglomerative clustering has a time complexity of O(n3) and require O(n2) memory, which makes it too slow for even medium data sets. Divisive clustering with an exhaustive search is O(2n)
Gabriel changed 9 months agoView mode Like 1 Bookmark
K-Means
Clustering is an unsupervised learning technique. (which is used when you have unlabeled data). It is the task of grouping together a set of objects in a way that objects in the same cluster are more similar to each other than to objects in other clusters. Similarity is an amount that reflects the strength of relationship between two data objects. Clustering can be categorized as: hard clustering techniques where each element must belong to a single cluster soft clustering (or fuzzy clustering) is based on a membership score that defines how much the elements are "compatible" with each cluster
Gabriel changed 9 months agoView mode Like 1 Bookmark
Ensamble Learning
Strong learners: trained models on single instances,iterating an algorithm in order to minimize a target loss function this approach is based on so-called strong learners, or methods that are optimized to solve a specific problem by looking for the best possible solution Weak learners: this approach is based on a set of weak learners that can be trained in parallel or sequentially and used as an ensemble based on a majority vote or the averaging of results. an ensemble is just a collection of models which come together(e.g. mean of all predictions) to give a final prediction one important point is that our choice of base models should be coherent with the way we aggregate these models.
Gabriel changed 9 months agoView mode Like 1 Bookmark
Imbalance Data
假設腫瘤（tumors）的資料如下： Total: 100 Total positive: 9 (9%) Total negative: 91 (91%) 在不平衡數據集中，僅僅依賴準確率（accuracy）來評估模型性能是不夠精確的偏向多數類別：在不平衡數據集中，多數類別樣本數遠多於少數類別。由於模型在訓練過程中接觸到的多數類別樣本更多，它更有可能傾向於預測多數類別。這樣，即使模型大部分時間都預測為多數類別，也會有很高的準確率。
Gabriel changed 10 months agoView mode Like 1 Bookmark
Support Vector Machines
"Find the line that best separates the training data."(SVM 的目標是找到一條能夠正確分類訓練的分類線) "Among all such lines, pick the one that maximizes the margin, which is the distance to the points closest to it."(選擇與分類線距離最遠的點作為support vectors) "The closest points that determine this line are known as support vectors, and the distance between these points and the line is known as the margin."(2倍support vectors與hyperline之間的距離稱為Margin). image 假設𝑤=[2,3],𝑥1=[1,5],𝑥2=[4,1]，並且𝑘=1：
Gabriel changed 10 months agoView mode Like 1 Bookmark
Naive Bayes
Naive Bayes 是一種分類問題演算法，它依據貝氏定理計算出每個資料點屬於特定類別的機率，例如將每封電子郵件分類成「惡意」或「正常」，Naive Bayes 的工作流程會像是這樣：（虛構情境）先驗機率 (Prior Probability): P(Spam) = 0.10 (惡意郵件的機率) P(Not Spam) = 0.90 (正常郵件的機率) 我們將以三個關鍵字來進行分析：緊急、賺錢、飆股。以下是這些關鍵字在惡意郵件和正常郵件中的條件機率。對於惡意郵件 (Spam)：
Gabriel changed 10 months agoView mode Like 1 Bookmark