# text result ## tf-idf 4000 word ### Category: openness [[0 4 0] [0 5 0] [0 6 0]] precision recall f1-score support 0 0.00 0.00 0.00 4 1 0.33 1.00 0.50 5 2 0.00 0.00 0.00 6 accuracy 0.33 15 macro avg 0.11 0.33 0.17 15 weighted avg 0.11 0.33 0.17 15 0.3333333333333333 **average k-fold accuracy:** 0.3004945054945055 ### Category: outdoor [[1 5] [4 5]] precision recall f1-score support 0 0.20 0.17 0.18 6 1 0.50 0.56 0.53 9 accuracy 0.40 15 macro avg 0.35 0.36 0.35 15 weighted avg 0.38 0.40 0.39 15 0.4 **average k-fold accuracy:** 0.5061904761904763 ## Chi-square 500 ### Category: openness [[0 3 1] [0 4 1] [1 5 0]] precision recall f1-score support 0 0.00 0.00 0.00 4 1 0.33 0.80 0.47 5 2 0.00 0.00 0.00 6 accuracy 0.27 15 macro avg 0.11 0.27 0.16 15 weighted avg 0.11 0.27 0.16 15 0.26666666666666666 0.3478296703296703 ### Category: outdoor [[4 2] [0 9]] precision recall f1-score support 0 1.00 0.67 0.80 6 1 0.82 1.00 0.90 9 accuracy 0.87 15 macro avg 0.91 0.83 0.85 15 weighted avg 0.89 0.87 0.86 15 0.8666666666666667 0.7642857142857142 ## chi-square 50 ### Category: openness [[0 2 2] [0 3 2] [1 2 3]] precision recall f1-score support 0 0.00 0.00 0.00 4 1 0.43 0.60 0.50 5 2 0.43 0.50 0.46 6 accuracy 0.40 15 macro avg 0.29 0.37 0.32 15 weighted avg 0.31 0.40 0.35 15 0.4 0.39645604395604395 ### Category: outdoor [[4 2] [1 8]] precision recall f1-score support 0 0.80 0.67 0.73 6 1 0.80 0.89 0.84 9 accuracy 0.80 15 macro avg 0.80 0.78 0.78 15 weighted avg 0.80 0.80 0.80 15 0.8 0.7104761904761905 # Summary: tf-idf 4000: 1. openness * Sample accuracy: 0.3333333333333333 * average k-fold accuracy: 0.3004945054945055 2. outdoor * Sample accuracy: 0.4 * average k-fold accuracy: 0.5061904761904763 Chi-square 500: 1. openness * Sample accuracy: 0.26666666666666666 * average k-fold accuracy: 0.3478296703296703 2. outdoor * Sample accuracy: 0.8666666666666667 * average k-fold accuracy: 0.7642857142857142 Chi-square 50: 1. openness * Sample accuracy: 0.4 * average k-fold accuracy: 0.39645604395604395 2. outdoor * Sample accuracy: 0.8 * average k-fold accuracy: 0.7104761904761905 count Chi-square 500: 1. openness * Sample accuracy: 0.3 * average k-fold accuracy: 0.3166666666666666 2. outdoor * Sample accuracy: 0.5 * average k-fold accuracy: 0.576310715609937 count Chi-square 50: 1. openness * Sample accuracy: 0.4 * average k-fold accuracy: 0.33689655172413796 2. outdoor * Sample accuracy: 0.43333333333333335 * average k-fold accuracy: 0.5252836484983314 # Model difference(tf-idf chi 500) ## Personality * Openess: * random forest = 0.3466666666666667 * svm = 0.3704597701149425 * xgboost = 0.33999999999999997 * conscientiousness: * random forest = 0.2949430584247047 * svm = 0.3704597701149425 * xgboost = 0.2691686530006886 * extraversion: * random forest = 0.3579569892473119 * svm = 0.3704597701149425 * xgboost = 0.369715821812596 * aggreablness: * random forest = 0.298963133640553 * svm = 0.3704597701149425 * xgboost = 0.30595238095238086 * neuroticism: * random forest = 0.40956989247311826 * svm = 0.3704597701149425 * xgboost = 0.42496159754224266 * OutDoor: * random forest = 0.711505376344086 * svm = 0.5251761216166111 * xgboost = 0.5456136447905079 # picture result ### Category: openness 1. xgboost : 0.39452838827838826 2. random tree : 0.4666666666666667 3. svm : 0.3719322344322344 ### Category: conscientiousness 1. xgboost : 0.3487271062271063 2. random tree :0.3246199633699634 3. svm : 0.26666666666666666 ### Category: extraversion 1. xgboost :0.35042582417582413 2. random tree :0.36567765567765564 3. svm : 0.39228021978021976 ### Category: agreeableness 1. xgboost :0.41208333333333336 2. random tree : 0.35720238095238094 3. svm : 0.3710714285714286 ### Category: neuroticism 1. xgboost :0.3020604395604395 2. random tree :0.33744505494505495 3. svm : 0.39228021978021976 ### Category: outdoor [[1 5] [4 5]] precision recall f1-score support 0 0.20 0.17 0.18 6 1 0.50 0.56 0.53 9 accuracy 0.40 15 macro avg 0.35 0.36 0.35 15 weighted avg 0.38 0.40 0.39 15 0.4 **average k-fold accuracy:** 0.5061904761904763 ## Chi-square 500 ### Category: openness [[0 3 1] [0 4 1] [1 5 0]] precision recall f1-score support 0 0.00 0.00 0.00 4 1 0.33 0.80 0.47 5 2 0.00 0.00 0.00 6 accuracy 0.27 15 macro avg 0.11 0.27 0.16 15 weighted avg 0.11 0.27 0.16 15 0.26666666666666666 0.3478296703296703 ### Category: outdoor [[4 2] [0 9]] precision recall f1-score support 0 1.00 0.67 0.80 6 1 0.82 1.00 0.90 9 accuracy 0.87 15 macro avg 0.91 0.83 0.85 15 weighted avg 0.89 0.87 0.86 15 0.8666666666666667 0.7642857142857142 ## chi-square 50 ### Category: openness [[0 2 2] [0 3 2] [1 2 3]] precision recall f1-score support 0 0.00 0.00 0.00 4 1 0.43 0.60 0.50 5 2 0.43 0.50 0.46 6 accuracy 0.40 15 macro avg 0.29 0.37 0.32 15 weighted avg 0.31 0.40 0.35 15 0.4 0.39645604395604395 ### Category: outdoor [[4 2] [1 8]] precision recall f1-score support 0 0.80 0.67 0.73 6 1 0.80 0.89 0.84 9 accuracy 0.80 15 macro avg 0.80 0.78 0.78 15 weighted avg 0.80 0.80 0.80 15 0.8 0.7104761904761905 # Summary: tf-idf 4000: 1. openness * Sample accuracy: 0.3333333333333333 * average k-fold accuracy: 0.3004945054945055 2. outdoor * Sample accuracy: 0.4 * average k-fold accuracy: 0.5061904761904763 Chi-square 500: 1. openness * Sample accuracy: 0.26666666666666666 * average k-fold accuracy: 0.3478296703296703 2. outdoor * Sample accuracy: 0.8666666666666667 * average k-fold accuracy: 0.7642857142857142 Chi-square 50: 1. openness * Sample accuracy: 0.4 * average k-fold accuracy: 0.39645604395604395 2. outdoor * Sample accuracy: 0.8 * average k-fold accuracy: 0.7104761904761905 count Chi-square 500: 1. openness * Sample accuracy: 0.3 * average k-fold accuracy: 0.3166666666666666 2. outdoor * Sample accuracy: 0.5 * average k-fold accuracy: 0.576310715609937 count Chi-square 50: 1. openness * Sample accuracy: 0.4 * average k-fold accuracy: 0.33689655172413796 2. outdoor * Sample accuracy: 0.43333333333333335 * average k-fold accuracy: 0.5252836484983314 # Model difference(tf-idf chi 500) ## Personality * Openess: * random forest = 0.3466666666666667 * svm = 0.3704597701149425 * xgboost = 0.33999999999999997 * conscientiousness: * random forest = 0.2949430584247047 * svm = 0.3704597701149425 * xgboost = 0.2691686530006886 * extraversion: * random forest = 0.3579569892473119 * svm = 0.3704597701149425 * xgboost = 0.369715821812596 * aggreablness: * random forest = 0.298963133640553 * svm = 0.3704597701149425 * xgboost = 0.30595238095238086 * neuroticism: * random forest = 0.40956989247311826 * svm = 0.3704597701149425 * xgboost = 0.42496159754224266 * OutDoor: * random forest = 0.711505376344086 * svm = 0.5251761216166111 * xgboost = 0.5456136447905079 把少的一邊複製,加重權重 smote class weight scale_pos_weight