R 變數類型

LHB阿好伯, Jan 30, 2019 06:48 PM

Image Not Showing Possible Reasons
The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported
Learn More →

tags: `R`

R 變數類型
簡單資料處理

判斷變數類型函數 `class()`

R內建的class(x)可以用來判斷資料類型

首先可以分為下面6種資料類型

Data Type	Example	備註
Logical 邏輯值	TRUE, FALSE	邏輯判斷結果
Numeric 浮點數	1, 2.3, 4567890	多數數字為浮點數，為計算機儲存數字資料所產生的
Integer 整數	1L, 2L, 34L	避免浮點數就在整數後面加上大寫L
Coplex 複數	3 + 2i
Charater 字串	'a', "TURE", "12.3"	字串都由兩個雙引號或單引號所包覆
Raw 原始	"Hello" = 48 65 6c 6c 6f	16 進制表示法
data.frame 資料框架	Image Not Showing Possible Reasons The image file may be corrupted The server hosting the image is unavailable The image path is incorrect The image format is not supported Learn More →	可以儲存不同變數類型的資料的資料結構，多為矩陣類型，一行為一變數資料
factor因子
list列表
matrix矩陣
functione公式	Image Not Showing Possible Reasons The image file may be corrupted The server hosting the image is unavailable The image path is incorrect The image format is not supported Learn More →

R 處理數值的限制

在R內建的數值運算上是有限制的

R 的 .Machine 這個內建變數中有一些關於數值資料的資訊

這些資訊可能會因為不同的電腦而有不同（不過對大多數的電腦而言，通常都是一樣的）

以下是跟一般使用者比較相關的數值。

.Machine$double.xmax 與 .Machine$double.xmin

分別表示目前 R 所能處理的最大浮點數與最小正浮點數：









.Machine$double.xmax
# [1] 1.797693e+308
.Machine$double.xmin
# [1] 2.225074e-308
.Machine$integer.max 則是 R 可以處理的最大整數值：

.Machine$integer.max
[1] 2147483647

2147483647 這個值就等於 2³¹ −1

如果需要更高精度的數值運算可以使用 Rmpfr 這個套件

而如果是大數運算，則可以使用 brobdingnab 套件。

浮點數陷阱

而浮點數大家可以去查看看，對他了解是一種電腦存檔所產生問題

在某些運算時可能回造成與我們預期解果不同

多數程式語言都會有問題

但也都有解決方法

看是使用套件Rmpfr或是換個方式判斷結果

例如用R計算((0.81 * 0.1) + (0.09 * (-0.9)))

理論上是0但結果會得到1.387779e-17

那這時可以像是以有效位數的方式擷取到小數點後兩位

或是做判斷是不是 -0.01 < x < 0.01

一個可以接受的誤差範圍去做區分

Data Frames資料框架

data frame 是一個用來儲存類似 Excel 表格的變數類型它跟矩陣類似

不過 data frame 的每個行（column）可以儲存不同變數類型的資料

甚至非狀巢結構的列表亦可

像是很有名的 iris 資料集就是data frame的資料

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

建立 Data Frames

我們可以使用 data.frame 函數來建立 data frame 變數






test.df <- data.frame(
  x = letters[1:10], #生成字母串列
  y = rnorm(10),  #生成常態分佈亂數
  z = runif(10) > 0.5 #生成隨機數字後判斷是否大於0.5
)
test.df

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

簡單資料處理

移除重複值

在EXCEL中可以去除重複值而在R中當然也行只需要使用到unique()即可
先隨機抽樣30個英文字母作為樣本


(name <- sample(LETTERS,30, replace = T))

[1] "O" "H" "J" "O" "T" "R" "J" "W" "C" "M" "P" "J" "A" "G" "B" "Y" "A" "A" "T" "V" "W" "M" "V"
[24] "S" "E" "B" "H" "Q" "O" "F"


unique(name) #去除重複值

數據交叉比較

在R中可以快速地進行簡單的數據間交集離集等操作




name1 <- LETTERS[1:15]
name2 <- LETTERS[6:20]
(intersect(name1, name2)) #交集
(unique(c(name1, name2))) #聯集

[1] "F" "G" "H" "I" "J" "K" "L" "M" "N" "O"
[1] "A" "B" "C" "D" "E" "F" "G" "H" "I" "J" "K" "L" "M" "N" "O" "P" "Q" "R" "S" "T"

參考資料

R程式語言的基礎:物件

吳漢銘國立臺北大學統計學系

R的變數與空間

G. T. Wang

數字資料表示法

R语言诡异的精度

統計之都

R 變數類型

tags: R

判斷變數類型函數 class()

R 處理數值的限制

浮點數陷阱

Data Frames資料框架

建立 Data Frames

簡單資料處理

移除重複值

數據交叉比較

參考資料

Read more

R語言常用好用套件紀錄

一圖勝千表_數據可視化_基本圖表篇

Raspberry Pi網路攝像機_實物投影機

Ntfy介紹_自行架設及使用說明

tags: `R`

判斷變數類型函數 `class()`