object-oriented objects

--- title: 'object-oriented objects' disqus: hackmd --- Object-oriented (OO) objects: S3 === ## Table of Contents [TOC] --- ## Base 最基本的，基本上沒有限制，所以介紹一下一些基本的查詢指令 * is.object * sloop::otype ```gherkin=1 require(mtcar) # A base object: is.object(1:10) #> [1] FALSE sloop::otype(1:10) #> [1] "base" # An OO object is.object(mtcars) #> [1] TRUE sloop::otype(mtcars) #> [1] "S3" ``` ### 其實base object 與 OO objects主要的差異在：OO objects有一個“class”屬性 ```gherkin=1 attr(1:10, "class") #> NULL attr(mtcars, "class") #> [1] "data.frame" x <- matrix(1:4, nrow = 2) class(x) #> [1] "matrix" "array" sloop::s3_class(x) #> [1] "matrix" "integer" "numeric" ``` ### type * typeof 不論是哪一種object，都有type，而R裡面有25種types(C-type) * NULL (NILSXP) * logical (LGLSXP) * integer (INTSXP) * double (REALSXP) * complex (CPLXSXP) * character (STRSXP) * list (VECSXP) * raw (RAWSXP) * closure (regular R functions, CLOSXP) * special (internal functions, SPECIALSXP) * builtin (primitive functions, BUILTINSXP) * environment (ENVSXP) * S4 type (S4SXP) * symbol (aka name, SYMSXP) * language (usually called calls, LANGSXP) * pairlist (used for function arguments, LISTSXP) * expression (EXPRSXP)這個比較特別，只能由parse()或expression()輸出，平常不會用到 * externalptr (EXTPTRSXP) * weakref (WEAKREFSXP) * bytecode (BCODESXP) * promise (PROMSXP) * ... (DOTSXP) * any (ANYSXP) * mode() * storage.mode() ```gherkin=1 typeof(1:10) #> [1] "integer" typeof(mtcars) #> [1] "list" typeof(mean) #> [1] "closure" typeof(`[`) #> [1] "special" typeof(sum) #> [1] "builtin" typeof(globalenv()) #> [1] "environment" mle_obj <- stats4::mle(function(x = 1) (x - 2) ^ 2) typeof(mle_obj) #> [1] "S4" typeof(quote(a)) #> [1] "symbol" typeof(quote(a + 1)) #> [1] "language" typeof(formals(mean)) #> [1] "pairlist" ``` ### numeric有點不一樣 ```gherkin=1 sloop::s3_class(1) #> [1] "double" "numeric" sloop::s3_class(1L) #> [1] "integer" "numeric" is.numeric(factor("x")) #> [1] FALSE #很怪吧，為何明明是factor，卻說是整數 ???? Nani ??? typeof(factor("x")) #> [1] "integer" #進去看一下結構，就可以知道，為何typeof會跟你說是integer囉~~~ factor("x") %>% unclass #> [1] 1 #> attr(,"levels") #> [1] "x" ``` --- ## S3：最基本與common的OO系統結構in CRAN packages (base and stats) 風格隨意，自由度相當高，但是需要宣告，所以有一定程度上可以防呆。 ```gherkin=1 sequence <- "ABCDEFGHIKLMNPQRSTVWXYZ" x <- c() x$sequence <- sequence #宣告為結構，然後為類別命名 protein_sequence_constructor <- function(sequence){ structure(list(sequence = sequence), class = "protein") } #使用後可以將變數內容轉換為特定的S3物件，只要符合定義的條件 protein.seq <- protein_sequence_constructor(x$sequence) #$sequence #[1] "ABCDEFGHIKLMNPQRSTVWXYZ" #attr(,"class") #[1] "protein" #S3中generic methods操作：要定義使用的method，然後對該class count_basepair <- function(x) UseMethod("count_basepair") count_basepair.protein <- function(x){ return(nchar(x$sequence)) } #如此一來，對於非該物件類別的的object，便無法使用該function count_basepair(protein.seq) #> 23 count_basepair(x$sequence) #Error in UseMethod("count_basepair") : # no applicable method for 'count_basepair' applied to an object of class "character" ``` ### Overview 一個S3 object的最基本組成：一個基本資料類型(base type)加上一個類別屬性(class attribute)，以及其他屬性. 就以factor來說好了， * 它的base type是integer * class attribute為"factor" * 以及一個levels attribute用來儲存所有levels ```gherkin=1 library(sloop) f <- factor(c("a", "b", "c")) typeof(f) #show base type #> [1] "integer" attributes(f) #show all attributes #> $levels #> [1] "a" "b" "c" #> #> $class #> [1] "factor" unclass(f) #show all other atteributes except class....因為unclass了XD #> [1] 1 2 3 #> attr(,"levels") #> [1] "a" "b" "c" ``` #### 需要注意的是，當把S3 object放進generic function時 sloop::ftype()可以用來確認輸入與輸出的類型 ```gherkin=1 ftype(print) #> [1] "S3" "generic" ftype(str) #> [1] "S3" "generic" ftype(unclass) #> [1] "primitive" print(f) #> [1] a b c #> Levels: a b c # stripping class reverts to integer behaviour print(unclass(f)) #> [1] 1 2 3 #> attr(,"levels") #> [1] "a" "b" "c" ``` #### str() is generic 有些S3 classes藏internal details，例如: POSIXlt這個用來顯示date-time data，用str只能看到最頂層的資訊，而資料被POSIXlt藏起來了 ```gherkin=1 time <- strptime(c("2017-01-01", "2020-05-04 03:21"), "%Y-%m-%d") time #> [1] "2017-01-01 CST" "2020-05-04 CST" str(time) #> POSIXlt[1:2], format: "2017-01-01" "2020-05-04" str(unclass(time)) #> List of 9 #> $ sec : num [1:2] 0 0 #> $ min : int [1:2] 0 0 #> $ hour : int [1:2] 0 0 #> $ mday : int [1:2] 1 4 #> $ mon : int [1:2] 0 4 #> $ year : int [1:2] 117 120 #> $ wday : int [1:2] 0 1 #> $ yday : int [1:2] 0 124 #> $ isdst: int [1:2] 0 0 #> - attr(*, "tzone")= chr "UTC" ``` #### method: 執行某一個別class sloop::s3_dispatch(): to see the process of method dispatch 例如: 在執行print factor時，其實是去呼叫print.factor這個generic function ```gherkin=1 s3_dispatch(print(f)) #> => print.factor #> * print.default ``` 一般來說，我們無法直接看到S3 method的source code，原因使他們只存在package內，而不是global environment sloop::s3_get_method():要看的話要用這個 ```gherkin=1 weighted.mean.Date #> Error in eval(expr, envir, enclos): object 'weighted.mean.Date' not found s3_get_method(weighted.mean.Date) #> function (x, w, ...) #> .Date(weighted.mean(unclass(x), w, ...)) #> <bytecode: 0x7f9682f700b8> #> <environment: namespace:stats> ``` ```gherkin=1 ``` ### Creating a new S3 class 可以兩種建造方式，並且class name可以隨時修改(夠free吧) ```gherkin=1 # Create and assign class in one step x <- structure(list(), class = "my_class") # Create, then set class x <- list() class(x) <- "my_class" class(x) #> [1] "my_class" inherits(x, "my_class") #> [1] TRUE inherits(x, "your_class") #> [1] FALSE ``` 雖然S3很自由，但是如果萬一不幸就是不小心寫錯砸了自己的腳...... ```gherkin=1 # Create a linear model mod <- lm(log(mpg) ~ log(disp), data = mtcars) class(mod) #> [1] "lm" print(mod) #> #> Call: #> lm(formula = log(mpg) ~ log(disp), data = mtcars) #> #> Coefficients: #> (Intercept) log(disp) #> 5.381 -0.459 # Turn it into a date (?!) class(mod) <- "Date" # Unsurprisingly this doesn't work very well print(mod) #> Error in as.POSIXlt.Date(x): 'list' object cannot be coerced to type 'double' ``` --- 所以我們最好建立好習慣來搭建S3 OO 1. constructors 2. validator 3. helper #### constructors 由於S3相當的自由，所以它並不保證某一個class的object都擁有同樣的結構 (上面看到了, 有人assign Date到lm......) 所以我們用constructors，來enforce a consistent structure： 1. called new_myclass() 2. Have one argument for the base object, and one for each attribute. 3. Check the type of the base object and the types of each attribute. ex. 限定變數型態為double的attribute，而且它的class叫做"Date" ```gherkin=1 new_Date <- function(x = double()) { stopifnot(is.double(x)) structure(x, class = "Date") } new_Date(c(-1, 0, 1)) #> [1] "1969-12-31" "1970-01-01" "1970-01-02" ``` 再來看一個稍微複雜的例子，兩個變數的 * 第一個變數限定在double的attribute * 第二個變數限定要是五種指定的文字 * class叫做"difftime" ```gherkin=1 new_difftime <- function(x = double(), units = "secs") { stopifnot(is.double(x)) units <- match.arg(units, c("secs", "mins", "hours", "days", "weeks")) structure(x, class = "difftime", units = units ) } new_difftime(c(1, 10, 3600), "secs") #> Time differences in secs #> [1] 1 10 3600 new_difftime(52, "weeks") #> Time difference of 52 weeks ``` #### validator 越複雜越需要check，以factor來說： * x是integer * levels是character 但是雖然有check type，但是如果格式錯誤，依然會產生ERROR ```gherkin=1 new_factor <- function(x = integer(), levels = character()) { stopifnot(is.integer(x)) stopifnot(is.character(levels)) structure( x, levels = levels, class = "factor" ) } new_factor(1:5, c("a","b","c","d","e")) #> [1] a b c d e #> Levels: a b c d e new_factor(1:5, "a") #> Error in as.character.factor(x): malformed factor new_factor(0:1, "a") #> Error in as.character.factor(x): malformed factor ``` 但是要在一個function裡面鉅細彌遺地寫出所有check，會佔據太多版面，還沒看到main function，就沒力了.....orz 所以最好把它獨立出來，check歸check，也讓再用性更高 ```gherkin=1 validate_factor <- function(x) { values <- unclass(x) levels <- attr(x, "levels") if (!all(!is.na(values) & values > 0)) { stop( "All `x` values must be non-missing and greater than zero", call. = FALSE ) } if (length(levels) < max(values)) { stop( "There must be at least as many `levels` as possible values in `x`", call. = FALSE ) } x } validate_factor(new_factor(1:5, "a")) #> Error: There must be at least as many `levels` as possible values in `x` validate_factor(new_factor(0:1, "a")) #> Error: All `x` values must be non-missing and greater than zero ``` #### helper 有時候科技需要去符合墮性，比如說上面的new_difftime，我們輸入的是整數但在第一關不是double他就bye bye了這時第一種方法是寫一個help function來幫他轉type，並且assign default: ```gherkin=1 new_difftime(1:10) #> Error in new_difftime(1:10): is.double(x) is not TRUE difftime <- function(x = double(), units = "secs") { x <- as.double(x) new_difftime(x, units = units) } difftime(1:10) #> Time differences in secs #> [1] 1 2 3 4 5 6 7 8 9 10 ``` 另一種方式，猜想一些使用者可能的使用情境，並先作出預處理例如：先幫他match輸入與attribute ```gherkin=1 factor <- function(x = character(), levels = unique(x)) { ind <- match(x, levels) validate_factor(new_factor(ind, levels)) } factor(c("a", "a", "b")) #> [1] a a b #> Levels: a b ``` 第三種也是覆蓋一層，但是是預先assign default，如果使用這沒輸入的話也可以繼續執行 ```gherkin=1 POSIXct <- function(year = integer(), month = integer(), day = integer(), hour = 0L, minute = 0L, sec = 0, tzone = "") { ISOdatetime(year, month, day, hour, minute, sec, tz = tzone) } POSIXct(2020, 1, 1, tzone = "America/New_York") #> [1] "2020-01-01 EST" POSIXct(2020, 1, 12) #[1] "2020-01-12 CST" ISOdatetime(2020, 1, 12) #Error in ISOdatetime(2020, 1, 12) : # argument "min" is missing, with no default ``` ### How S3 generics and methods work UseMethod()倒底是用來幹嘛的呢？其實就是用來呼叫已經定義好的class 我們可以用s3_dispatch來觀察，一個指令下去S3 class後，他是呼叫哪一個function出來用例如: ```gherkin=1 > mean function (x, ...) UseMethod("mean") <bytecode: 0x55644e1445f0> <environment: namespace:base> #我們可以看到這個平常在用的function其實是呼叫了一個已經定義好的class，名字(name)叫mean #用methods來看看這個mean裡面，可以發現其實他裡面有很多的function，而我們一般輸入的話會去使用到的其實只有mean.default > methods(mean) [1] mean,ANY-method mean.Date mean.default mean.difftime mean.IDate* [6] mean.integer64* mean.ITime* mean,Matrix-method mean.POSIXct mean.POSIXlt [11] mean.quosure* mean,sparseMatrix-method mean,sparseVector-method mean.vctrs_vctr* mean.yearmon* [16] mean.yearqtr* mean.zoo* #另一個例子，使用s3_dispatch去看一個指令，使用到的是class下的哪一個function # => indicates the method that is called # * indicates a method that is defined, but not called x <- Sys.Date() s3_dispatch(print(x)) #> => print.Date #> * print.default ``` sloop::下面有許多功能，是可以解構跟觀察S3 class的好工具 ```gherkin=1 s3_methods_generic("mean") #> # A tibble: 7 x 4 #> generic class visible source #> <chr> <chr> <lgl> <chr> #> 1 mean Date TRUE base #> 2 mean default TRUE base #> 3 mean difftime TRUE base #> 4 mean POSIXct TRUE base #> 5 mean POSIXlt TRUE base #> 6 mean quosure FALSE registered S3method #> 7 mean vctrs_vctr FALSE registered S3method s3_methods_class("ordered") #> # A tibble: 4 x 4 #> generic class visible source #> <chr> <chr> <lgl> <chr> #> 1 as.data.frame ordered TRUE base #> 2 Ops ordered TRUE base #> 3 relevel ordered FALSE registered S3method #> 4 Summary ordered TRUE base ``` #### Creating methods There are two wrinkles to be aware of when you create a new method: First, you should only ever write a method if you own the generic or the class. R will allow you to define a method even if you don’t, but it is exceedingly bad manners. Instead, work with the author of either the generic or the class to add the method in their code. A method must have the same arguments as its generic. This is enforced in packages by R CMD check, but it’s good practice even if you’re not creating a package. There is one exception to this rule: if the generic has ..., the method can contain a superset of the arguments. This allows methods to take arbitrary additional arguments. The downside of using ..., however, is that any misspelled arguments will be silently swallowed72, as mentioned in Section 6.6. ### The four main styles of S3 objects: * vector * record * data frame * scalar 之前的例子幾乎都是vector， Record style objects use a list of equal-length vectors to represent individual components of the object. The best example of this is POSIXlt, which underneath the hood is a list of 11 date-time components like year, month, and day. Record style classes override length() and subsetting methods to conceal this implementation detail. ```gherkin=1 ## Date的example x <- as.POSIXlt(ISOdatetime(2020, 1, 1, 0, 0, 1:3)) x #> [1] "2020-01-01 00:00:01 UTC" "2020-01-01 00:00:02 UTC" #> [3] "2020-01-01 00:00:03 UTC" length(x) #> [1] 3 length(unclass(x)) #> [1] 9 x[[1]] # the first date time #> [1] "2020-01-01 00:00:01 UTC" unclass(x)[[1]] # the first component, the number of seconds #> [1] 1 2 3 ``` Data frames are similar to record style objects in that both use lists of equal length vectors. However, data frames are conceptually two dimensional, and the individual components are readily exposed to the user. The number of observations is the number of rows, not the length: ```gherkin=1 x <- data.frame(x = 1:100, y = 1:100) length(x) #> [1] 2 nrow(x) #> [1] 100 ``` Scalar objects typically use a list to represent a single thing. For example, an lm object is a list of length 12 but it represents one model. ```gherkin=1 mod <- lm(mpg ~ wt, data = mtcars) length(mod) #> [1] 12 ``` ### How inheritance works in S3 S3 classes can share behaviour through a mechanism called inheritance. Inheritance is powered by three ideas: * The class can be a character vector. For example, the ordered and POSIXct classes have two components in their class: ```gherkin=1 class(ordered("x")) #> [1] "ordered" "factor" class(Sys.time()) #> [1] "POSIXct" "POSIXt" ``` * If a method is not found for the class in the first element of the vector, R looks for a method for the second class (and so on): ```gherkin=1 s3_dispatch(print(ordered("x"))) #> print.ordered #> => print.factor #> * print.default s3_dispatch(print(Sys.time())) #> => print.POSIXct #> print.POSIXt #> * print.default ``` * A method can delegate work by calling NextMethod(). We’ll come back to that very shortly; for now, note that s3_dispatch() reports delegation with ->. ```gherkin=1 s3_dispatch(ordered("x")[1]) #> [.ordered #> => [.factor #> [.default #> -> [ (internal) s3_dispatch(Sys.time()[1]) #> => [.POSIXct #> [.POSIXt #> [.default #> -> [ (internal) ``` Before we continue we need a bit of vocabulary to describe the relationship between the classes that appear together in a class vector. We’ll say that ordered is a subclass of factor because it always appears before it in the class vector, and, conversely, we’ll say factor is a superclass of ordered. S3 imposes no restrictions on the relationship between sub- and superclasses but your life will be easier if you impose some. I recommend that you adhere to two simple principles when creating a subclass: The base type of the subclass should be that same as the superclass. The attributes of the subclass should be a superset of the attributes of the superclass. POSIXt does not adhere to these principles because POSIXct has type double, and POSIXlt has type list. This means that POSIXt is not a superclass, and illustrates that it’s quite possible to use the S3 inheritance system to implement other styles of code sharing (here POSIXt plays a role more like an interface), but you’ll need to figure out safe conventions yourself. --- #### NextMethod() NextMethod() is the hardest part of inheritance to understand, so we’ll start with a concrete example for the most common use case: [. We’ll start by creating a simple toy class: a secret class that hides its output when printed: ```gherkin=1 new_secret <- function(x = double()) { stopifnot(is.double(x)) structure(x, class = "secret") } print.secret <- function(x, ...) { print(strrep("x", nchar(x))) invisible(x) } x <- new_secret(c(15, 1, 456)) x #> [1] "xx" "x" "xxx" ``` This works, but the default [ method doesn’t preserve the class: ```gherkin=1 s3_dispatch(x[1]) #> [.secret #> [.default #> => [ (internal) x[1] #> [1] 15 ``` To fix this, we need to provide a [.secret method. How could we implement this method? The naive approach won’t work because we’ll get stuck in an infinite loop: ```gherkin=1 `[.secret` <- function(x, i) { new_secret(x[i]) } ``` Instead, we need some way to call the underlying [ code, i.e. the implementation that would get called if we didn’t have a [.secret method. One approach would be to unclass() the object: ```gherkin=1 `[.secret` <- function(x, i) { x <- unclass(x) new_secret(x[i]) } x[1] #> [1] "xx" ``` This works, but is inefficient because it creates a copy of x. A better approach is to use NextMethod(), which concisely solves the problem of delegating to the method that would have been called if [.secret didn’t exist: ```gherkin=1 `[.secret` <- function(x, i) { new_secret(NextMethod()) } x[1] #> [1] "xx" ``` We can see what’s going on with sloop::s3_dispatch(): ```gherkin=1 s3_dispatch(x[1]) #> => [.secret #> [.default #> -> [ (internal) ``` The => indicates that [.secret is called, but that NextMethod() delegates work to the underlying internal [ method, as shown by the ->. As with UseMethod(), the precise semantics of NextMethod() are complex. In particular, it tracks the list of potential next methods with a special variable, which means that modifying the object that’s being dispatched upon will have no impact on which method gets called next. --- #### subclassing When you create a class, you need to decide if you want to allow subclasses, because it requires some changes to the constructor and careful thought in your methods. To allow subclasses, the parent constructor needs to have ... and class arguments: ```gherkin=1 new_secret <- function(x, ..., class = character()) { stopifnot(is.double(x)) structure( x, ..., class = c(class, "secret") ) } ``` Then the subclass constructor can just call to the parent class constructor with additional arguments as needed. For example, imagine we want to create a supersecret class which also hides the number of characters: ```gherkin=1 new_supersecret <- function(x) { new_secret(x, class = "supersecret") } print.supersecret <- function(x, ...) { print(rep("xxxxx", length(x))) invisible(x) } x2 <- new_supersecret(c(15, 1, 456)) x2 #> [1] "xxxxx" "xxxxx" "xxxxx" ``` Then the subclass constructor can just call to the parent class constructor with additional arguments as needed. For example, imagine we want to create a supersecret class which also hides the number of characters: ```gherkin=1 new_supersecret <- function(x) { new_secret(x, class = "supersecret") } print.supersecret <- function(x, ...) { print(rep("xxxxx", length(x))) invisible(x) } x2 <- new_supersecret(c(15, 1, 456)) x2 #> [1] "xxxxx" "xxxxx" "xxxxx" ``` To allow inheritance, you also need to think carefully about your methods, as you can no longer use the constructor. If you do, the method will always return the same class, regardless of the input. This forces whoever makes a subclass to do a lot of extra work. Concretely, this means we need to revise the [.secret method. Currently it always returns a secret(), even when given a supersecret: ```gherkin=1 `[.secret` <- function(x, ...) { new_secret(NextMethod()) } x2[1:3] #> [1] "xx" "x" "xxx" ``` We want to make sure that [.secret returns the same class as x even if it’s a subclass. As far as I can tell, there is no way to solve this problem using base R alone. Instead, you’ll need to use the vctrs package, which provides a solution in the form of the vctrs::vec_restore() generic. This generic takes two inputs: an object which has lost subclass information, and a template object to use for restoration. Typically vec_restore() methods are quite simple: you just call the constructor with appropriate arguments: ```gherkin=1 vec_restore.secret <- function(x, to, ...) new_secret(x) vec_restore.supersecret <- function(x, to, ...) new_supersecret(x) ``` If your class has attributes, you’ll need to pass them from to into the constructor. Now we can use vec_restore() in the [.secret method: ```gherkin=1 `[.secret` <- function(x, ...) { vctrs::vec_restore(NextMethod(), x) } x2[1:3] #> [1] "xxxxx" "xxxxx" "xxxxx" ``` ### Dispatch detail What happens when you call an S3 generic with a base object, i.e. an object with no class? You might think it would dispatch on what class() returns: #### S3 and base types ```gherkin=1 class(matrix(1:5)) #> [1] "matrix" "array" ``` But unfortunately dispatch actually occurs on the implicit class, which has three components: The string “array” or “matrix” if the object has dimensions The result of typeof() with a few minor tweaks The string “numeric” if object is “integer” or “double” There is no base function that will compute the implicit class, but you can use sloop::s3_class() ```gherkin=1 s3_class(matrix(1:5)) #> [1] "matrix" "integer" "numeric" ``` This is used by s3_dispatch(): ```gherkin=1 s3_dispatch(print(matrix(1:5))) #> print.matrix #> print.integer #> print.numeric #> => print.default ``` This means that the class() of an object does not uniquely determine its dispatch: ``` x1 <- 1:5 class(x1) #> [1] "integer" s3_dispatch(mean(x1)) #> mean.integer #> mean.numeric #> => mean.default x2 <- structure(x1, class = "integer") class(x2) #> [1] "integer" s3_dispatch(mean(x2)) #> mean.integer #> => mean.defaultgherkin=1 ``` #### Internal generics Some base functions, like [, sum(), and cbind(), are called internal generics because they don’t call UseMethod() but instead call the C functions DispatchGroup() or DispatchOrEval(). s3_dispatch() shows internal generics by including the name of the generic followed by (internal): ```gherkin=1 s3_dispatch(Sys.time()[1]) #> => [.POSIXct #> [.POSIXt #> [.default #> -> [ (internal) ``` For performance reasons, internal generics do not dispatch to methods unless the class attribute has been set, which means that internal generics do not use the implicit class. Again, if you’re ever confused about method dispatch, you can rely on s3_dispatch(). --- #### Group generics Group generics are the most complicated part of S3 method dispatch because they involve both NextMethod() and internal generics. Like internal generics, they only exist in base R, and you cannot define your own group generic. There are four group generics: Math: abs(), sign(), sqrt(), floor(), cos(), sin(), log(), and more (see ?Math for the complete list). Ops: +, -, *, /, ^, %%, %/%, &, |, !, ==, !=, <, <=, >=, and >. Summary: all(), any(), sum(), prod(), min(), max(), and range(). Complex: Arg(), Conj(), Im(), Mod(), Re(). Defining a single group generic for your class overrides the default behaviour for all of the members of the group. Methods for group generics are looked for only if the methods for the specific generic do not exist: ```gherkin=1 s3_dispatch(sum(Sys.time())) #> sum.POSIXct #> sum.POSIXt #> sum.default #> => Summary.POSIXct #> Summary.POSIXt #> Summary.default #> -> sum (internal) ``` Most group generics involve a call to NextMethod(). For example, take difftime() objects. If you look at the method dispatch for abs(), you’ll see there’s a Math group generic defined. ```gherkin=1 y <- as.difftime(10, units = "mins") s3_dispatch(abs(y)) #> abs.difftime #> abs.default #> => Math.difftime #> Math.default #> -> abs (internal) ``` Math.difftime basically looks like this: ```gherkin=1 Math.difftime <- function(x, ...) { new_difftime(NextMethod(), units = attr(x, "units")) } ``` It dispatches to the next method, here the internal default, to perform the actual computation, then restore the class and attributes. Inside a group generic function a special variable .Generic provides the actual generic function called. This can be useful when producing error messages, and can sometimes be useful if you need to manually re-call the generic with different arguments. --- #### Double dispatch Generics in the Ops group, which includes the two-argument arithmetic and Boolean operators like - and &, implement a special type of method dispatch. They dispatch on the type of both of the arguments, which is called double dispatch. This is necessary to preserve the commutative property of many operators, i.e. a + b should equal b + a. Take the following simple example: ```gherkin=1 date <- as.Date("2017-01-01") integer <- 1L date + integer #> [1] "2017-01-02" integer + date #> [1] "2017-01-02" ``` If + dispatched only on the first argument, it would return different values for the two cases. To overcome this problem, generics in the Ops group use a slightly different strategy from usual. Rather than doing a single method dispatch, they do two, one for each input. There are three possible outcomes of this lookup: The methods are the same, so it doesn’t matter which method is used. The methods are different, and R falls back to the internal method with a warning. One method is internal, in which case R calls the other method. This approach is error prone so if you want to implement robust double dispatch for algebraic operators, I recommend using the vctrs package. See ?vctrs::vec_arith for details. ###### tags: `R` `S3`