Excel課小筆記
===
[Toc]
---
## W0 syllabus
### Course grades
1. **Homework assignments 40%**
> Numerical answers are provided, grading will be focused on methods/functions used.
2. **Discussion 10%**
> Each student will be required to participate at least twice in the discussion section on NTU COOL course website;
> It may be asking any course content-related question, or answering any course content-related questions;
> Need to submit discussion assignments #1 and #2 to receive
credit.
3. **Midterm exam 25%**
4. **Final exam 25%**
> Paper exam, closed book;
> Basic concepts, usage, and exceptions of EXCEL functions.
## W1 Basic Usage
- home tab
**format painter**: copy format of cells
**wrap text**: use for extra-long content
**conditional formatting**: great tool !
- formulas tab
**trace reference**: show formula relationship between cells
- data tab
**sort**
**text to columns**: split text into multiple columns
**remove duplicates**
- view tab
**freeze panes**: lock some rows or columns, and enable split view
- **selecting cells**
hold down **Ctrl** - select all chosen cells, rows, and columns
select cell A, press **Shift**, select cell B - select all cells between cell A and cell B
- **file format**
standard excel file **.xlsx** (**.xls** for old version)
comma separated values **.csv**
tab separated values **.txt** (or **.tsv**)
text file **.txt**
- within excel, we could open **.csv** or **.txt**; otherwise, within file explorer, we could also select "open with"
## W2 Cell References
- four main types of values in excel:
**ISTEXT(), ISNUMBER(), ISLOGICAL(), ISERROR()**
- **convert** between them

- input value can be keyed in directly or with equal sign
| ISTEXT() | ISNUMBER() |
| -------- | ---------- |
| ="123" | 123 |
| 2\*3 | =2\*3 |
- **find and replace** only searches for input values but not return values
- Usually, cell reference range will be auto-adjusted through **copy and paste**; however, we can place **\$** to avoid this. furthermore, we can choose to fix either rows or columns, such as "**=\$C\$35**", "**=\$C4+D\$3**"
- **LEFT()**: return the specified number of characters from the start of text or number
- cell reference targeting cells of other worksheets, such as "**=Sheet1!B2**", "**=SUM(Sheet1!B2:B5)**", "**='halo word'!A1**", "**=AVERAGE(Sheet1:Sheet2!A19:A21)**"
- hard-coding, such as "**=SUM(B2:B5)**"; to increase flexibility, we could apply the formula as the term of "**=SUM(INDIRECT(C7&":"&C8))**"
- **ADDRESS()**: return cell reference in text format
- **INDIRECT()**: return cell reference by text
- must know the difference between "**=INDIRECT("D4")**" and "**=INDIRECT(D4)**"
- place **&** to join text strings into a single text string
- **COLUMN()**: return the column number of the cell reference
- **INDEX()**: return value of specified relative cell in specified cell range
## W3 Conditional Logical Functions
- **IF()**: check for condition, then return specified value
- the logistic of true and false

- note that,
>`""` **ISTEXT()**, not **ISBLANK()**
> "**=IF(1,\,)**" will return 0 (**ISNUMBER()**)
> "**=IF("TRUE", "T", "F")**" will return T, but "TRUE" `not` **ISLOGICAL()**
> **TRUE** > **FALSE** > all text > all numbers
- **AND()**: return true if all the literals are true
- **OR()**: return true if at least one literal is true
- outer layer of **IF()** will be processed first, for example,

- **<>** is the operator to indicate `not equal to`, such as "**=3<>2**"
- **NOT()**: return the opposite logical value
- **IFERROR()**: return specified value if error happens
- • Possible error messages:
> #N/A → Search related functions could not find intended target
> #VALUE → Wrong type of input used for a formula or function, such as 5 + abc
> #REF! → Could not find targeted cell reference, usually occurs after deleting cells or data
> #DIV/0! → Can not divide by 0
> #NUM! → Invalid value in a formula or function, such as log0
> #NAME? → Misused or misspelled keywords in a formula or function
> #NULL! → Cell reference or cell reference range refers to a null collection of cells
- **IFNA()**: return specified value if #N/A happens
- **IFS()**: return the specified value of the first literal that is true
- **SWITCH()**: compare the target value with the list of the matched value, then replace it
- **XOR()**: return true if there is add number of literals are true
## W4 Statistical Functions I
- **COUNT()**: return the number of cells that contain `number`, such as "**=COUNT(A1:A6)**", "**=COUNT(5,2,6,9)**"

- **COUNTA()**: return the number of cells that are **not** empty
- **SUM()**: return the sum of the specified cells that are `number`

- **PRODUCT()**: return the product of the specified cells that are `number`
- **AVERAGE()**: return the mean of the specified cells that are `number`
- **AVERAGEA()**: similar to **AVERAGE()**, but `text` and `FALSE` are treated as `0`
- **STDEVA()** series:
- **STDEVP()**: return the **population** standard deviation of the specified cells that are `number`
- **STDEV()**: return the **sample** standard deviation of the specified cells that are `number`
- **STDEVA()**, **STDEVPA()**

- **MEDIAN()**: return the median of the specified cells
- **MODE()**: return the most common number that first appears of the specified cells
- **MODE.MULT()**: return all the multiple modes, note that array function is needed
- **MAX()**, **MIN()**, **MAXA**, **MINA**
- **LARGE()**: return the k-th largest number within the certain range
- **AVERAGE(LARGE(range, {1,2, ... ,k}))**: return the avarage of the top k numbers
- **SMALL()**, **AVERAGE(LARGE(range, {1,2, ... ,k}))**
- **ROUND()**: rounds the input value to the k-th digit after the decimal point, 「四捨五入」
- **ROUNDUP()**「一到九都入」
- **ROUNDDOWN()**/**TRUNC()**「一到九都捨」
- `IMPORTANT` When processing negative numbers, treats them first as positive numbers, performs rounding, then add the negative sign back
- **INT()**: rounds down the input value to the nearest integer
- **ABS()**
- **POWER()** can also directly input in the term of `=a^k`
- **SQRT()**
- **EXP()** means `e^k`, cannot be directlt inputted
- **LOG()**, **LOG10()**, **LN()**
- **RAND()**: returns the random value in the interval `[0,1)`
- **RANDBETWEEN(x,y)**: returns the random value in the interval `[x,y]`, where x, y is treated into integer
- Priority order for processing multiple operators:
> If in the same priority order, then **from left to right**; otherwise,
> 1) Reference operators “:” “ ” “,” (colon, space, comma)
• Union operator : SUM(A1:A3), SUM(A1, A2, A3)
• Intersection operator : SUM(D1:D4 D2:D5), equivalent to
SUM(D2:D4)
> 2) Negative sign “-”
> 3) Percentage “%”
> 4) Exponent “^”
> 5) Multiplication/Division “*”, “/”
> 6) Addition/Subtraction “+”, “-”
> 7) Join “&”
> 8) Comparison “=“, “>”, “<“, “>=“, “<=“, “<>”
## W6 Statistical Functions II
- **COUNTIF()**: counts the number of cells that meet the criteria, such as "**=COUNTIF(A1:A4, "<"&B1)**", "**=COUNTIF(G8:G19, SUM(4))**"
-countif will detect and convert the format of values in both data range and criteria, and values of different format are incomparable
- **wildcard**:
- `*` represents one or more text character
- `?` represents a single text character
- `~*`, `?` represents the corresponding character but not wildcard
- only applicable to conditional criteria
- "**=IF("GOLD">"GAS", "y", "n")**" acts like strcmp()
- **COUNTIFS()**: counts the number of cells that meet all the criteria
- the return value of countif is determined by comparison of combination of cells among cell reference ranges to all criteria

- **AND()**, **OR()** can be achieved in countif
- **SUMPRODUCT()**
- **SUMIF()**: return the sum of specified range that meets the criteria
- **SUMIFS()**, **AVERAGEIF()**, **AVERAGEIFS()**, **MAXIFS()**, **MINIFS()**
- **SEARCH()**: return the position of the text within the specified cell reference
## W5 Lookup and References
- **VLOOKUP()**: search for the matched lookup value
- matching method:
- `FALSE`: Exact match, returns the first matched one
- `TRUE`: Approximate match, returns the largest one that is not greater than the lookup value, pre-sorting data in ascending order is a prerequisite
- return `#N/A` if no match found
- `{=func}` indicates executed as an array function, such as "**{=SUM(VLOOKUP(H1, A1:F7, {3,4}, FALSE))**}" aquivalent to "**=SUM(VLOOKUP(H1, A1:F7, 3, FALSE), VLOOKUP(H1, A1:F7, 4, FALSE))**"
- **HLOOKUP()** similar but is in horizontal version
- **LOOKUP()** simplified version with only approximate match
- don't need the same dimensions between provided lookup range and return range
- automatically extend return range if needed
- use lookup range as return range if not provided
- **XLOOKUP()** pro version with more features, such as return value if no matches, more match methods, more search methods
- **MATCH()**: returns the row/col number of the lookup value that is matched
- **MATCH+INDEX** can do a X-Y search
- **CHOOSE()**: returns the k-th provided value/ cell reference/ cell reference range
## W7 Date and Time
- customise date format:
> yyyy, yy, m, mm, mmm, mmmm, mmmm, d, dd, ddd, dddd, aaa, aaaa
- **DATEVALUE()**: return the date `integer` part of the serial number for the provided date in text format since `1900/1/1`
- it will return `#VALUE!` if date DNE
- **WEEKDAY()**: return the day of week for the provided date
- `1900` is mistaken as leap year by Excel
- will return wrong value prior to `1900/3/1`
- **YEAR()**, **MONTH()**, **DAY()**
- **DATE()**: return the full date, and will automatically rollover and rollback if needed
- customise time format:
> h, hh, m, mm, s, ss, AM/PM
- **TIMEVALUE()**: return the time `decimal` part of the serial number
- **HOUR()**, **MINUTE()**, **SECOND()**
- **TIME()**: return the full time, and will also automatically rollover and rollback if needed
- **VALUE()**: return the full serial number including `date` and `time` part

- **TEXT()**: return the input value in specified format
- `ADD` and `SUB` of date and time is normally resolved in serial numbers, must use **TEXT()** to perform; On the order hand, we can extract into **YEAR()**, **MONTH()**, and so on to perform to avoid bypass the serial numbers, such as "**=TIME(HOUR()+x, MINUTE()+y, SECOND()+z)**"
- **EDATE()**: return the date after the number of month
- **EOMONTH()**: return the last date of month for the provided date after the number of month
- **NOW()**: return the current date and time
- **TODAY()**: return the current date
- **WEEKNUM()**: return the number of week since the first day of the year
- **DATEDIF()**: return the number of years/months/days between the two dates
- **NETWORKDAYS()**: return the number of workdays between the two dates
- use **NETWORKDAYS.INTL()** to customise the weekends
- **WORKDAY()**: return the date after the number of workdays
## W8 Figure Plots
- charts in excel:
- pie chart
- bar chart, histogram
- line graph
- scatter plot (散點圖)
- useful operations:
- directly merge charts by copy and paste
- Switch Row/Column
- combo with secondary axis
- sparkline (show within single cell)
- mini-bar graph
- **REPT()**: repeats provided text for a given number of times
- conditional formatting
- **CHAR(10)**: newline character
- Box Plot
<img src="https://hackmd.io/_uploads/Hy9285IZR.png" style="width: 50%; height: 50%">
- **PERCENTILE.INC()**: return the k-th percentile of the specified data for $0\leq k\leq1$
- **PERCENTILE.EXC()** for $0\lt k\lt1$
- **QUARTILE.INC()**: return the specified quartile of the specified data for $0\leq k\leq4\ where\ k \in \mathbb{Z}$
- **QUARTILE.EXC()** for $0\lt k\lt4\ where\ k \in \mathbb{Z}$
- ==HOW?== modify cell value, stacked column, switch row/col, no fill, error bars
- Gantt Chart
<img src="https://hackmd.io/_uploads/S1M-2qLZR.png" style="width: 60%; height: 60%; border: 1px solid gray;">
- ==HOW?== stacked bar, axis reversed, no fill, set bounds and units
- Step Chart
<img src="https://hackmd.io/_uploads/SyM8p9LbR.png" style="width: 60%; height: 60%; border: 1px solid gray;">
- ==HOW?== line graph, extend data into $(x_n, y_n)\ and\ (x_{n+1}, y_n)$
## W9 Dynamic Table
- could automatically adjust data range and formulas
- ==HOW?== select the date range you want, then select "insert" and "table"
- by **directly type in the formula** or **manually select the data range**
- even could auto-complete the formula for entire row/col
- Pivot Table
- ==HOW?== select "insert" and "pivot table"
- ==OP?== drag field, display hierarchy order, change display setting, display tabular format, filter, merge labels
- Pivot Chart
- ==HOW?== select "PivotTableAnalyse" and "PivotChart"
- could insert slicer to filter data
- Timeline
- ==HOW?== select "PivotTableAnalyse" and "InsertTimeline"
- this is a filter based on time
- ==OP?==
- link slicer of multiple pivot tables from same data
- add field directly in pivot table
- *prerequisities->* incorrect if there are multiple data entries for a single name/unit/type and other than basic addition/subtraction calculations are involved
- split pivot table into different worksheets
- ==HOW?== select "PivotTableAnalyse", "Options", "ShowReportFilterPages", and "SelectFilterField"
- double click to obtain raw data
## W10 Array Formula
- an array in excel means `a list of values represented as {...}`
- values separated by `,` in horizontal array, `;` in vertical array
- ==HOW?== select m x n cell range, enter "**={1,2,3,4}**", press **CTRL+SHIFT+ENTER**
- eg. "**{=A3:A6+B3:B6}**", "**{=A1:A2+B2:B3}**"
- potential errors :(
<img src="https://hackmd.io/_uploads/HyVjvzKGR.png" style="width: 80%; height: 80%; border: 1px solid gray;">
- picture to comprehend Array Formula
<img src="https://hackmd.io/_uploads/S1WAdzKM0.png" style="width: 80%; height: 80%; border: 1px solid gray;">
<img src="https://hackmd.io/_uploads/rJ0tuGKGR.png" style="width: 80%; height: 80%; border: 1px solid gray;">
- "**{=B6:B9*C6:C9}**" + "**=SUM(D6:D9)**" = "**{SUM(B6:B9*C6:C9)}**"
- "**{=IF(A8:A14="gg", B8:B14, "")}**" + "**{=LARGE(F2:F5, {1;4})}**" + "**=AVERAGE(G8:H8)**" = "**{=AVERAGE(LARGE(IF(A8:A14="gg", B8:B14, ""), {1;4}))}**"
- **Copy & Paste** also works for Array Formula, no need to **CTRL+SHIFT+ENTER** when copying
- Pairwise Table (there are several approaches)
- **AND & OR** not works with Array Formula, we could use **multiplication & addition** with combination of **TRUE & FALSE** instead
- `logical` will convert into `number`
## W11 Manipulating Text
- **LEFT(), RIGHT(), MID()**: return the specified number of characters
- **LEN()**: return the length for the given string
- `space, newline, decimal point` will be counted also
- adjustment allows up to 15 digits/ decimal places, and then performs **LEN()**
- Date: date format will be converted into serial number then **LEN()**; text format will directly **LEN()**
- "**0.33**", "**0.3333333333333330000**" will have **LEN()** of 17 regardless of how it displays
- error messages will not be processed
- **TRIM()**: remove leading and trailing `space` (not `newline`) and compress `spaces` between non-space chars into single one
- **CODE()**: return the ASCII code for the given char, while **CHAR()** is the reversed version
- **FIND()**: return the position of pattern within the given string
- to obtain multiple result, we could use Array Formula, or "**C4+1**"
- **SEARCH()** similar but case sensitive
- **CONCATENATE()**, **CONCAT()**: combines all texts and returns it
- **TEXTJOIN()**: combines all texts with delimiter and returns it
- **UPPER()**, **LOWER()**, **PROPER()**
- **REPLACE()**, **SUBSTITUTE()**
- **EXAXT()**: performs text comparison
- **TEXT()**: return the specified text format
- ==OP?== decimals, integers, thousand separator, thousand unit, fractions, percentage, scientific notation, date and time
- `0` vs `#` vs `?`
- `#` only displays significant digits, `0` displays 0 for non-significant digits, `?` displays space instead
- extract n-th word in text, by **Text to Columns** or Array Formula
## W12 Conditional Formatting
- Formats not changeable:
- font, size, superscript, subscript, width of border
- Properties can be used in rules:
- value, text, date, blanks, errors
- `blank` is not defined as how **ISBLANK()** does, while its def is "does not exist any non-space char"
- perform **applying order**, which is not the same as setting order
- ==OP?== stop if true, format painter
- format will automatically change when opening the file next time
- Rule Type:
<img src="https://hackmd.io/_uploads/SkGbN_gER.png" style="width: 80%; height: 80%; border: 1px solid gray;"><img src="https://hackmd.io/_uploads/ryxSV_eNA.png" style="width: 80%; height: 80%; border: 1px solid gray;">
## W13 Statistical Tests
- haha
## W14 Examples and Applications
- haha