RSelenium
The aim is to scrape all the tables at
Data Analysis and Extraction - RSelenium tutorial
dir.R.packages <- "C:/Program Files/R/R-4.0.3/library"
#install.packages("tidyquant", lib = dir.R.packages)
#install.packages("RSelenium", lib = dir.R.packages)
library(RSelenium,lib.loc = dir.R.packages)
Download chrome driver. Note that mismatched versions may occur (e.g., version of the downloaded chrome driver does not support old version of chrome browser). The downloaded working version is ChromeDriver 89.0.4389.23
Download Selenium Server
Laucn cmd.exe, change directory to D:/My Software, where selenium-server-standalone-3.141.59.jar is located
D:
cd D:\My Software
# Execute the following command
java -Dwebdriver.chrome.driver="C:\drivers\chromedriver_win32\chromedriver.exe" -jar selenium-server-standalone-3.141.59.jar
Type the following in a new Google chrome webpage http://localhost:4444/
Create a new session at http://localhost:4444/wd/hub/static/resource/hub.html. Create a new session > select chrome as the browser
Learn More →
con <- RSelenium::remoteDriver(remoteServerAddr="localhost"
,port=4444
,browserName="chrome")
# Open the connection
con$open()
# Send an URL to the new session
con$navigate("https://quickfs.net/company/CKF:AU")
Learn More →
#-------------------------------------------------------------
# Dropdown= Overview
# Select "overview" from the dropdown list and then get the content of all tables
#-------------------------------------------------------------
tables <- htmlParse(con$getPageSource()[[1]]) # class(tables)
readHTMLTable(tables)
# Extracting tables
library(rvest, lib.loc = dir.R.packages)
x <- con$getPageSource()[[1]] %>%
read_html() %>%
html_table()
table.1 <- x[[1]] # class(table.1) "data.frame"
table.2 <- x[[2]] # class(table.1) "data.frame"
# Extract sub tables
names(table.1)
str(table.1)
# Reshape the table Valuation Ratios
library(tidyr, lib.loc = dir.R.package)
table.1.1 <- table.1[c(2:9),c(1:2)] %>%
dplyr::rename(name=`Key Statistics`
,value=`Key Statistics.1`) %>%
tidyr::pivot_wider(names_from = name, values_from=value)
# Clean column names
colnames(table.1.1) <- sub(x=colnames(table.1.1), pattern = "/", replacement = ".")
# Reshape the table 10-Yr Median Returns
table.1.2 <- table.1[c(2:4),c(3:4)] %>%
dplyr::rename(name=`Key Statistics`
,value=`Key Statistics.1`) %>%
tidyr::pivot_wider(names_from = name, values_from=value)
# Reshape the table 10-Year CAGR
table.1.3 <- table.1[c(6:9),c(3:4)] %>%
dplyr::rename(name=`Key Statistics`
,value=`Key Statistics.1`) %>%
tidyr::pivot_wider(names_from = name, values_from=value)
# Reshape the table 10-Yr Median Margins
table.1.4 <- table.1[c(2:5),c(5:6)] %>%
dplyr::rename(name=`Key Statistics`
,value=`Key Statistics.1`) %>%
tidyr::pivot_wider(names_from = name, values_from=value)
# Reshape the table Capital Structure
table.1.5 <- table.1[c(7:9),c(5:6)] %>%
dplyr::rename(name=`Key Statistics`
,value=`Key Statistics.1`) %>%
tidyr::pivot_wider(names_from = name, values_from=value)
#----------------------------------------
# Reshape the table with 10 year overview
#----------------------------------------
str(table.2)
table.2$X1 <- table.2$X1 %>%
gsub(x=., pattern = " ", replacement = ".") %>%
gsub(x=., pattern = "%", replacement = "percent")
base.table.2 <- data.frame()
iterators <- colnames(table.2)[2:11]
years <- table.2[1,c(2:11)] # dim(years) 1 10
item.names <- table.2[c(2:14),1]
for(i in 1:ncol(years)){
# Get column name by positio
year <- years[1,i] # "2011"
name <- colnames(years)[i]
# Reshape a single year of data to long format
.year.long <- data.frame( name=table.2[c(2:14), 1]
,value=table.2[c(2:14),i+1]
,stringsAsFactors = F)
.year.wide <- .year.long %>%
tidyr::pivot_wider(names_from = name, values_from=value) %>%
# Add year
dplyr::mutate(year=year) %>%
dplyr::select(year,everything())
# Vertically add the current year of data to the base data set
base.table.2 <- dplyr::bind_rows(base.table.2, .year.wide)
}
Inspect the web elments
Learn More →
<div _ngcontent-c1="" class="col-xs-offset-3 col-xs-2"><select-fs-dropdown _ngcontent-c1="" _nghost-c4="">
<div _ngcontent-c4="" class="btn-group open" dropdown="">
<button _ngcontent-c4="" class="selectDropdown dropdown-toggle" dropdowntoggle="" type="button" aria-haspopup="true" aria-expanded="true">
<div _ngcontent-c4="" class="dropdownLabel">Overview</div>
</button>
<!----><ul _ngcontent-c4="" class="dropdown-menu" id="select-fs-dropdown" role="menu">
<!----><li _ngcontent-c4="">
<a _ngcontent-c4="" id="ovr">Overview</a>
</li><li _ngcontent-c4="">
<a _ngcontent-c4="" id="is">Income Statement</a>
</li><li _ngcontent-c4="">
<a _ngcontent-c4="" id="bs">Balance Sheet</a>
</li><li _ngcontent-c4="">
<a _ngcontent-c4="" id="cf">Cash Flow Statement</a>
</li><li _ngcontent-c4="">
<a _ngcontent-c4="" id="ratios">Key Ratios</a>
</li>
</ul>
</div>
</select-fs-dropdown></div>
id.ovr <- con$findElement(using = 'id', value = "ovr")
Selenium message:no such element: Unable to locate element: {"method":"css selector","selector":"#ovr"}
(Session info: chrome=89.0.4389.90)
For documentation on this error, please visit: https://www.seleniumhq.org/exceptions/no_such_element.html
Build info: version: '3.141.59', revision: 'e82be7d358', time: '2018-11-14T08:25:53'
System info: host: 'CHANG-PC', ip: '192.168.0.167', os.name: 'Windows 7', os.arch: 'amd64', os.version: '6.1', java.version: '1.8.0_231'
Driver info: driver.version: unknown
Error: Summary: NoSuchElement
Detail: An element could not be located on the page using the given search parameters.
class: org.openqa.selenium.NoSuchElementException
Further Details: run errorDetails method
R - Rselenium - navigate drop down menu / list / box using = 'id'
The code above produces this plot
Feb 23, 2024Issue R files shown as 0 KB. R files reopened as empty in RStudio.Solution I had a similar issue with older R files that opened as empty. It turned out that RStudio didn’t use the correct encoding as default and therefore wasn’t able to read the file (presented the file as empty). You can make sure that you are using the correct encoding by:0. Copying R code to a text file
Dec 2, 2023ERROR An error occurred executing the workspace job “autoexec”. SDS Failed to provide the SAS workspace. SAS.EC.Directory.Model.SDSEXception
Dec 1, 2023Chang’s working examples created in R.
Nov 28, 2023or
By clicking below, you agree to our terms of service.
New to HackMD? Sign up