[Logging][Query] Server Log Selection

# Server Log Selection In previous flow, the server log are downloaded from servers through jump hosts certified by the client. On those machines, putting the logs onto the sftp server for the supplyer to download. # Requirement Adding a web resouce on one of the server to download the log without the jump hosts and putting the log onto the sftp server. # Scenarios In a routine of uses from the log, a cousumer 1. does a transaction, 2. downloads the log, and 3. seachs keywords in a time interval. # Problems The log may be such large 1. spent many time to download and 2. seen a small part in chaos content in the log. # Flow 1. First, we would like to cache the log download previously. The downloading only be triggered if the content is not found or a request of fored updating is triggered. As a result, the time to download is descreased. 2. Second, we only returns a time interval that the consumer did a transaction before and after a little minites or seconds from the time. # Techs ## Cache Log File ### Read-Write Pattern - If a request is reading a file, no one can write on the file but the other readings are allowed. - If a request is writing a file, no others can read or write the file. Fetch ``` while(read_count < 0) Download_Target.wait(); read_count++ // read read_count-- if (read_count == 0) Download_Target.notify(); ``` Download ``` while(read_count >= 0) Download_Target.wait(); read_count-- // download read_count++ Download_Target.notify() ``` ### Flow ``` Request | v Fetch --not found-- >download --- > Fetch | found | v | return < ------------------------------ ``` ### Definitions In below description, we only talk about the requests fetching on the same file. Because of if all requests are fetching different log, there are no issues in race condition. *step N* - step 1 return(download only), `.wait()` or to step 2 - step 2 read or download - step 3 `.notifyAll()` ### Situations Fetch step 1 - Fetch step 2 - due to read_count > 0, Fetch read_count++ to step 2 (Fetch step 1) - Fetch step 3 - Fetch read_count-- if(read_count==0) `.notify()` (Fetch step 3) - Download step 2 - due to read_count < 0, Fetch `.wait()` (Fetch step 1) - Download step 3 - Download read_count=0 `.notify()` (Download step 3) Download step 1 - Download step 2 - due to read_count < 0, Download `return` (Download step 1) - Download step 3 - Download read_count=0 `.notify()` (Download step 3) - Fetch step 2 - due to read_count > 0, Download `.wait()` (Download step 1) - Fetch step 3 - Fetch read_count-- if(read_count==0) `.notify()` (Fetch step 3) No one in step 2&3 Fetch step 1 - read_count == 0 read_count++ to step 2 Download step 1 - read_count == 0 read_count-- to step 2 ### Synchronization The `read_count` is calculated by many requests at a time, we have to `synchronize` it to prevent effect of race condition. To determine whether a request has to `.wait()` or not, the synchronization of `read_count` has to be included in the synchronization of `DownloadTarget`. The `.wait()` and `.notify()` has to be out of the synchronization of `read_count`. Because the `.wait()` and `.notify()` only returns the lock of itself, doesn't return its parent's locks. ```java= // Thread A synchronized (DownloadTarget) { // Thread A occupys lock of DownloadTarget synchronized (read_count) { // Thread A occupys lock of DownloadTarget and read_count DownloadTarget.wait(); // Thread A releases DownloadTarget only // Thread A occupys lock of read_count } } // Thread B synchronized (DownloadTarget) { // Thread B occupys lock of DownloadTarget synchronized (read_count) { // Thread B blocked due to Thread A is still occupying the lock of read_count DownloadTarget.notify(); } } ``` ```java= // Thread A synchronized (DownloadTarget) { // Thread A occupys lock of DownloadTarget synchronized (read_count) { // Thread A occupys lock of DownloadTarget and read_count } // Thread A occupys lock of DownloadTarget DownloadTarget.wait(); // Thread A releases DownloadTarget -> Thread A occupys 0 // Thread A requires lock of DownloadTarget } // Thread B synchronized (DownloadTarget) { // Thread B occupys lock of DownloadTarget synchronized (read_count) { // Thread B occupys lock of DownloadTarget and read_count } // Thread B occupys lock of DownloadTarget DownloadTarget.notify(); // Thread B releases DownloadTarget -> Thread B occupys 0 } ``` ### Fetch ```java= synchronized (Cache) { if not fetch yet initial record of the Cache return false } while (true) { synchronized (Download_Target) { synchronized (read_count) { if read_count >= 0 read_count++ break } Download_Target.wait() } } // Fetch content ... if not found return false else return true // Finally synchronized (Download_Target) { synchronized (read_count) { read_count-- if read_count == 0 Download_Target.notifyAll() } } ``` ### Download ```java= retry: while (true) { synchronized (Download_Target) { synchronized (read_count) { switch (read_count) { case 0: read_count-- break retry case -1: return false default: break } } Download_Target.wait() } } // download file // update Cache return true // Finally synchronized (Download_Target) { synchronized (read_count) { read_count = 0 Download_Target.notifyAll() } } ``` ## Log Entry ### Identify a Log Entry In the NTABS server, all logging is done by the log4j library. With the configuration setup in the NTABS server, each log entry contains some meta information about the log message, such as datetime, logging level, thread, message class and source information. These meta information is formatted in a specified format that a Regex statement can match the start of a log entry. The log format is as follow, where `%d` is date time, `%p` is logging level, `%t` is thread, `%c` is the message class and `%m` is log message. ```log4j %d|%-5p|%-50.50t|%-30.30c{1}| %m%n ``` Sample of log entry below: ```log 2022-04-13 17:00:00,000|DEBUG|WebContainer:31|MnmServletFilter| [MNMLOG] CATEGORY... ``` With a fixed format, we can use a Regex statement to capture the start of a log message by matching the datetime and log level slot a the message. ```regex |------------------datetime----------------| |-level-| \d{4}(?>-\d{2}){2} \d{2}(?>:\d{2}){2}\,\d{3}\|[ A-Z]{5}\|.* ``` > If you are interested about how this Regex statement works, go visit [Regex 101](https://regex101.com/r/1tVFqw/1) to see the explanation of the Regex statement.  Since all log entries start with the same meta information format, the Regex can match the start of a log message and the next after. The location of a log message in the Log file is found. ```java= private long findNextLog(long pointer) { try { this.logFile.seek(pointer); long index = pointer; // fetch next line and until reaching the start of next log message while (!LogFileAccessor.START_OF_LOG.matcher(this.logFile.readLine()).matches()) { index = this.logFile.getFilePointer(); } // set pointer at the start of the next log message this.logFile.seek(index); return index; } catch (IOException | NullPointerException e) { return -1; } } ``` ### Time-Index Relation For a very large log file (>500MiB), searching a specific logging time sequentially (linear seach) costs in the order of 10 minutes to perform a single search. That isn't really usable in real-life application. Since all log entries in the log file are ordered by time, such that we can treat the log file as a sorted array of log entries. Thus, binary search can be used to greatly improve the searching performance, the cost of perform a search using binary search algorithm is about 3 to 5 seconds in a 800MiB large log file. ```java= // non-recursive binary search while (start < end) { pointer = this.findNextLog((end + start) / 2); if (pointer == end) { // reach end of search, // pointer should point to the start of the targeted log entry break; } String log = Optional.ofNullable(this.getNextLog(pointer)).orElse("1970-01-01 00:00:00"); long time = LogFileAccessor.toUnixTimestamp(log.substring(0, TIME_LENGTH)); if (time < target) { start = pointer; } else { end = pointer; } } ``` > Fetch the starting index of a log entry using the `LogAccessor.findNextLog(long)` method, and read the first 19 characters of the log as the date time of the log entry. > This implementation of binary search in the server log service uses a non-recursive method, because the large size of log file sometime causes `StackOverflowError`. The time-index mappings fetched before are cached for increasing preformance of later searches. To further improve performance, previous search results can be cached to constraint searching range or even directly lookup if a direct hit occurred. ```java= private long fetchPointer(long target) throws IOException { // check bound if (target <= this.head) { return 0; } if (target > this.tail) { return this.logFile.length(); } // try cached int index = Collections.binarySearch(this.cache, target); if (index >= 0) { return this.cachePointer.get(index); } // find pointer index = -(index + 1); long pointer = this.tryPointer(index, target); // maintain cache this.cache.add(index, target); this.cachePointer.add(index, pointer); return pointer; } ``` ### Slice the Log For slicing the log file into smaller time range a start time and end time are given in the request. We can locate the start and end of the time range in the log file. Then by using the start and end locations to limit the read of a `FileInputStream` can achieve the read of only a specific part of a file. There is an implmentation of such limited input stream in the `Apache Commons IO` library(2.0+), [BoundedInputStream](https://commons.apache.org/proper/commons-io/javadocs/api-2.4/org/apache/commons/io/input/BoundedInputStream.html). However, the `Apache Commons IO` library in the Gaia project is an older version of 1.4, so we need to implement our own version of a limited input stream. So there is the `LimitedInputStream`, it is an `InputStream` compliance control layer on another input stream to provide starting position and read limit control on an input stream of any kinds. The `LimitedInputStream` uses a read counter to check and control the range that is allowed to read. ```java= @Override public int read(byte[] bytes, int offset, int length) throws IOException { this.init(); if (this.current < this.end) { length = this.readable(length); int read = this.inputStream.read(bytes, offset, length); this.current += read != -1 ? read : 0; return read; } return -1; } ``` With the Unix time to byte location conversion and the `LimitedInputStream`, we can get specific part of the log file with start time and end time input. The following code snippet show how the `LogFileAccessor` finds the targeted range and open a `LimitedInputStream` for reading. ```java= public InputStream getStream(long startTime, long endTime) throws IOException { long startIndex = this.fetchPointer(startTime); long endIndex = this.fetchPointer(endTime + 1); return new LimitedInputStream(new FileInputStream(this.file), startIndex, endIndex); } ``` ###### tags: `System Programming`