Product Line Event Generation in Anvil

# Product Line Event Generation in Anvil ![](https://i.imgur.com/P20wYO3.png) <small>Jack Crowley <br> September 2019</small> ## Overview This working paper introduces Forge's production of "Product Line" Events. The intended audience for this working paper is incoming Forge technical staff. This document may also serve to describe the technical event production process with others. A background knowledge of natural language processesing, lingusitics, semantics and general artificial intelligence is useful, but not required. A product line event is defined by they business use of a semantic event. For example, consider the semantic event "Fraud." It has participents, occurs at a certain time, and may have other attributes associated with it such as a dollar value. This fraud event may be simultaneously cast, depending on certain participants, as an Operational Risk event and a Criminal event. When a semantic event is cast to a specific product line event, that product line may impose additional restrictions on the event as well as a requirement for additional data elements. Continuing with the same example, casting the semantic event "fraud" to a OpRisk event imposes the restriction that one of the actors needs to be a banking institution. It also reuquires that additional information be applied to the OpRisk event that are not a part of the semantic event such as "Impact type", and "Business line". #### A Brief diversion to discuss Forge's concept of 'Event' Events represent system dynamics, they are the actions that affect change. In the Anvil platform evenets represent the activies conducted by or influencing companies, people or other entity type. Rarely do we as individuals witness an event, but they are often observed or detected by somene, and reported in the news or social media. In this sense, Forge acts as a sensor. We continually monitor information sources to detect the events being observed and reported. From this perspective events are history. Once it is seen and reported, it has already happened. So what use are they? ##### Time Series Analysis Considering events as time series elements, interesting causal models can be developed. ##### Events as Evidence As elemets in a Bayesian Belief Network, the observation of an event can be used as evidence of the state of a system or to infer other events. For example, from observing a wedding it can be inferred that the participants are now in the state "married," or if observing an arrest it can be inferred (with a certainly likelihood) that there was a prior "law breaking event." ##### Events as Narratives A narrative is a series of events connected via a theme; they tell the story. For example: * event 1: Person-A patented a new device * event 2: Person-A founded Company-X * event 3: Company-X raised $ * event 4: Comapny-X had an IPO * event 5: Company-Y challenged Company-X's patent in a lawsuit * etc Narratives include a set of participants and different event types occuring involving those participants over time. As a visual analogy you can think of the event timelines of different actors being interwoven over the duration of the narrative. Actors can be involved in multiple narratives at the same time. For example, there is a set of common actors and events associated with my "work narrative" and one associated with one I label my "home narrative."" ## NLP Pipeline Forge collects and processes public data from tens of thousands of sources in real time. The processing imultaneously informs our entity intelligence environment. It extracts and resolves themes, relationships, and entities as well as performaing a number of other functions such as topic modeling and calculating entity level sentiment. All of this data is accessible to our customers (<< see snowflake reference >>). One of the AI models that is applied identifies event "prototypes" in the form of thematic roles. A thematic role expresses the position a noun phrase plays with respect to an action. A partial list of thematic roles includes: actor, affected entity, benefactor, source, recipient, experiencer, etc. In Forge's Anvil environment thematic roles do not represent an event. Subsequent processes reason over the ontological types and resolved identities of the thematic roles, incorporate other semantic and topical information identified in the document to first produce semantic event, map these semantic events to specific business and industry use cases (product line events) and ultimately to event groups and event narratives. An outline of these processes follows. ## Generating Semantic Events from Thematic Role Data The unreasonable effectiveness of data ![](https://i.imgur.com/4pPP3Sd.png) ## Event Groups and Narratives ![](https://i.imgur.com/Yex8a7V.png) ## Generating Product Line Events from Semantic Events ![](https://i.imgur.com/o8uUgIy.png) ## Event Database Schema The database schema used for in the event production process is below. ![](https://i.imgur.com/GQwHWCi.png) The schema is color coded as follows: * Yellow tables represent the Snowflake schema tables that are referenced by the event tables. * Brown tables represent thematic role information extracted during the runtime processing of a document. This information is intended to be populated by the program that stores processed documents in the target database. * Green tables contain reference data. This reference data includes event type information, possible lexical units for the different event types, the mapping of events to product lines, and the association of product lines with meta information (e.g. basel category). * Blue tables are populated by the complex event processer discussed below and represent the mapping of thematic roles to specific event types. * Red tables are used to store the history of events as they are updated either automatically or during human related processes such as QC processes. ## Glossary #### Thematic Role #### Semantic Event #### Lexical Unit #### Event Group #### Narrative