Seminar - Artificial Intelligence (Text Summarization)

--- title: "Seminar - Artificial Intelligence (Text Summarization)" author: "Arie Levy" date: "2024" --- # **Seminar - Artificial Intelligence (Text Summarization)** ### **Arie Levy** --- ## **Table of Contents** 1. [Introduction](#introduction) 2. [Literature Review](#literature-review) - [Extractive Summarization](#extractive-summarization) - [Abstractive Summarization](#abstractive-summarization) 3. [Discussion and Conclusion](#discussion-and-conclusion) 4. [Bibliography](#bibliography) --- ## Introduction: Text is one of the traditional ways of communication between people. With the increasing availability of text data in electronic form, the handling and analysis of text using computers has gained popularity. Handling text data with machine learning methods has brought interesting challenges to the region that have been further expanded by the integration of some natural language features. As the methods were able to address more complex problems related to text data, Expectations are becoming greater and calling for more sophisticated methods, especially a combination of methods from different research fields including information retrieval, machine learning, statistical data analysis, data mining. Automatic text abstraction is becoming an integral part of many systems, pushing the boundaries of research capabilities toward what can be referred to as never-ending learning from text aimed at mimicking human learning methods. The article introduces the development of text abstraction research, pointing out interesting research problems that have been and are still being addressed in the research, sample tasks that have been addressed, and some challenges along the way. It is not always easy to understand what is being conveyed by a certain text, sometimes we "drown" in the large number of words of the text or in the loading of a lot of irrelevant information.Automatic text concisenesscan often contribute to understanding the text, gaining knowledge from the data provided in textual form, and realizing the basic facts conveyed through the text. As electronic media becomes widely used, the amount of texts in electronic form is rapidly increasing and still growing. While these texts are primarily aimed at human readers, it is not uncommon to use computer programs to manipulate texts. Text manipulation by computer programs has a wide range of uses ranging from enablingtext extraction, text editing, text storage and indexing for search and retrieval, document grading, document classification, Extracting information and knowledge, answering questions, etc. In this article, we present the development of AI researchon the subject of text summary. The article points out some interesting research problems that have been addressed and are still being addressed, and details some sample tasks that have been addressed, and some challenges along the way. The development of text summaries has advanced significantly since the 1950s until today, with each stage characterized by new technological methods and advanced techniques. The following is a breakdown of the development of the field in stages: 1 . The Initial Phase (1950–1990) The initial work in the field of textual conciseness began in the mid-20th century. In the 1950s, researchers relied mainly on simple statistical methods such as calculations of word frequency. These methods tried to identify words that were considered central to the text . For example : -TF-IDF algorithmand find the most important sentences based on the appearance of these words.This algorithm is actually a simple abstract summary, focused on identifying important keywords and sentences, without understanding the meaning of the text. 2. Early Modern Phase (1990–2015) In the 1990s and early 2000s, there was a significant change when natural language processing (NLP) techniques and rule-based methods and machine learning began to be used. Researchers began to focus on syntactic analysis of text and the creation of more advanced tools that can identify relationships between words, sentences, and paragraphs. This concisenessworks bysyntactic analysis and feature-based machine learning. For example: -TextRank algorithmbased on graphical grids and used for abstract summary.Developing law-based systems that have begun to understand the semantic connections between words. 3. Current Phase (2015–present) In 2015, there was another significant change, with the introduction of deep neural networks and models of. Transformer These technologies have brought to life the power of machine learning and artificial intelligence to create abstract abstractions. This concisenessproduces original and eloquent summaries based on a broad understanding of the text, in which the system not only selects sentences from the text, but also reformulates the main ideas. For example: BERT, GPT, CW, and. T-5 4. The Future Stage and Current Challenges The next stage involves the advanced implementation oftransformer-based models, a combination of domain knowledge and deep learning, and the development of additional capabilities such as multilingual text summary. However, there are many challenges such as creating more accurate summaries, preventing misleading summary, and improving the semantic understanding processes of the models. --- ## In summary: Technological advances in the field of artificial intelligence have led to the development of advanced methods for summarizing texts, which allow us to process and understand large amounts of information quickly and with great precision. A transition from traditional abstract methods, which focus on selecting the most important sentences from the original text, to complex abstract methods, which create a new and in-depth summary, illustrates the continuous improvement of tools in natural language processing. The use of advanced models such asBERTand CWmakes the process more intuitive and eloquent, thus advancing our ability to produce high-quality text summaries suitable for diverse needs. Literature Review: Textual concise using artificial intelligence is divided into two main types:abstractand abstract concise. The two types differ in their approach to the purpose of conciseness, and each has different benefits and uses depending on the case. Extractive Summarization: In abstract summary, the system selects existing sentences or paragraphs from the original text and rearranges them to create a shorter summary. The goal is to select the most important parts of the original text, without altering or paraphrasing the content. The algorithm ranks the sentences in the original text according to their importance using various techniques such as calculating the weight of words or sentences using models that rely on labels of importance. Key issues with automatic text summarization include how to identify the most important content out of the rest of the text and how to synchronize the material and draft summary text. In general, there are two different approaches to text summary: the choice-based approach and the understanding-based approach. Text summarization systems can choose to use shallow text features at the sentence or discourse level to locate the important sentences that will make up an "extract" summary. Such extraction methods often treat the "most important" content as the "most common" or "best-placed" content, thus avoiding any effort to understand deep text. They are easy to implement and are generally applicable to different textual genres, But it is usually very difficult to achieve performance that exceeds the level achieved so far. Methods and Tools: 1.) Extraction Summary Methods: Most practical text summarization systems today are based on extraction. A summary is created on the basis of extracting sentences and then rearranging the extracted sentences sequentially, without being rewritten. Different extraction methods use different text attributes to represent the content of the text. These attributes may include: thematic features based on term frequency statistics, placement features such as position in the text, placement in a particular paragraph or passage, background features such as terms from headings and headings in the text, hint words and phrases such as concluding cues within the text "in the summary", "our inquiry", bonus terms such as "significant", "impossible", and so on. Such features can be analyzed individually or selectively combined to create a function that is used to identify important words and meaningful sentences in the text. Sentence scores are calculated based on the assessment of the importance of words. Sentences that are concentrations of high-scoring words (meaningful words) are often the target sentences to be extracted. The earliest studies of text summarization began in the late 1950s with the pioneering work of Hans Peter Lohn, in which he invented a statistical method for extracting sentences that calculated a significance score for each sentence based on word frequency counting. Based on our research, we applied MEAD's sentence extraction techniques in summarizing countries' economic reports (IMF staff reports) and ICU nursing narratives. The results of the study showed that extraction-based text summarization methods are very good for professionally written reports such as economic reports, for example, while nursing narratives are more of a "work notes" nature than well-polished economic reports. Each nurse uses her own writing and phrasing style; abbreviations (standard or arbitrary), Slang, common spelling mistakes. Therefore, the summary method here did not provide good and accurate results. 2) Location-based method and referral method: Location-based methods weigh the words and sentences in the different parts of the document differently. Often sentences under headings, sentences near the beginning and end of a document or paragraph take on more weight than those in the middle; Sometimes they are simply automatically selected, for example in a lead system, sentences are added to the summary based on their position in the source articles only. Sometimes the position of a sentence in text is used to adjust the standard sentence score. 3) Random method: Simply composing randomly selected sentences from a document as a summary based on a random value between 0-1 assigned to each sentence. Random methods are often used as basic summaries. 4) Query method: Given a query (for example, a group of individual words, phrases, or short passages), a query-based method will calculate the similarity between the query and the sentences in the documents. Sentences with the highest similarity values will be selected for a summary essay. 5) Machine Learning-Based Method: A corpus-based approach, where both the original document collection and the corresponding model summaries are present (especially extraction summaries), empirical rules for extracting text snippets from documents can be learned using text classification methods. The summation problem becomes a problem of two-class classification. --- ## Abstractive Summarization: In abstract summary, the system not only selects sentences from the original text, but also creates new sentences based on an understanding of the content. This is a more complex method that requires advanced natural language processing to create new texts that summarize the content in an eloquent and concise manner. The algorithm uses advanced models that learn about the connections between words and sentences to create a new summary. The model is based on deep neural networks that perform word predictions according to the context, allowing it to create original content that represents the main ideas. While choice-based methods are better at quickly identifying snippets of text that carry "important" content, understanding-based methods are better at synchronizing the selected information. ### Methods and Tools: #### 1) Summary-Based Comprehension: A summary is sometimes the goal and sometimes a byproduct of a reading and comprehension process. Summary-based comprehension actually means three things: (i) understanding a text; (2) to find out what is important; (iii) rewrite a number of important messages to create coherent text, i.e., produce text. The key to implementing an understanding-based summary computationally is the ability to: (i) to correctly interpret the syntax and semantics of words and sentences (i.e., to link linguistic forms to meaning, to map natural language expressions to formal semantic representation), (ii) Deriving the information is extremely important through the proper operation/reasoning of the content that is officially described. (iii) Map the new formal representation of content to natural language expressions. None of these steps is a trivial task. In fact, each step involves a handful of very challenging topics. Research in computational linguistics deals with the first and third tasks and has proposed and developed impressive solutions. The theory of calculating with words would be a relevant approach in dealing with the second task. Cognitive models of reading comprehension and summary: The theoretical basis for understanding established text summation systems is found in cognitive models of reading comprehension. Among the various theories, the structureof the micro model, and the macro proposed by Kintschand van Dijk35,36 is perhaps the most influential. #### 2) Computing with Words (CW): Among the many different ideas about how meaning and knowledge can be represented in the human brain and machine, logic is more easily obtained by formally representingthe model, that is, taking its input and proposing a list of propositions that represent the meaning of a piece of text. The output is the semantic structure of the text at both the micro and macro levels represented in the form of a coherence graph. It is believed that such a structure will make it possible to summarize the full meaning of the text into its essence.Microstructure refers to the semantic structure of sentences. It reflects the individual propositions and the close neighborly relationships between them. Macrostructure presents the same facts as allmicrostructures, but describes them from a global perspective. A coherence graph contains a set of neat and connected propositions. The order of the relationship is determined in particular by the "relation relations" between the propositions in the form of an overlapping argument. Both the micro and macro structures are related by a set of semantic mapping rules called "macro rules" such as the deletion rule, the deletion rule, the generalization rule, and the construction rule, which are applied in "macro operations" that derive macro structures from microstructures. Macro operations are governed by a "schema", which is a formal representation of the reader's purpose and helps determine the relevanceand importance of propositions, And so which part of the text will be the essence of his words. The control schema can be determined according to the text genre or derived from the query description. ##### 2.1) Computing with words using fuzzy logic: The theory of fuzzy logic based on "computing with words" (CW) offers mathematical tools for formal representation and thinking about perceptual information, which are conveyed in text in natural language by inaccurate terms, concepts, classes, and chains of thinking and thinking. Thus it provides relevant methods for understanding-based summation systems. Until recently, the application of fuzzy logic in natural language comprehension was only sparsely discussed and scattered in the soft computing and computational linguistics literature, although the theoretical basis was laid in several articles by Prof. Lotfi Zadeh decades ago. The term "computing with words" (CW) was coined in the mid-1990s and indicates a relatively new emphasis in the development of a fuzzy theory, in order to meet the need for better methods of representation and reasoning with perceptual information. ##### 2.2) Options for Implementation in Text Summary: The Kintsch-Dijk modelrecognizes that a macro structure can capture the most vital information indicated by a sequence of claims and thus represent the essence of the text passage. Thus, a summary can be created by deriving macro structure andmicrostructure by deleting details, deleting irrelevant suggestions, generalizing multiple claims, and constructing new claims. This process occurs recursively on sequences of micro suggestions as long as the constraints on the rules are met. This is similar in a way to the process of constraint propagation in CW. Both micro and macro suggestions indicate both solid facts and soft perceptions that can be institutionalized as general constraints. The spread of constraints will derive conclusions from facts presented by the collection of claims. This process is driven by suggestions for a query or a topic description. And the process of introducing macro suggestions from micro suggestions can be realized as a distribution of therapy rather than semantic frameworks and networks, and already plays a significant role in language processing. A natural language text contains rich prepositions that can be generally regarded as propositions that are themselves appropriate for logical representations and actions. --- ## Discussion and Summary: Pros and cons: It is a well-known fact that there is a sharp contrast and mismatch between the formality and accuracy of classical logic and the flexibility and variation of natural languages. Natural language text is brimming with perceptual information that is inaccurate and inherently vague. Despite its widespread application in NLP systems, classical logics such as FOPC (First-Order Predicate Calculus) have significant limitations in terms of expressing uncertain or inaccurate information and knowledge.In addition, the extraction techniques appear to function reasonably well when the text is handled by professional writers.In contrast, the results from the use of simple extraction techniques on nursing narratives are quite disappointing. Fuzzy logic, on the other hand, provides the means to make qualitative values more accurate by introducing the possibility of representation and action on different quantifiers, which help maintain close connections to natural language. Thus, in principle, the inaccuracy and ambiguity of terms, concepts, and meaning in natural language text can be largely addressed quantitatively using the word-to-word computing method. In conclusion: As we detailed in the article, there aretwomain types of abstracts (and they have different models): - Abstract Abstract: • Extraction summary methods: These methods are based on extracting sentences from the original text and presenting them as a new sequence without changing the wording. The method relies on features such as the frequency of terms, position in a sentence or document, relevant headings, and hinting words. • Location-based method and reference method: These methods weigh the words and sentences according to their position in the document. For example, sentences in the headings, at the beginning and end of paragraphs, are considered more meaningful. In many cases, the method simply selects sentences according to their position without the need for in-depth evaluation. • Random Method: A simple method in which sentences are chosen at random based on a random value between 0 and 1. This method is sometimes used as a basic summary that represents the original text in a completely random manner. • Query method: The method compares sentences in a document to a given query (e.g., a set of words). The sentences that most closely resemble a query are chosen for conciseness. Useful when there is a specific topic that the summary needs to emphasize. • Machine Learning-Based Method: This approach is based on a corpus, where there is a collection of custom documents and summaries. The method studies the relationship between the original content and the summary and uses text classification techniques to identify important sentences for inclusion in the summary. - Abstract Abstract: • The Kintsch-Dijk model presents a useful framework for us to understand the mental processes that take place in reading comprehension. However, one notable limitation in the model is that the formulation of the structure of the meaning of the discourse is based solely on a coherent relational relationship that is not necessarily equal to meaning relations. • CW Method: In summary, text presupposes an accurate natural language processing function that can transform free text into natural language suggestions and then into a form of general constraints. This assumes that the finite variables and limiting relationships can be reliably defined, which poses a serious obstacle. With regard to textual information processing, CW's theory offers a very different approach than traditional approaches to computational linguistics. It seems to offer a better framework and a more appropriate methodology for meaningful representation and reasoning expressed in natural language than in classical logic. The proposed mechanism of thinking for distributing constraints can help derive new constraint statements fromexisting groups. An advantage application for it would be to generate natural language text answers to natural language queries. A text summary can benefit from such an approach when the required summary can be created as a collected answer to a collection of queries about a particular topic or object. CW is therefore one of the many tools that will come in handy in summarizing text. However, there are many challenging issues in implementing a CW framework. A large amount of arguments contained in text can easily prevent the computing framework with words from being useful. Text summarizing requires a combination of shallow and deep text analysis methods. The various solutions for summarizing text range from sophisticated but computationally difficult models to be implemented, to more crudeand less precise butcomputationally efficient techniques. The development in the field of natural language processing and computational linguistics has yielded fairly good analytical techniquesand an abundance of lexical and ontological resources, and the field is still on the rise and continues to develop. --- ## Bibliography: 1. Mladenić, Dunja; Marko the Gravedigger (Mars 2013) "Automatic Text Analysis by Artificial Intelligence" https://www.proquest.com/docview/1353021094?sourcetype=Scholarly%20Journals 2. Mike Lewis*, Yinhan Liu*, Naman Goyal*, Marjan Ghazvininejad, Abdelrahman Mohamed, Omer Levy, Ves Stoyanov, Luke Zettlemoyer (October 2019) "BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension" https://paperswithcode.com/paper/bart-denoising-sequence-to-sequence-pre 3. Shuhua Liu (October 2009) "EXPERIENCES WITH AND REFLECTIONS ON TEXT SUMMARIZATION TOOLS" 4. McKeown K., R. Passonneau, D. Elson, A. Nenkova and J. Hirschberg, (2005) “Do Summaries Help? A Task-Based Evaluation of Multi-Document Summarization” 5. Hovy E., Chin-Yew Lin, L. Zhou and J. Fukumoto,(2005) “Automated Summarization Evaluation with Basic Elements” 6. Salton, G., Wong, A., and Yang, C. S.,(1975) “A vector space model for automatic indexing" . 7. Papineni K., Roukos S., Ward T., and W. Zhu,(2002) “BLEU: a Method for Automatic Evaluation of Machine Translation”. 8. Lin, C-Y.,(2004) “ROUGE: A package for automatic evaluation of summaries”. 9. Nenkova, A., & Passonneau R. (2004). “Evaluating content selection in summarization: the pyramid method 10. Nenkova A., R. Passonneau, and K. McKeown,(2007) “The Pyramid Method: Incorporating human content selection variation in summarization evaluation" 11. Zadeh L A., (1996) “Fuzzy Logic = Computing with Words. 12. Zadeh L A., (1999) “From Computing with Numbers to Computing with Words -- From Manipulation of Measurements to Manipulation of Perceptions". 13. Kintsch W. and van Dijk T A.(1978) "Toward a model of text comprehension and production, Psychol". 14. Paice C. D.,(1990) “Constructing literature abstracts by computer: techniques and prospects”. 15. Salton G., J Allen, C Buckley, A Singhal, (1994) “Automatic analysis, theme generation, and summarization of machine-readable texts". 16. Hovy E. and C. Lin,(1999) “Automated Text Summarization in SUMMARIST”. 17. Mani I. and M. T. Maybury (1999) “Automatic Summarizing : Factors and Directions”. 18. Luhn H. P.(1999) “The Automatic Creation of Literature Abstracts”. 19. Climenson, W. D., Hardwick, N. H., Jacobson, S.N. (1961) "Automatic Syntax Analysis in Machine Indexing and Abstracting". 20. Carbonell J. G. and J. Goldstein, (1998) “The Use of MMR, Diversity-Based Reranking for Reordering Documents and Producing Summaries”. 21. Marcu D., (1999) “The automatic construction of large- scale corpora for summarization research". 22. Marcu D. and L. Gerber (2001) “An Inquiry into the Nature of Multidocument Abstracts, Extracts, and Their Evaluation". 23. McKeown K. and D. R. Radev, (1999) “Generating Summaries of Multiple News Articles”. 24. Radev D., Allison T., Blair-Goldensohn S., Blitzer J., Celebi A., Drabek E., Lam W., Liu D., Qi H., Saggion H., Teufel S., Topper M. and A. Winkel, (2003) The MEAD Multidocument Summarizer, MEAD Documentation. 25. Radev D., Jing H., Stys M. and D. Tam, (2004) “Centroid-based Summarization of Multiple Documents". 26. Lacson R., R. Barzilay and W. Long (2006) "Automatic analysis of Medical Dialogue in the Home Hemodialysis Domain: Structure Induction and Summarization". 27. Galley M., (2006) “Automatic summarization of conversational multi-party speech”. 28. McKeown K., L. Shrestha and O. Rambow, (2007) “Using question-answer pairs in extractive summarization of email conversations”. 29. Lindroos J. Automated Text Summarization using MEAD: Experience with the IMF Staff Reports (2006). 30. Liu, S. and J. Lindroos, (2006) “Towards Fast Digestion of IMF Staff Reports with Automated Text Summarization Systems". 31. Liu, S., Hiisa M. and F. Sundell, (2007) “Automatic summarization of intensive care nursing narratives”.