Generative AI - HackMD

# Generative AI #### Basic requirement | OS/Kernel | Version | |:-------------------------- |:----------------------------------------| | Host operating system | Linux Ubuntu 18.04 or 20.04 or 22.04 | | Kernel | 4.15.0.142.lowlatency | | GPU | GeForce RTX 3060 # Taxonomy of generative AI algorithms ![](https://hackmd.io/_uploads/Sk7ieSEe6.png) # Sentence VAE Implementation A re-implementation of the Sentence VAE paper, Generating Sentences from a Continuous Space. The paper talks about modelling sentences to latent space representations and allowing deterministic decoding of these latent vectors to produce well-formed sentences. This also allows interpolation between two vectors to produce coherent sentences. The inspiration for this implementation is from Tim Baumgärtner's repository. The data processing code and some helper functions have been taken from there. The main purpose of this project was to enhance my understanding of VAEs and implement this paper for learning purposes ## Training ```shell= ## https://github.com/shreyansh26/Sentence-VAE $ git clone https://github.com/shreyansh26/Sentence-VAE ## Use the download_data.sh file to get the Penn Treebank data. ## Training is as simple as executing $ python3 train.py ## To replicate the above results, I used the following config $ python train.py --learning_rate 0.001 --num_layers 2 --word_dropout_rate 0.2 --annealing_till 2500 ## Other configs can be seen from the train.py file. ``` ## Inference ```shell= ## To use the checkpoint provided with this repository, run ## python3 inference.py <path_to_checkpoint> -nl 2 $ python3 inference.py -c "bin/2023-09-29 19:30:55/epoch_9.pt" -nl 2 ## The configs have to be added based on the training configs of the model checkpoint being used. ``` ![](https://hackmd.io/_uploads/H1wJn44ea.png) # finetuned-gpt2-convai (based on Transformer) This repo is about finetuning gpt2 model provided by Transformers. I fine tuned the model in convai dataset. To learn about this code you can follow this tutorial : https://www.youtube.com/watch?v=elUCn_TFdQc&t=1189s&ab_channel=ProgrammingHut ```shell= $ git clone https://github.com/Pawandeep-prog/finetuned-gpt2-convai.git $ python3 ChatData.py $ python3 main.py ``` ![](https://hackmd.io/_uploads/S1o-MSNgp.png) # Text-Generation-with-LSTM writing from Andrej Karpathy's The Unreasonable Effectiveness of Recurrent Neural Networks. Given a sequence of characters from this data ("Shakespear"), train a model to predict the next character in the sequence ("e"). Longer sequences of text can be generated by calling the model repeatedly. Developed using Keras. Inspired by the following notebook: https://colab.research.google.com/github/tensorflow/docs/blob/master/site/en/tutorials/text/text_generation.ipynb#scrollTo=BwpJ5IffzRG6 ```shell= $ git clone https://github.com/OMEGAMAX10/Text-Generation-with-LSTM.git $ python3 LSTM Text Generation.py ``` # TextGAN-PyTorch https://github.com/williamSYSU/TextGAN-PyTorch # KGPT (based on Transformer) A custom GPT based on Zero To Hero utilizing tiktoken with the intent to augment AI Transformer-model education and reverse engineer GPT models from scratch. ```shell= $ git clone https://github.com/mytechnotalent/kgpt.git $ pip3 install tiktoken $ python3 kgpt.py ``` ![](https://hackmd.io/_uploads/Sy7I87reT.png) # FalconGPT (based on Transformer) Simple GPT app that uses the falcon-7b-instruct model with a Flask front-end. ```shell= $ git clone https://github.com/mytechnotalent/falcongpt.git $ pip3 install langchain $ python3 falcongpt.py ``` ![](https://hackmd.io/_uploads/H1wo_Xrxa.png) ![](https://hackmd.io/_uploads/r1UKnYGba.png) # Generative Semantic Communication: Diffusion Models Beyond Bit Recovery ### 📃 Abstract Semantic communication is expected to be one of the cores of next-generation AI-based communications. One of the possibilities offered by semantic communication is the capability to regenerate, at the destination side, images or videos semantically equivalent to the transmitted ones, without necessarily recovering the transmitted sequence of bits. The current solutions still lack the ability to build complex scenes from the received partial information. Clearly, there is an unmet need to balance the effectiveness of generation methods and the complexity of the transmitted information, possibly taking into account the goal of communication. In this paper, we aim to bridge this gap by proposing a novel generative diffusion-guided framework for semantic communication that leverages the strong abilities of diffusion models in synthesizing multimedia content while preserving semantic features. We reduce bandwidth usage by sending highly-compressed semantic information only. Then, the diffusion model learns to synthesize semantic-consistent scenes through spatially-adaptive normalizations from such denoised semantic information. We prove, through an in-depth assessment of multiple scenarios, that our method outperforms existing solutions in generating high-quality images with preserved semantic information even in cases where the received content is significantly degraded. More specifically, our results show that objects, locations, and depths are still recognizable even in the presence of extremely noisy conditions of the communication channel. ### 🎯 The GESCO framework ![](https://hackmd.io/_uploads/rkvp_LUxp.png) ### 📈 Main Results ![](https://hackmd.io/_uploads/H1JRdU8ep.png) ### 📋 How to use GESCO Train GESCO Install the file requirements.txt and, separately, conda install pytorch==1.12.1 torchvision==0.13.1 -c pytorch. ```shell= ## Run the following command: python3 image_train.py --data_dir ./data --dataset_mode cityscapes --lr 1e-4 --batch_size 4 --attention_resolutions 32,16,8 --diffusion_steps 1000 --image_size 256 --learn_sigma True --noise_schedule linear --num_channels 256 --num_head_channels 64 --num_res_blocks 2 --resblock_updown True --use_fp16 True --use_scale_shift_norm True --use_checkpoint True --num_classes 35 --class_cond True --no_instance False ## For Cityscapes: --dataset_mode cityscapes, --image_size 256, --num_classes 35, --class_cond True, --no_instance False. ## For COCO: --dataset_mode coco, --image_size 256, --num_classes 183, --class_cond True, --no_instance False. ## For ADE20K: --dataset_mode ade20k, --image_size 256, --num_classes 151, --class_cond True, --no_instance True. ``` Sample from GESCO Train your own model or download our pretrained weights here. ```shell= ## Run the following command: python3 image_sample.py --data_dir "./data" --dataset_mode cityscapes --attention_resolutions 32,16,8 --diffusion_steps 1000 --image_size 256 --learn_sigma True --noise_schedule linear --num_channels 256 --num_head_channels 64 --num_res_blocks 2 --resblock_updown True --use_fp16 True --use_scale_shift_norm True --num_classes 35 --class_cond True --no_instance False --batch_size 1 --num_samples 100 --model_path ./your_checkpoint_path.pt --results_path ./your_results_path --s 2 --one_hot_label True --snr your_snr_value --pool None --unet_model unet With the same dataset-specific hyperparameters, in addition to --s ## with is equal to 2 in Cityscapes and 2.5 for COCO and ADE20k. ## Our code is based on guided-diffusion and on SDM. ``` ```shell= scp Cityscapes_ema_0.9999_190000.pt eric@10.33.7.30:/home/eric/generative_ai/GESCO/weight ``` https://github.com/ispamm/GESCO # Custom data (GPT2 based on Transformer) ### Custom_data_training ```shell= import pandas as pd import numpy as np import re from PyPDF2 import PdfReader import os import docx # Functions to read different file types def read_pdf(file_path): with open(file_path, "rb") as file: pdf_reader = PdfReader(file) text = "" for page_num in range(len(pdf_reader.pages)): text += pdf_reader.pages[page_num].extract_text() return text def read_word(file_path): doc = docx.Document(file_path) text = "" for paragraph in doc.paragraphs: text += paragraph.text + "\n" return text def read_txt(file_path): with open(file_path, "r") as file: text = file.read() return text def read_documents_from_directory(directory): combined_text = "" for filename in os.listdir(directory): file_path = os.path.join(directory, filename) if filename.endswith(".pdf"): combined_text += read_pdf(file_path) elif filename.endswith(".docx"): combined_text += read_word(file_path) elif filename.endswith(".txt"): combined_text += read_txt(file_path) return combined_text # Read documents from the directory #train_directory = '/content/drive/MyDrive/ColabNotebooks/data/chatbot_docs/training_data/full_text' train_directory = '/home/eric/generative_ai/custom_data/' text_data = read_documents_from_directory(train_directory) text_data = re.sub(r'\n+', '\n', text_data).strip() # Remove excess newline characters #text_data = read_pdf('/content/drive/MyDrive/ColabNotebooks/data/chatbot_docs/Cell_Biology.pdf') #text_data = re.sub(r'\n+', '\n', text_data).strip() # Remove excess newline characters # Save the training and validation data as text files #with open("/content/drive/MyDrive/ColabNotebooks/data/chatbot_docs/combined_text/full_text/train.txt", "w") as f: # f.write(text_data) with open("/home/eric/generative_ai/custom_data/dci.txt", "w") as f: f.write(text_data) from transformers import TextDataset, DataCollatorForLanguageModeling from transformers import GPT2Tokenizer, GPT2LMHeadModel from transformers import Trainer, TrainingArguments def load_dataset(file_path, tokenizer, block_size = 128): dataset = TextDataset( tokenizer = tokenizer, file_path = file_path, block_size = block_size, ) return dataset def load_data_collator(tokenizer, mlm = False): data_collator = DataCollatorForLanguageModeling( tokenizer=tokenizer, mlm=mlm, ) return data_collator def train(train_file_path,model_name, output_dir, overwrite_output_dir, per_device_train_batch_size, num_train_epochs, save_steps): tokenizer = GPT2Tokenizer.from_pretrained(model_name) train_dataset = load_dataset(train_file_path, tokenizer) data_collator = load_data_collator(tokenizer) tokenizer.save_pretrained(output_dir) model = GPT2LMHeadModel.from_pretrained(model_name) model.save_pretrained(output_dir) training_args = TrainingArguments( output_dir=output_dir, overwrite_output_dir=overwrite_output_dir, per_device_train_batch_size=per_device_train_batch_size, num_train_epochs=num_train_epochs, ) trainer = Trainer( model=model, args=training_args, data_collator=data_collator, train_dataset=train_dataset, ) trainer.train() trainer.save_model() #train_file_path = "/content/drive/MyDrive/ColabNotebooks/data/chatbot_docs/combined_text/full_text/train.txt" train_file_path = "/home/eric/generative_ai/custom_data/dci.txt" model_name = 'gpt2' #output_dir = '/content/drive/MyDrive/ColabNotebooks/models/chat_models/custom_full_text' output_dir = '/home/eric/generative_ai/custom_data/' overwrite_output_dir = False per_device_train_batch_size = 8 num_train_epochs = 50.0 save_steps = 50000 # Train train( train_file_path=train_file_path, model_name=model_name, output_dir=output_dir, overwrite_output_dir=overwrite_output_dir, per_device_train_batch_size=per_device_train_batch_size, num_train_epochs=num_train_epochs, save_steps=save_steps ) ``` ```shell= $ TF_CPP_MIN_LOG_LEVEL=2 python3 custom_data_inference.py ``` ![](https://hackmd.io/_uploads/SyP180vea.png) ### Custom_data_inference ```shell= from transformers import PreTrainedTokenizerFast, GPT2LMHeadModel, GPT2TokenizerFast, GPT2Tokenizer def load_model(model_path): model = GPT2LMHeadModel.from_pretrained(model_path) return model def load_tokenizer(tokenizer_path): tokenizer = GPT2Tokenizer.from_pretrained(tokenizer_path) return tokenizer def generate_text(model_path, sequence, max_length): model = load_model(model_path) tokenizer = load_tokenizer(model_path) ids = tokenizer.encode(f'{sequence}', return_tensors='pt') final_outputs = model.generate( ids, do_sample=True, max_length=max_length, pad_token_id=model.config.eos_token_id, top_k=50, top_p=0.95, ) print(tokenizer.decode(final_outputs[0], skip_special_tokens=True)) model1_path = "/home/eric/generative_ai/custom_data/" sequence1 = "[Q] this message will send downlink information." max_len = 80 generate_text(model1_path, sequence1, max_len) # model2_path = "/home/eric/generative_ai/custom_data" # sequence2 = "[Q] There's the New Data Indicator (NDI)?" # max_len = 50 # generate_text(model2_path, sequence2, max_len) ``` ![](https://hackmd.io/_uploads/rJxdsAwla.png) https://www.youtube.com/watch?v=nsdCRVuprDY&ab_channel=DigitalSreeni # About A recurrent (LSTM) neural network in C https://github.com/Ricardicus/recurrent-neural-net ```shell= ## Do you have a mac or a Linux machine? In that case it is super easy to download and run. ## Mac or Linux (UNIX types) ## Open a terminal window and type: $ git clone https://github.com/Ricardicus/recurrent-neural-net/ $ cd recurrent-neural-net ## This will compile the program. You need the compiler 'gcc' which is also available for download just like 'git'. $ make ## If there is any complaints, then remove some flags in the 'makefile', I use 'msse3' on my mac but it does not work for my raspberry Pi for example. ## Then run the program: $ ./net data/dci.txt -lr 0.03 ## where datafile is a file with the traning data and it will start training on it. You can see the progress over time. ``` # C++ compile torch INSTALLING C++ DISTRIBUTIONS OF PYTORCH We provide binary distributions of all headers, libraries and CMake configuration files required to depend on PyTorch. We call this distribution LibTorch, and you can download ZIP archives containing the latest LibTorch distribution on our website. Below is a small example of writing a minimal application that depends on LibTorch and uses the torch::Tensor class which comes with the PyTorch C++ API. ```shell= $ wget https://download.pytorch.org/libtorch/nightly/cpu/libtorch-shared-with-deps-latest.zip $ unzip libtorch-shared-with-deps-latest.zip example-app/ build/ CMakeLists.txt example-app.cpp $ vim CMakeLists.txt cmake_minimum_required(VERSION 3.18 FATAL_ERROR) project(example-app) find_package(Torch REQUIRED) set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} ${TORCH_CXX_FLAGS}") add_executable(example-app example-app.cpp) target_link_libraries(example-app "${TORCH_LIBRARIES}") set_property(TARGET example-app PROPERTY CXX_STANDARD 17) # The following code block is suggested to be used on Windows. # According to https://github.com/pytorch/pytorch/issues/25457, # the DLLs need to be copied to avoid memory errors. if (MSVC) file(GLOB TORCH_DLLS "${TORCH_INSTALL_PREFIX}/lib/*.dll") add_custom_command(TARGET example-app POST_BUILD COMMAND ${CMAKE_COMMAND} -E copy_if_different ${TORCH_DLLS} $<TARGET_FILE_DIR:example-app>) endif (MSVC) $ vim example-app.cpp #include <torch/torch.h> #include <iostream> int main() { torch::Tensor tensor = torch::rand({2, 3}); std::cout << tensor << std::endl; } $ cmake -DCMAKE_PREFIX_PATH=`python -c 'import torch;print(torch.utils.cmake_prefix_path)'` $ cmake --build . --config Release $ ./example-app 0.4105 0.4872 0.2121 0.2660 0.0995 0.3632 [ CPUFloatType{2,3} ] ``` # MultiPDF Chat App The MultiPDF Chat App is a Python application that allows you to chat with multiple PDF documents. You can ask questions about the PDFs using natural language, and the application will provide relevant responses based on the content of the documents. This app utilizes a language model to generate accurate answers to your queries. Please note that the app will only respond to questions related to the loaded PDFs. ```shell= $ git clone https://github.com/alejandro-ao/ask-multiple-pdfs.git $ streamlit run app.py $ export 'PYTORCH_CUDA_ALLOC_CONF=max_split_size_mb:512' ``` ![](https://hackmd.io/_uploads/rkkl5jyba.png) # AI-Text-Generation ```shell= $ git clone https://github.com/marwanmusa/AI-Text-Generation $ pip3 install -r requirements.txt $ python3 GPTNeo.py # for GPTNeo $ python3 GPT2.py # for GPT2 ``` # Stable Diffusion Implementation - Text to Image ```shell= $ git clone https://github.com/marwanmusa/Text-to-Image-with-StableDiffusion $ pip3 install -r requirements.txt $ python3 app.py ## Generated image samples: ## Duck Skiing ``` # Screen ```shell= $ screen -S eric $ ctrl a+d $ screen -list $ screen -x 4192709.eric ```