# AI CREDIT GROUP
# 1) How different of RNN and CNN?
Convolutional neural networks (CNN) and recurrent neural networks (RNN) are used for completely different purposes: RNNs are good with series of data (one thing happens after another) and are used a lot in problems that can be framed as “what will happen next given…” while CNNs are especially good at visual cognitive problems like image classification, object detection.
Both CNN and RNN are different architecture’s of neural nets that perform well on different types of data. **RNN is able to process temporal information** that comes in sequences, such as a sentence for example. CNN is designed for **spatial information** that might correlated when come together. More details about differeces of CNN and RNN are descibed as follow.
**RNNs (recurrent neural networks)** are made up of one node. It is fed data then outputs a result back into itself, and continues to do this.
- Additonally, RNNs are networks that are designed to interpret temporal or sequential information. RNNs use other data points in a sequence to make better predictions. They do this by taking in input and reusing the activations of previous nodes or later nodes in the sequence to influence the output.
- Breakthroughs like LSTM (long short term memory) make it smart at remembering things that have happened in the past and finding patterns across time to make its next guesses make sense.
**CNNs (convolutional neural networks)** essentially have three parts, convolution layers, pooling layers, and fully-connected layers. It usually takes a 2D (sometimes more dimensions) matrix and outputs a result.
- Convolution starts at the top left and takes a small window with a certain width and height and performs an operation on that. The operation is usually a matrix multiplication where the matrix to multiply by is decided via gradient descent to get the best final results. It then moves according to a stride parameter and does the same. It does this all the way across the image and outputs a new image.
- Pooling is similar in the sense that it breaks the image down using small windows; however, the operation it runs on this small window is usually (average, max, or min) to combine the small window into a single pixel.
- After a set amount of convolutions and pooling, the final output is put through a fully connected layer, which is a conventional feed-forward neural network to output a result.
- You can think of the pooling and convolution layers as a form of image pre-processing similar to what was done in traditional computer vision (filters), except the parameters like the matrix in each convolution layer is decided by gradient descent.
Sources:
- [What is the Difference Between CNN and RNN?](https://lionbridge.ai/articles/difference-between-cnn-and-rnn/)
# 2) When we say RNN is linear, is the LSTM is included? I wonder that LSTM is not linear and how the RNN still be linear
I do not think that the LSTM is linear although RNN can be called as [NN with linear architecture](https://i.stack.imgur.com/eTs0t.gif).
- The reason is standard RNNs (Recurrent Neural Networks) suffer from vanishing and exploding gradient problems (while finding paramerters) which is the problem of linearity.
- LSTMs (Long Short Term Memory) deal with these problems by introducing new gates, such as input and forget gates, and the forget gates is the non-linear ability of LSTMs which allows for better control over the gradient flow and enable better preservation of “long-range dependencies”. The long-range dependency in RNN is resolved by increasing the number of repeating layer in LSTM.
Resources link:
- [Understanding LSTM Networks](http://colah.github.io/posts/2015-08-Understanding-LSTMs/)
# 3) How to design the feedback of CNN?
The feedback systems of CNN is still an open-research domain with a lot of beautiful explorations in recent years. Depending on my research, I can list down some paper and resources about this area:
- [Incorporating Feedback in Convolutional Neural Networks by Institute of Neural Information Processing, Ulm University published at ccneuro-2019](https://ccneuro.org/2019/proceedings/0000395.pdf).
Short summary: by comparing feedforward and feedback network susing a multi-digit classification task, quantifying performance as well as robustness against image noise, they show that the advantage of feedback networks which **add the feedback to the feedforward signal** is largely due to the increased receptive field size of their neurons. In addition, they also show that networks which **use modulating or subtractive feedback** (inspired by theories of feedback processing in cortex) outperform additive architectures and have increased robustness against noise.
- [Feedback-prop: Convolutional Neural Network Inference under Partial Evidence - University of Virginia published at CVPR-2018](https://openaccess.thecvf.com/content_cvpr_2018/papers/Wang_Feedback-Prop_Convolutional_Neural_CVPR_2018_paper.pdf).
Short summary: They propose an inference procedure for deep convolutional neural networks (CNNs) when partial evidence is available. **Their method consists of a general feedback-based propagation approach (feedback-prop)** that boosts the prediction accuracy for an arbitrary set of unknown target labels **when the values for a non-overlapping arbitrary set of target labels are known**.
They present two variants of feedback-prop based on layer-wise and residual iterative updates. Their experiment using several multi-task models and show that feedback-prop is effective in all of them. Conclusionally, they found that by optimizing the intermediate representations for a given input sample during inference with respect to a subset of the target variables, predictions for all target variables improve their accuracy. They proposed two variants of a feedback propagation inference approach to leverage this dynamic property of CNNs and showed their effectiveness for making predictions under partial evidence for general CNN models trained in a multi-label or multi-task setting.
Resources link:
- [Incorporating Feedback in Convolutional Neural](https://ccneuro.org/2019/proceedings/0000395.pdf)
- [Feedback-prop: Convolutional Neural Network ](https://openaccess.thecvf.com/content_cvpr_2018/papers/Wang_Feedback-Prop_Convolutional_Neural_CVPR_2018_paper.pdf)
- [Feedback Convolutional Neural Network for Visual Localization and Segmentation](https://www.computer.org/csdl/journal/tp/2019/07/08370896/13rRUwdIOTs)
- [Feedback based Neural Networks](https://web.stanford.edu/class/cs331b/2016/projects/wu_shen.pdf)
- [Feedback Convolutional Neural Network for Visual Localization and Segmentation](https://ieeexplore.ieee.org/document/8370896)
# 4) Do you have good resource/example of A.I in audio?
Artificial Intelligence (AI) is having a transformative effect on a huge range of industries, and the world of media and entertainment is no exception. Creators and machines are continuing to become more intertwined, with creative workflows taking on new shapes as AI-assistance gathers momentum.
> When it comes to audio workflows, there are three main areas where **AI is starting to have an impact: assisted mastering, assisted mixing and assisted composition**. All three are at slightly different points on the adoption scale.
> - For example, AI is already well established in the mastering process – despite this arguably being the most specialised area of music production. The goal of mastering is to make the listening experience consistent across all formats. The process varies across different formats (Spotify, CDs, movies, etc.) as each has different loudness constraints, making mastering extremely technical and potentially costly.
>
> There are very few skilled mastering engineers around, but AI is proving to be a viable and democratising alternative for many musicians. **By analysing data and learning from previous tracks, AI-powered tools enable less experienced engineers to quickly and easily achieve professional results**, albeit without the finesse of a human expert.
>
>Next, we come to assisted mixing which, although currently slightly behind mastering in terms of adoption, is developing fast. With so much content being created for OTT services such as Netflix and Amazon Prime, the volume of audio work happening in post is increasing dramatically. Facilities are therefore looking for ways to work faster and more cost-efficiently.
>
>Finally, there’s audio composition, another area of music production that is quickly realising the value of AI. More and more tools are using deep learning algorithms to identify patterns in huge amounts of source material and then utilising the insights generated to compose basic tunes and melodies.
> [source](https://www.psneurope.com/studio/ai-audio-workflows)
>
The development of Ai in audio related to several bold topics such as:
- Human voice recognition or text to speech with deep learning architect
- [mozilla-text2speech](https://github.com/mozilla/TTS)
- Deep learning for music or audio classification
- [Deep Learning for Music (DL4M) collection](https://github.com/ashishpatel26/Best-Audio-Classification-Resources-with-Deep-learning)
- Generating music with GAN
- [Wave GAN](https://github.com/mazzzystar/WaveGAN-pytorch)
# 5) I don't know how to analysis discrete data such as credit record, money transaction ... etc. There is a gap that I link up with those data to popular model of audio/image processing.
Since a General Artificial Itenlligence was not fully developed in real world, many model were created based on the problems and data format. For example, CNNs are usefull for image/video processing, RNNs are applied in text/audio processing and many other machine learning models desinged for discrete data in the big field of Data Analysis.
In reality, there exist combinations of the above things, the records are consider as sequential data (time-series) and RNNs are applied to predict what will happend next or discrete data are translated in to images, then apply model of image-processing:
- In the paper of [`A Convolutional Transformation Network forMalware Classification`](https://arxiv.org/pdf/1909.07227.pdf) based on [Microsoft Malware Classification Challenge](https://www.kaggle.com/c/malware-classification/leaderboard) the team introduce the method of casting the malware classification problem into the image classification task that can surpass all baselines and achieves 99.14% in terms of accuracy on the testing set,
- For online-platform shopping, the features extracted from produt's images can be incorporated with user's information and product's chareateristics to build [an effective visual recomender system](http://cs231n.stanford.edu/reports/2017/pdfs/105.pdf) that can surpass tradditional methods.
Most widely spread application of analysing discrete data like credit record, money transaction is credit scoring, which is used by banks to evaluate the potential risk of lending money and to determine customers, who are likely to bring in the most revenue. This problem has decades of research and development with many methods are used from mathematical to statistical and machine learning.
Let's say we have a history of 7 years bank transaction records of a group of people, to analyse the records and give a ranking of their credit of borrowing, we can first extract traditional variables like positive income shocks, balance returns, zero transactions, and relative cash expenditure, combine with client characteristics (demographic) and loan behaviour information then feed into a scoring/classification model to get the output. The model can be logistic regression, support vector machine, neural network, decision tree, random forest, gradient boosting, Naive Bayes, etc.
The features canbe selected with experts or automatic tools like [Deep Feature Synthesis](https://innovation.alteryx.com/deep-feature-synthesis/), [Auto Encoder](http://ufldl.stanford.edu/tutorial/unsupervised/Autoencoders/) with take all the information of user and create new variables to improve the accuracy of the models. When variables and models become black-boxes, explanatory models like feature importance, information values will help experts to understands the meaning behind the features and develop new theory. To detect behavior information like personal trends and preference, RNN based on transaction data is a powerful tool.
[Source](https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3616342)
# 6) Any example on the "Go" in Chess? I mean how does it work in technical aspect.
I do not know what exactly you mean with "Go" in Chess. If you meame`AlphaGo`, the AI that surpassed humanity in Go game last year, we will talk about Reinforcement Learning, that similar to semi-supervised machine learning.
Reinforcement Learning(RL) enables an agent to learn in an interactive environment by trial and error using feedback from its own actions and experiences. Unlike supervised learning that tries to inference the determined task given data, reinforcement learning uses rewards and punishment as signals for positive and negative behavior. And unlike unsupervised learning with the goal to find similarities and differences between data points, reinforcement learning tries to find a suitable action model that would maximize the total cumulative reward of the agent.
To apply reinforcement learning, it requires to define well the following components (like a game):
- Environment: the world or the field the agent is acting in,
- State: status/situation, described by agent parameters,
- Reward: feedback of environment, profit/loss when agent doing/changing somethings,
- Policy: what agent can/cannot do when it's in specific State,
- Value: what agent gets when doing somethings in specific State.
Since data is a big must, RL is most applicable in domains where simulated data is readily available like gameplay, robotics:
- Building AI for playing computer games. AlphaGo Zero is the first computer program to defeat a world champion in the ancient Chinese game of Go. Others include ATARI games, Backgammon, etc
- RL is used to enable the robot to create an efficient adaptive control system for itself which learns from its own experience and behavior,
- text summarization engines, dialog agents (text, speech) which can learn from user interactions and improve with time, learning optimal treatment policies in healthcare and agents for online stock trading.
[source](https://www.kdnuggets.com/2018/03/5-things-reinforcement-learning.html)