tags: hw4, conceptual
# HW4 Conceptual: Language Models
Due March 20th at 6PM
Answer the following questions, showing your work where necessary. Please explain your answers and work.
We encourage the use of $\LaTeX$ to typeset your answers, as it makes it easier for you and us, though you are not required to do so.
Do **NOT** include your name anywhere within this submission. Points will be deducted if you do so.
## Conceptual Questions
1. What are the dimensions of an embedding matrix? What do they represent?
2. Given the following sentences (any relation to any real words is purely coincidental), plot reasonable embeddings in 2D for “Spongerobert”, “ixcented”, “avention”, “firtorented”, and “lainsignom”. (Hint: A simple graph with some clusters is fine.)
Spongerobert firtorented at the avention.
Then Spongerobert ixcented at his lainsignom.
Vadimward firtorented at the avention.
3. What are some benefits of using RNNs over trigrams (or n-grams generally speaking?)
4. What are LSTM cells? How are they different from Vanilla RNNs, and why are they able to ‘remember’ information for longer timeframes than vanilla RNNs? (Hint: Your answer should, at minimum, address the concepts of gates and gradients.)
5. (Optional) Have feedback for this assignment? Found something confus-
ing? We’d love to hear from you!
## Ethical Implications
ChatGPT is a deep learning model that generates human-like responses. To learn more about ChatGPT, check out [this link](https://www.assemblyai.com/blog/how-chatgpt-actually-works/). Deep learning models like ChatGPT have become increasingly accessible to the public. For this assignment, you will be inputting responses of your own to ChatGPT to identify flaws and analyzing the validity and consequences of the model's outputs.
1. While ChatGPT is a powerful tool for promoting creativity and summarizing texts, there are some things that the model has not been trained on. Play around with ChatGPT. Find 1-2 inputs to the model that resulted in misleading or incorrect outputs. Write out these prompts and responses. **Hint: think about what ChatGPT has not been trained on.**
2. Explain how these responses (or assumptions about ChatGPT's capabilities) can be detrimental. What steps has OpenAI taken to make its users aware of the model's pitfalls? (5-6 sentences)
Read [this article](https://www.nytimes.com/2023/02/16/technology/bing-chatbot-microsoft-chatgpt.html) from Kevin Roose at the New York Times that describes a reporter's interaction with the new AI-powered Bing chatbot from Microsoft, powered by a model that is more complex than ChatGPT.
3. What is your reaction? Do you agree with the author's statement that "A.I. has crossed a threshold?" Why or why not? (3-4 sentences)
Finally, read [this short opinion piece](https://www.nytimes.com/2023/03/08/opinion/noam-chomsky-chatgpt-ai.html) by renowned linguist Noam Chomsky.
4. Do you agree with his take on whether or not large language models can be truly intelligent? Does it even matter? (3-4 sentences)
## 2470-only Questions
1. The Gated Recurrent Unit (GRU) is another recurrent network cell that can, like the LSTM, retain information over long sequences. How is it able to do this? Describe its architecture, and compare it with that of the LSTM.
2. While we have studied Convolutional Neural Networks (CNNs) in the context of 2D images, CNNs can also be used for 1D sequence modeling tasks, such as language modeling. Look up some papers that have attempted this, and that compare CNN language models to RNN language models (cite which papers you read). What appears to be the general consensus on the pros and cons of the two approaches?