# Haub Kurz Summer Project (see also the notes on [Transformers](https://hackmd.io/@alexhkurz/SJs_B0tlF) and [Semantic Parsing](https://hackmd.io/@alexhkurz/r1J-JZHBd)) What happens if we train an autencoder on Attempto and then apply it to ordinary English? Other work I have not yet looked at: - [TRANX: A Transition-based Neural Abstract Syntax Parser for Semantic Parsing and Code Generation](https://arxiv.org/pdf/1810.02720.pdf). 2018. - https://arxiv.org/pdf/1806.07832.pdf - http://attempto.ifi.uzh.ch/site/docs/syntax_report.html#Introduction - [Grammar-based Neural Text-to-SQL Generation](https://arxiv.org/pdf/1905.13326.pdf). 2019 - [Mapping Language to Code in Programmatic Context](https://arxiv.org/pdf/1808.09588.pdf). 2018. A chain of papers 2018-2021 issuing from CNL 2018: - [Rewriting simplified text into a controlled natural language](http://mural.maynoothuniversity.ie/13420/1/BD_cs_rewriting.pdf). In Proceedings [CNL 2018](https://ebooks.iospress.nl/volume/controlled-natural-language-proceedings-of-the-sixth-international-workshop-cnl-2018-maynooth-co-kildare-ireland-august-27-28-2018). - [Knowledge Extraction from Simplified Natural Language Text](https://aran.library.nuigalway.ie/bitstream/handle/10379/15492/2019hazemsafwatphd.pdf). PhD Thesis, 2019. - [Probing the Natural Language Inference Task with Automated Reasoning Tools](https://www.aaai.org/ocs/index.php/FLAIRS/FLAIRS20/paper/view/18431/17545). 2020. - [Fact-checking, False Narratives, and Argumentation Schemes](http://users.ics.forth.gr/~fafalios/KnOD2021/Knod2021_paper_9.pdf). 2021. A survey on translating natural languages to programming languages from ACM Computing Surveys 2018: - Alamanis etal [A Survey of Machine Learning for Big Code and Naturalness](https://dl.acm.org/doi/pdf/10.1145/3212695) ... - "Research at the intersection of machine learning, programming languages, and software engineering has recently taken important steps in proposing learnable probabilistic models of source code that exploit the abundance of patterns of code." - page 11 has a brief discussion of translating to AST - "probabilistic context free grammars (PCFG) are not a good model of statistical dependencies between code tokens" **Question:** Is it better to translate directly from English to ACE abstract syntax, or is it better to first translate English to ACE and then parse ACE? - [The Natural Language Programming (NLPRO) Project: Turning Text into Executable Code](http://ceur-ws.org/Vol-2075/NLP4RE_paper10.pdf) ## Software ### AceWiki docker pull tkuhn/acewiki docker run -p 9077:9077 tkuhn/acewiki ### Stanford Part-Of-Speech Tagger https://hub.docker.com/r/haub/stanford-postagger-socket docker pull haub/stanford-postagger-socket:version1.0 docker run -p 2768:2768 --name postagger haub/stanford-postagger-socket ### Attempto Parser https://hub.docker.com/r/haub/attempto-parser-socket docker run -p 2767:2767 --name ape_test haub/attempto-parser-socket after that docker start ape_test python ./client_ape.py -text "John waits." -cdrsxml -csyntax docker stope ape_test docker pull frnkenstien/corenlp docker run -p 9000:9000 --name coreNLP --rm -i -t frnkenstien/corenlp ### Grammatical Framework https://www.grammaticalframework.org/doc/gf-shell-reference.html https://github.com/Attempto/ACE-in-GF/pull/14 https://github.com/inariksit/ACE-in-GF/tree/update-RGL/grammars/ace ## General References - [constituency parsing](https://web.stanford.edu/~jurafsky/slp3/13.pdf), [dependency parsing](https://web.stanford.edu/~jurafsky/slp3/14.pdf) from [Jurafsky-Martin](https://web.stanford.edu/~jurafsky/slp3/) - [coreference resolution](https://en.wikipedia.org/wiki/Coreference#Coreference_resolution), [anaphora resolution](https://en.wikipedia.org/wiki/Anaphora_(linguistics)#Anaphora_resolution_%E2%80%93_centering_theory) from Wikipedia ## Specific References - Norvig: [On Chomsky and the Two Cultures of Statistical Learning](http://norvig.com/chomsky.html) - Hochreiter and Schmidhuber: [Long Short-Term Memory ](https://direct.mit.edu/neco/article/9/8/1735/6109/Long-Short-Term-Memory). 1997. - Pasupat and Liang: [Compositional Semantic Parsing on Semi-Structured Tables](https://cs.stanford.edu/~pliang/papers/compositional-acl2015.pdf). 2015. - notes on [Transformers](https://hackmd.io/@alexhkurz/SJs_B0tlF) (which seem to have replaced LSTM) ## Various Notes https://github.com/Attempto/ACE-in-GF ### Aug 12 https://github.com/danshaub/tranX http://attempto.ifi.uzh.ch/site/docs/syntax_report.html#Introduction - The [Zephyr ASDL Paper](https://www.cs.princeton.edu/~appel/papers/asdl97.pdf) - The [Python `pickle` module](https://realpython.com/python-pickle-module/) - What is [Zephyr ASDL](https://www.oilshell.org/blog/2016/12/11.html)? - parse tree vs AST https://stackoverflow.com/questions/5026517/whats-the-difference-between-parse-tree-and-ast%5D #### Earlier Notes https://developer.amazon.com/en-US/docs/alexa/ask-overviews/what-is-the-alexa-skills-kit.html https://link.springer.com/content/pdf/10.1007/s42979-020-00424-4.pdf https://dl.acm.org/doi/abs/10.1145/3411764.3445131 https://www.baeldung.com/cs/feature-vs-label