holtzermann17
    • Create new note
    • Create a note from template
      • Sharing URL Link copied
      • /edit
      • View mode
        • Edit mode
        • View mode
        • Book mode
        • Slide mode
        Edit mode View mode Book mode Slide mode
      • Customize slides
      • Note Permission
      • Read
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Write
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Engagement control Commenting, Suggest edit, Emoji Reply
    • Invite by email
      Invitee

      This note has no invitees

    • Publish Note

      Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

      Your note will be visible on your profile and discoverable by anyone.
      Your note is now live.
      This note is visible on your profile and discoverable online.
      Everyone on the web can find and read all notes of this public team.
      See published notes
      Unpublish note
      Please check the box to agree to the Community Guidelines.
      View profile
    • Commenting
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
      • Everyone
    • Suggest edit
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
    • Emoji Reply
    • Enable
    • Versions and GitHub Sync
    • Note settings
    • Note Insights New
    • Engagement control
    • Make a copy
    • Transfer ownership
    • Delete this note
    • Save as template
    • Insert from template
    • Import from
      • Dropbox
      • Google Drive
      • Gist
      • Clipboard
    • Export to
      • Dropbox
      • Google Drive
      • Gist
    • Download
      • Markdown
      • HTML
      • Raw HTML
Menu Note settings Note Insights Versions and GitHub Sync Sharing URL Create Help
Create Create new note Create a note from template
Menu
Options
Engagement control Make a copy Transfer ownership Delete this note
Import from
Dropbox Google Drive Gist Clipboard
Export to
Dropbox Google Drive Gist
Download
Markdown HTML Raw HTML
Back
Sharing URL Link copied
/edit
View mode
  • Edit mode
  • View mode
  • Book mode
  • Slide mode
Edit mode View mode Book mode Slide mode
Customize slides
Note Permission
Read
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Write
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Engagement control Commenting, Suggest edit, Emoji Reply
  • Invite by email
    Invitee

    This note has no invitees

  • Publish Note

    Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

    Your note will be visible on your profile and discoverable by anyone.
    Your note is now live.
    This note is visible on your profile and discoverable online.
    Everyone on the web can find and read all notes of this public team.
    See published notes
    Unpublish note
    Please check the box to agree to the Community Guidelines.
    View profile
    Engagement control
    Commenting
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    • Everyone
    Suggest edit
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    Emoji Reply
    Enable
    Import from Dropbox Google Drive Gist Clipboard
       Owned this note    Owned this note      
    Published Linked with GitHub
    • Any changes
      Be notified of any changes
    • Mention me
      Be notified of mention me
    • Unsubscribe
    # Hyperreal Enterprises Whitepaper Joseph Corneli July 21, 2020 (draft) # 1 Introduction Our vision is to make meaningful and rewarding work in the knowledge economy possible for everyone. This would have easily recognisable economic, humanitarian, social, and ecological benefits [24]. While the Internet and Web 2.0 have not achieved this goal, they give hope that it might be possible. The advances of the 1990s and early 2000s have considerably broadened access and, in the process, generated a large pool of open data. Hyperreal Enterprises plans to use this data to bootstrap AI tools that support knowledge work. Our first product will be an AI tutor that helps people learn how to program and connects them with practical projects. This whitepaper gives some background on this project, and summarises the technical state of play. # 2 Background PlanetMath was one of the early examples of “commons-based peer production” [3], launched by Aaron Krowne in 2000. For details, see Aaron Krowne’s Master’s thesis [13]. PlanetMath.org, Ltd., was incorporated as a nonprofit circa 2004. Further design considerations were developed in Joseph Corneli’s PhD thesis, which focused on building better support for “peer learning” on PlanetMath [5]. Since 2018, the contents of the PlanetMath encyclopedia are archived on Github, and the site is online but not actively maintained. In the mid-‘00s, several PlanetMath contributors discussed a further iteration of the basic design that would go far beyond being a collaboratively written but largely static repository, and become a computationally meaningful and increasingly complete symbolic mathematics software system. Whereas the field of computer mathematics has focused primarily on representing mathematics in logical formalisms, we imagined a system that could interface directly with mathematical texts as they are written by mathematicians and students. Success criteria would include passing preliminary exams, tutoring students, or writing original mathematics papers. Since this would effectively become a “simulation” of mathematics — much as mathematics itself might be thought of as a simulation of the real world — we called this largely notional project the Hyperreal Dictionary of Mathematics. Completing such an endeavour would require not only mathematical knowledge (and content), but also HCI, linguistics, AI, and organisational work. On the back of this speculative design we imagined an organisation that would use computers to represent still broader forms of knowledge, ranging from learning materials in other disciplines, to logistics and exchange systems. We believed that such representations might be used to solve problems far beyond mathematics. We called this hypothetical organisation Hyperreal Enterprises. In 2019, we officially incorporated a company in the UK under this name. The company will initially focus on “Research and experimental development on natural sciences and engineering.” More specifically, having observed that there is a large demand for technical talent and a corresponding under-production of the same, we decided to focus our spun-up enterprise on building software that can support technical training. Although “intelligent tutoring systems” is a long-established field, we have a new take on it, since we will build our tutoring system on top of a large collection of open data. Although the application area (to programming rather than mathematics) different, the technical facets of the project can be revisited under the previously mentioned high-level division. The intervening years have seen considerable improvements in all of these areas, with both off-the-shelf and research-grade software at our disposal for commercial exploitation. # 3 HCI We plan to present the learning materials in the form of an interactive game. We think that this, in itself, will be a considerable advance for new users, when compared learning on one’s own with only the debugger and written documentation — or relying on competitors which mainly teach using videos and minimal interaction — or relying on Q&A, which is not friendly or useful for beginners. ## 3.1 Game engine Game engines like Unity can support interaction and visualisation. Within a game engine, we plan several additional features to support users: - Storytelling Module (involve rewriting the core documentation of the language as well in a visual manner and integrating into the engine) - Interaction Module (e.g., notice when users get stuck and give hints) - Visualisation Module (to illustrate programming challenges, including internals and an interactive “world”) - Analytics Module (e.g., to be aware of how many hints are used and when and change the problem accordingly) ## 3.2 Content authoring The engine can provide a nice look and feel, but ultimately it will only be as good as the content, which will need a well thought through learning design. The initial workflow for content authoring will rely on gathering relevant problems from around the internet or other sources, look at how they depend on each other, and generating hints and links between content for each step where people could go wrong. All of this will then be saved as a playable file ready for the game engine. # 4 Linguistics Ultimately we want to be able to transform open source materials from sites like Stack Exchange and Github into instructional materials automatically. This is an ambitious proposal, but there are several recent precedents that could put it within reach over the next five years. ## 4.1 Named entity recognition and graph theory We should stress that we do not need to fully parse and process content to still get some benefit from tools integration. Even simple named-entity recognition provides some useful affordances [10], towards exposing graph structures which can then be reasoned about computationally [14]. ## 4.2 Argumentation theory We applied ideas from the field of argumentation to model mathematical creativity. One branch of this work focused on constrained models of argumentative process with formally derivable features [18]. Another was more open-ended and sought to identify the actual dialogue moves used [7], and to model them computationally [6]. ## 4.3 Vector-based and deep learning based language models Given progress within computational linguistics, it is reasonable to expect that argumentation structures that we are able to extract by hand could be identified automatically [11] [26]. Recent years have seen considerable and well-publicised advances in deep learning based language models. ## 4.4 Linguistics of technical languages In the current application we would not be looking at discourse around computer programming rather than mathematical language. However, the kinds of language and reasoning used are broadly similar, so it is worth pointing to fairly recent advances in linguistics applied to mathematical knowledge [8], and to related work in language-aware mathematical problem solving [9]. # 5 AI We are interested in understanding computer programs and supporting the process of learning how to program. Recent advances in code generation — largely based on the kind of AI that supports linguistics — are able to carry out some impressive feats of code generation. However, current technologies are not yet good at code explanation. ## 5.1 Dataflow Some of ideas related to from recent advances in computational linguistics have also been applied to computer languages. Of particular interest are data flow models such as code2vec [1] and “Neural Code Comprehension” [2]. ## 5.2 Arxana and AtomSpace Work begun in the mid-2000s on a framework for representing knowledge in computation-friendly formats resulted in the Arxana prototype (https://repo.or.cz/w/arxana.git). We explored using this together with our work on modelling the process of mathematical proofs [6]. Our work was inspired in part by the Conceptual Dependency diagrams of Schank (see [16]) and the Conceptual Graphs of Sowa [21]. AtomSpace (https://github.com/opencog/atomspace) is a broadly similar open source tool developed by the OpenCog Foundation [12]. These or similar graph technologies will be useful for representing domain knowledge and exposition in computational forms. # 6 Mathematics We are concerned with content that has an explicit computational interpretation (code), with accompanying expository text, as well as the with process of interaction with these materials. Some portion of these concerns can benefit from mathematical modelling, even when the contents are not themselves “mathematics” per se. ## 6.1 Category theory Ologs give a simple graphical formalism for representing knowledge objects [22] (with similar affordances to Conceptual Graphs). They are computationally equivalent to a data query language which supports efficient data integration from multiple sources [4], and which has a recent industry-grade implementation (https://conexus.com/cql). Monocl is a simple process modelling language based on category theoretic diagrams [17, 20] — that derive from data flow analysis. It was initially demonstrated as an abstraction layer over a collection of data science programs. In the present project, Monocl (or similar ideas) will be used to create a general purpose ontology of programs and programming. ## 6.2 Statistical analysis We have demonstrated the application of techniques for semiparametric analysis [23] to understand properties of learning behaviour at scale [5, Chapter 6]. This and other techniques can be adapted within the Analytics Module mentioned above to provide real-time feedback for learners. # 7 Organisation We plan to build on open source content. Others have already created various interesting and impressive analyses and applications that can inform our work. ## 7.1 Stack Exchange Many research papers have investigated Stack Exchange content. For our purposes question difficulty estimation [15] and models of learning efficacy [25] are particularly interesting pieces of prior work. Another interesting line of work connected Stack Overflow to the IDE, automatically sourcing relevant questions [19]. A limited demonstration of code autocompletion based on Stack Overflow questions was published in 2016 (https://emilschutte.com/stackoverflow-autocomplete/), and in 2017 Microsoft released a bot that works with Stack Overflow contents and retrieves relevant questions or code based on a textual query (https://github.com/Microsoft/BotFramework-Samples/tree/master/StackOverflow-Bot). ## 7.2 Peeragogy There are some overlap between Hyperreal Enterprises and the concerns of Peeragogy project, an open source collaboration that has been gathering design patterns for peer learning (http://www.peeragogy.org/). # 8 Conclusion We are working to develop an immersive, engaging and adaptive gamified online learning experience for people upskilling in tech. While this may bring to mind existing players in the EdTech space — such as Udacity, Udemy, EdX, Hacker Rank, Codecademy, Grasshopper, Treehouse, Lambda School, c0d3.com, freecodecamp.com, scrimba.com (among others), our solution is sufficiently different that the easiest way to explain it is that we are solving an essentially different problem. Whereas the organisations and tools mentioned support skill acquisition, they do not address the more challenging problem of modelling skilled performance, at scale and in detail. Accordingly, it is more accurate to think of Hyperreal Enterprises as a competitor to Andela, a startup that prepares people to deliver high-level offshoring services. However, we will be able to offer a wider range of training and services, ultimately covering the whole tech landscape. The advantages of our approach correspond to the disciplines we incorporate in our offering: HCI - When compared with the standard EdTech offerings we can deliver a much better price-to-performance ratio, through the use of engaging experiences built on top of meaningful models of code and programming practices. Linguistics - Wide and ultimately comprehensive coverage that, moreover, stays up to date as the field changes, without the need for expensive content development. AI - The potential to automate routine skill evaluation, supporting independent learning and/or an improved teacher-to-student ratio. Mathematics - A route to extending our models and learning materials to other domains of knowledge, across STEM and beyond. Organisation - The ability to certify of skilled practice, including soft skills and collaboration ability, that will be meaningful to employers. Until recently, it would not have been possible to create a computational model of Stack Exchange and Github without countless years of hand-coding, such that the map would be out of date by the time it was finished. Thanks to the developments surveyed above, we are now in a position to bring an innovative interactive upskilling interface to the world of open source software to market. To be certain, all of the areas touched on above will need further work, and realising our ambitions will require further research as well as a creative use of existing technologies. One purpose of this document is to help clarify the division of labour, and begin to outline some of the relevant interfaces between the technologies we depend on. References [1] Uri Alon et al. “code2vec: Learning distributed representations of code”. In: Proceedings of the ACM on Programming Languages 3.POPL (2019), pp. 1–29. [2] Tal Ben-Nun, Alice Shoshana Jakobovits, and Torsten Hoefler. “Neural code comprehension: A learnable representation of code semantics”. In: Advances in Neural Information Processing Systems. 2018, pp. 3585–3597. [3] Yochai Benkler. “Coase’s Penguin, or, Linux and “The Nature of the Firm””. In: The Yale Law Journal 112.3 (2002), pp. 369–446. issn: 00440094. url: http://www.jstor.org/stable/1562247. [4] Kristopher S. Brown, David I. Spivak, and Ryan Wisnesky. “Categorical data integration for computational science”. In: Computational Materials Science 164 (2019), pp. 127–132. issn: 0927-0256. doi: https://doi.org/10.1016/j.commatsci. 2019.04.002. url: http://www.sciencedirect.com/science/article/pii/S0927025619302046. [5] Joseph Corneli. “Peer produced peer learning: A mathematics case study”. PhD thesis. The Open University, 2014. url: http://oro.open.ac.uk/40775/. [6] Joseph Corneli et al. “Modelling the Way Mathematics is Actually Done”. In: Proceedings of the 5th ACM SIGPLAN International Workshop on Functional Art, Music, Modeling, and Design. FARM 2017. Oxford, UK: ACM, 2017, pp. 10–19. isbn: 978-1-4503-5180-5. doi: 10.1145/3122938.3122942. url: http://doi.acm.org/10.1145/3122938.3122942. [7] Joseph Corneli et al. “Argumentation Theory for Mathematical Argument”. In: Argumentation 33.2 (June 2019), pp. 173–214. issn: 1572-8374. doi: 10.1007/s10503-018-9474-x. url: https://doi.org/10.1007/s10503-018-9474-x. [8] Mohan Ganesalingam. “The language of mathematics”. In: The Language of Mathematics. Springer, 2013, pp. 17–38. [9] Mohan Ganesalingam and William Timothy Gowers. “A fully automatic theorem prover with human-style output”. In: Journal of Automated Reasoning 58.2 (2017), pp. 253–291. [10] Deyan Ginev and Joseph Corneli. “NNexus Reloaded”. English. In: Intelligent Computer Mathematics. Ed. by Stephen M. Watt et al. Vol. 8543. Lecture Notes in Computer Science. Springer International Publishing, 2014, pp. 423–426. isbn: 978-3-319-08433-6. doi: 10.1007/978- 3319-08434-3_31. url: http://dx.doi.org/10.1007/978-3-319-08434-3_31. [11] Deyan Ginev and Bruce R. Miller. Scientific Statement Classification over arXiv.org. 2019. arXiv: 1908.10993 [cs.CL]. [12] Hendy Irawan and Ary Setijadi Prihatmanto. “Implementation of graph database for OpenCog artificial general intelligence framework using Neo4j”. In: 2015 4th International Conference on Interactive Digital Media (ICIDM). IEEE. 2015, pp. 1–6. [13] Aaron Phillip Krowne. “An architecture for collaborative math and science digital libraries”. MA thesis. Virginia Tech, 2003. [14] Pierre Raymond de Lacaze. BABAR: Wikipedia Knowledge Extraction. [15] Jing Liu et al. “Question difficulty estimation in community question answering services”. In: Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing. 2013, pp. 85–90. [16] Steven L Lytinen. “Conceptual dependency and its descendants”. In: Computers & Mathematics with Applications 23.2-5 (1992), pp. 51–73. [17] Evan Patterson et al. “Teaching machines to understand data science code by semantic enrichment of dataflow graphs”. In: arXiv preprint arXiv:1807.05691 (2018). [18] Alison Pease et al. “Lakatos-style collaborative mathematics through dialectical, structured and abstract argumentation”. In: Artificial Intelligence 246 (2017), pp. 181–219. issn: 00043702. doi: https://doi.org/10.1016/j.artint.2017.02.006. url: http://www.sciencedirect.com/science/article/pii/S0004370217300267. [19] Luca Ponzanelli et al. “Mining StackOverflow to turn the IDE into a self-confident programming prompter”. In: Proceedings of the 11th Working Conference on Mining Software Repositories. 2014, pp. 102–111. [20] Ioana Monica Baldini Soares et al. Generating semantic flow graphs representing computer programs. US Patent 10,628,282. Apr. 2020. [21] John F Sowa. Knowledge representation: logical, philosophical and computational foundations. Brooks/Cole Publishing Co., 1999. [22] David I. Spivak and Robert E. Kent. “Ologs: A Categorical Framework for Knowledge Representation”. In: PLoS ONE 7.1 (Jan. 2012). Ed. by Chris Mavergames, e24274. issn: 19326203. doi: 10.1371/journal.pone.0024274. url: http://dx.doi.org/10.1371/journal.pone. 0024274. [23] Timothy Teravainen. “Semiparametric Estimation of a Gaptime-Associated Hazard Function”. PhD thesis. Columbia University, 2014. [24] Roberto Unger et al. Imagination unleashed: Democratising the knowledge economy. url: https://www.nesta.org.uk/report/imagination-unleashed/. [25] Utkarsh Upadhyay, Isabel Valera, and Manuel Gomez-Rodriguez. “Uncovering the Dynamics of Crowdlearning and the Value of Knowledge”. In: Proceedings of the Tenth ACM International Conference on Web Search and Data Mining. 2017, pp. 61–70. [26] Amy X Zhang, Bryan Culbertson, and Praveen Paritosh. “Characterizing online discussion using coarse discourse sequences”. In: Eleventh International AAAI Conference on Web and Social Media. 2017.

    Import from clipboard

    Paste your markdown or webpage here...

    Advanced permission required

    Your current role can only read. Ask the system administrator to acquire write and comment permission.

    This team is disabled

    Sorry, this team is disabled. You can't edit this note.

    This note is locked

    Sorry, only owner can edit this note.

    Reach the limit

    Sorry, you've reached the max length this note can be.
    Please reduce the content or divide it to more notes, thank you!

    Import from Gist

    Import from Snippet

    or

    Export to Snippet

    Are you sure?

    Do you really want to delete this note?
    All users will lose their connection.

    Create a note from template

    Create a note from template

    Oops...
    This template has been removed or transferred.
    Upgrade
    All
    • All
    • Team
    No template.

    Create a template

    Upgrade

    Delete template

    Do you really want to delete this template?
    Turn this template into a regular note and keep its content, versions, and comments.

    This page need refresh

    You have an incompatible client version.
    Refresh to update.
    New version available!
    See releases notes here
    Refresh to enjoy new features.
    Your user state has changed.
    Refresh to load new user state.

    Sign in

    Forgot password

    or

    By clicking below, you agree to our terms of service.

    Sign in via Facebook Sign in via Twitter Sign in via GitHub Sign in via Dropbox Sign in with Wallet
    Wallet ( )
    Connect another wallet

    New to HackMD? Sign up

    Help

    • English
    • 中文
    • Français
    • Deutsch
    • 日本語
    • Español
    • Català
    • Ελληνικά
    • Português
    • italiano
    • Türkçe
    • Русский
    • Nederlands
    • hrvatski jezik
    • język polski
    • Українська
    • हिन्दी
    • svenska
    • Esperanto
    • dansk

    Documents

    Help & Tutorial

    How to use Book mode

    Slide Example

    API Docs

    Edit in VSCode

    Install browser extension

    Contacts

    Feedback

    Discord

    Send us email

    Resources

    Releases

    Pricing

    Blog

    Policy

    Terms

    Privacy

    Cheatsheet

    Syntax Example Reference
    # Header Header 基本排版
    - Unordered List
    • Unordered List
    1. Ordered List
    1. Ordered List
    - [ ] Todo List
    • Todo List
    > Blockquote
    Blockquote
    **Bold font** Bold font
    *Italics font* Italics font
    ~~Strikethrough~~ Strikethrough
    19^th^ 19th
    H~2~O H2O
    ++Inserted text++ Inserted text
    ==Marked text== Marked text
    [link text](https:// "title") Link
    ![image alt](https:// "title") Image
    `Code` Code 在筆記中貼入程式碼
    ```javascript
    var i = 0;
    ```
    var i = 0;
    :smile: :smile: Emoji list
    {%youtube youtube_id %} Externals
    $L^aT_eX$ LaTeX
    :::info
    This is a alert area.
    :::

    This is a alert area.

    Versions and GitHub Sync
    Get Full History Access

    • Edit version name
    • Delete

    revision author avatar     named on  

    More Less

    Note content is identical to the latest version.
    Compare
      Choose a version
      No search result
      Version not found
    Sign in to link this note to GitHub
    Learn more
    This note is not linked with GitHub
     

    Feedback

    Submission failed, please try again

    Thanks for your support.

    On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?

    Please give us some advice and help us improve HackMD.

     

    Thanks for your feedback

    Remove version name

    Do you want to remove this version name and description?

    Transfer ownership

    Transfer to
      Warning: is a public team. If you transfer note to this team, everyone on the web can find and read this note.

        Link with GitHub

        Please authorize HackMD on GitHub
        • Please sign in to GitHub and install the HackMD app on your GitHub repo.
        • HackMD links with GitHub through a GitHub App. You can choose which repo to install our App.
        Learn more  Sign in to GitHub

        Push the note to GitHub Push to GitHub Pull a file from GitHub

          Authorize again
         

        Choose which file to push to

        Select repo
        Refresh Authorize more repos
        Select branch
        Select file
        Select branch
        Choose version(s) to push
        • Save a new version and push
        • Choose from existing versions
        Include title and tags
        Available push count

        Pull from GitHub

         
        File from GitHub
        File from HackMD

        GitHub Link Settings

        File linked

        Linked by
        File path
        Last synced branch
        Available push count

        Danger Zone

        Unlink
        You will no longer receive notification when GitHub file changes after unlink.

        Syncing

        Push failed

        Push successfully