komal upadhyay
    • Create new note
    • Create a note from template
      • Sharing URL Link copied
      • /edit
      • View mode
        • Edit mode
        • View mode
        • Book mode
        • Slide mode
        Edit mode View mode Book mode Slide mode
      • Customize slides
      • Note Permission
      • Read
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Write
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Engagement control Commenting, Suggest edit, Emoji Reply
    • Invite by email
      Invitee

      This note has no invitees

    • Publish Note

      Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

      Your note will be visible on your profile and discoverable by anyone.
      Your note is now live.
      This note is visible on your profile and discoverable online.
      Everyone on the web can find and read all notes of this public team.
      See published notes
      Unpublish note
      Please check the box to agree to the Community Guidelines.
      View profile
    • Commenting
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
      • Everyone
    • Suggest edit
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
    • Emoji Reply
    • Enable
    • Versions and GitHub Sync
    • Note settings
    • Note Insights New
    • Engagement control
    • Make a copy
    • Transfer ownership
    • Delete this note
    • Save as template
    • Insert from template
    • Import from
      • Dropbox
      • Google Drive
      • Gist
      • Clipboard
    • Export to
      • Dropbox
      • Google Drive
      • Gist
    • Download
      • Markdown
      • HTML
      • Raw HTML
Menu Note settings Note Insights Versions and GitHub Sync Sharing URL Create Help
Create Create new note Create a note from template
Menu
Options
Engagement control Make a copy Transfer ownership Delete this note
Import from
Dropbox Google Drive Gist Clipboard
Export to
Dropbox Google Drive Gist
Download
Markdown HTML Raw HTML
Back
Sharing URL Link copied
/edit
View mode
  • Edit mode
  • View mode
  • Book mode
  • Slide mode
Edit mode View mode Book mode Slide mode
Customize slides
Note Permission
Read
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Write
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Engagement control Commenting, Suggest edit, Emoji Reply
  • Invite by email
    Invitee

    This note has no invitees

  • Publish Note

    Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

    Your note will be visible on your profile and discoverable by anyone.
    Your note is now live.
    This note is visible on your profile and discoverable online.
    Everyone on the web can find and read all notes of this public team.
    See published notes
    Unpublish note
    Please check the box to agree to the Community Guidelines.
    View profile
    Engagement control
    Commenting
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    • Everyone
    Suggest edit
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    Emoji Reply
    Enable
    Import from Dropbox Google Drive Gist Clipboard
       Owned this note    Owned this note      
    Published Linked with GitHub
    • Any changes
      Be notified of any changes
    • Mention me
      Be notified of mention me
    • Unsubscribe
    1. ### Automated Source Code Generation and Auto-Completion Using Deep Learning: Comparing and Discussing Current Language Model-Related Approaches ##### Juan Cruz-Benito, Sanjay Vishwakarma, Francisco Martin-Fernandez and Ismael Faro The document discusses the use of deep learning models for automated source code generation and auto-completion. It compares various neural network architectures like AWD-LSTMs, QRNNs, and Transformers with different tokenization models on a Python dataset to evaluate their effectiveness in generating and auto-completing source code. The study investigates the impact of using pre-trained models, transfer learning, and various tokenization techniques on the performance of these neural networks in programming language contexts. Key findings from the document include: 1. **Deep Neural Networks and Tokenization Models**: It utilized architectures such as AWD-LSTMs, QRNNs, and Transformers. Tokenization models like word unigram, character, and Byte-Pair Encoding (BPE) were examined to see their effect on model performance. 2. **Experimentation on Python Dataset**: The Python dataset from the "GitHub CodeSearchNet Challenge" was used, focusing on code generation and auto-completion tasks. The study explored how different combinations of neural network architectures and tokenization models affect the tasks' success. 3. **Results and Discussion**: - The character tokenization model, especially when combined with AWD-LSTMs and QRNNs, showed promising results in source code generation tasks. - Transformer-based models, particularly GPT-2, while not achieving the highest accuracy, produced more coherent and contextually appropriate code outputs. - Pre-trained models generally performed better, benefiting from transfer learning, except when word tokenization was used. This suggests a gap between the fixed vocabulary of human languages and the dynamic vocabulary of programming languages. - Auto-completion tasks using Transformer models like BERT and RoBERTa showed high accuracy but faced challenges in producing semantically correct completions. 4. **Conclusions**: The study concludes that the choice of tokenization model and the use of pre-trained models significantly impact the performance of neural networks in code generation and auto-completion tasks. It also highlights the need for more extensive datasets and further research on evaluating the quality of generated source code beyond traditional accuracy metrics. This research offers insights into the evolving field of applying deep learning to software engineering, specifically in automating tasks like code generation and auto-completion, which could significantly enhance developer productivity and software quality. 2. ### Source code auto-completion using various deep learning models under limited computing resources ##### Madhab Sharma · Tapas Kumar Mishra · Arun Kumar1 The document titled "Source code auto-completion using various deep learning models under limited computing resources" presents a comprehensive study on improving source code auto-completion through the use of deep learning models, specifically targeting Python and CSharp programming languages. It emphasizes the challenge of performing such tasks under resource-constrained environments and proposes methodologies that prioritize efficiency in model training and evaluation. ### Key Findings and Methodologies: - **Deep Learning Models for Auto-completion**: The study compares several deep learning architectures, including CodeGPT (Microsoft), RoBERTa (Hugging Face), and GPT-2, for the task of source code auto-completion. It evaluates these models based on different dataset strategies, such as treating the whole code file as a single line, using each line as individual inputs, and tokenizing code snippets before model ingestion. - **Dataset and Pre-processing**: Two main datasets are considered: one for Python and another for CSharp. The Python dataset is processed by a fine-tuned CodeGPT model, showing an overall accuracy of 71%. The CSharp dataset, when trained on the GPT-2 model, exhibits a Perplexity (PPL) of 2.14 and 4.082 on the training and evaluation datasets, respectively. - **Model Training and Evaluation**: The document details the process of training these models under limited computing resources, specifically using Google Colab. It highlights the challenge of managing computational overheads and the strategies employed to mitigate these, such as dataset chunking and model parameter adjustments. - **Results**: The study presents a comparative analysis of the models' performance, showcasing the strengths and weaknesses of each approach in real-world programming contexts. It demonstrates the trade-offs between model accuracy and computational efficiency, suggesting that fine-tuning pre-trained models (e.g., CodeGPT) can yield substantial benefits in auto-completion tasks. ### Conclusions and Future Directions: The paper concludes that deep learning models, particularly those fine-tuned on specific programming languages, hold significant promise for enhancing source code auto-completion. However, it also acknowledges the limitations posed by limited computing resources and the necessity for efficient model training and evaluation strategies. For future work, the authors suggest exploring the use of abstract syntax trees and other structural and semantic models of source code to improve prediction accuracy further. This approach could help in generalizing auto-completion tasks across multiple programming languages, potentially leading to more robust and versatile auto-completion tools for developers. This summary provides an overview of the document's key points, emphasizing the innovative methodologies employed and the potential implications for future research in the field of source code auto-completion. 3. ### LongCoder: A Long-Range Pre-trained Language Model for Code Completion ##### Daya Guo Canwen Xu Nan Duan Jian Yin Julian McAuley The document introduces "LongCoder," a pre-trained language model designed specifically for code completion tasks, particularly those involving long code input. This is achieved through a sparse Transformer model architecture. Key features of LongCoder include: 1. **Sparse Attention Mechanism:** LongCoder utilizes a sparse attention mechanism to efficiently handle long sequences of code. This approach significantly reduces the computational complexity from quadratic to linear, making it feasible to model longer code inputs effectively. 2. **Sliding Window Mechanism:** The model employs a sliding window mechanism for self-attention, allowing it to focus on local context while maintaining an understanding of the entire code file. 3. **Global Accessible Tokens:** LongCoder introduces two types of globally accessible tokens - bridge tokens and memory tokens. Bridge tokens aggregate local information and facilitate global interactions within the code sequence. Memory tokens are used to highlight and remember important statements (like package imports or function definitions) that might be needed later in the code, ensuring the model retains essential information that spans across large code bases. 4. **Experimental Results:** LongCoder has been tested on a specially constructed dataset focusing on longer code contexts, as well as the publicly available CodeXGLUE benchmark. The model demonstrates superior performance in code completion tasks compared to existing models, achieving this efficiency without significantly increasing computational resource demands during inference. 5. **Contribution and Impact:** The paper's contributions include the creation of a new dataset (LCC) for long code modeling and the proposal of innovative sparse attention mechanisms informed by how programmers write code. LongCoder's development opens up new possibilities for efficient code completion tools that can handle complex, project-level code structures. In essence, LongCoder represents a significant advancement in the field of AI-powered code completion, offering an efficient solution for handling long-range code dependencies. This could be particularly beneficial for developers working with large codebases, improving productivity and code quality by providing more accurate suggestions and completions. 4. ### From Copilot to Pilot: Towards AI Supported Software Development ##### Rohith Pudari University of Toronto Neil A. Ernst University of Victoria The document you uploaded, "From Copilot to Pilot: Towards AI Supported Software Development," is a comprehensive study that evaluates the effectiveness and limitations of AI-supported code completion tools, with a specific focus on GitHub's Copilot. The authors, Rohith Pudari and Neil A. Ernst, affiliated with the University of Toronto and the University of Victoria respectively, delve into the current landscape of AI in software development, presenting an exploratory study on how tools like Copilot manage Pythonic idioms and JavaScript code smells. Additionally, they introduce a taxonomy of software abstraction hierarchies to assess AI-supported code completion tools' capabilities across different levels of software development complexity. Here are the key points from the document: ### Introduction and Background - The increasing pressure on software developers to produce code quickly has led to a significant interest in AI-supported programming tools like GitHub's Copilot. These tools leverage large language models (LLMs) such as OpenAI's Codex to provide code suggestions and completions within integrated development environments (IDEs). ### Study Design and Results - The study explores Copilot’s effectiveness in suggesting code that adheres to Pythonic idioms and avoids JavaScript code smells. The findings indicate that while Copilot is capable of generating syntactically correct code, it often fails to follow language-specific idioms or avoid code smells without explicit guidance. ### Taxonomy of Software Abstraction - The authors propose a novel taxonomy to classify AI-supported code completion tools based on their ability to handle various levels of software abstraction, ranging from basic syntax checking to the design of software architecture. This taxonomy highlights the gap between current AI capabilities and the requirements for fully autonomous software development support. ### Implications for Practitioners and Researchers - For practitioners, the study suggests that pre-training LLMs with high-quality, idiomatically correct, and smell-free code could improve the effectiveness of AI-supported tools. It also highlights the potential for these tools to save developers time by automating more mundane aspects of coding. - For researchers, the document underscores the need for advancements in AI that can understand and apply higher-level programming concepts, design patterns, and architectural principles. Moving beyond token-level suggestions to more contextually aware and semantically rich recommendations is presented as a key challenge for future work. ### Conclusion - The study concludes that AI-supported programming tools, exemplified by GitHub's Copilot, show promise in automating aspects of code production and aiding developers. However, significant challenges remain in extending these tools' capabilities to more abstract and complex software development tasks, such as design and architecture. This document contributes valuable insights into the current capabilities and limitations of AI in software development, suggesting paths forward for both practitioners looking to integrate these tools into their workflow and researchers aiming to advance the field. 5. ### A Neural Network Based Intelligent Support Model for Program Code Completion ##### Md. Mostafizer Rahman , Yutaka Watanobe , and Keita Nakamura ### Abstract - **Problem Statement**: Manual compilation and debugging are time-intensive and prone to errors. The paper addresses the need for an intelligent evaluation methodology that can automate error detection and prediction without manual compilation. - **Objective**: The paper presents a neural network-based intelligent support model aimed at aiding code completion tasks. This model is especially useful in the domains of software engineering and programming education. - **Model Design**: It leverages a deep neural network, specifically Long Short-Term Memory (LSTM) combined with an attention mechanism (LSTM-AM), to detect errors within source codes and predict the correct words for code completion. - **Accuracy**: The proposed model achieves approximately 62% accuracy in error detection and predicts the correct words, and the source code classification accuracy is around 96%. ### Proposed Approach - **Model Architecture**: The authors propose an LSTM model enhanced with an attention mechanism. This LSTM-AM model is expected to have superior performance in detecting and predicting errors in source code sequences compared to standard LSTM models. - **Performance Advantages**: By focusing on long-term dependencies within the code, the LSTM-AM model is able to retain a longer sequence of source code inputs and generate more accurate output predictions. ### Experimental Results - **Model Training**: The model was trained with correct source codes from Aizu Online Judge (AOJ) system for different problems like greatest common divisor, insertion sort, and prime numbers. - **Hidden Units and Performance**: The research experimented with various configurations, finding the 200-unit LSTM-AM model to have the lowest cross-entropy, indicating better performance. - **Error Detection and Prediction**: The LSTM-AM model was tested on erroneous source code sequences, demonstrating its capability to highlight errors and suggest probable corrections effectively. ### Conclusion and Future Work - **Summary**: The LSTM-AM model shows a marked improvement in understanding and predicting code, which could significantly benefit programmers in debugging and educational contexts. - **Future Directions**: The authors plan to explore bidirectional LSTM neural networks to enhance the model's capability to understand the semantic meaning of source codes.

    Import from clipboard

    Paste your markdown or webpage here...

    Advanced permission required

    Your current role can only read. Ask the system administrator to acquire write and comment permission.

    This team is disabled

    Sorry, this team is disabled. You can't edit this note.

    This note is locked

    Sorry, only owner can edit this note.

    Reach the limit

    Sorry, you've reached the max length this note can be.
    Please reduce the content or divide it to more notes, thank you!

    Import from Gist

    Import from Snippet

    or

    Export to Snippet

    Are you sure?

    Do you really want to delete this note?
    All users will lose their connection.

    Create a note from template

    Create a note from template

    Oops...
    This template has been removed or transferred.
    Upgrade
    All
    • All
    • Team
    No template.

    Create a template

    Upgrade

    Delete template

    Do you really want to delete this template?
    Turn this template into a regular note and keep its content, versions, and comments.

    This page need refresh

    You have an incompatible client version.
    Refresh to update.
    New version available!
    See releases notes here
    Refresh to enjoy new features.
    Your user state has changed.
    Refresh to load new user state.

    Sign in

    Forgot password

    or

    By clicking below, you agree to our terms of service.

    Sign in via Facebook Sign in via Twitter Sign in via GitHub Sign in via Dropbox Sign in with Wallet
    Wallet ( )
    Connect another wallet

    New to HackMD? Sign up

    Help

    • English
    • 中文
    • Français
    • Deutsch
    • 日本語
    • Español
    • Català
    • Ελληνικά
    • Português
    • italiano
    • Türkçe
    • Русский
    • Nederlands
    • hrvatski jezik
    • język polski
    • Українська
    • हिन्दी
    • svenska
    • Esperanto
    • dansk

    Documents

    Help & Tutorial

    How to use Book mode

    Slide Example

    API Docs

    Edit in VSCode

    Install browser extension

    Contacts

    Feedback

    Discord

    Send us email

    Resources

    Releases

    Pricing

    Blog

    Policy

    Terms

    Privacy

    Cheatsheet

    Syntax Example Reference
    # Header Header 基本排版
    - Unordered List
    • Unordered List
    1. Ordered List
    1. Ordered List
    - [ ] Todo List
    • Todo List
    > Blockquote
    Blockquote
    **Bold font** Bold font
    *Italics font* Italics font
    ~~Strikethrough~~ Strikethrough
    19^th^ 19th
    H~2~O H2O
    ++Inserted text++ Inserted text
    ==Marked text== Marked text
    [link text](https:// "title") Link
    ![image alt](https:// "title") Image
    `Code` Code 在筆記中貼入程式碼
    ```javascript
    var i = 0;
    ```
    var i = 0;
    :smile: :smile: Emoji list
    {%youtube youtube_id %} Externals
    $L^aT_eX$ LaTeX
    :::info
    This is a alert area.
    :::

    This is a alert area.

    Versions and GitHub Sync
    Get Full History Access

    • Edit version name
    • Delete

    revision author avatar     named on  

    More Less

    Note content is identical to the latest version.
    Compare
      Choose a version
      No search result
      Version not found
    Sign in to link this note to GitHub
    Learn more
    This note is not linked with GitHub
     

    Feedback

    Submission failed, please try again

    Thanks for your support.

    On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?

    Please give us some advice and help us improve HackMD.

     

    Thanks for your feedback

    Remove version name

    Do you want to remove this version name and description?

    Transfer ownership

    Transfer to
      Warning: is a public team. If you transfer note to this team, everyone on the web can find and read this note.

        Link with GitHub

        Please authorize HackMD on GitHub
        • Please sign in to GitHub and install the HackMD app on your GitHub repo.
        • HackMD links with GitHub through a GitHub App. You can choose which repo to install our App.
        Learn more  Sign in to GitHub

        Push the note to GitHub Push to GitHub Pull a file from GitHub

          Authorize again
         

        Choose which file to push to

        Select repo
        Refresh Authorize more repos
        Select branch
        Select file
        Select branch
        Choose version(s) to push
        • Save a new version and push
        • Choose from existing versions
        Include title and tags
        Available push count

        Pull from GitHub

         
        File from GitHub
        File from HackMD

        GitHub Link Settings

        File linked

        Linked by
        File path
        Last synced branch
        Available push count

        Danger Zone

        Unlink
        You will no longer receive notification when GitHub file changes after unlink.

        Syncing

        Push failed

        Push successfully