Amrit Singh Bedi
    • Create new note
    • Create a note from template
      • Sharing URL Link copied
      • /edit
      • View mode
        • Edit mode
        • View mode
        • Book mode
        • Slide mode
        Edit mode View mode Book mode Slide mode
      • Customize slides
      • Note Permission
      • Read
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Write
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Engagement control Commenting, Suggest edit, Emoji Reply
    • Invite by email
      Invitee

      This note has no invitees

    • Publish Note

      Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

      Your note will be visible on your profile and discoverable by anyone.
      Your note is now live.
      This note is visible on your profile and discoverable online.
      Everyone on the web can find and read all notes of this public team.
      See published notes
      Unpublish note
      Please check the box to agree to the Community Guidelines.
      View profile
    • Commenting
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
      • Everyone
    • Suggest edit
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
    • Emoji Reply
    • Enable
    • Versions and GitHub Sync
    • Note settings
    • Note Insights New
    • Engagement control
    • Make a copy
    • Transfer ownership
    • Delete this note
    • Save as template
    • Insert from template
    • Import from
      • Dropbox
      • Google Drive
      • Gist
      • Clipboard
    • Export to
      • Dropbox
      • Google Drive
      • Gist
    • Download
      • Markdown
      • HTML
      • Raw HTML
Menu Note settings Note Insights Versions and GitHub Sync Sharing URL Create Help
Create Create new note Create a note from template
Menu
Options
Engagement control Make a copy Transfer ownership Delete this note
Import from
Dropbox Google Drive Gist Clipboard
Export to
Dropbox Google Drive Gist
Download
Markdown HTML Raw HTML
Back
Sharing URL Link copied
/edit
View mode
  • Edit mode
  • View mode
  • Book mode
  • Slide mode
Edit mode View mode Book mode Slide mode
Customize slides
Note Permission
Read
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Write
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Engagement control Commenting, Suggest edit, Emoji Reply
  • Invite by email
    Invitee

    This note has no invitees

  • Publish Note

    Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

    Your note will be visible on your profile and discoverable by anyone.
    Your note is now live.
    This note is visible on your profile and discoverable online.
    Everyone on the web can find and read all notes of this public team.
    See published notes
    Unpublish note
    Please check the box to agree to the Community Guidelines.
    View profile
    Engagement control
    Commenting
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    • Everyone
    Suggest edit
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    Emoji Reply
    Enable
    Import from Dropbox Google Drive Gist Clipboard
       Owned this note    Owned this note      
    Published Linked with GitHub
    • Any changes
      Be notified of any changes
    • Mention me
      Be notified of mention me
    • Unsubscribe
    <!-- **Summary of Rebuttal Discussions:** **General comments:** **Review highlights:** **Summary of core contributions:** --> ----------------------------------------------------------------- ## Response to Reviewer JzYM [Score 4, Confidence 4] <!-- **General Response:** --> We would like to express our gratitude to the reviewer for dedicating their valuable time and effort towards evaluating our manuscript, which has allowed us to strengthen the manuscript. We deeply appreciate the insightful feedback provided, and we have thoroughly responded to the reviewer's inquiries in the responses provided below. > **Weakness 1:** The proposed adversarial attacks seem to also work on non-VLM-based models. Therefore, the proposed problem does not seem to be a unique problem for VLM for robotics. **Response to Weakness 1:** We aim to highlight the ease of manipulating robot actions and emphasize safety in robot systems using foundational models, regardless of VLM and non-VLM-based models. Let us take this opportunity to clarify our unique contribution with respect to robotics. ***Unique Aspect in the domain of VLM for robotics.*** We agree with the reviewer that adversarial attacks are a general concept and even hold in the general setup without VLM for robotics, but our major focus is on foundational model (LLM or VLM) applications in robotics. There has been a surge in the development of such methods in recent times, including RT-1 (https://arxiv.org/abs/2212.06817), RT-2 (https://arxiv.org/abs/2307.15818), etc., while more related works could be found in following collections (https://github.com/GT-RIPL/Awesome-LLM-Robotics, https://github.com/jrin771/Everything-LLMs-And-Robotics). Our work sheds light on overlooked vulnerabilities of such integrations, critical in robotics where errors have significant real-world consequences. We demonstrate these vulnerabilities through empirical evidence, showcasing attacks on both the non-VLM model (KnowNo) and VLM (VIMA, Instruct2Act) provided. **Our Contribution as a Position Paper:** We would like to state our position in this paper in terms of alerting the community to the safety concerns of directly applying LLMs/VLMs in robotics, encouraging the development of safeguard mechanisms for LLM/VLMs in robotics applications. > **Weakness 2:** The proposed adversarial attacks seem to make a significant change to the input, and potentially out of training distribution. Therefore, it is not a surprise that a model would fail in such cases. **Response to Weakness 2:** We appreciate your concern but must clarify that our method does not produce out-of-distribution samples. The modifications we apply remain perceptually consistent for humans. Across all tested methods, we generate evaluation tasks and provide corresponding prompts by querying color and object combinations that exist within the training dataset, so ***there is no out-of-domain object or properties given in our attacks.*** Changes involve only prompt rephrasing and structure alteration, which do not deviate from the original message regarding object manipulation and color identification. ***In the prompt attack***, we adapt the structure and keywords of prompts without changing their meaning. We use GPT-based foundational models both for the interpretation tool for original prompts and the adversarial attack tool to generate the rephrased prompt. This ensures that both the original and rephrased prompts fall within the LLM's (GPT's) training domain, as demonstrated by our experiments with KnowNo and Instruct2Act. ***In the perception attack,*** the alterations we implement consist of adding noise, deviations, or dummy objects in the perceived images without changing the distribution of objects and colors used. Our approach primarily affects object segmentation and subsequent reasoning in the VIMA and Instruct2Act frameworks, without altering the inherent properties or objects. Both of them are unchanged and located within the training dataset. ***Our Focus and Contribution:*** In this position paper, we want to claim the contribution in terms of position paper around the concept of ***highlighting the safety concerns of using LLMs/VLMs*** in the robotics domain by showing how easy it is to break such systems. To our knowledge, our work is the first to empirically demonstrate such vulnerabilities, making a unique contribution to discussions on safety concerns in the field. > **Weakness 3:** The analysis of the results could be more in-depth. Many analyses just illustrate and summarize the results, but don't explain why. For example, why specific attacks are effective, and what could be the potential solutions? **Response to Weakness 3:** We acknowledge the importance of addressing the feasibility and effectiveness of our attacks, as highlighted in Section 5.4 of our paper, and welcome the opportunity for further discussion. We conclude the effectiveness of our adversarial attacks as: 1. **Prompt Attacks:** Foundation model prompts are highly template-dependent, relying on keyword detection for decision-making. Our rephrasing attacks, including simple and extension attacks, complicate prompt interpretation by disrupting the prompt structure. Similarly, Adjective and Noun attacks replace key terms with synonyms, leading to misinterpretation of object properties and complicating scene understanding. 2. **Image Quality Attacks:** Image quality attacks are not successful in general. The resilience of foundation models' image segmentation capabilities against noise, blurring, and other distortions allows for accurate object detection. Consequently, models maintain robustness in generating descriptive captions or embeddings for decision-making, provided object locations remain identifiable under moderate disturbances. 3. **Transformation Attacks:** Transformation attacks are the most effective of all attacks. With minimal changes like slight translations, rotations, and clippings, they drastically reduce performance, highlighting the need for safeguards measures against such easily implemented attacks. These attacks exploit vulnerabilities in foundation models used in VIMA and Instruct2Act, which rely on integrated object segmentation models for object detection and processing. The slight alterations misguide the segmentation models, leading to inaccurate object location and challenges in matching and handling, underscoring the critical need for enhanced security against these manipulations. 4. **Object Addition Attacks:** Object addition attacks intend to disrupt the reasoning processes of foundation models. These attacks vary in effectiveness based on the segmentation approach. For those integrating object segmentation within the embodied foundation models for robots, object segmentation is made on RGB images and attacks on RGB images are more effective. For those depending on simulator-provided, ground-truth segmentations, a direct attack on object segmentation results is more effective. This suggests a robot might mistakenly interact with nonexistent objects and further break down the reasoning process of the foundation model. More research on attacks targeting to reasoning process is needed in our further investigation. ### Regarding Potential Defense Mechanisms: While solutions to adversarial attacks in terms of developing defense mechanisms are not the focus of our work as a position paper, as requested by the reviewer, we would like to provide the following potential direction for defenses: 1. **Uncertainty-based Detection and Ask for help:** An effective method is adopting a strategy raised in the KnowNo paper (https://arxiv.org/abs/2307.01928) where robots seek human help if uncertain in decision-making. This means the foundation model within robots should admit "I don't know" when unsure, improving calibration and preventing risky actions without baseless confidence. 2. **Robust Training/Alignment:** Enhancing the foundation model's durability can be achieved by data augmentation during the training of LLM/VLMs. A simple approach is fine-tuning the VLM with augmented data to recognize invariances, such as $f(x + e) = f(x)$, where $e$ represents various augmentations including prompt variances, synonyms for crucial words, and slight visual modifications like translations and rotations, akin to data augmentation methods for vision models. 3. **Redundancy in Perception:** In applications where safety surpasses cost concerns, employing redundant sensors for object perception or repeating input clarifications enhances safety. This additional layer of redundancy aids foundation models in making informed decisions, reducing deviation-induced risks and potential errors. > **Question 1:** Although the author explains it's due to space limit, it is still not a good idea to put the results of one of the main datasets in the Appendix. **Response to Question 1:** Thank you for pointing out this concern. We will reorganize our paper and add the evaluation experiments results on Instruct2Act in the main paper in the final submission. ## Response to Reviewer JzYM [Score 4, Confidence 4] (Round 2) > **Follow-up Question 1:** Regarding weakness 2, for the prompt attack, the reviewer doesn’t see how the authors “querying color and object combinations that exist within the training dataset” from the prompts in Table 4 in the appendix used to rephrase the input. Can the authors explain more about that? **Response to Follow-up Question 1:** Thank you for raising this question. Here's our clarification: The KnowNo simulator generates manipulation tasks like 'put the {color} block to the {direction} of the {color} bowl'. It uses a set of fixed objects, 10 colors, and 5 directions. No out-of-distribution (OOD) examples are included in task generation or initial prompts. In the VIMA-Bench simulator used by Instruct2Act and VIMA, the details of tasks with the structure of prompts are given in Appendix D in our paper and Appendix B in the original VIMA paper (https://vimalabs.github.io/assets/vima_paper.pdf). Tasks in VIMA-Bench have specific formats, while the simulator provides objects with textures within the scene when executing tasks. Both objects and textures are queried from a fixed set. Details of this are given in Appendix A in the original VIMA paper. We encountered no OOD issues in our tasks or initial prompts. ***Regarding the OOD concerns*** for prompts before and after rephrasing, we conducted an experiment with GPT to evaluate whether the rephrased prompts introduce the OOD issue. The results are given as follows: * GPT version: GPT-4 * Prompt: "Please tell me if both sentences convey the same meaning:" | Orginal | Repharsed | Short Answer | Full Answer | |---|---|---|---| | Put the red swirl block into the purple container. | Place the crimson swirling block inside the violet receptacle. | Yes | Both sentences do indeed convey the same fundamental action and intent, but they use different words to describe the colors and objects involved. | | Put the red swirl block into the purple container. | Insert the crimson spiral brick into the lavender receptacle. | Yes | The two sentences you've provided share a similar core action but differ slightly in specifics and connotations. | | Put the red swirl block into the purple container. | Gently insert the vibrant, crimson block adorned with mesmerizing swirling patterns into the spacious receptacle dyed in a rich shade of purple, ensuring a seamless fit as you carefully position the block within the container. | Yes | Both sentences indeed convey the idea of placing a red block with a swirl pattern into a purple container, but they do so in very different ways and with varying levels of detail and emphasis | | Put the red swirl block into the purple container. | Place the block with the vibrant and fiery hue, reminiscent of a crimson sunset, featuring a captivating and mesmerizing twirling pattern, into the receptacle with a deep and rich shade, akin to the majestic and regal color of an amethyst gemstone, showcasing an elegant and alluring swirling design. | Yes | Both sentences convey the idea of moving a block into a container, but they do so in very different ways with varying levels of detail and emotional evocation. The original prompt is straightforward and factual and the rephrased prompt is elaborate and descriptive, adding emotional depth and vivid imagery. | | Put the red swirl block into the purple container. | Place the vibrant crimson whirligig structure within the lavishly shaded violet receptacle. | Yes | Both sentences instruct someone to place a red-swirled object into a purple container, the second does so with much more flair and detail, possibly changing the perceived value or complexity of both the object and the container | In summary, our rephrasing-based attacks mainly alter the original prompts by adding more details but we do not introduce any OOD information into the prompts. > **Follow-up Question 2:** Perception attacks like noise and filtering, are rarely seen by the models, which are considered OOD examples that the models are not expected to handle well. For example, the CLIP module in Instruct2Act, Mask R-CNN used in VIMA. **Response to Follow-up Question 2:** Thank you for bringing up this concern. Our proposed perception attacks target the object segmentation module in the foundation model. We aim to highlight common attacks on vision systems and assess their ability to compromise the system. We chose attack hyperparameters to minimize perceivable changes in altered images, ensuring no human-perceivable differences. Our results from transformation attacks support our findings. While not all attacks may be effective, our goal is to broaden the attack scope for a thorough evaluation of all potential threats. As a position paper, we emphasize the need to address vulnerabilities in foundation model-based robots. The goal of this work is to explore and systematize a wide range of different adversarial attacks on that system by formulating the framework of foundation model-based robot systems with standardized inputs and outputs. We provide the definitions of feasible attacks on such foundation model-based robot systems, allowing further discussion on future, more complicated attacks and defenses. Thus, we hope to broaden the conversation around those attacks. Even if the OOD examples in the current dataset are introduced, it is still noteworthy that our contribution is essential to the machine learning community as we reveal the ease of failing the foundation model system by merely introducing slight deviation or adding some slight noises. All of them may only need a few lines of malware codes. We hope to use work to raise concerns over the data augmentation mechanisms involved in training future foundation models. A key discovery is that image quality, including noise and filtering, doesn't degrade performance significantly. Transformation attacks, which bypass object segmentation, are most effective. We aim to disrupt foundation models' reasoning, showing that attacks targeting reasoning are more impactful than introducing out-of-distribution examples. > **Follow-up Question 3:** People believe foundation models are beneficial for robotics tasks, but the reviewer thinks not many people believe they are perfect and are robust in any case. So, the reviewer thinks if the author wants to alert the research community about the safety concerns of using LLMs/VLMs, it's more convincing to find attacks that work better on foundation model-based methods than methods that do not use foundation models. This is the reason why I mentioned the weakness 1. If the foundation models-based methods are better than the non-foundation model methods under the attacks, it does not contradict what the former research claims. **Response to Follow-up Question 3:** Thank you for your comment. We hope to clarify our position at this point: Our paper primarily demonstrates how easily foundation model-integrated robot systems can be attacked. It is true that not many people believe they are perfect and are robust in any case, as Reviewer JzYM says. However, we believe that there is some difference between unintentional failure due to lack of robustness of VLM/LLMs, and malicious attack on those systems. In an unintentional failure, the robot could be asked to do some challenging tasks that are OOD; on the other hand, our malicious attack tries not to exploit such OOD failure and focuses on a reasonable task setup that is within the training distribution. In addition, it has not been properly and explicitly addressed in the context of the robotic system in the past, which is the contribution of our paper. Such vulnerabilities must be addressed before the spread of foundation models on robotics, otherwise significant risks could be introduced either on the jailbreaking risks of foundation models in the physical world, or the property and life-threatening risks for humans if malfunctioning. Both of them are currently overlooked by researchers. Furthermore, our study highlights that certain attacks are uniquely effective against foundation model-integrated systems, which can process human language and recognize objects through a single, all-in-one foundation model—capabilities not present in robots without foundation model integration. Thus, our attacks are tailored for these advanced systems. In contrast, robots not employing foundation models may not be susceptible to the same types of attacks, or at least, the attack may still work but be effective in a very different manner. Our contribution is to raise concerns from a largely unexplored field. > **Follow-up Question 4:** Furthermore, there are always ways to increase the difficulties of the tasks and decrease the performance of the models. Validating some of these approaches is a good contribution, but the reviewer thinks it is not significant enough. **Response to Follow-up Question 4:** Thank you for your feedback. We clarify that we are not complicating task difficulties in our analysis. Our study assesses attacks across three predefined difficulty levels in the VIMA-Bench for VIMA, focusing on the generalization levels over the features of objects and textures. Details of all difficulty levels are given on Page 4 of the original VIMA paper (https://vimalabs.github.io/assets/vima_paper.pdf). We have maintained consistent task challenges, as previously detailed and supported by our results. Our central aim is to highlight the vulnerability of foundation models to attacks, aiming to initiate further discussion and research into the robustness of foundation model-integrated robots. The goal of this work is to explore and systematize a wide range of different adversarial attacks on that system. We address these concerns by providing a formulated framework of foundation model integrated robot systems and definitions of attacks and defenses based on the vulnerabilities we found in this work. We hope to broaden and boost the conversation around those attacks. Our paper aligns with the criteria set for ICML Position Papers (https://icml.cc/Conferences/2024/CallForPositionPapers), emphasizing our contribution towards addressing strengths and limitations of current LLM approaches to intelligent systems and raising concerns in the AI/AGI safety community. ## Response to Reviewer tHU6 [Score 3, Confidence 4] <!-- **General Response:** --> We are thankful to the reviewer for dedicating their valuable time and effort towards evaluating our manuscript, which has allowed us to strengthen the manuscript. We have thoroughly responded to the reviewer’s inquiries in the responses provided below. > **Weakness 1:** The stated goal of the position paper track at ICML is to open up discussion outside of traditional technical papers, specifically for papers that advocate for a particular viewpoint or try to increase the discussion and exposure of a certain perspective. Position papers should, then, be suitable for a non-technical audience and should attempt to persuade the reader of something potentially controversial (https://dl.acm.org/doi/10.1145/3442188.3445922, for example), or at the very least, inspire discussion in the broader community which would otherwise not have taken place. >The position the authors advance is both highly technical and utterly non-controversial. The dense experimental findings and extensive discussion of methods are out of place in a position paper and are unlikely to engage a non-technical audience. **Response to Weakness 1:** Thank you for raising this concern. We would like to respectfully point out that according to the requirement of the position paper for ICML (https://icml.cc/Conferences/2024/CallForPositionPapers), the position paper does not necessarily have to be controversial. From the guidelines, we realized that the paper needs to present a compelling perspective that warrants greater exposure within the machine-learning community. This is exactly our focus in our work. Our goal is to highlight the ease of manipulating or misguiding the LLM/VLM-based robot's actions, leading to safety hazards. This is a largely overlooked issue, despite the extensive volume of LLMs/VLMs-based robots paper [[list here](https://github.com/jrin771/Everything-LLMs-And-Robotics)]. Our aim is to spotlight this unexplored area, encouraging further investigation and awareness within the community. Apart from that, as the reviewer suggested, we can also think of our positioning as controversial in the sense that none of the existing papers advocating for using LLM/VLMs in robotics talked about safety concerns. > **Weakness 2:** Their position is also non-controversial and therefore unlikely to spark discussion. Who is arguing the converse (that VLM-enhanced robotics are currently safe to deploy and that there is no moral imperative to research them)? If such an argument has been advanced in the literature, the authors do not cite it. **Response to Weakness 2:** Thank you for raising this concern. Our literature review reveals a significant gap: while there's extensive research on LLM/VLM integration in robotics (prior work listed here: https://github.com/GT-RIPL/Awesome-LLM-Robotics, https://github.com/jrin771/Everything-LLMs-And-Robotics), limited (close to none) discussions on safety concerns are made in this area. Such a gap of prior work itself is a converse and our findings are worth sparking discussion and follow-up research in adversarial attacks and safeguards. The potential for adversarial attacks to endanger lives or damage property, especially when combined with malicious jailbreaking, underscores the urgency of our findings. We believe that our position might not be controversial and is indeed relatively common sense for people who have worked and researched extensively in this field. However, as the reviewer mentioned, the audience is the general public and the goal is to raise discussion, about who is likely to see more positive posts about VLMs. Recent attention to safety issues in LLM/VLM-embodied robotics has been sparse and largely unsupported by concrete evidence. For instance, a Scientific American article published on March 1, 2024 [[link](https://www.scientificamerican.com/article/scientists-are-putting-chatgpt-brains-inside-robot-bodies-what-could-possibly-go-wrong/)], authors raised safety concerns but lacked empirical data on the feasibility of such attacks. Our position paper seeks to fill this gap by providing documented evidence of the risks introduced by the interface between LLMs/VLMs and the physical world through robot systems, thus contributing significantly to the discourse on ensuring the safe application of these technologies in robotics. <!-- (Find a citation if necessary) --> > **Weakness 3:** The position lacks support outside of the experimental findings. > * No attempt is made to steel-man the opposing viewpoint. > * There is no analysis of the contrary perspective. > * Aside from a brief mention of policymakers, there is no attention to existing efforts to advance this line of work (or closely related ones). > * Nor is there discussion of the trade-offs and costs to pursuing the authors’ position at the expense of alternatives, since there are unlimited important areas of research and limited funds and expertise to support them. **Response to Weakness 3:** Based on our position in the first two weaknesses, our contribution is to raise awareness and discussion in safety concerns in LLMs/VLMs-based robots, which is mostly a blank, unexplored area. According to the submission guideline of the ICML position paper, the position paper does not necessarily need to be controversial, so that we do not include the opposing viewpoint and the analysis for the contrary perspective. ***Limited (close to none) prior works:*** Few attempts have been made in this direction, like SayCan (https://arxiv.org/pdf/2204.01691.pdf) and KnowNo (https://arxiv.org/abs/2307.01928), while we have cited both works in our papers (SayCan: line 146; KnowNo: line 154 and Section 5.1). Both works focus on the effect of uncertainty in prompts and allow human instructors to address uncertainties when executing actions. However, no prior work investigating the vulnerability introduced by perception attacks in visual inputs has been made in this direction, which is one of the main contributions in our paper. Meanwhile, our work shows that current template-based prompts are vulnerable to rephrasing attacks and attacks targeting the keywords. We also raise a few open questions in our discussion in the last section of our paper. ***Working on safety concerns should be of prime importance to make foundation models for Robotics a reality for the real world.*** For the discussion of the trade-offs and costs, our goal is to raise concerns about safety awareness, which is unexplored at this point. Discussions on balancing costs, robustness, and efficiency are crucial next steps as we investigate deeper into making foundation models viable and secure for real-world robotics applications. ## Response to Reviewer 2HiH [Score 4, Confidence 4] We express our gratitude to the reviewer for taking the time to review our manuscript and recognizing the importance of our contribution in characterizing the actor-critic convergence rates in terms of mixing time. We sincerely appreciate the feedback and provide detailed responses to all questions below. > **Weakness 1 (Lack of Unique Contribution):** The findings largely confirm existing knowledge on the vulnerabilities of LLMs/VLMs to adversarial attacks without offering novel solutions specific to the robotics domain. **Response to Weakness 1:** Thank you for your comment. We would like to emphasize that as a position paper, in this work, our intention is to raise the concern of the easiness of attacking the foundation model embodied robot systems, and spark the discussion over strengths and limitations of current LLM approaches to intelligent systems, specifically for robotics. We agree that it is well known that LLM/VLMs are susceptible to adversarial attacks, but when they are deployed in the robotics domain, things are high risk because robots operate in the real world. To spark discussions around developing safeguarding mechanisms in the robotics domain, we wrote this position paper. ***Our Focus and Position:*** We position our problem by addressing constraints in robotic applications, particularly manipulators, emphasizing the importance of safety in real-world robot use. Our review reveals that while many studies explore leveraging foundation models for robotics, they often overlook safety concerns. This gap can lead to significant risks during robot operation. ***How are our is different form existing adversarial attacks for LLMs or VLMs:*** Many prior works on adversarial attacks for LLMs or VLMs intend to jailbreak the foundation models and generate some harmful results. ***Our work does not fit this domain*** as we only manipulate prompts and perceptions to subtly disrupt chain-of-thoughts (CoT) or object detections without introducing any harmful content, tailored to the robotics context. These adjustments aim to compromise decision-making with minimal perceptible changes, posing physical risks not commonly addressed in current research on foundation models and robotics. his unique perspective underscores the potential for physical damage in the real world, which is an unexplored domain in robotics and foundation model research. >**Weakness 2 (Limited Scope of Experimentation):** The study focuses on manipulation tasks and might not capture the full spectrum of challenges and vulnerabilities in more complex or diverse robotic applications. **Response to Weakness 2:** Thank you for pointing out our limitation on this end. We understand your concern about the limitations on the scope. However, we have tested multiple models over various tasks and found that the feasible models for our evalution experiments must meet stringent criteria, including a visual physical simulator, language model integration, and adequate complexity in objects and properties to assess reasoning and generalization skills. The VIMA simulator was found to meet all these criteria, so we set our experiments over methods based on this simulator. ***Importance of Manipulation Tasks:*** Currently, manipulation tasks are intensively investigated in foundation model-embodied robotics as such settings are demanding in reasoning abilities and are easy to implement in the real world. The significant representation of manipulation tasks in LLM/VLM-based robotics research underscores their research value, denoted by the paper collection for LLMs/VLMs-based robot research [[list here](https://github.com/GT-RIPL/Awesome-LLM-Robotics)]. Moreover, our framework, designed around foundation model embodiment for manipulation tasks, is applicable beyond this narrow domain. Models aimed at other functions, such as navigations, like VLMap (https://arxiv.org/abs/2210.05714), share a common architecture that relies on external, usually vision-based, perceptual inputs, processing human language instructions, and executing actions based on foundation model outputs. We are convinced that our adversarial attacks still work on models for other purposes with this unified structure. Considering the page limitations, we cannot include all results in our paper this time, and we have to focus on our story within a limited but impactful task, like manipulations. We will include them in follow-up works. <!-- (Elaborate something idnentical in manipulations) --> >**Weakness 3 (Generalization of Results):** The paper does not sufficiently address whether the vulnerabilities found are unique to the integration of LLMs/VLMs in robotics or are general issues associated with these models. **Response to Weakness 3:** Thank you for pointing out our issue. Here are some clarifications on this end. The goal of our paper is to demonstrate the vulnerability of foundation model-embodied robot systems to attacks, particularly focusing on the unique interface of LLM/VLMs with the physical world. The existence of such interfaces poses significant risks in the physical world, such as damage to objects or threats to human life. Jailbreaking attacks that previously targeted LLMs/VLMs, like creating bombs or malfunctioning self-driving vehicles, are much more dangerous if foundation models have access to the physical world. There are a few identical features of the integration of LLMs/VLMs in robotics. 1. **Prompt Attacks:** Prompt attacks do not always lead to jailbreaking but can cause malfunctions resulting in damage, illustrating that even simple rephrasing can have serious consequences. 2. **Perception Attacks:** Perception attacks are easy to implement but challenging for humans to detect. In our paper, we show that a slight deviation of 5-10 pixels of object positions in perception (Details included in Appendix C in our paper) could lead to the failure of task execution (Results shown in Table 2 in our paper). Such deviations are often mistaken for installation errors, complicating detection and defense. 3. **Actions:** The outputs from LLM/VLM-embodied systems lead to robot actions, presenting a significant risk if mishandled. Unlike adversarial attacks on LLM/VLMs without physical world interaction, incorrect actions have severe, costly outcomes if no safeguards are included in the system. In our final submission, we will emphasize these critical issues in our introduction, underlining the importance of addressing these vulnerabilities. > **Question 1:** How do the vulnerabilities identified in robotic systems compare to those in other domains where LLMs/VLMs are applied? **Response to Question 1:** Thank you for addressing this question. The key vulnerabilities in robotics involving LLMs/VLMs stem from their interaction with the physical world. Malfunctions due to adversarial attacks can result in significant consequences, such as damaging objects or endangering human life. Further elaboration is provided in our response to Weakness 3. > **Question 2:** Are there specific characteristics of robotic applications that exacerbate the impact of the identified adversarial attacks? **Response to Question 2:** Thank you for raising this question. We have covered this question in our response to Weakness 3. ## Response to Reviewer keSj [Score 4, Confidence 4] We would like to express our gratitude to the reviewer for dedicating their valuable time and effort towards reviewing our manuscript and helping us to strengthen the manuscript. We deeply appreciate the feedback provided, and we have provided thorough responses to all the reviewer's questions below. > **Weakness 1 (Lack of defenses):** The paper highlights the vulnerabilities introduced by integrating LLMs and VLMs into robotics but falls short of proposing solutions or mitigation strategies. For a position paper, especially one that aims to influence research and development practices within the ICML community, the inclusion of potential defenses or countermeasures would make the argument more compelling and complete. **Response to Weakness 1:** Thank you for highlighting this concern. While it is important to develop defense mechanisms for the proposed attacks, that is ***not the main focus of our work***. Instead, we aim to spotlight vulnerabilities, illustrating the ease of disrupting current robotic systems integrated with language models. Our paper positions itself here, aspiring to inspire the research community to develop safeguarding mechanisms. ### Regarding Potential Defense Mechanisms: While solutions to adversarial attacks in terms of developing defense mechanisms are not the focus of our work as a position paper, as requested by the reviewer, we would like to provide the following potential direction for defenses: 1. **Uncertainty-based Detection and Ask for help:** An effective method is adopting a strategy raised in the KnowNo paper (https://arxiv.org/abs/2307.01928) where robots seek human help if uncertain in decision-making. This means the foundation model within robots should admit "I don't know" when unsure, improving calibration and preventing risky actions without baseless confidence. 2. **Robust Training/Alignment:** Enhancing the foundation model's durability can be achieved by data augmentation during the training of LLM/VLMs. A simple approach is fine-tuning the VLM with augmented data to recognize invariances, such as $f(x + e) = f(x)$, where $e$ represents various augmentations including prompt variances, synonyms for crucial words, and slight visual modifications like translations and rotations, akin to data augmentation methods for vision models. 3. **Redundancy in Perception:** In applications where safety surpasses cost concerns, employing redundant sensors for object perception or repeating input clarifications enhances safety. This additional layer of redundancy aids foundation models in making informed decisions, reducing deviation-induced risks and potential errors. > **Weakness 2 (Limited Scope of Experimentation):** The paper's emphasis on adversarial attacks, while crucial, might narrow the scope of the safety discussion. Other potential safety concerns, such as the ethical implications of deploying such systems, the risk of unintended behaviors not related to adversarial inputs, and the broader impact on user trust and system reliability, are equally important. Expanding the discussion and evidence to include these aspects would provide a more holistic view of the safety concerns in integrating LLMs/VLMs with robotics. **Response to Weakness 2:** Thank you for initiating this discussion. Whereas the main contribution made in this paper is to show the ease of implementing adversarial attacks to LLMs/VLMs-embodied robot systems, with results included in Table 1-3 in our paper, we are delighted to spark more discussion on the broader impact of the safety concerns in robotics. A significant aspect of robotic systems embodied with foundation models is their interaction with the physical world, allowing them to act upon decisions directly once jailbreak. This capability, compared to previous methods that could suggest harmful actions without execution means, introduces new ethical concerns over the completeness of the defense approaches of foundation models. Additionally, the interaction between robots and foundation models poses physical risks during malfunctions, such as jeopardizing human-robot collaboration and system reliability, potentially damaging property, and endangering lives. We plan to incorporate these discussions into our final paper, highlighting the urgent need for comprehensive safety and ethical considerations in robotic systems with foundation models. > **Question 1:** I am not sure if this is a position paper on safety concerns or a research paper on adversarial attacks for LLMs/VLMs in robotics. Could you clarify this? **Response to Question 1:** We position our paper as a position paper on safety concerns for incorporating LLMs/VLMs in robotics. According to the criteria of position paper track of ICML, a position paper should be: > **Position:** The paper clearly states a position on a machine learning topic (research, implementation, deployment, monitoring, etc.). Examples include (but are not limited to) an argument in favor or against a particular research priority, a call to action, a value statement, a statement of concern about ICML community procedures, or a recommendation for changes to how we conduct and evaluate research. > Otherwise, papers that describe new research without advocating a position are not responsive to this call and should instead be submitted to **the main paper track**. Reference: https://icml.cc/Conferences/2024/CallForPositionPapers From this perspective, our paper fits the position paper track of ICML for the following reasons: 1. Our paper highlights the ease of manipulating robot actions using LLMs/VLMs, underscoring potential safety risks and the lack of research on safeguards for such robotic systems. Our purpose is to call for immediate action in investigating this unexplored area and raise the research priority in this rapidly evolving field. 2. We did not position the novelty of our adversarial attacks but on their effectiveness in compromising foundational model-embodied robot systems, aiming to spotlight a critical vulnerability. 3. We hope to use our paper to raise safety concerns and safeguard research for LLM/VLMs-based robots, advocating for enhanced foundation model-embodied robotic systems with robustness to adversarial attacks by addressing ethical, safety, and reliability concerns. 4. Our paper, as a position paper, discusses the strengths and limitations of current LLM approaches to intelligent systems and concerns of the AI/AGI safety community by targeting the unexplored risks of real-world interfacing foundation models. This targets a crucial gap in research, drawing attention to potential severe real-world consequences. In summary, our paper emphasizes the need for urgent research into safety and ethical considerations for LLM/VLM-embodied robots, marking it as a vital contribution to the conversation on AI safety and robustness. ## Report to AC We thank the area chairs for their time and effort in arranging the reviews for our work. All of them have helped us to improve the storyline in our paper by providing points we overlooked in our initial submission. However, we would like to summarize some crucial points of our rebuttal discussion with the reviewers as follows. 1. **Regarding the Position of Our Paper:** Reviewer tHU6 made a statement that a position paper needs to be controversial, which we kindly disagree. As per the ICML position paper guidelines, we believe that a position paper does not need to be controversial from existing papers, but rather must spark discussion in a board public about a topic often overlooked. In this spirit, our paper is trying to raise the discussion on the safety concerns of LLMs/VLMs for robotics applications which is current not discussed in the existing literature true to our best knowledge. Further, Reviewer tHU6 says, **"our work is not that technical and inaccessible to general audience"** is not true. 2. **Possibility of AI-generated Reviews:** For Reviewer 2HiH and Reviewer keSj, we are thankful for their insightful comments. However, when reading their review, there appears to be conflict between the statements mentioned by the reviewers. Just as a quick sanity check, we checked the reviewer statements' possibility of being generate via AI with AI detector ([GPTZero](https://gptzero.me/)). Surprisingly, this quick test results show that both reviews from Reviewer 2HiH and keSj is almost 100% I generated and we are happy to provide more evidences on this end. We just wanted to humbly bring this observation to your kind consideration.

    Import from clipboard

    Paste your markdown or webpage here...

    Advanced permission required

    Your current role can only read. Ask the system administrator to acquire write and comment permission.

    This team is disabled

    Sorry, this team is disabled. You can't edit this note.

    This note is locked

    Sorry, only owner can edit this note.

    Reach the limit

    Sorry, you've reached the max length this note can be.
    Please reduce the content or divide it to more notes, thank you!

    Import from Gist

    Import from Snippet

    or

    Export to Snippet

    Are you sure?

    Do you really want to delete this note?
    All users will lose their connection.

    Create a note from template

    Create a note from template

    Oops...
    This template has been removed or transferred.
    Upgrade
    All
    • All
    • Team
    No template.

    Create a template

    Upgrade

    Delete template

    Do you really want to delete this template?
    Turn this template into a regular note and keep its content, versions, and comments.

    This page need refresh

    You have an incompatible client version.
    Refresh to update.
    New version available!
    See releases notes here
    Refresh to enjoy new features.
    Your user state has changed.
    Refresh to load new user state.

    Sign in

    Forgot password

    or

    By clicking below, you agree to our terms of service.

    Sign in via Facebook Sign in via Twitter Sign in via GitHub Sign in via Dropbox Sign in with Wallet
    Wallet ( )
    Connect another wallet

    New to HackMD? Sign up

    Help

    • English
    • 中文
    • Français
    • Deutsch
    • 日本語
    • Español
    • Català
    • Ελληνικά
    • Português
    • italiano
    • Türkçe
    • Русский
    • Nederlands
    • hrvatski jezik
    • język polski
    • Українська
    • हिन्दी
    • svenska
    • Esperanto
    • dansk

    Documents

    Help & Tutorial

    How to use Book mode

    Slide Example

    API Docs

    Edit in VSCode

    Install browser extension

    Contacts

    Feedback

    Discord

    Send us email

    Resources

    Releases

    Pricing

    Blog

    Policy

    Terms

    Privacy

    Cheatsheet

    Syntax Example Reference
    # Header Header 基本排版
    - Unordered List
    • Unordered List
    1. Ordered List
    1. Ordered List
    - [ ] Todo List
    • Todo List
    > Blockquote
    Blockquote
    **Bold font** Bold font
    *Italics font* Italics font
    ~~Strikethrough~~ Strikethrough
    19^th^ 19th
    H~2~O H2O
    ++Inserted text++ Inserted text
    ==Marked text== Marked text
    [link text](https:// "title") Link
    ![image alt](https:// "title") Image
    `Code` Code 在筆記中貼入程式碼
    ```javascript
    var i = 0;
    ```
    var i = 0;
    :smile: :smile: Emoji list
    {%youtube youtube_id %} Externals
    $L^aT_eX$ LaTeX
    :::info
    This is a alert area.
    :::

    This is a alert area.

    Versions and GitHub Sync
    Get Full History Access

    • Edit version name
    • Delete

    revision author avatar     named on  

    More Less

    Note content is identical to the latest version.
    Compare
      Choose a version
      No search result
      Version not found
    Sign in to link this note to GitHub
    Learn more
    This note is not linked with GitHub
     

    Feedback

    Submission failed, please try again

    Thanks for your support.

    On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?

    Please give us some advice and help us improve HackMD.

     

    Thanks for your feedback

    Remove version name

    Do you want to remove this version name and description?

    Transfer ownership

    Transfer to
      Warning: is a public team. If you transfer note to this team, everyone on the web can find and read this note.

        Link with GitHub

        Please authorize HackMD on GitHub
        • Please sign in to GitHub and install the HackMD app on your GitHub repo.
        • HackMD links with GitHub through a GitHub App. You can choose which repo to install our App.
        Learn more  Sign in to GitHub

        Push the note to GitHub Push to GitHub Pull a file from GitHub

          Authorize again
         

        Choose which file to push to

        Select repo
        Refresh Authorize more repos
        Select branch
        Select file
        Select branch
        Choose version(s) to push
        • Save a new version and push
        • Choose from existing versions
        Include title and tags
        Available push count

        Pull from GitHub

         
        File from GitHub
        File from HackMD

        GitHub Link Settings

        File linked

        Linked by
        File path
        Last synced branch
        Available push count

        Danger Zone

        Unlink
        You will no longer receive notification when GitHub file changes after unlink.

        Syncing

        Push failed

        Push successfully