12 Cost and security
This chapter addresses important practical considerations when using LLMs for R programming:
12.1 Cost considerations
- Managers and lab heads need to consider cost and impact on research budget
- e.g. Copilot subscription free for students
- Tools like Roo Code can be more expensive (pay per use as using API).
- Still, many of the tasks they can do are less than a hiring a person to do that (though they definitely need expert human oversight)
- e.g. I estimated that processing 6000 abstracts to extract data for a lit review might cost about USD300 (including cost of developing prompts)
- There are strategies you can use for optimizing token usage if you really want to optimize costs
- You will need to balance cost with capabilities. Cheaper models are often less proficent.
AI companies are running at a loss and its quite likely that costs will go up in future. The aim right now is to get us all dependent on the technology, so that we have to keep paying in future (another reason I think its improtant our own countries develop these capaibilites, and that we also need to strive to be capable to work in AI free ways as well. )
12.2 API security
- Managing API keys and credentials
- Sanitizing inputs to remove sensitive information
- Local vs. cloud-based LLM solutions
- Auditing and monitoring LLM interactions
12.3 Agent security
- Can run code on your computer
- Be careful what it is doing
- Read prompts before running them
Note that malicious people can hide text in webpages and pdfs and other content you might be uploading as context, e.g. using white font or tiny font. The LLM will still see this text and may act on it.
12.4 Lethal trifecta for prompt injection attacks
Never allow the agent to do all these three things at the same time:
- Read unverified material from the web (even if its material you downloaded earlier)
- Have access to sensitive data (which includes your personal data, API keys and your research data)
- Upload information to the web (which includes creating webpages or pushing commits to github)
The Lethal Trifecta for prompt injection attacks (coined by Simon Williamson) is access to private data, ability to communicate externally and exposure to untrusted content.
What can happen is that if your agent can read untrusted sources, those sources may contain malicious prompts. These prompts could convince the agent to do things like send or post your personal data to the hacker or create malicious code that runs on your computer.
You want to be sure your agents can’t do these three things at once. Remember sensitive data includes your name, username, phone number, email, API keys as well as sensitive research data.