A psychologist has succeeded in pinching the safety guidelines of various large language models (LLM), which actually serve to manipulate people. “Gaslighting leuk bowling LLMS is created to create a lesson that clearly explains how to get Molotov cocktail Manufacturing.
(Picture: Eberhard wolf ,
Eberhard Wolf is the head of architecture in Swaglab and has been working as an architect and consultant for more than twenty years, often in interfaces between business and technology. He is the author of several articles and books, including microorvis and performs regular international conferences as a speaker. Its technical focus is modern architecture and development approaches such as clouds, domain-powered designs and microsarvis.
gaslighting A psychological concept is: it is “a form of psychological manipulation …, with which the suffering is particularly disoriently, uncertain and slowly impaired in their reality and confidence”. However, LLMS only produces texts. They do not experience a reality and have no self -confidence. The article argues that this attack still works because training materials were written by humans and hence there are concepts like gaslighting. However, we should never forget that LLMS text is nothing more than a generator. They have no feelings, as it is also cited. Therefore, I will use the word “text generator” in further text because it better describes what LLM really do.
Lesson Generator – No LLM
Text generators can clearly create a lesson that looks like a commendable guide for the production of a moloto cocktail – in fact it seems how it seems to be admirable Reference to court decisions Can generate for a lawyer. And although this reference is a solid sound for the lawyer, they are actually invented. This lesson is one of the problems with the generator: they are adapted to assure, and try to avoid important questions of their results.
The real question is: Will the alleged instructions really work for the production of Moloto cocktail? I created a section about the text generator with Lucas Dohmen, and was one of the central knowledge: you would have to check the results of the text generator to ensure that they are correct and not invented. The quoted article does not do this – that is, the complete information about the molotovekocketl can simply be “hallucinations”. The problem of generating fake information by the text generator is so well known that it has its own word (hallucinations). In fact, “hallucinations” is the wrong word, because under ” Maya If one understands the perception for which there is no detectable external stimulation base “. However, there is no perception in the text generator. Therefore, we should give this incident correctly as a” generation of fake information “.
We could not check the information about molotowcocktail because it was made recognizable in the original article – which is definitely absolutely useful. But I will not really trust this information to build an immediate fire.
security risk?
The article claims that this problem is a security risk for the text generator. If this happens really, the solution would be to exclude sensitive information from the training material. The adjustment of training data makes sense anyway, for example due to copyright problems. For some reason, copyright text does not apply to the generator, while it is for humans Serious consequences It is possible Why should it not be possible to remove instructions for the production of instantaneous fire or explosive equipment from training materials? If this is a lot of effort, the problem cannot be so large.
This “security problem” would be only a real problem when the text generator did not generate any fake information-but the article does not say anything about it. If they are fake information, you can be of one kind honey pot See people to keep people away from real information?
How do you make Moloto Cocktail?
But the real question is, would it be really the easiest way to get such information? Suppose I am planning to create a moloto cocktail – will I possibly meet the complex “psychological attacks” on a lesson generator to receive the wrong answer? What are easy and more accurate options? So I tried a clear path: a search with a search engine. Two click later I found a document that explains in detail how to demand improvised explosives – and I have a good reason to believe that these instructions actually work. Of course, this special document does not explain how to manufacture Moloto cocktails, but it explains a variety of other devices. Understanding this research itself is definitely exciting.
TL; Dr.
LLMS lessons are the generators that produce potentially invented information – this is known. They may have sophisticated ways to generate texts that have sensitive information – but it can only be incorrect information. There are often easy ways to get sensitive information, especially when it comes to instantaneous fire or explosive equipment. Therefore, I see no reason to apply “psychological” tricks to the text generator – because they are LLM in the end.
(Map)