AI Galaxy-LLM Crash Course by Hichhaiking through Part 4

0
3
AI Galaxy-LLM Crash Course by Hichhaiking through Part 4


A large language model (LLM) is designed to process and generate human language. After the original introduction of LLM in the first, the second in the second and the architectural type in the third part and the architectural type, at the end of time.




Professor Dr. Michael Stall has been working in Siemens Technology since 1991. His research focuses on the software architecture for large complex systems (distributed systems, cloud computing, IIOT), embedded systems and artificial intelligence. He recommends professional fields on software architecture issues and is responsible for architectural training of senior software architects in Siemens.

Development Forum: Gitlab focuses on code review with 17.10 KiDevelopment Forum: Gitlab focuses on code review with 17.10 Ki

Fasting your seat belt!

Anyone who uses modern LLM such as Deepsek R1 or OpenAI O3 should often see expenses such as “thinking” or “argument”. Therefore the language model is capable of structuring and systematically reacting for an inquiry. Therefore, they are called logic models.

In large language models, logic or conclusions are applied by various techniques that improve their ability to separate complex problems in managed stages and provide logical explanation. The most important methods include:

  • Chain-off-prompt (COT) LLMS training involves so that they gradually generate clarification to your answers, which helps you mimic the human -thinking processes.
  • Supervised fine-tuning (SFT) and reinforcement learning: Techniques such as Star (Self-Struggle Region) use reinforcement learning to reward the model for the generation of the right logic stages, which can then be used for SFT.
  • Early engineering: Strategies such as active prompts and chain of sucks use LLM developers to promote LLM’s argument to promote LLM input so that it is gradually able to “think”.

These methods aim to improve LLMS arguing and improve the ability to provide transparent thinking processes, although experts continue to discuss the realm that they actually argue.

Chain-off-three- (COT) improves the logic skills of the large language model, making them stimulated to separate complex functions in many logical stages. This approach reflects human thinking and enables the model to deal with more systematic and transparent problems. The most important benefits include:

  • Better accuracy: Due to concentration at one step after another, LLM can provide more accurate results, especially to solve complex tasks such as mathematical problems and logical thinking.
  • Better clarity: COT indications provide a clear insight the way the model comes to its conclusion, which improves the confidence and understanding of the AI ​​versions.
  • Less hallucinations: COT promotes the model with the structured argument stages, helps reduce errors and hallucinations in the answers from COT prompting lLMS.

The larger voice model has revolutionized the field of natural language processing and enables applications such as speech translations, texts and chatbots. The future of LLMS is exciting, such as with possible applications in areas such as education, healthcare and entertainment.

An innovative solution is the beginning of MOE processes (Mo = Mix of experts) In llms. These models consist of LLM components, each of which specializes in a certain domain. Through a gating mechanism, the model pursues user inquiry for the appropriate LLM component.

Under Agent AI LLM is understood to understand agents that are capable of reaching their environment, for example to start a function or operate the UI.

Multi-agent systems Include separate LLM agents, which in return have different roles and work together to solve a normal function, which reminds of the pre -mentioned mixture of experts.

Meanwhile, more and more multimodal models are also available, which in addition to the text, but can also understand and/or generate images, videos, audio or symbols. The vision model can be presented with a picture to ask questions about the picture. Some models allow to enter clear language rather than text. Models such as Openaai Sora produce realistic videos from language signals. Midjorney, Dal-E and similar models can create images from user requirements (prompts). The architecture of the model is similar to the architecture presented in this article. Only that models can process and generate other elements such as pixel section in addition to text tokens.

In view of this rapid development, it is necessary to deal with a deepening on the subject of LLM and generic AI as a developer. It should be equally important to severely questioned new LLM technologies, especially in terms of our lives and their influences on our society. This is particularly applied to moral principles and values. Only those who know and understand technologies can assess and weigh opportunities and risks.

For those who want to know more about LLM, there are some additional resources for them:

Here is a vocabulary of some words used in this article series:

  • LLM: Big Language Model
  • Tokar: An ingredient that separates the input text in small units called tokens.
  • Embeding: Words, sub -squirts or numerical representation of signs that capture the meaning.
  • Encoder: An ingredient that produces a constant representation of the input text.
  • Decoder: An component that produces lessons based on input and relevant presentation.
  • Self post: A mechanism that enables models to focus on different parts of the input text and generate a relevant performance.
  • Cross Post: A mechanism that enables the model to focus on external information, e.g. B. Input text or other models.
  • Trained models in advance: Models that train their creators on large data records and have become fine for specific tasks.
  • Reference Window: The amount of input text that the model can see at a certain time.
  • Mask: Tantra that prevents models from focusing on parts of the input text.


(RME)

Google announced Mithun’s new worksGoogle announced Mithun’s new works

LEAVE A REPLY

Please enter your comment!
Please enter your name here