Unravel the mysteries of machine learning, Generative AI, LLMs, and so much more (and how it all works together) — in this comprehensive AI guide.
AI isn't just a futuristic concept anymore.
It's here, ready to transform how you approach your work by streamlining your daily tasks and unlocking new possibilities.
But to really maximize AI, you need to know the basics.
The essential AI knowledge in this article will demystify the rapidly changing AI landscape and equip you with empowering, practical know-how so you can start using AI effectively… and fast.
In this article:
NOTE: Because the content builds upon itself, we recommend reading this in sequential order so that you don't miss critical terminology and explanations.
‘AI’ and ‘ML’ get thrown around a lot… What do they mean?
Artificial intelligence is a broad term referring to a machine’s ability to perform tasks that would typically require human intelligence (e.g., speech recognition, language comprehension, and making decisions or predictions based on data).
What’s not considered AI?
An algorithm is a set of step-by-step instructions that guide machines in performing tasks and making decisions. Algorithms are used across the entire spectrum of AI models.
An AI model refers to a program or algorithm trained and programmed by a human on specific data to achieve an explicitly defined task.
A few types of AI models include:
Rule-Based Systems | The act of machines performing tasks based on predetermined rules that have been hard-coded into them by humans, resulting in predefined outcomes using "if-then" coding statements |
Expert Systems | The act of machines performing tasks based on predetermined expertise that has been hard-coded into them by humans to simulate the judgment and behavior of a human expert |
Machine Learning | The act of machines learning to perform tasks and optimize performance from experience — without a human explicitly defining the rules |
While all ML models are considered AI models, not every AI model is an ML model. The key difference between rule-based and expert-based AI models and ML models is that:
An AI system refers to the entire infrastructure and framework required for building and deploying AI.
While the AI model is a central component of an AI system, an AI system also includes data acquisition, hardware and software training resources, the user interface, and more.
Before we dive into more technical terminology, we think that it’s imperative for you to understand how AI is currently affecting — and could possibly affect — our world.
Many people are concerned about the rapid development of AI and ensuring that it is aligned with societal values and principles.
Some key terminology in this space includes:
Alignment | Ensuring an AI system’s goals align with human values and interests |
Responsible AI | Ethical & responsible use of AI technology |
Explainability | Making AI models' decision-making process transparent |
Black Box | When an AI model's decision-making is not understood by humans |
Singularity | Hypothetical point when AI systems surpass human comprehension in a way that leads to unpredictable societal changes |
Alignment is the process of ensuring that an AI system’s goals align with human values and interests. It is a crucial aspect of the larger concept of responsible AI. In a:
Responsible AI refers to the ethical and responsible use of AI technology — ensuring that AI systems are designed and implemented in a way that respects human rights, diversity, and privacy.
For example, inputs of an email trigger may be the email body, subject line, sender's email, sent date, tags added, etc.
A critical approach to building responsible AI, explainability refers referring to making AI models — and how they make certain decisions — transparent and easy to understand.
A "black box" is when the internal workings and decision-making processes of an AI model are not easily understood or explained, even by the developers who created it.
"Black boxes" raise concerns related to trust and accountability.
Most responsible AI and alignment research focuses on ensuring that the integration of AI technology into society has a positive impact.
However, we must also plan for singularity, which is a hypothetical point in the future where AI systems become capable of designing and improving themselves without human intervention — surpassing human comprehension in a way that leads to rapid and unpredictable societal changes.
The continued progression of intelligence in AI systems is not only expected but, in some ways, is the goal.
So where is AI intelligence currently at… and, perhaps more importantly, where is it going?
Artificial Narrow Intelligence (ANI) | We initially started with ANI, where AI systems were designed to perform specific tasks or sets of tasks, e.g. voice recognition or image classification. |
Artificial General Intelligence (AGI) | We are now arguably at AGI, where AI systems have human-level intelligence and can perform a wide range of tasks. |
Artificial Superintelligence (ASI) | We’re on our way to ASI, where AI systems surpass human intelligence and can perform tasks beyond human comprehension. Singularity falls into this category. |
Now that we understand AI’s broader societal implications, let’s explore the more technical aspects of how it all works.
To understand machine learning, it's helpful to think of it as a toolbox filled with different tools that each solve different problems.
Just like tools in a toolbox, there are various machine learning approaches (i.e., tools) — each with its own strengths and weaknesses. To get the result you desire, it’s crucial to use the right tool for the job.
Imagine you work for Amazon and need to build an AI model that recommends products to customers based on their past purchases.
To pull this off, you could choose any of the following machine learning approaches:
Supervised Learning | Learning via a labeled dataset, predetermined by humans |
Unsupervised Learning | Learning by identifying patterns in data without explicitly labeled outputs |
Reinforcement Learning | Learning via rewards or penalties based on the model's actions |
Deep Learning | Learning via a neural network (i.e., layered, interconnected nodes similar to the human brain) |
NOTE: These are the most common machine learning approaches, but they are not the only available options.
The most common approach is supervised learning.
Supervised learning involves teaching an ML model how to make helpful responses by providing a labeled dataset, predetermined by humans so that the model can learn the relationship between inputs and the ideal output.
The model would then learn a pattern to make predictions for new input data based on the patterns it learned from the labeled examples.
In our Amazon example, the inputs may be customer information and product ratings & reviews, and the output might be what the customer ended up purchasing. The model could then predict which products may interest a customer.
Unsupervised learning is when a model identifies patterns in a dataset without any explicitly labeled outputs.
In other words, it could determine that customers who buy a laptop also tend to purchase a wireless mouse and a laptop case — even without you telling it to look for those patterns.
For example, using unsupervised learning, Amazon's recommendation engine could learn (on its own, without a human telling it) to group products that are frequently bought together or customers with similar purchasing patterns.
Unsupervised learning is particularly useful in situations where accurately labeling a large volume of diverse, intricate data is a prohibitively timely and expensive undertaking for a human to perform.
You could also use reinforcement learning, which refers to the model learning by receiving rewards or penalties based on its actions.
Reinforcement learning is akin to teaching the model to play a game in which it:
Reinforcement learning would be a good option if:
There are multiple types of reinforcement learning, each with differences in how the model learns.
How It Learns | Human Involvement? | AI Model Example |
---|---|---|
Interacts with its environment | No | DeepMind's AlphaGo Zero |
Utilizes another model's feedback* | No | Anthropic's Claude |
A combination of traditional reinforcement learning and human guidance (This is called RLHF: Reinforcement Learning from Human Feedback)** |
Yes | OpenAI's ChatGPT |
*By utilizing another AI model's feedback, the new model can benefit from the insights and experiences of the pre-existing model — leading to more accelerated learning and enhanced performance, particularly in tasks that require complex decision-making or understanding of intricate patterns in data.
**In tasks where human judgment is essential (such as natural language processing for chatbots), human guidance helps refine the quality of generated outputs.
The most powerful of them all (and how Amazon has actually built their recommendation system) involves an approach called ‘deep learning’.
Deep learning is a process of training a program called a neural network — which is quite similar to the human brain’s structure and function, consisting of many layers of interconnected nodes that work together to process data.
NOTE: We'll cover more about how neural networks work in the "Generative AI Architecture" section.
Now that we know the tools in the toolbox, let’s discuss how the tools can be used (i.e., the tasks the tools can perform).
While there are a growing number of ways machine learning is being used, the eight most common include:
Prediction | Predicts likelihood of a certain outcome |
Classification | Categorizes data |
Natural Language Processing (NLP) | Processes language |
Computer Vision | Interprets visual data |
Speech Recognition | Transcribes speech into written text |
Anomaly Detection | Identifies unusual patterns |
Clustering | Identifies groupings in data |
Generation | Creates new content |
Fairly self-explanatory, prediction involves the AI model predicting the likelihood of a certain outcome, typically framed as probabilities.
Example Use Case: Social media ranking algorithms predict the probability that a user will click on a specific ad.
Classification refers to an AI model identifying patterns in the input data and then using those patterns to predict the category (or label) for new, unseen data points. It can be thought of as categorizing data.
For example, an AI model trained on images of different types of animals (labeled as "cat," "dog," "bird") could then be used to classify a new image and predict whether it shows a cat, dog, or bird.
Example Use Case: Classification powers many e-commerce tagging systems, which categorize products using keywords or labels.
Natural language processing (NLP) focuses on an AI model understanding and processing language.
NLP is a key function of large language models (LLMs), which are models that process and generate human language (such as ChatGPT). NOTE: We’ll cover more about LLMs in the “A Deep Dive Into LLMs” section.
Example Use Cases: NLP is often used in customer service to analyze sentiment of customer feedback and in industry research to analyze large volumes of text data to extract insights.
Computer vision refers to an AI model analyzing and interpreting visual data (e.g., images or videos) from cameras or sensors.
Example Use Cases: AI models can leverage computer vision to identify defects in products using visual data, enabling efficient quality control in manufacturing.
The term speech recognition is a bit misleading since it involves both recognizing speech and transcribing it into written text.
Example Use Case: Speech recognition is one of the most widely used consumer use cases of AI today… ahem, Siri.
Anomaly detection identifies unusual or abnormal patterns in data.
Example Use Case: AI models can detect & prevent cyber attacks by identifying unusual network activity.
Clustering involves identifying groupings and patterns in data without explicitly defining the criteria.
Example Use Case: Netflix leverages clustering to provide personalized movie recommendations.
Generation, often referred to as generative AI (or 'GenAI'), involves using AI to create new data or content.
Example Use Case: Creating unique graphics & designs based on existing patterns, styles, & more.
Generative AI is seeing an explosion of use cases — and it's particularly exciting when combined with other tasks.
NOTE: For these reasons, generative AI is the crux of what we’ll cover in the remainder of this article.
To understand generative AI, it's helpful to familiarize yourself with some essential terms.
Modality | The type of data being processed or generated |
Input | The data provided to an AI system to explain a problem, situation,or request |
Prompt | The interaction of a human providing a model information |
Inference | The process of a model applying training data to generate a result |
Completion/Output | The response a model generates |
Generative AI can be applied across a variety of modalities, which refers to the type of data being processed or generated (e.g., text, image, video, code, speech, music, 3D model, and more).
An input is the data (e.g., text, images, sensor data, or many other types of relevant information) provided to an AI system to explain a problem, situation, or request.
Inputs are fundamental throughout the entire lifecycle of an AI model — from training to deployment and usage.
A prompt (which is a type of input) is an interaction between a human and an AI model that provides the model with sufficient information to generate the user’s intended output.
Prompts can take many various forms (e.g., questions, code snippets, images, or videos). While the most common prompts today are text prompts, prompts can be any modality.
A few prompting-related terms you might hear include:
Text-to-Image | Generating an image from a text description (i.e. text prompt) |
Text-to-Video | Generating a video from a text description (i.e. text prompt) |
Image-to-Image | Generating an image using another image as the prompt |
An inference is the process of an AI model applying the information it learned during training to generate an actionable result (e.g., generating an image).
A completion (also called an ‘output’) refers to the response a model generates — whether that be text, an image, or other modality.
Generative AI models use tokens, vectors, and embeddings to understand inputs and generate completions/outputs.
Token | A fundamental unit of data that represents words, pixels, etc. |
Vector | A mathematical representation of a token |
Embedding | A 'supercharged' vector that captures meaning between tokens |
A token is the smallest unit of data used by AI models to process inputs (including, but not limited to, prompts) and generate outputs.
Tokens represent elements such as words or pixels, depending on the modality.
For example, in the sentence "Apple is a fruit", each word ("Apple," "is," "a," "fruit") is a token.
A vector is a mathematical representation of a token.
Each token gets its own set of numbers that represent its meaning and context — enabling the model to interpret the token in a 'language' it understands.
Embeddings are like supercharged vectors that not only represent tokens but also capture deep meanings and relationships between tokens.
Embeddings help AI models better understand the nuances and overall context of the data.
How Tokens, Vectors, & Embeddings Relate To Each Other
Breaking down complex data into small, manageable tokens and then converting them into numerical vectors that machine learning algorithms can more easily process enables AI models to analyze, comprehend, and generate content effectively.
While vectors are suitable for tasks where the focus is on numerical operations and straightforward data representation, embeddings are required for tasks in which the AI model needs to learn complex patterns or understand subtle nuances and relationships between data, such as natural language processing (NLP) and computer vision.
AI architecture involves the AI system’s underlying infrastructure needed to develop, train, deploy, use, and manage AI models — including:
Hardware | E.g., GPUs & TPUs |
Software | Including ML frameworks & libraries |
Model Design | Such as neural network architecture |
Model Behavior | Involving distinct training objectives & data |
When creating an AI system, the first step after choosing an AI model (e.g., rule-based system or machine learning) is to acquire the computing hardware needed to run it.
Computer power is an important aspect of machine learning models, which refers to the capability of hardware systems to perform complex computations required for training and running machine learning models efficiently.
The amount of computing power plays a crucial role in the performance of machine learning algorithms — especially deep learning models, which rely on vast amounts of data and computations.
The availability of high-end computational resources enables the rapid advancement of deep learning models by facilitating parallel processing and faster computations.
High-end computational resources needed for machine learning include:
Graphics Processing Units (GPUs) | Used for training & generating outputs |
Specialized chips like Tensor Processing Units |
Used for speeding up training & enhancing performance |
Graphics Processing Units (GPUs)
GPUs are typically used to render realistic graphics and visuals in modern video gaming and virtual reality systems — but they are also highly prominent in AI development for rapidly processing large datasets and complex algorithms.
Someone building an AI model could either:
To help power OpenAI’s ChatGPT, Microsoft spent hundreds of millions of dollars building a massive supercomputer consisting of A100 NVIDIA chips — the most powerful chip on the market at the time.
Tensor Processing Units (TPUs)
Developed by Google, TPUs are specialized AI accelerators that increase the speed of training and deployment as well as the overall performance of machine learning models.
This boost of speed and performance makes them especially well-suited for generative AI tasks like image and text generation.
The two most common ways to interact with an AI model include:
In cases where you're using an AI model via a web interface provided by a company (e.g., ChatGPT, which is provided by the company, OpenAI), the only software you really need is a web browser (e.g., Chrome, Safari, etc.).
The web browser acts as the client software, allowing you to send queries and receive responses from ChatGPT without needing to install any additional machine learning libraries or frameworks on your local machine.
However, if you wanted to run an AI model locally on your computer, working with it directly requires the installation of:
Why Would You Want To Work With An AI Model Directly?
For reasons such as privacy concerns, faster processing speeds, offline access, or the need to customize or fine-tune the model for specific requirements. By running the model locally, you have more control over the data — ensuring that sensitive information is not sent to external servers.
Open Source
Open source means that an AI model has been made freely available for anyone to use, modify, and distribute — allowing for greater collaboration, transparency, and innovation by enabling developers and researchers to access and build upon the existing model.
For context, Grok is an open-source model while ChatGPT is not (as of April 2024).
Machine Learning Framework
A machine learning framework is a tool that simplifies the development of AI models without requiring software developers, data scientists, and machine learning engineers to delve into the complex underlying algorithms.
Machine learning frameworks offer a range of functionalities to facilitate and streamline model building and training — catering to various needs and preferences.
Library
In the context of machine learning, a library refers to a collection of pre-written code that provides functions and tools for building and implementing machine learning models.
Prominent open-source machine learning framework libraries include:
A popular library is Hugging Face's Transformers library, which is a state-of-the-art machine learning library designed for PyTorch, TensorFlow, and more that provides tools to easily download and train pre-trained models, reducing compute costs (i.e. computing power) and the time required to train models from scratch.
There are many ways to design a model — with one of the most prominent being the design of neural network architecture, which refers to the configuration of a neural network.
What's a Neural Network?
A neural network is a type of machine learning process that teaches machines to process data in a way that mimics the human brain.
By using adaptive systems encompassing interconnected nodes in a layered structure, neural networks enable machines to understand complex relationships and patterns as well as learn from their mistakes and improve.
Part of what makes neural networks so powerful is their ability to translate data into a numerical representation then meaningfully interpret the data via embeddings (which we covered earlier in the “Key GenAI Terminology” section).
If you were creating an AI system, you may choose to go with a neural network architecture if you wanted the model to perform a complex task where relationships within the data might be non-linear and intricate, such as with:
Transformers are currently one of the most discussed AI architectures. You may have heard of them because they are the “T” in GPT (Generative Pre-trained Transformer) — which powers OpenAI’s ChatGPT.
Introduced in 2017, a transformer is a type of neural network architecture that is based on the self-attention mechanism, which allows the model to pay attention to different parts of the input data simultaneously (as opposed to one element at a time) as it learns.
This ability helps it develop a deeper understanding of its training data by learning more complex relationships within the data. Due to their flexibility and power, transformers have been widely adopted for training large language models (LLMs) on extensive datasets.
NOTE: While transformers are primarily used in neural network-based AI models, they are not limited to just neural networks. Transformers have become a foundational component of many state-of-the-art AI systems, regardless of the underlying architecture.
Generative Adversarial Networks (GANs) are a type of algorithm that pits two neural networks against each other to improve the quality of the generated data (i.e., the two neural networks are trained simultaneously in a competitive manner).
The two networks include a:
The networks iterate until the generator becomes adept at producing data that the discriminator struggles to distinguish from real data.
GANs can be used in a variety of applications (such as generating realistic images) and allow models to generate realistic data that can be used for training other machine learning models.
Why would we want a model to train another model? Training one machine with another can improve the overall performance and efficiency of the AI system. By using a well-trained machine to generate realistic data for training other models, we can potentially improve the accuracy and generalization capabilities of those models. Additionally, it can reduce the amount of manual effort required to collect and label large datasets, making the training process faster and more cost-effective.
How Transformers & GANs Work Together
Many of today’s AI image generation models combine the strengths of transformers and GANs to generate images from text descriptions. The text input is first processed by a transformer, which then conditions the GAN to generate the corresponding image.
While a model may leverage transformer architecture and incorporate some GAN-like elements, the model's training process shapes the model's specific behavior. In order to influence a model's behavior, it may have distinct training:
Constitutional AI and diffusion models are great examples of this.
Constitutional AI models are embedded with a set of ethical guidelines (such as avoiding harm, respecting preferences, and providing accurate information) within their functioning so that they produce harmless results.
Training Objective: May be trained to understand legal principles & make predictions about legal outcomes
Training Data: May be trained on legal documents, court cases, and historical precedents
Anthropic’s Claude is a large language model (LLM) that’s powered by constitutional AI.
Currently the most popular option for image and video generation, diffusion models offer realistic outputs by degrading and reconstructing data systematically. Here’s how it works:
Training Objective: May be trained to analyze & generate social media content
Training Data: May be trained on social media posts, news articles, or other sources of information
DALL-E is an AI image generator that utilizes a diffusion model.
Training and deploying AI models happens in four phases:
Phase 1: Pre-Training | Creates a foundation (or 'base') model |
Phase 2: Customization | Tailors a model to perform a specific task |
Phase 3: Deployment | Makes the model available for use in a real-world application |
Phase 4: Refinement | Improvement of the model's behavior & outputs |
Most generative AI models are pre-trained on large datasets of complex, unstructured data. Unstructured data refers to data that is not organized in a predefined manner.
Why train an AI model on unstructured data as opposed to structured data?
Unstructured data (such as raw text from websites, books, articles, or images from the internet) is more abundant and inherently diverse — providing a wealth of human knowledge, language, and visual information. This diversity is crucial for developing models with a broad understanding and the ability to generalize across a wide range of tasks.
This first level of training creates what’s called a “foundation model” (also referred to as a “base model”).
Foundation models are designed to be general-purpose (i.e., capable of performing a diverse array of tasks by encompassing a broad spectrum of knowledge and capabilities).
However, this inherent generality can render them less adept at specific tasks compared to models that are tailored for those particular functions.
For example, a foundation model may be good at predicting words, but it would not be good at performing the tasks we want it to do, such as following our instructions (aka prompts) or chatting with us (like ChatGPT).
To tailor an AI model to excel at specific tasks, it needs further training — which may entail:
NOTE: These training methods are not mutually exclusive. A model can be trained via only one method or with all methods. However, because training AI models is computationally expensive, typically only some parameters (i.e., variables within a model) are updated when introducing new data or optimizations — and even that “refresh” process can be quite expensive.
Specialized prompts guide a model's existing knowledge and capabilities toward a specific task or output format without altering the model's original state.
A few types of specialized prompting methods include:
AI models can also be further customized for specific tasks using a process called fine-tuning, which involves exposing the model to a new dataset.
As opposed to prompt-related training, where the model’s original state remains unchanged, fine-tuning does alter the model’s original state via additional training to improve the model’s performance on the new task.
However, fine-tuning runs a risk of overfitting the model — a common machine learning issue where an AI model becomes too specialized on a subset of the data it’s fed. This can result in a lack of generalizability and poor performance on new data.
Therefore, it’s crucial to strike a balance between customization and generalizability to ensure that the model works well for the specific use case while still being adaptable to new scenarios.
Unlike fine-tuning, which alters a model’s underlying knowledge or capabilities, temperature is more about influencing the model's output behavior.
By design, generative AI models produce an output based on what is the most likely completion of the task. Temperature adds more randomness into the completion process — helping the model explore different possibilities and generate more creative and varied responses.
Once the model has been trained and customized, it is either:
Continuous monitoring and feedback loops track the model's performance and ensure that it remains effective over time.
This could involve refining algorithms, retraining the model with fresh data to keep it up-to-date, adding new features, implementing new techniques to handle specific tasks or scenarios better, and more.
While generative AI encompasses a broader range of tasks beyond language generation (including image and video generation, music composition, and more), large language models (LLMs) are specifically designed for tasks revolving around natural language generation and comprehension.
LLMs operate by using extensive datasets to learn patterns and relationships between words and phrases — enabling them to generate coherent and contextually relevant text outputs based on given prompts or inputs.
Despite their impressive capabilities, LLMs:
NOTE: To ensure accuracy, you should use credible sources to verify information generated by LLMs.
Although LLMs showcase a wide range of capabilities, they also face and present certain challenges.
Training Cost | Requiring expensive computational resources |
Fine-Tuning Limits | Doesn't expand knowledge or improve understanding |
Data Shortage | Lack of access to high-quality data |
Limited Context Windows | Struggles around maintaining coherence over larger amounts of data |
Hallucinations | Generation of incorrect or nonsensical information |
Latency & Cost Tradeoffs | Faster responses are more expensive |
Training Dates | Lack of up-to-date knowledge |
Because LLMs demand considerable computational resources, training cost is a significant concern.
In addition to the problem, there’s a concern about a possible GPU shortage, which would exacerbate this problem by making it challenging for organizations to secure necessary hardware.
The availability of foundation models helps mitigate training costs. However, the costs of inference and fine-tuning these models for specific tasks still persist — creating a potential bottleneck for even the foundation model providers themselves.
While fine-tuning can improve an AI model’s performance at certain tasks (like summarizing), it's not as effective for expanding the AI's knowledge base with entirely new information or improving its understanding of facts.
Instead, adding more varied and targeted data during the training (aka data augmented generation) is currently seen as a better approach.
However, this is still an area where researchers are actively exploring and learning, so there aren't definitive answers yet.
Context and knowledge are essential for models to perform well — making sourcing more high-quality data a challenge that’s very top of mind.
It’s unclear where this higher-quality data will come from.
One theory is that OpenAI is trying to source richer data through projects like ElevenLabs and Rewind AI, state-of-the-art speech-to-text transcription tools that make it easier to analyze and utilize data from sources such as podcasts, interviews, speeches, and other audio recordings — enabling access a wider range of data sources that were previously inaccessible or difficult to analyze.
The amount of text that an LLM can “pay attention” to at a time — known as its context window — is a major constraint that you may have experienced yourself when using an LLM.
Limited context windows increase the likelihood that the model struggles to maintain coherence over longer passages of text and forgets important details mentioned earlier in a conversation, affecting the relevance and memory of its outputs.
While context windows have been expanding, retrieval-based systems — which enable a model to search through a large collection of pre-existing responses — might be needed to enhance the models' ability to access a larger pool of relevant information from which to draw upon.
Hallucinations refer to large language models’ tendency to generate incorrect or nonsensical information in order to complete the task at hand (e.g., an LLM might generate sentences that don't make sense) — posing challenges to the reliability of LLMs.
NOTE: Because this is such a large problem, this is a very active area of research.
In the world of LLMs, latency (i.e., the speed at which you get a response) and cost are directly opposed to each other — meaning that faster responses are more expensive, and reduced costs may lead to slower responses.
When companies use LLMs for critical parts of products, they may face a problem because slow responses can make their product ideas impractical.
For example, if a chatbot takes too long to respond, users may lose interest or go to a competitor's faster service.
AI model providers (like Google for Gemini) currently make these decisions for us. However, we can expect more granular control in the future, which would provide more flexibility for optimizing interactions with AI models. And, in fact, ChatGPT is already offering this — allowing users the ability to toggle features like web browsing and DALL-E options on and off.
Companies and products that need high latency (fast models) typically get more control by opting to use an open-source, self-hosted model. This is because, by self-hosting, companies avoid latency issues that may arise from using third-party hosting services, and open-source software allows for greater flexibility to tailor systems to meet a company’s unique requirements.
Because of how LLMs are currently trained, the model is unlikely to be fully up to date — lacking knowledge of recent events, trends, or technological advancements, which limits their relevance and accuracy in certain applications.
However, some applications or systems built using LLMs (such as Gemini and Perplexity) may incorporate real-time information by integrating with external APIs or data sources. In such cases, the integration of real-time data sources is handled by the system built around the LLM rather than by the LLM itself.
LLMs are trained to train general language patterns, grammar, and semantic relationships between words and phrases. Just like other generative AI, training an LLM involves:
During pre-training, the LLM is trained on an extensive dataset with the objective of predicting the next word in a sentence given the previous words.
For example, For example, when given the input "The sky is", the LLM is trained to predict the next word "blue".
OpenAI’s GPT-3.5 was reportedly trained on Books, Wikipedia, and a substantial portion of the internet (known as "CommonCrawl") — illustrating the vastness and intricacy of its training data.
An LLM goes from predicting the next word in a sentence to understanding complex text and semantic meaning through further training and fine-tuning on a large dataset of text.
Additionally, LLMs are equipped with attention mechanisms that help them focus on relevant parts of the input text, aiding in understanding and generating more sophisticated language — helping them to develop a deeper understanding of more sophisticated language by learning more complex relationships within the text data.
This process of fine-tuning allows the model to learn patterns, relationships, and contextual information within the text — enabling it to generate coherent, human-like responses.
AI research teams like Open AI are actively working to make LLMs as easy to use and helpful as possible. The progress from the first iteration of ChatGPT to ChatGPT-4 clearly shows this.
The first version feels noticeably more like auto-complete, whereas ChatGPT-4 offers a more interactive and conversational experience, making it feel as if you are having a helpful conversation with an assistant.
It’s undeniable that AI has revolutionized the way businesses operate and individuals interact with AI tools.
Two terms currently gaining prominence in AI are automation and agents. But before we delve into those, it’s essential to define the concepts of:
Low-Code/No-Code | A type of programming that requires zero (or very little) traditional coding knowledge |
Application Programming Interfaces (APIs) | A way for programs (like tools, software, or apps) to communicate with each other |
NOTE: While these next terms aren’t directly considered AI terminology, you’ll encounter them when learning to apply AI and will benefit from a base understanding of what they each mean and their relation to AI.
Low-Code/No-Code
Businesses often use low-code/no-code tools to quickly create custom applications without the need for a large development team while still having some level of customizability and control. Examples of each include:
Application Programming Interface (API)
APIs are the cornerstone of the low-code/no-code space because most of these tools (like Zapier) primarily connect various APIs and provide a simple user interface for businesses to integrate different applications and services into their own operations easily.
Automation is creating a new level of intelligent workflows that transform how your business operates — turning complex challenges into manageable solutions by delegating tasks to machines.
Automation tools streamline processes by linking data across various tools and needing minimal ongoing manual intervention once established. The automation tools available range from user-friendly low-code/no-code platforms to more sophisticated systems that offer extensive customization for experienced developers.
The key differentiator between each of the three types of automation lies in their independence and cognitive capabilities.
Type of Automation | How It Works | Level of Human Involvement | Tasks |
---|---|---|---|
Traditional Automation | Predetermined rules | High — need explicit instructions | Menial, repetitive tasks |
AI Automation | Learns from data to make human-like decisions |
Medium — moderate amount of setup | Data analysis & decision making |
AI Agents | Make decisions autonomously | Low — need only an end goal | Complex problem-solving & interactions |
Traditional Automation: Traditional automation is designed to execute repetitive tasks based on explicit, predefined rules determined by humans and does not learn or evolve.
AI Automation: AI automation uses artificial intelligence capabilities like machine learning and natural language processing (NLP) to enable machines to learn from data and experiences, recognize patterns, and make human-like decisions. AI automations do adapt and evolve over time.
AI Agents: Similar to AI automation, AI agents are designed to perceive their environment and make human-like decisions. However, unlike AI automations, AI agents take autonomous actions (i.e., without needing any human input).
Automation involves six key concepts, including:
Concept | Description |
---|---|
Workflow | A sequence of steps to complete a task |
Triggers | Events that initiate the workflow |
Inputs | The data required for the automation to work |
Logic | The rules that determine what happens within a workflow |
Action | The steps taken by the automation |
Output | The ultimate result of an automation |
APIs play a crucial role in automation by facilitating the real-time exchange of data between different software (such as customer databases, CRM platforms, and social media analytics tools) within an automation workflow.
Almost every automation tool on the market is essentially an API wrapper, which refers to software that provides a more user-friendly way to work with an API by removing the complexity of directly interacting with the API.
You can think of an API wrapper as a translator that enables you to use APIs without getting lost in the technical details or as a middleman that takes care of the complex interactions needed to use APIs.
» Discover: AI Copywriters
Automation through AI | Delve deeper into how automation works (with examples!) |
AI Evolution | Explore the progression of AI — from its earliest foundations to tomorrow's innovations |
AI Glossary | Discover even more key AI terminology |