In the past decade, Artificial Intelligence (AI) has gone through a radical evolution. At the heart of this revolution are foundation models, large, general-purpose models that learn from massive datasets and can be adapted to many tasks with minimal extra training. The most famous example are Large Language Models (LLMs), like GPT-4, which have transformed the way we interact with machines through natural language.
What if we could apply that same idea, not only to language, but also to brain signals?
Welcome to the world of EEG foundation models!
What are EEG Foundation Models (EEG-FMs)?
Just as LLMs are trained on massive amounts of text to learn the structure of language, EEG foundation models (EEG-FMs) are trained on large-scale electroencephalography (EEG) data to learn the structure of cognition.
These models aim to develop a general understanding of EEG signals across different subjects, tasks, and mental states. Once trained, they can be fine-tuned for specific applications, like mental workload estimation, emotional state detection, attention tracking, or BCI control, with minimal calibration.
EEG-FMs are built using deep learning architectures similar to those used in Natural Language Processing (NLP), such as:
- Encoder models that turn raw EEG into compact, meaningful representations.
- Encoder-decoder models that enable reconstruction, translation between modalities, or temporal predictions.
- Contrastive learning strategies that align EEG with behavioral or contextual labels.
EEG-FMs vs. LLMs
EEG-FMs share core principles with LLMs. Both are built on the idea of pretraining on vast and diverse datasets to learn general-purpose representations. This foundational knowledge can then be fine-tuned or adapted for specific downstream tasks. Just as LLMs have become versatile tools for language understanding and generation, EEG-FMs have the potential to become “general-purpose brains”, capable of decoding, interpreting, and interacting with human cognitive states across a wide range of neurotechnological applications.
Concept | LLMs | EEG-FMs |
Input | Text sequences | EEG time-series |
Learning | Language structure | Neural signal structure |
Pre-training Objective | Next-word prediction, masked modeling | Reconstruction, contrastive learning, self-supervised decoding |
Output | Language embedding | Brain state embedding |
Applications | Chatbots, Assistants, Search Engine, Translation, Speech-to-Text, Text-to-Speech | BCIs, Cognitive state detection, Neurofeedback, Mental health and Research tools |
Why are EEG-FMs a Big Deal?
EEG data is notoriously noisy, highly individual, and task-specific. Traditionally, every new application or user required training a new model from scratch, or at least collecting a good amount of user-specific data.
Foundation models hold the potential to change this game entirely by offering:
- Generalization across tasks and subjects
- Few-shot or zero-shot performance: no training or just a few examples needed
- Plug-and-play potential for neurotech devices
- Adaptive, real-time neurofeedback
- Personalized brain-aware applications
In essence, EEG-FMs pave the way for brain-aware systems that just work, systems that are robust, adaptive, and ready to use with minimal setup. In practice, this means eliminating many of the traditional hurdles in EEG-based applications: no more lengthy calibration sessions, no need for subject-specific training data, and no fragile classifiers that fall apart when the context or task changes.
For real-world use, this could dramatically improve brain-computer interfaces (BCIs), making them more accessible for assistive technologies, neurofeedback training, and even consumer wellness tools. BCI systems have long promised seamless interaction between brain and machine, but the reality has been slower and messier, largely due to the variability and fragility of EEG-based models. Foundation models could finally deliver on that promise. Imagine putting on a headband and instantly accessing attention tracking, mood assessment, or hands-free control, without having to retrain the system every time. This shift could unlock new applications in mental health, education, productivity, gaming, and accessibility, making EEG truly scalable and practical.
In research, EEG-FMs could accelerate experimentation by offering reliable, generalizable representations across subjects, datasets, and tasks. This could lead to more reproducible findings in cognitive neuroscience, faster development cycles in clinical studies, and new ways to study mental states in naturalistic settings.
Whether decoding mental workload, detecting early signs of neurological disease, or exploring the neural basis of consciousness, foundation models could become a cornerstone of modern neurotechnology.
What is the Current State of Research?
While this is still an emerging area, research labs are building shared EEG datasets to support pretraining, while benchmarks for generalization and cross-task performance are starting to emerge.
Some notable projects are already pushing the boundaries and a few large-scale models have been proposed. Here’s some examples:
Neuro-GPT [1]
Neuro-GPT combines a specialized EEG encoder with a GPT-style decoder. The model is first pre-trained on a large EEG dataset using a self-supervised task, where it learns to reconstruct masked segments of the signal (similar to how language models learn by predicting missing words).
To evaluate its performance in data-scarce settings, it was fine-tuned on motor imagery classification using data from just 9 participants. Despite the limited data, Neuro-GPT outperformed models trained from scratch, showing strong generalization and the ability to handle common challenges in EEG analysis, such as variability across subjects and limited labeled data.
Criss-Cross Brain Foundation Model (CBraMod) [2]
CBraMod is a criss-cross transformer tailored for EEG data. In this architecture, “criss-cross” means that the model processes spatial and temporal information separately but in parallel, using two dedicated attention mechanisms, one for the spatial layout of EEG channels, and one for how brain activity unfolds over time. This design helps the model better capture the unique structure of EEG signals.
The model is pre-trained on a large collection of EEG recordings using a masked patch reconstruction task (similar to masked language modeling), and has been evaluated across 10 BCI tasks and 12 different datasets, demonstrating strong generalization and versatility.
Large Cognition Model (LCM) [3]
LCM is a transformer-based foundation model designed to generalize across a wide range of datasets and tasks. By using large-scale self-supervised learning, it learns universal EEG representations that can be easily adapted for specific applications, such as decoding cognitive states, classifying neurological conditions, or powering neurofeedback systems. What sets LCM apart is its dual attention architecture, which combines temporal and spectral attention to capture both time-based dynamics and frequency-specific patterns directly from raw EEG data. This makes the model highly effective at extracting meaningful features without extensive preprocessing.
Evaluated across multiple benchmarks, LCM consistently outperforms existing EEG models, demonstrating strong generalization across both subjects and tasks, marking a significant step forward for scalable, brain-aware AI systems.
Remaining Challenges
Despite their promise, EEG-FMs still face several challenges. Unlike text corpora used to train language models, EEG datasets are typically smaller, more heterogeneous, and vary widely in quality and structure. The EEG signal itself is noisy and highly variable, making it difficult to extract clean and meaningful patterns, especially in real-world or mobile environments.
Additionally, there is a lack of standardization across EEG hardware, electrode configurations, and experimental protocols, which complicates model generalization. Even when models perform well, the interpretability and neuroscientific validity of the learned brain representations remain open questions.
Other hurdles include privacy concerns, especially with sensitive cognitive or clinical data, and the need for better benchmarks and community-shared tools to evaluate cross-task and cross-subject performance reliably.
Conclusion
Foundation models revolutionized the way we work with language. EEG foundation models could do the same for the brain.
They represent a critical step toward making neurotechnology accessible, reliable, and intelligent, pushing the boundaries of what brain-computer interfaces and brain-aware AI can achieve.
At BrainAccess, we are deeply committed to advancing EEG research and neurotechnology by embracing the power of foundation models. Our work focuses on building robust tools, accessible hardware, and intelligent software that accelerate innovation in brain-computer interfaces and cognitive monitoring. By supporting open research, standardized pipelines, and real-world applications, we aim to contribute to a future where EEG analysis is not only faster and more accurate but also more meaningful. We believe that foundation models can serve as a cornerstone for a unified, scalable understanding of brain activity, one that bridges individual variability, simplifies complex workflows, and unlocks new possibilities for research, healthcare, and human-computer interaction.
Written by Martina Berto, Research Engineer and Neuroscientist for BrainAccess at Neurotecnology.
References
[1] Cui, W., Jeong, W., Thölke, P., Medani, T., Jerbi, K., Joshi, A. A., & Leahy, R. M. (2024, May). Neuro-gpt: Towards a foundation model for eeg. In 2024 IEEE International Symposium on Biomedical Imaging (ISBI) (pp. 1-5). IEEE. https://doi.org/10.48550/arXiv.2311.03764
[2] Wang, J., Zhao, S., Luo, Z., Zhou, Y., Jiang, H., Li, S., … & Pan, G. (2024). Cbramod: A criss-cross brain foundation model for eeg decoding. arXiv preprint arXiv:2412.07236. https://doi.org/10.48550/arXiv.2412.07236
[3] Chen, C. S., Chen, Y. J., & Tsai, A. H. W. (2025). Large cognition model: Towards pretrained eeg foundation model. arXiv preprint arXiv:2502.17464. https://doi.org/10.48550/arXiv.2502.17464