[🏠Home](README.md)

Due to projects like [Explore the LLMs](https://llm.extractum.io/) specializing in model indexing, the custom list has been removed.

Views0
PublishedFeb 1, 2026

Loading actions...

5 minBeginnerpromptSingle file

Skill content

Main instructions and any bundled files for this skill.

markdown

🏠Home

Open LLM Models List

Due to projects like Explore the LLMs specializing in model indexing, the custom list has been removed.

Noteworthy

ModelLinkDescriptionDate added
BitNet b1.58 2B4Thttps://huggingface.co/microsoft/bitnet-b1.58-2B-4Tthe first native 1-bit LLM at the 2-billion parameter scale achieving performance comparable to full-precision models of similar size but with computational efficiency (memory, energy, latency)2025-04-25
OpenHands-LMhttps://huggingface.co/all-handsopenhands 1.5b, 7b and 32b coding models with verified strong performance on SWE-Bench using the OpenHands Coding-Agent2025-04-25
OpenThinker2-32Bhttps://huggingface.co/open-thoughts/OpenThinker2-32Bfine-tuned version of Qwen2.5-32B-Instruct on the OpenThoughts2-1M dataset with increased quality compared to base model2025-04-25
Cogito-V1https://huggingface.co/collections/deepcogito/cogito-v1-preview-67eb105721081abe4ce2ee53Cogito model family with Qwen and Llama fine-tunes using Iterated Distillation and Amplification (IDA) to increase coding, STEM and IF quality compared to their base models2025-04-25
DeepCoder-14B-Previewhttps://huggingface.co/agentica-org/DeepCoder-14B-Previewcode reasoning LLM fine-tuned from DeepSeek-R1-Distilled-Qwen-14B using RL to scale up to long context lengths using DeepScaleR and GRPO+2025-04-25
ZR1-1.5Bhttps://huggingface.co/Zyphra/ZR1-1.5BFine Tuned DeepSeek-R1-Distill-Qwen-1.5B trained extensively on both verified coding and mathematics problems with reinforcement learning2025-04-25
Skywork-OR1https://huggingface.co/Skywork/Skywork-OR1-32B-Preview7B, 7B math and 32B reasoning models with open sourced weights, training data and training code2025-04-25
GLM-4https://huggingface.co/collections/THUDM/glm-4-0414-67f3cbcb34dd9d252707cb2e3 32B models as general, reasoning and deep reasoning variant as well as a 9B SML2025-04-25
MAI-DS-R1https://huggingface.co/microsoft/MAI-DS-R1post-trained DeepSeek-R1 reasoning model by Microsoft AI that enhances responsiveness on blocked topics while maintaining strong reasoning capabilities2025-04-25
Gemma 3 QAThttps://huggingface.co/google/gemma-3-27b-it-qat-q4_0-ggufQuantization Aware Training models from Google, regaining bf16 quality in int4 quants and slashing memory footprint2025-04-25
Sky-T1-7B-Minihttps://huggingface.co/NovaSky-AI/Sky-T1-miniTrained with simple RL applied on DeepSeek-R1-Distill-Qwen-7B model, achieving close to OpenAI o1-mini performance on math benchmarks2025-02-21
OmniParser-v2https://huggingface.co/microsoft/OmniParser-v2.0A VLM converting screenshots of Phone and Desktop UIs into structured list of interactable elements for Computer-Use2025-02-21
R1-1776https://huggingface.co/perplexity-ai/r1-1776Deepseek-R1 671B Param model with removed Chinese Communist Party Censorship2025-02-21
Step-Audio-Chathttps://huggingface.co/stepfun-ai/Step-Audio-ChatMultimodal Large Language Model with 130B parameters for speech recognition, semantic understanding, dialogue management, voice cloning, and speech generation2025-02-21
Qwen2.5-VLhttps://huggingface.co/Qwen/Qwen2.5-VL-3B-Instruct3B, 7B and 72B Vision Text Multimodal Model with support for bounding boxes, structured output, OCR for tables, forms etc, long video understanding, agentic computer and phone use, visual and text understanding2025-02-21
Arcee-Maestro-7Bhttps://huggingface.co/arcee-ai/Arcee-Maestro-7B-PreviewRL trained reasoning model based on DeepSeek-R1-Distill-Qwen-7B with further GPRO training for reasoning, math and coding2025-02-21
Arcee-Blitzhttps://huggingface.co/arcee-ai/Arcee-BlitzMistral-Small-24B-Instruct base distilled with DeepSeek-R1 for fast and efficient resaoning with 32k context2025-02-21
OpenThinker-32Bhttps://huggingface.co/open-thoughts/OpenThinker-32Bfine-tuned reasoning model of Qwen/Qwen2.5-32B-Instruct on the DeepSeek-R1 distilled OpenThoughts-114k dataset2025-02-21
MiniCPM-ohttps://huggingface.co/openbmb/MiniCPM-o-2_6GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming on Your Phone2025-02-21
DeepSeek-R1https://huggingface.co/deepseek-ai/DeepSeek-R1Ground Breaking reasoning model from DeepSeek trained on novel method to decrease RLHF efforts with distilled variants of various sizes2025-02-01
Sky-T1https://huggingface.co/NovaSky-AI/Sky-T1-32B-PreviewUC Berkeley's reasoning model with 32B parameters2025-01-15
QwQhttps://huggingface.co/Qwen/QwQ-32B-PreviewQwen's reasoning model with 32B parameter2025-01-15
Moxin LLMhttps://huggingface.co/moxin-org/moxin-llm-7bFully open data, open training 7B base and chat fine tuned model2024-12-20
Bamba-9bhttps://huggingface.co/blog/bambaHybrid Mamba2 model by IBM, Princeton, CMU, UIUC trained on open data with 2.5x throughput available for vLLM, TRL, llama and transformers2024-12-20
Command R7Bhttps://huggingface.co/CohereForAI/c4ai-command-r7b-12-2024open weights research 7B model with reasoning, summarization, question answering, coding, tool use and RAG capabilities2024-12-20
DeepSeek-V2.5-1210-236Bhttps://huggingface.co/deepseek-ai/DeepSeek-V2.5-12101210 improvement over original V2.5 with Math, Coding and Reasoning improvements2024-12-20
QwQ-32bhttps://qwenlm.github.io/blog/qwq-32b-preview/Apache 2 licensed LLM from Alibaba Cloud's Qwen team, inspired by OpenAI's o1 reasoning model for test time compute via reasoning tokens to improve performance2024-12-02
Sparse-Llama-3.1-8B-2of4https://neuralmagic.com/blog/24-sparse-llama-smaller-models-for-efficient-gpu-inference/2:4 Sparse Llama: Smaller Models for Efficient GPU Inference2024-12-02
CursorCorehttps://huggingface.co/collections/TechxGenus/cursorcore-series-6706618c38598468866b60e2Coding LLMs for use within CursorCore and CursorWeb
ichigohttps://huggingface.co/homebrewltdan open research project extending text-based llama3 to have native "listening" ability, using an early fusion technique, with improved multiturn capabilities and refusal to process inaudible queries
Zamba2https://www.zyphra.com/post/zamba2-7ba 7B SOTA SML for running on-device with 25% faster first token time and 20% token per second rate compared to other architectures using Mamba2 blocks interleaved shared attention blocks and LoRA shared MLP block
reader-lmhttps://jina.ai/news/reader-lm-small-language-models-for-cleaning-and-converting-html-to-markdownJina AI's LLM to convert HTML to Markdown, making heuristics, cleanup and content identification an LLM task
Pixtralhttps://huggingface.co/mistralai/Pixtral-12B-240912B LLM with a 400M vision encoder for multi modal image and text inference and 128k sequence length by Mistral
llama-3.2https://ai.meta.com/blog/llama-3-2-connect-2024-vision-edge-mobile-devices/small and medium sized vision LLMs in 11b and 90b and text only 1b and 3b models by Meta
gemma2 2bhttps://huggingface.co/bartowski/gemma-2-2b-it-GGUF2b small language model by google achieving SOTA performance for sub 3b models on LLM Leaderboard 2
DeepSeekCoderv2https://github.com/deepseek-ai/DeepSeek-Coder-V2?tab=readme-ov-file#2-model-downloads16b and 236b mixture of experts coding models with 128k context length
codegemmahttps://huggingface.co/google/codegemma-7bgoogle's coding models from 2b base, 7b base and 7b instruct
codeqwen1.5https://huggingface.co/Qwen/CodeQwen1.5-7Bbase and chat models with 7B parameters and good quality
Qwen2https://huggingface.co/collections/Qwen/qwen2-6659360b33528ced941e557fEnglish and Chinese models from 0.5b, 1.5b, 7b, and 72b sizes with great performance and 128k context windows for the 7 and 72b models
Phihttps://huggingface.co/collections/microsoft/phi-3-6626e15e9585a200d2d761e3Microsoft's small language and vision models with small and medium parameter sizes, short and long context lengths and great performance
Yi-1.5[https://huggingface.co/01-ai/Yi-9B](https://huggingface.co/01-ai/Yi-1.5-34B-Chat9b model focusing on multilingual text understanding, available as 9B and 34B variants
InternLM2.5https://huggingface.co/internlm/internlm2_5-7b-chat7B base and chat models focusing reasoning, math and tool use and 1M context window
Mistral-Largehttps://huggingface.co/mistralai/Mistral-Large-Instruct-2407a 123B sized model beating llama-3.1 and gpt-4o in several categories with a focus on multilinguality, coding, agentic tasks and reasoning.
Llama-3.1https://ai.meta.com/blog/meta-llama-3-1/Metas most advanced model providing 8b, 70b and 405b base and instruction tuned models and 128k context window with on par quality of current SOTA closed source models
Nuextracthttps://huggingface.co/numind/NuExtractis a structure extraction model based on phi-3-mini, allowing to instruct based on a json template that the model fills from unstructured text provided
Mistral Nemohttps://huggingface.co/mistralai/Mistral-Nemo-Instruct-2407a 12B model by mistral and nvidia offering 128k context window offered as instruct and base models
CodeGeeX4https://huggingface.co/THUDM/codegeex4-all-9b9B multilingual code generation model for chat and instruct with a 128k context length
Mamba-Codestralhttps://huggingface.co/mistralai/Mamba-Codestral-7B-v0.1by mistral based on the Mamba2 architecture performing on par with SOTA transformer based code models
Aya-23https://huggingface.co/CohereForAI/aya-23-35B8B and 35B instruction tuned multi lingual model focusing on 23 languages
Mistral-7b-instruct-v0.3https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3with function calling, new tokenizer and 32k max context
CodeStral-22Bhttps://huggingface.co/mistralai/Codestral-22B-v0.1Coding model trained on 80+ languages with instruct and Fill in the Middle tasks, 32k max context
Yuan2-M32https://huggingface.co/IEITYuan/Yuan2-M32-hfMixture of Experts with Attention Router, 32 Experts, 2 Active, TOtal 40B parameters, 3.7B active and max length of 16K
DeepSeek-V2https://github.com/deepseek-ai/DeepSeek-V2#2-model-downloads21B Strong, Economical, and Efficient Mixture-of-Experts Language Model
Granitehttps://huggingface.co/ibm-granitefamily of Code Models from IBM with 3b, 8b, 20b, 34b, base and instruct models for code completion and chat
GemMoEhttps://huggingface.co/Crystalcareai/GemMoE-Base-RandomAn 8x8 Mixture Of Experts based on Gemma
wavecoder-ultra-6.7bhttps://huggingface.co/microsoft/wavecoder-ultra-6.7bcovering four general code-related tasks: code generation, code summary, code translation, and code repair
Mixtral-8x22B-Instruct-v0.1https://huggingface.co/mistralai/Mixtral-8x22B-Instruct-v0.1an instruct fine-tuned version of the Mixtral-8x22B-v0.1
WizardLM-2-8x22Bhttps://huggingface.co/alpindale/WizardLM-2-8x22BMicrosoft's WizardLM 2 8x22B beating gpt-4-0314 on MT-Bench
WizardLM-2-7Bhttps://huggingface.co/microsoft/WizardLM-2-7BMicrosoft's WizardLM 2 7B, release for 70B coming up backup0
aiXcoderhttps://huggingface.co/aiXcoder/aixcoder-7b-base7B Code LLM for code completion, comprehension, generation
Mixtral-8x22B-v0.1https://huggingface.co/v2ray/Mixtral-8x22B-v0.1Sparse MoE model with 176B total and 44B active parameters, 65k context size
grok-1https://huggingface.co/xai-org/grok-1314b MoE model by xAI
DBRXhttps://huggingface.co/databricks/dbrx-basebase and instruct MoE models from databricks with 132B total parameters and a larger number of smaller experts supporting RoPE and 32K context size
command-r-plushttps://huggingface.co/CohereForAI/c4ai-command-r-plusa 104B model with highly advanced capabilities including RAG and tool use for English, French, Spanish, Italian, German, Brazilian Portuguese, Japanese, Korean, Arabic, and Simplified Chinese
StarCoder2https://huggingface.co/bigcode/starcoder2-15b15B, 7B and 3B code completion models trained on The Stack v2
command-rhttps://www.maginative.com/article/cohere-launches-command-r-scalable-ai-model-for-enterprise-rag-and-tool-use/35B optimized for retrieval augmented generation (RAG) and tool use supporting Embed and Rerank methodology. model weights
AI21 Jambahttps://huggingface.co/ai21labs/Jamba-v0.1production-grade Mamba-based hybrid SSM-Transformer Model licensed under Apache 2.0 with 256K context and 52B MoE at 12B each
Smaug-72Bhttps://huggingface.co/abacusai/Smaug-72B-v0.1Based on Qwen-72B and MoMo-72B-Lora then finetuned by Abacus.AI, is the best performing Open LLM on the HF leaderboard by Feb-2024
SLIM Model Familyhttps://huggingface.co/llmwareSmall Specialized Function-Calling Models for Multi-Step Automation, focused on enterprise RAG workflows
aya-101https://huggingface.co/CohereForAI/aya-10113b model fine tuned open acess multilingual LLM from Cohere For AI
seamlessM4T v2https://huggingface.co/docs/transformers/en/model_doc/seamless_m4t_v2Multimodal Audio and Text Translation between many languages
SeaLLMhttps://huggingface.co/SeaLLMs/SeaLLM-7B-v2multilingual LLM for Southeast Asian (SEA) languages 🇬🇧 🇨🇳 🇻🇳 🇮🇩 🇹🇭 🇲🇾 🇰🇭 🇱🇦 🇲🇲 🇵🇭
meditronhttps://github.com/epfLLM/meditron7B and 70B Llama2 based LLM fine tuning adapted for the medical domain
Mixtral of expertshttps://mistral.ai/news/mixtral-of-experts/A high quality Sparse Mixture-of-Experts.
Porohttps://huggingface.co/LumiOpen/Poro-34BSiloGen model checkpoints of a family of multilingual open source LLMs covering all official European languages and code, news
deepseek-coderhttps://github.com/deepseek-ai/DeepSeek-Codercode language models, trained on 2T tokens, 87% code 13% English / Chinese, up to 33B with 16K context size achieving SOTA performance on coding benchmarks
openchathttps://github.com/imoneoi/openchatAdvancing Open-source Language Models with Mixed-Quality Data
llmware RAG modelshttps://huggingface.co/llmwaresmall LLMs and sentence transformer embedding models specifically fine-tuned for RAG workflows
HelixNethttps://huggingface.co/migtissera/HelixNetMixture of Experts with 3 Mistral-7B, LoRA, HelixNet-LMoE optimized version
Mistral-7B-german-assistant-v3https://huggingface.co/flozi00/Mistral-7B-german-assistant-v3finetuned version for german instructions and conversations in style of Alpaca. "### Assistant:" "### User:", trained with a context length of 8k tokens. The dataset used is deduplicated and cleaned, with no codes inside. The focus is on instruction following and conversational tasks
WizardMath-70B-V1.0https://huggingface.co/WizardLM/WizardMath-70B-V1.0SOTA Mathematical Reasoning
leo-hessianai-13b-chat-bilingualhttps://huggingface.co/LeoLM/leo-hessianai-13b-chat-bilingualbased on llama-2 13b is a fine tune of the base leo-hessianai-13b for chat
em_german_leo_mistralhttps://huggingface.co/jphme/em_german_leo_mistralLeoLM Mistral fine tune of LeoLM with german instructions
SauerkrautLM-13B-v1https://huggingface.co/VAGOsolutions/SauerkrautLM-13b-v1fine tuned llama-2 13b on a mix of German data augmentation and translations, SauerkrautLM-7b-v1-mistral German SauerkrautLM-7b fine-tuned using QLoRA on 1 A100 80GB with Axolotl
CodeShellhttps://github.com/WisdomShell/codeshell/blob/main/README_EN.mdcode LLM with 7b parameters trained on 500b tokens, context length of 8k outperforming CodeLlama and Starcoder on humaneval, weights
sqlcoderhttps://github.com/defog-ai/sqlcoder15B parameter model that outperforms gpt-3.5-turbo for natural language to SQL generation tasks
ChatGLM2-6Bhttps://github.com/THUDM/ChatGLM2-6Bv2 of the GLM 6B open bilingual EN/CN model
baichuan-7bhttps://github.com/baichuan-inc/baichuan-7BBaichuan Intelligent Technology developed baichuan-7B, an open-source language model with 7 billion parameters trained on 1.2 trillion tokens. Supporting Chinese and English, it achieves top performance on authoritative benchmarks (C-EVAL, MMLU)
salesforce/CodeT5https://github.com/salesforce/codet5code assistant, has released their codet5+ 16b and other model sizes
VPGTranshttps://vpgtrans.github.io/Transfer Visual Prompt Generator across LLMs and the VL-Vicuna model is a novel VL-LLM. Paper, code
replit-codehttps://huggingface.co/replit/focused on Code Completion. The model has been trained on a subset of the Stack Dedup v1.2 dataset.
Visual-med-alpacahttps://github.com/cambridgeltl/visual-med-alpacafine-tuning llama-7b on self instruct for the biomedical domain. Models locked behind a request form.
Multimodal-GPThttps://github.com/open-mmlab/Multimodal-GPTmulti-modal visual/language chatbot, using llama with custom LoRA weights and openflamingo-9B.
mPLUG-Owlhttps://github.com/X-PLUG/mPLUG-OwlMultimodal finetuned model for visual/language tasks
MOSS by Fudan Universityhttps://github.com/OpenLMLab/MOSSa 16b Chinese/English custom foundational model with additional models fine tuned on sft and plugin usage
BigCodeOpen Scientific collaboration to train a coding LLMhttps://huggingface.co/bigcode
CodeGeeX 13BMulti Language Code Generation Modelhttps://huggingface.co/spaces/THUDM/CodeGeeX
RWKV: Parallelizable RNN with Transformer-level LLM Performancehttps://github.com/BlinkDL/RWKV-LM
GeoV/GeoV-9bhttps://huggingface.co/GeoV/GeoV-9b9B parameter, in-progress training to 300B tokens (33:1)
LAION OpenFlamingoMulti Modal Model and training architecturehttps://github.com/mlfoundations/open_flamingo
Cerebras GPT-13bhttps://huggingface.co/cerebras(release notes)
Share: