MicroThinker-3B-Preview - AI Language Models Tool

Overview

MicroThinker-3B-Preview is a 3-billion-parameter, fine-tuned language model from huihui-ai built on the Llama-3.2-3B-Instruct-abliterated base. It is presented as an experimental research model targeted at improved step-by-step reasoning and higher-quality short-to-medium text generation, and the model card explicitly describes it as a reasoning-focused SFT (Supervised Fine-Tuning) effort. ([huggingface.co](https://huggingface.co/huihui-ai/MicroThinker-3B-Preview)) The model was fine-tuned on the FineQwQ-142k dataset (≈142k training examples) with a long token length setting and 4-bit quantization during training; the card includes detailed, reproducible training steps and an example ollama command for inference. Users and downstream packagers have produced GGUF/quantized builds and community forks for smaller-footprint local inference. These details and the dataset card are documented on the Hugging Face model and dataset pages. ([huggingface.co](https://huggingface.co/huihui-ai/MicroThinker-3B-Preview))

Model Statistics

  • Downloads: 15
  • Likes: 1
  • Pipeline: text-generation
  • Parameters: 3.2B

License: apache-2.0

Model Details

Architecture and base: MicroThinker-3B-Preview is a 3B-parameter Llama-3.2 family model (base: meta-llama/Llama-3.2-3B-Instruct → huihui-ai/Llama-3.2-3B-Instruct-abliterated → MicroThinker-3B-Preview). The model card lists the model size as 3B parameters and the tensor type as BF16. ([huggingface.co](https://huggingface.co/huihui-ai/MicroThinker-3B-Preview)) Fine-tuning and training: The model was supervised-fine-tuned (SFT) using the FineQwQ-142k dataset (142k examples). Training notes on the card state the run used a single RTX 4090 (24GB), LoRA-style adapters (lora_rank 8, lora_alpha 32), one training epoch, max_length set to 21710 tokens, and quant_bits=4 for the fine-tuning pipeline. The card also provides the exact swift/ms-swift CLI commands used for reproduceable SFT and for merging LoRA adapters into a final model directory. ([huggingface.co](https://huggingface.co/huihui-ai/MicroThinker-3B-Preview)) Inference and quantizations: The model card shows an ollama usage example (ollama run huihui_ai/microthinker:3b) for local inference. Community contributors (for example, mradermacher) have created multiple GGUF/quantized builds (Q2–Q8 and F16) to support different hardware and performance/quality tradeoffs. ([huggingface.co](https://huggingface.co/huihui-ai/MicroThinker-3B-Preview))

Key Features

  • Fine-tuned from Llama-3.2-3B-Instruct-abliterated for improved step-by-step reasoning.
  • Trained on the FineQwQ-142k supervised dataset (≈142k examples) for instruction-following.
  • Configured with long-context training (max_length set to 21,710 tokens during SFT).
  • Designed for local inference via ollama; community GGUF quantizations available for compact runs.
  • Compact 3B footprint enables experimentation on consumer-class GPUs and quantized runtimes.

Example Usage

Example (python):

import subprocess
import json

# Simple example: run the model via the ollama CLI and capture output.
# Requires ollama to be installed and the image available locally (model: huihui_ai/microthinker:3b).

prompt = "You are a helpful assistant. Explain in 3 bullet points how binary search works."

cmd = [
    "ollama", "run", "huihui_ai/microthinker:3b",
    "--json",  # ask ollama for JSON output where supported
    "--", prompt
]

proc = subprocess.run(cmd, capture_output=True, text=True)

if proc.returncode != 0:
    print("Error running ollama:", proc.stderr)
else:
    # ollama may return plain text or JSON depending on options/version
    try:
        data = json.loads(proc.stdout)
        print("Model output (JSON):", data)
    except Exception:
        print("Model output (text):")
        print(proc.stdout)

# Notes:
# - The model card includes the ollama example command: `ollama run huihui_ai/microthinker:3b`.
# - For local non-ollama inference, community GGUF/quantized builds are available (see community mirrors).

Benchmarks

Parameter count: 3B parameters (Source: https://huggingface.co/huihui-ai/MicroThinker-3B-Preview)

Training dataset size: 142k examples (FineQwQ-142k) (Source: https://huggingface.co/datasets/huihui-ai/FineQwQ-142k)

Max training context (max_length): 21,710 tokens (configured during SFT) (Source: https://huggingface.co/huihui-ai/MicroThinker-3B-Preview)

Tensor type (model card): BF16 (Source: https://huggingface.co/huihui-ai/MicroThinker-3B-Preview)

Downloads (recent month, Hugging Face page): 15 downloads (last month, as shown on model page) (Source: https://huggingface.co/huihui-ai/MicroThinker-3B-Preview)

Last Refreshed: 2026-03-03

Key Information

  • Category: Language Models
  • Type: AI Language Models Tool