Skip to content
AxiomLogicaSearch
Category

AI & ML

All about AI and Machine Learning, Latest articles, advances in domain.

All articles

Should teams buy curated preference data or build an in-house curation pipeline?
AI & ML

Should teams buy curated preference data or build an in-house curation pipeline?

Buying curated preference data reduces internal labeling and curation labor, but the trade-off is vendor dependency and less control over sampling and rubric design — in practice, teams should expect the cheapest path to be purchase for experimentation and the best path to be build when they need domain-specific preference signals, auditability, or iterative rubric changes.

24 min read
How to merge multiple fine-tuned LLMs with mergekit: a practical tutorial
AI & ML

How to merge multiple fine-tuned LLMs with mergekit: a practical tutorial

mergekit can run entirely on CPU or with as little as 8 GB VRAM and still perform multi-model merges out of core — this makes low-cost experimentation feasible — but quality still depends on choosing compatible checkpoints and the right merge method, not just averaging weights.

19 min read
How to build a fine-tuning dataset filtering pipeline with Setu and Hugging Face Datasets
AI & ML

How to build a fine-tuning dataset filtering pipeline with Setu and Hugging Face Datasets

Setu combines Spark-based document preparation, cleaning, flagging/filtering, and MinHashLSH deduplication with Hugging Face Datasets-style dataset handling — enough to scale noisy web/PDF/speech corpora into SFT-ready training data — but it still depends on Linux/WSL-friendly setup, Java, Spark, and a multi-stage quality gate before deduplication pays off.

20 min read
DeepSpeed vs Megatron-LM: which stack fits pre-training, fine-tuning, and checkpoint portability?
AI & ML

DeepSpeed vs Megatron-LM: which stack fits pre-training, fine-tuning, and checkpoint portability?

Megatron-LM is the stronger research/pre-training substrate, while DeepSpeed is the broader optimization layer with more turnkey distributed features and integrations — but the real business cost difference is checkpoint portability and operational complexity, because Megatron Bridge and DeepSpeed↔Megatron integration reduce migration friction only if you standardize on compatible formats and workflows.

23 min read

The weekly brief.

One email each Sunday with what we tested, what we'd buy, and what to skip. No filler.