Skip to content
AxiomLogicaSearch
Category

AI & ML

All about AI and Machine Learning, Latest articles, advances in domain.

All articles

The orchestration of multi-agent systems: how planning, policy, and communication fit together
AI & ML

The orchestration of multi-agent systems: how planning, policy, and communication fit together

A robust multi-agent control plane splits planning, policy, communication, memory, observability, evaluation, and governance into separate building blocks — which Microsoft’s reference architecture and A2A both position as the scalable way to coordinate specialized agents — but the model deliberately stays framework-agnostic and caps connected-agent depth to avoid uncontrolled agent trees.

28 min read
AI & ML

Engineering the Quantized Johnson-Lindenstrauss (QJL) Transform for Distributed Inference

By utilizing the Quantized Johnson-Lindenstrauss (QJL) transform for KV cache compression, engineers can achieve a 5x reduction in VRAM utilization for long-context LLM inference without the overhead of storing traditional quantization constants, provided the implementation is tuned for the specific hardware-native CUDA kernel constraints.

18 min read
AI & ML

Architecting Scalable Agentic Workflows with FaaS-Hosted MCP Servers

By decoupling MCP server logic from the LLM orchestrator using distributed FaaS endpoints, engineers can reduce infrastructure idle costs by up to 40% compared to monolithic deployments, provided they implement sub-50ms gRPC/HTTP cold-start optimization strategies.

19 min read
AI & ML

Evaluating 3D Gaussian Splatting (3DGS) for Real-Time Robotics Navigation

By transitioning from implicit NeRF-based motion deblurring to 3D Gaussian Splatting with Bézier SE(3) trajectory modeling, robotics engineers can achieve real-time rendering speeds (30+ FPS) while simultaneously solving motion-blurred input artifacts, provided they can accommodate the integration of event camera streams for pose estimation.

15 min read
AI & ML

Architecting for Disaggregated LLM Inference: Prefill-Decode Isolation

By decoupling compute-bound prefill from memory-bound decode using llm-d architectures, engineers can achieve up to 4.5x improvement in goodput and significantly lower P99 TTFT, provided they account for the added network latency of KV-cache serialization over high-speed interconnects like EFA.

15 min read

The weekly brief.

One email each Sunday with what we tested, what we'd buy, and what to skip. No filler.