#mixture-of-experts 아티클 모음

Dev.to

1.6T 파라미터 MoE와 1M 토큰 컨텍스트의 LongCat-2.0 기반 Agentic AI 구현

LongCat-2.0 & Agentic AI: Reshaping India's Tech by 2026

AI/MLadvanced51 분 소요2일 전

Dev.to

GLM 5.2 구동을 위한 최소 240GB VRAM 요구와 인프라 비용 분석

GLM 5.2 isn't free: not even my US$4,000 Spark can run it

AI/MLintermediate14 분 소요3일 전

Dev.to

GLM-5.2: 744B 파라미터 규모와 40B 연산 비용의 효율적 분리

GLM-5.2 Becomes the Top Open-Weights Model: Active vs Total Parameters

AI/MLintermediate18 분 소요2026년 6월 23일

Dev.to

Kimi K2.7의 262k Context와 Cloudflare의 무인 배포 인프라 기반 Agent 최적화

262k tokens + agent deployment platforms level up

AI/MLintermediate15 분 소요2026년 6월 23일

Dev.to

MoE Soft Routing의 Calibration Drift 해결을 통한 AI 신뢰도 확보

Why Your AI Model's Confidence Score Is Probably Lying (And What To Do About It)

AI/MLadvanced24 분 소요2026년 6월 19일

Dev.to

MoE 아키텍처와 CUDA 최적화 통한 추론 비용 94.4% 절감

Why Chinese AI Models Are 95% Cheaper — The Economics Explained

AI/MLadvanced22 분 소요2026년 6월 19일

Hacker News

IndexShare 기반 1M Context 확보 및 오픈 모델 성능 1위 달성

GLM-5.2: The Most Powerful Open Model yet and the Brutal Reality of Running It

AI/MLadvanced15 분 소요2026년 6월 19일

GeekNews

GLM-5.2, Artificial Analysis 오픈 가중치 모델 1위 등극

GLM-5.2: 744B 파라미터 기반 오픈 가중치 모델 지능 지수 1위 달성

AI/MLadvanced20 분 소요2026년 6월 18일

Dev.to

744B MoE 아키텍처 기반 GLM-5.2 로컬 배포 및 가용성 확보

Run GLM-5.2 Locally: The Open Model Nobody Can Ban

AI/MLadvanced26 분 소요2026년 6월 15일

Dev.to

MoE 구조를 통한 70B급 성능의 14B급 연산 비용 달성

Mixture of Experts (MoE): what it actually does under the hood, and when it pays off

AI/MLadvanced27 분 소요2026년 6월 13일

Dev.to

H100 기준 1,100 TPS 달성한 Diffusion 기반 텍스트 생성 아키텍처

DiffusionGemma: How Google's New Open LLM Hits 1,000 Tokens/sec and Changes Inference Economics

AI/MLadvanced12 분 소요2026년 6월 12일

The Register

Diffusion 기술 도입으로 로컬 텍스트 생성 속도 최대 4배 향상

Google's new open-weights model brings image-generation tricks to AI text generation

AI/MLadvanced7 분 소요2026년 6월 11일

GeekNews

DeepSeek V4 Pro, 정밀도에서 GPT-5.5 Pro를 앞서다

DeepSeek V4 Pro, 38.0점 기록하며 GPT-5.5 Pro의 정밀도 추월

AI/MLintermediate24 분 소요2026년 6월 9일

Dev.to

MoE 아키텍처 기반 전문 모델 및 Agent-centric 생태계 가속화

Top 10 Apple intelligence features to look forward to:

AI/MLintermediate16 분 소요2026년 6월 8일

Dev.to

Ollama 기반 로컬 LLM 구축을 통한 Zero-Cloud 코딩 에이전트 환경 구현

Run Coding Agents on Local AI — Zero Cloud, Full Control

AI/MLintermediate24 분 소요2026년 6월 7일

Dev.to

35B MoE 추론 모델과 5B 코드 최적화 모델 기반의 독자적 AI 스택 구축

Microsoft MAI-Thinking-1 & MAI-Code-1-Flash: Developer Guide to 7 New MAI Models

AI/MLadvanced28 분 소요2026년 6월 3일

Dev.to

0.11$/M 토큰의 초저가 비용과 MoE 기반 효율성을 갖춘 80B 코딩 에이전트

Qwen3-Coder-Next review 2026: 80B params, 3B active, and the cheapest credible coding agent API

AI/MLintermediate15 분 소요2026년 6월 2일

Hugging Face Blog

12B MoE 구조로 2배 빠른 추론을 구현한 Mellum2 공개

Introducing Mellum2: A 12B Mixture-of-Experts Model by JetBrains

AI/MLintermediate8 분 소요2026년 6월 1일

Dev.to

MoE 기반 2T 파라미터 모델의 17B 수준 추론 효율 달성

Llama 4: Meta's Latest — Scout, Maverick, and the MoE Revolution

AI/MLintermediate7 분 소요2026년 5월 25일

Dev.to

Multi-polar AI 생태계 전환에 따른 Sovereign AI 전략 및 MoE 아키텍처 효율성 분석

Beyond the West: What Eastern AI Models Mean for Enterprises, Developers, and Digital Sovereignty

AI/MLintermediate40 분 소요2026년 5월 25일