#long-context 아티클 모음

Dev.to

Ambiguity 처리 능력과 Long Context 안정성을 확보한 MiniMax 모델 분석

MiniMax: What It Actually Means to Run on This Model

AI/MLintermediate4 분 소요3일 전

Dev.to

SWE-Bench Pro 80.3% 달성, Long-Context 기반의 Agentic Workflow 혁신

Claude Fable 5 (Mythos-Class) for Polymarket Trading Bots: The Long-Context Agentic Leap Developers Needed

AI/MLadvanced7 분 소요2026년 6월 24일

Dev.to

MSA 도입으로 1M 컨텍스트 연산 비용 28.4배 절감 및 코딩 성능 확보

MiniMax M3 Explained: The Sparse Attention Breakthrough

AI/MLadvanced11 분 소요2026년 6월 24일

Dev.to

2M Context Window와 Deep Think 모드로 구현한 초거대 컨텍스트 추론 엔진

Gemini 3.5 Pro: 2M Context, Deep Think, and the Post-Fable-5 Frontier

AI/MLadvanced24 분 소요2026년 6월 20일

Hugging Face Blog

IndexShare 도입으로 1M Context 구현 및 per-token FLOPs 2.9배 절감

GLM-5.2: Built for Long-Horizon Tasks

AI/MLadvanced36 분 소요2026년 6월 17일

Dev.to

Flat-rate 과금 기반의 Orchestrated Reasoning Loop 구조를 통한 자율 리서치 에이전트 구현

The Future of Large Language Models

AI/MLintermediate19 분 소요2026년 6월 16일

Dev.to

Request-based Pricing 도입으로 Long-Context 비용 최대 100배 절감

LLM Trends and Future Outlook

AI/MLintermediate13 분 소요2026년 6월 16일

Dev.to

Mythos-class 성능의 Fable 5 출시 및 Safeguard Fallback 구조 도입

Claude Fable 5 Is Here. Here's What Actually Matters for Developers 👨🏾‍💻

AI/MLintermediate4 분 소요2026년 6월 9일

Hacker News

코드 마이그레이션 60배 가속 및 Token 비용 50% 이상 절감한 Mythos-class 모델 출시

Claude Fable 5

AI/MLadvanced44 분 소요2026년 6월 9일

GeekNews

Show GN: VLM은 한국 공공기관 문서를 얼마나 잘 읽을까? KOLongDoc 벤치마크 공개

한국어 공공기관 Long-Document 분석을 위한 KOLongDoc 벤치마크 공개

AI/MLintermediate1 분 소요2026년 6월 4일

Dev.to

Local-First AI Stack 구현을 위한 Gemma 4 멀티 모달-계층별 모델 라인업

Gemma 4 Is Not Just Another Open Model — It Changes What Developers Can Build Locally

AI/MLintermediate20 분 소요2026년 5월 22일

Dev.to

Symmetric Pooling으로 512K 컨텍스트 전방 패스 21배 가속 및 학습 시간 30% 단축

Lighthouse Attention: The Training-Time Hierarchy That Makes Quadratic Attention Practical Again

AI/MLadvanced10 분 소요2026년 5월 19일

Dev.to

bfloat16 도입을 통한 64K Context 처리 및 0.5M TPS 달성

Is Brain Float (bf16) Worth it?

AI/MLadvanced23 분 소요2026년 5월 12일

Dev.to

SubQ: Sparse Attention 기반 12M 토큰 처리 및 비용 80% 절감

SubQ Model: Can Subquadratic Make Long-Context AI More Efficient?

AI/MLadvanced31 분 소요2026년 5월 11일

Dev.to

Layer Router 기반 Flux Attention으로 추론 비용 50% 절감 및 최대 2.8배 가속

Flux Attention halves inference cost on long contexts

AI/MLadvanced6 분 소요2026년 5월 10일

Dev.to

12M 토큰 기준 Attention 연산량을 1,000배 절감한 Linear Scaling 아키텍처

1,000x Claim, No Independent Proof: Subquadratic Architecture

AI/MLadvanced9 분 소요2026년 5월 8일

Hugging Face Blog

처리량 9배 향상 및 Omni-modal 통합을 구현한 Nemotron 3 Nano Omni

Introducing NVIDIA Nemotron 3 Nano Omni: Long-Context Multimodal Intelligence for Documents, Audio and Video Agents

AI/MLadvanced39 분 소요2026년 4월 28일

Dev.to

KV Cache 9배 압축을 통한 1M Token Context 실용화

DeepSeek V4: Million-Token Context That Actually Works

AI/MLadvanced9 분 소요2026년 4월 26일

Dev.to

Claude 성능 저하 분석: 다중 제약 조건 및 Long Context 일관성 22% 하락

Cancelé Claude: medí el deterioro de calidad con mis propios benchmarks antes de irme

AI/MLintermediate24 분 소요2026년 4월 25일

Dev.to

Agent Swarm 확장성 3배 강화 및 처리량 185% 향상시킨 Kimi K2.6 공개

Kimi K2.6 Has Arrived: An Open-Weight Powerhouse for Agentic Work

AI/MLadvanced8 분 소요2026년 4월 21일