Hugging Face가 datasets 라이브러리의 스트리밍 백엔드를 재설계해 초기 요청을 100배 감소, 데이터 파일 해석 시간을 10배 단축, 처리 속도를 2배 향상
Streaming datasets: 100x More Efficient
Streaming datasets: 100x More Efficient
LeRobot v0.4.0: Supercharging OSS Robot Learning
NVIDIA Releases 6 Million Multi-Lingual Reasoning Dataset
LeRobot goes to driving school: World’s largest open-source self-driving dataset
Open R1: Update #2
CinePile 2.0 - making stronger datasets with adversarial refinement
FineVideo: behind the scenes
Docmatix - a huge dataset for Document Visual Question Answering
Data Is Better Together: A Look Back and Forward
StarCoder2 and The Stack v2
Introducing DOI: the Digital Object Identifier to Datasets and Models