ํ”ผ๋“œ๋กœ ๋Œ์•„๊ฐ€๊ธฐ
From Strings to Silicon: How I Optimized a Java Word Count by 7X ๐Ÿš€
Dev.toDev.to
Backend

Java Word Count ์ตœ์ ํ™”๋กœ ์ฒ˜๋ฆฌ ์†๋„ 7๋ฐฐ ํ–ฅ์ƒ ๋ฐ ๋ฉ”๋ชจ๋ฆฌ ํšจ์œจ ๊ทน๋Œ€ํ™”

From Strings to Silicon: How I Optimized a Java Word Count by 7X ๐Ÿš€

elias mohammadi2026๋…„ 5์›” 19์ผ14๋ถ„intermediate

Context

2์–ต ๊ฐœ์˜ ๋‹จ์–ด๊ฐ€ ํฌํ•จ๋œ 1.5GB ํ…์ŠคํŠธ ํŒŒ์ผ์˜ ๋นˆ๋„์ˆ˜๋ฅผ ๊ณ„์‚ฐํ•˜๋Š” ์‹œ์Šคํ…œ ์„ค๊ณ„. ์ดˆ๊ธฐ Java NIO ๊ธฐ๋ฐ˜์˜ ๋‹จ์ˆœ ๊ตฌํ˜„์ฒด๋Š” ์ „์ฒด ํŒŒ์ผ์„ ๋ฉ”๋ชจ๋ฆฌ์— ๋กœ๋“œํ•˜๋Š” ๋ฐฉ์‹์œผ๋กœ ์ธํ•œ ๊ณผ๋„ํ•œ ๋ฉ”๋ชจ๋ฆฌ ์ ์œ ์™€ ๋‹จ์ผ ์Šค๋ ˆ๋“œ ์ฒ˜๋ฆฌ์˜ ๋‚ฎ์€ ์„ฑ๋Šฅ ํ•œ๊ณ„๋ฅผ ๋ณด์ž„.

Technical Solution

  • Files.lines() Stream ๋„์ž…์„ ํ†ตํ•œ ํŒŒ์ผ ์ „์ฒด ๋กœ๋“œ ๋ฐฉ์‹ ์ œ๊ฑฐ ๋ฐ ๋ฉ”๋ชจ๋ฆฌ footprint ์ตœ์†Œํ™”
  • 8๊ฐœ ๊ณ ์ • ์Šค๋ ˆ๋“œ ํ’€ ๊ธฐ๋ฐ˜์˜ Divide and Conquer ์ „๋žต ์ ์šฉ์œผ๋กœ CPU ์ฝ”์–ด ํ™œ์šฉ๋„ ๊ทน๋Œ€ํ™”
  • Thread-local aggregation ๊ตฌ์กฐ ์„ค๊ณ„๋ฅผ ํ†ตํ•œ ๊ณต์œ  ๋ฉ”๋ชจ๋ฆฌ ๊ฒฝํ•ฉ(Contention) ๋ฐ Lock ์˜ค๋ฒ„ํ—ค๋“œ ์ œ๊ฑฐ
  • String.split()์˜ ๋‚ด๋ถ€ ์ •๊ทœํ‘œํ˜„์‹ ์ฒ˜๋ฆฌ ๋น„์šฉ์„ ์ค„์ด๊ธฐ ์œ„ํ•œ ๋ฌธ์ž์—ด ํ• ๋‹น ์ตœ์ ํ™” ์‹œ๋„
  • MappedByteBuffer ๋ฐ FileChannel ํ™œ์šฉ์œผ๋กœ OS ๋ ˆ๋ฒจ์˜ Zero-copy์— ๊ฐ€๊นŒ์šด ํŒŒ์ผ I/O ์ฒ˜๋ฆฌ
  • ํ•˜๋“œ์›จ์–ด ์บ์‹œ(L1/L2) ํšจ์œจ์„ ๊ณ ๋ คํ•œ ๋ฉ”๋ชจ๋ฆฌ ๋ ˆ์ด์•„์›ƒ ๋ฐ ๋ฐ์ดํ„ฐ ์•ก์„ธ์Šค ํŒจํ„ด ์ตœ์ ํ™”

1. ๋Œ€์šฉ๋Ÿ‰ ํŒŒ์ผ ์ฒ˜๋ฆฌ ์‹œ ์ „์ฒด ๋กœ๋“œ ๋Œ€์‹  Stream์ด๋‚˜ MappedByteBuffer ๊ฒ€ํ† 

2. ๋ฉ€ํ‹ฐ์Šค๋ ˆ๋”ฉ ์„ค๊ณ„ ์‹œ Lock ๊ธฐ๋ฐ˜ ๊ณต์œ  ์ž์›๋ณด๋‹ค Shared-Nothing ๊ตฌ์กฐ์˜ Thread-local ์ €์žฅ์†Œ ์šฐ์„  ๊ณ ๋ ค

3. ์„ฑ๋Šฅ ๋ณ‘๋ชฉ ์ง€์  ํŒŒ์•…์„ ์œ„ํ•ด POC ๊ธฐ๋ฐ˜์˜ baseline ์ธก์ • ํ›„ ์ ์ง„์  ์ตœ์ ํ™” ์ˆ˜ํ–‰

4. ์ •๊ทœํ‘œํ˜„์‹ ๊ธฐ๋ฐ˜์˜ split() ๋“ฑ ๊ณ ๋น„์šฉ ๋ฌธ์ž์—ด ์—ฐ์‚ฐ์˜ ๋Œ€์ฒด ๋ฐฉ์•ˆ ํƒ์ƒ‰

์›๋ฌธ ์ฝ๊ธฐ