Table of Contents
Rust Deep Dive
How to Count Lines in a File Using Rust (The Right Way, and the Fast Way)
Count lines in a file using Rust — from .lines().count() to zero-allocation byte scanning. Covers the 8KB buffer trap, String allocation overhead, and concurrent multi-file processing with Rayon.
You have a 2GB log file. You need to count the lines.
Here are four Rust approaches:
| Method | Time shape | Allocation profile |
|---|---|---|
.lines().count() | slowest | one owned String per line |
.lines().count() with a larger reader buffer | better | still one String per line |
read_line() plus buffer reuse | good | one reusable String |
| manual byte scan | fastest | fixed byte buffer |
The difference between the first and last method can be several times on large logs. The reason is not mysterious: default buffering and allocation behavior matter.
This guide covers rust count lines methods from the shortest snippet to zero-allocation byte scanning, plus Rayon-based parallel counting for directories of files.
If you only need a quick answer outside Rust, use the Line Counter tool. If you are building this into a CLI, service, or data pipeline, keep reading.
Quick Method Guide
| Method | Code size | Memory | Best use |
|---|---|---|---|
.lines().count() | Tiny | allocation-heavy | Small files, quick scripts |
with_capacity(...).lines() | Tiny | allocation-heavy | Medium files where simplicity wins |
rust read_line plus clear() | Medium | very low | General production default |
| manual byte scanning | More code | fixed buffer | Huge files and maximum speed |
| Rayon parallel counting | More code | controlled | Many files in a directory |
bytecount crate | Small | file-size if read-all | SIMD-style fast counts for in-memory data |
For most count lines rust code, the practical default is a larger reader buffer plus read_line() reuse. For rust count lines large file tools where raw speed matters, use manual byte scanning.
If you came from Go, the closest comparison is the Go line counting guide: both languages reward explicit buffer choices, but Rust's main trap is allocation behavior rather than a token-size error.
Method 1: .lines().count() - Simple but With Hidden Costs
The shortest version is:
use std::fs::File;
use std::io::{self, BufRead, BufReader};
fn count_lines_simple(filename: &str) -> io::Result<usize> {
let file = File::open(filename)?;
let reader = BufReader::new(file);
Ok(reader.lines().count())
}
This compiles, and it is fine for a quick experiment on a small trusted file.
It is not the best production answer. reader.lines().count() counts iterator items, including any Err item. If invalid UTF-8 or an I/O error appears, you do not get a clean io::Result failure.
Use try_fold if you want to keep the .lines() style and propagate errors:
use std::fs::File;
use std::io::{self, BufRead, BufReader};
fn count_lines_checked(filename: &str) -> io::Result<usize> {
let file = File::open(filename)?;
let reader = BufReader::new(file);
reader.lines().try_fold(0usize, |count, line| {
line.map(|_| count + 1)
})
}
That is better, but two performance traps remain.
Trap 1: the default reader buffer is currently 8 KiB
Rust's standard buffered reader documentation says the default capacity is currently 8 KiB. It also says this may change in the future, so code should not depend on the exact number.
For a large sequential file, 8 KiB is often conservative. A 1GB file read through 8 KiB chunks means roughly 131,000 buffer fills. A 1 MiB buffer reduces that to roughly 1,000 fills.
Use BufReader::with_capacity when rust bufreader performance matters:
use std::fs::File;
use std::io::{self, BufRead, BufReader};
fn count_lines_buffered(filename: &str) -> io::Result<usize> {
let file = File::open(filename)?;
let reader = BufReader::with_capacity(1024 * 1024, file);
reader.lines().try_fold(0usize, |count, line| {
line.map(|_| count + 1)
})
}
A larger buffer is not automatically better forever. Benchmark values like 64 KiB, 256 KiB, and 1 MiB on your storage. The point is that the default capacity is chosen as a general-purpose default, not as the fastest setting for a single multi-GB scan.
Trap 2: .lines() returns an owned String per line
The lines method consumes the reader and returns an iterator of io::Result<String>. Each line is an owned String.
That is convenient. It also means allocation churn:
for each line:
allocate or grow String
fill it with the line
return it
drop it after count() moves on
Rust has no garbage collector, so this is not GC pressure. It is allocator pressure: many heap allocations and frees. For a 100 million line log, that cost can dominate.
Important nuance: .lines().count() does not keep every line alive at once. The problem is not that all strings stay in memory. The problem is repeated allocation, UTF-8 validation, and ownership traffic for data you immediately discard.
Trap 3: .lines() consumes ownership
The lines(self) method takes ownership of self.
let file = File::open("data.txt")?;
let reader = BufReader::new(file);
let count = reader.lines().count();
// reader is moved. You cannot read from it again here.
That is often fine for a line counter, but it confuses beginners who try to count first and then keep using the same BufReader.
Use .lines() when:
- The file is small.
- You need the line strings anyway.
- You value the shortest code over speed.
Avoid it for rust count lines large file code where you discard line content.
Method 2: read_line() with Buffer Reuse - The Recommended Default
The more efficient Rust pattern is:
allocate one String
read one line into it
count
clear the String without freeing capacity
repeat
That is the core rust read_line optimization.
use std::fs::File;
use std::io::{self, BufRead, BufReader};
fn count_lines_efficient(filename: &str) -> io::Result<usize> {
let file = File::open(filename)?;
let mut reader = BufReader::with_capacity(1024 * 1024, file);
let mut count = 0usize;
let mut line = String::new();
loop {
line.clear();
let bytes_read = reader.read_line(&mut line)?;
if bytes_read == 0 {
break;
}
count += 1;
}
Ok(count)
}
The read_line method appends to the provided String. The docs explicitly say previous content is preserved, so you need line.clear() before the next read if you want reuse without appending.
String::clear() resets length but keeps capacity:
let mut s = String::with_capacity(256);
s.push_str("hello");
assert_eq!(s.len(), 5);
assert!(s.capacity() >= 256);
s.clear();
assert_eq!(s.len(), 0);
assert!(s.capacity() >= 256);
That is why rust read_line buffer reuse works: you keep one allocation and reuse it for every line. If one line is long, the buffer can grow, and later shorter lines reuse that capacity.
This method also counts the final line without a trailing newline. read_line() returns the number of bytes read; if it returns a positive number, a line was read. EOF is Ok(0).
Use this method when:
- You want the best default count lines rust implementation.
- You need UTF-8 validation.
- You want low memory and readable code.
- You need to filter empty or comment lines.
Method 3: Manual Byte Scanning - Zero Allocation, Maximum Speed
If all you need is the number of newline bytes, skip text decoding and scan bytes.
use std::fs::File;
use std::io::{self, Read};
fn count_lines_fast(filename: &str) -> io::Result<usize> {
let mut file = File::open(filename)?;
let mut buf = [0u8; 64 * 1024];
let mut count = 0usize;
let mut saw_any_byte = false;
let mut last_byte = b'\n';
loop {
let bytes_read = file.read(&mut buf)?;
if bytes_read == 0 {
break;
}
saw_any_byte = true;
for &byte in &buf[..bytes_read] {
if byte == b'\n' {
count += 1;
}
}
last_byte = buf[bytes_read - 1];
}
if saw_any_byte && last_byte != b'\n' {
count += 1;
}
Ok(count)
}
This is the fastest rust count lines large file pattern because it avoids:
- per-line
Stringallocation - UTF-8 validation
- line-ending trimming
- iterator item construction
It treats the file as bytes. That is ideal for logs, CSV exports, and source files when line breaks are the only thing that matters.
Use read_line() instead when invalid UTF-8 should be an error.
Benchmark: Comparing the Methods
Criterion benchmark shape:
# Cargo.toml
[dev-dependencies]
criterion = "0.5"
use criterion::{criterion_group, criterion_main, Criterion};
fn benchmark_count_lines(c: &mut Criterion) {
let filename = "benches/testdata/large.txt";
let mut group = c.benchmark_group("count_lines");
group.bench_function("lines_count_8kb_buf", |b| {
b.iter(|| count_lines_checked(filename).unwrap())
});
group.bench_function("lines_count_1mb_buf", |b| {
b.iter(|| count_lines_buffered(filename).unwrap())
});
group.bench_function("read_line_reuse", |b| {
b.iter(|| count_lines_efficient(filename).unwrap())
});
group.bench_function("manual_byte_scan", |b| {
b.iter(|| count_lines_fast(filename).unwrap())
});
group.finish();
}
criterion_group!(benches, benchmark_count_lines);
criterion_main!(benches);
Directional results for a 500MB newline-delimited file on SSD storage, release mode:
| Method | Time shape | Live memory | Main cost |
|---|---|---|---|
.lines() with default buffer | slowest | low live memory, high allocation churn | 8 KiB buffer plus String allocation |
.lines() with 1 MiB buffer | better | low live memory, high allocation churn | String allocation |
rust read_line reuse with 1 MiB buffer | good | about one line buffer plus reader buffer | UTF-8 validation |
| manual byte scan | fastest | fixed 64 KiB buffer | raw I/O bandwidth |
The exact times depend on CPU, storage, allocator, line length, and file cache state. The ranking is the useful part:
- Larger buffering reduces repeated reads.
rust read_linereuse removes per-lineStringallocation.- Manual byte scanning removes both text decoding and line allocation.
That is the rust bufreader performance story in one paragraph.
Method 5: Parallel Line Counting with Rayon
Parallelism helps most when you have many files, not one sequential stream on one disk.
Use current crate versions:
[dependencies]
rayon = "1.12"
walkdir = "2.5"
Basic Rayon version:
use rayon::prelude::*;
use std::path::PathBuf;
use walkdir::WalkDir;
fn files_with_extension(dir: &str, extension: &str) -> Vec<PathBuf> {
WalkDir::new(dir)
.into_iter()
.filter_map(Result::ok)
.filter(|entry| entry.file_type().is_file())
.filter(|entry| {
entry
.path()
.extension()
.and_then(|ext| ext.to_str())
.map(|ext| ext == extension)
.unwrap_or(false)
})
.map(|entry| entry.path().to_owned())
.collect()
}
fn count_lines_parallel(dir: &str, extension: &str) -> usize {
let files = files_with_extension(dir, extension);
files
.par_iter()
.map(|path| {
count_lines_efficient(path.to_string_lossy().as_ref()).unwrap_or_else(|err| {
eprintln!("warning: {:?}: {}", path, err);
0
})
})
.sum()
}
This is the practical rust count lines parallel rayon pattern: collect paths, use par_iter(), count each file independently, then sum().
With progress:
use rayon::prelude::*;
use std::path::PathBuf;
use std::sync::atomic::{AtomicUsize, Ordering};
use std::sync::Arc;
fn count_lines_parallel_with_progress(files: &[PathBuf]) -> usize {
let processed = Arc::new(AtomicUsize::new(0));
let total_files = files.len();
files
.par_iter()
.map(|path| {
let count = count_lines_efficient(path.to_string_lossy().as_ref()).unwrap_or(0);
let done = processed.fetch_add(1, Ordering::Relaxed) + 1;
if done % 100 == 0 || done == total_files {
eprintln!("progress: {}/{} files", done, total_files);
}
count
})
.sum()
}
Use rust count lines parallel rayon code for directories, repositories, and rotated logs. For one huge file on one disk, manual byte scanning is usually more predictable.
Part 6: A Production-Ready Rust Line Counter
This module gives you both a text-aware path and a raw byte path.
use std::fs::File;
use std::io::{self, BufRead, BufReader, Read};
use std::path::Path;
#[derive(Debug, Clone)]
pub struct CountOptions {
pub buffer_size: usize,
pub skip_empty: bool,
pub skip_comments: bool,
}
impl Default for CountOptions {
fn default() -> Self {
Self {
buffer_size: 1024 * 1024,
skip_empty: false,
skip_comments: false,
}
}
}
pub fn count_file<P: AsRef<Path>>(path: P, opts: &CountOptions) -> io::Result<usize> {
let file = File::open(path)?;
let mut reader = BufReader::with_capacity(opts.buffer_size, file);
count_reader(&mut reader, opts)
}
pub fn count_reader<R: BufRead>(reader: &mut R, opts: &CountOptions) -> io::Result<usize> {
let mut count = 0usize;
let mut line = String::new();
loop {
line.clear();
let bytes = reader.read_line(&mut line)?;
if bytes == 0 {
break;
}
let trimmed = line.trim_end_matches('\n').trim_end_matches('\r');
if opts.skip_empty && trimmed.is_empty() {
continue;
}
if opts.skip_comments && trimmed.starts_with('#') {
continue;
}
count += 1;
}
Ok(count)
}
pub fn count_file_fast<P: AsRef<Path>>(path: P) -> io::Result<usize> {
let mut file = File::open(path)?;
let mut buf = vec![0u8; 64 * 1024];
let mut count = 0usize;
let mut saw_any_byte = false;
let mut last_byte = b'\n';
loop {
let n = file.read(&mut buf)?;
if n == 0 {
break;
}
saw_any_byte = true;
count += count_newlines(&buf[..n]);
last_byte = buf[n - 1];
}
if saw_any_byte && last_byte != b'\n' {
count += 1;
}
Ok(count)
}
#[inline]
fn count_newlines(buf: &[u8]) -> usize {
buf.iter().filter(|&&b| b == b'\n').count()
}
Usage:
let count = count_file("data.txt", &CountOptions::default())?;
let opts = CountOptions {
skip_empty: true,
skip_comments: true,
..Default::default()
};
let count = count_file("config.txt", &opts)?;
let count = count_file_fast("access.log")?;
Production checklist:
- Use a larger buffered reader capacity for large sequential files.
- Use
rust read_lineplusclear()instead of.lines()when discarding line content. - Propagate errors with
?, notunwrap(). - Decide whether invalid UTF-8 should fail or be ignored.
- Handle final lines without trailing newline.
- Normalize CRLF when inspecting line content.
- Use Rayon for many files, not as a reflex for one file.
Part 7: Special Scenarios
stdin
use std::io::{self, BufRead};
fn main() -> io::Result<()> {
let stdin = io::stdin();
let mut reader = stdin.lock();
let mut count = 0usize;
let mut line = String::new();
loop {
line.clear();
if reader.read_line(&mut line)? == 0 {
break;
}
count += 1;
}
println!("{}", count);
Ok(())
}
For a quick terminal alternative, use wc -l for quick counts.
In-memory strings
Rust string lines() behavior is convenient and Unicode-aware:
fn count_lines_str(s: &str) -> usize {
s.lines().count()
}
If you want raw newline-byte behavior where "a\n" is one line and "a" is one line:
fn count_lines_str_bytes(s: &str) -> usize {
if s.is_empty() {
return 0;
}
let newlines = s.as_bytes().iter().filter(|&&b| b == b'\n').count();
if s.as_bytes().last() == Some(&b'\n') {
newlines
} else {
newlines + 1
}
}
To reuse the same generic buffered-read helper:
use std::io::{BufRead, Cursor};
fn count_lines_with_cursor(s: &str) -> std::io::Result<usize> {
let mut cursor = Cursor::new(s.as_bytes());
let mut count = 0usize;
let mut line = String::new();
loop {
line.clear();
if cursor.read_line(&mut line)? == 0 {
break;
}
count += 1;
}
Ok(count)
}
Windows line endings
The lines() iterator removes \n and CRLF from returned strings. read_line() keeps the line ending in the buffer.
That is why the production helper trims both:
let trimmed = line.trim_end_matches('\n').trim_end_matches('\r');
For pure counting, CRLF needs no special handling because each Windows line still contains one \n byte.
bytecount crate
For in-memory data, bytecount rust code can be faster than a manual loop.
[dependencies]
bytecount = "0.6.9"
fn count_lines_bytecount(filename: &str) -> std::io::Result<usize> {
let data = std::fs::read(filename)?;
let newlines = bytecount::count(&data, b'\n');
let count = if !data.is_empty() && data.last() != Some(&b'\n') {
newlines + 1
} else {
newlines
};
Ok(count)
}
The bytecount docs describe bytecount::count as a fast byte occurrence counter. Optional features can enable runtime SIMD dispatch for SSE2 and AVX2 code paths. Use bytecount rust code when data already fits in memory or when a read-all approach is acceptable.
Which Rust Method Should You Use?
Need to count lines in Rust?
|
+-- Small trusted file?
| +-- .lines().try_fold(...)
|
+-- Large text file and want UTF-8 validation?
| +-- BufReader::with_capacity + read_line + clear
|
+-- Huge file and only newline bytes matter?
| +-- Manual byte scanning
|
+-- Data already in memory?
| +-- bytecount::count or str::lines
|
+-- Many files?
+-- Rayon par_iter + per-file counter
The shortest answer is .lines().count(). The right production answer is usually rust read_line reuse. The fastest answer is manual byte scanning.
FAQ
How do I count lines in a file in Rust?
Use BufReader::with_capacity, then call read_line() in a loop while reusing a single String with clear().
What is the fastest way to count lines in Rust?
Manual byte scanning is usually fastest. It reads fixed-size byte chunks and counts b'\n' without allocating or validating UTF-8.
Why is .lines().count() slow in Rust?
The lines iterator returns owned String values, so it allocates per line. The default buffer is currently 8 KiB, which may be small for large sequential files.
How do I fix slow BufReader performance in Rust?
Use BufReader::with_capacity, commonly 1 MiB for large files, and prefer read_line() buffer reuse when you are discarding line content. This is the core rust bufreader performance fix.
How do I count lines without allocating in Rust?
Use manual byte scanning with Read::read into a fixed buffer and count b'\n'.
How do I count lines in parallel in Rust?
Use Rayon par_iter() over a collection of file paths. The rust count lines parallel rayon pattern is best for many files, not one ordinary sequential read.
How do I count lines in a String in Rust?
Use s.lines().count() for Rust string line semantics, or count newline bytes and compensate for a missing trailing newline if you need file-style physical line counts.
Does Rust handle Windows line endings when counting?
Yes for basic line counts. The lines iterator removes CRLF endings from returned strings. read_line() keeps the ending, so trim \n and then \r if you inspect content.
Sources Checked
- Rust
BufReaderdocumentation for default capacity andwith_capacity: https://doc.rust-lang.org/std/io/struct.BufReader.html - Rust
BufReaddocumentation forread_line,lines, ownership, UTF-8 errors, and CRLF behavior: https://doc.rust-lang.org/std/io/trait.BufRead.html - Rust forum discussion of
lines()allocation cost versusread_line()buffer reuse: https://users.rust-lang.org/t/why-using-the-read-lines-iterator-is-much-slower-than-using-read-line/92815 - Rayon 1.12.0 release notes: https://docs.rs/crate/rayon/latest/source/RELEASES.md
- walkdir 2.5.0 documentation: https://docs.rs/walkdir/latest/walkdir/
- bytecount 0.6.9 documentation and SIMD feature notes: https://docs.rs/bytecount
Related Guides and Tools
- Go line counting (similar BufReader traps) for scanner buffers, gzip logs, and goroutines.
- Python line counting for scripts, CSVs, and large-file methods.
- C line counting for
fread,mmap,wc -linternals, and thefgetslong-line trap. - Haskell lazy I/O and ByteString counting for
readFile, strict handle scopes, and newline-byte fixes. - Swift line counting for
FileHandle,URL.lines, SwiftUI progress, and theautoreleasepoolmemory trap. - Java line counting for JVM services and benchmarked file I/O.
- wc -l for quick counts for shell workflows.
- Line Counter tool for no-code line counting.
Counting Lines Should Not Require a Cargo Project
If you need a quick line count on a log file, a CSV export, or any text file, paste it into the Line Counter. No Rust toolchain required, no allocation choices, just the number.
Frequently Asked Questions
How do I count lines in a file in Rust?
For production code, open the file with BufReader::with_capacity, then call read_line in a loop while reusing one String buffer.
What is the fastest way to count lines in Rust?
Manual byte scanning is usually fastest because it counts b'\n' in fixed-size buffers without UTF-8 validation or per-line String allocation.
Why is .lines().count() slow in Rust?
The lines iterator returns owned String values, so it allocates per line. The default reader buffer is currently 8 KiB, which can be small for large sequential files.
How do I fix slow BufReader performance in Rust?
Use BufReader::with_capacity, commonly 1 MiB for large sequential files, then reuse a String with read_line and clear.
How do I count lines without allocating in Rust?
Read byte chunks with Read::read and count b'\n'. Add one if the non-empty file does not end with a newline.
How do I count lines in parallel in Rust?
Collect file paths with walkdir and use Rayon par_iter to count files concurrently, then sum the per-file counts.
How do I count lines in a String in Rust?
For Rust str behavior, use s.lines().count(). For raw newline-byte behavior, count b'\n' and compensate for a missing trailing newline.
Does Rust handle Windows line endings when counting?
The lines iterator strips LF and CRLF endings. read_line keeps line endings in the String, so trim '\n' and then '\r' when inspecting content.
Related Guides
14 min read
How to Count Lines in a File Using Go (And the bufio.Scanner Trap You Need to Know)
Count lines in a file using Go — bufio.Scanner, bytes.Count, and manual byte scanning. Includes the critical 64KB buffer limit fix, benchmark results, and concurrent file processing.
20 min read
How to Count Lines in Python: 7 Methods, Benchmarked and Battle-Tested
Count lines in Python strings, text files, large files, and directories. Includes real performance benchmarks, empty file handling, splitlines vs split, and production-ready functions.
13 min read
How to Count Lines in a File in C (And Why `fgetc` Is 9x Slower Than `fread`)
Count lines in a file in C — fgets, fread, mmap, and the large performance gap between them. Covers `wc -l` internals, Windows vs Linux portability, long-line traps, and production-ready counting patterns for large files.
16 min read
How to Count Lines in a File Using Java (6 Methods, Benchmarked)
Count lines in a file using Java — BufferedReader, Files.lines, LineNumberReader, BufferedInputStream, and more. Includes benchmark results for 5GB files and Java 8–17 examples.