Table of Contents
C++ Deep Dive
How to Count Lines in a File in C++ (And Two Classic Traps That Catch Everyone)
Count lines in a file in C++ using ifstream, istreambuf_iterator, fread, and mmap. Covers the while(!eof()) off-by-one trap, the single-pass istreambuf_iterator reset issue, and why newline-byte counters must fix the missing final newline case.
Two C++ line-counting bugs show up so often that they deserve to be treated as language lore.
The first one is the classic while(!eof()) loop:
while (!file.eof()) {
std::getline(file, line);
++count;
}
It looks reasonable. It is also the wrong way to write c++ ifstream count lines, because eof() is only set after a read fails.
The second trap is more subtle. Many people switch to std::count plus std::istreambuf_iterator<char>:
auto count = std::count(
std::istreambuf_iterator<char>(file),
std::istreambuf_iterator<char>(),
'\n'
);
That is a tidy modern answer to c++ istreambuf_iterator count lines, but it has two consequences:
- the stream position ends up at the end of the file
- it only counts newline bytes, not logical lines
So if you try to getline from the same stream afterward, you need to reset it. And if the file is non-empty but does not end with '\n', a plain newline counter undercounts by one.
Those are the real issues behind c++ count lines in file:
- correctness for the last line
- correct stream-state handling
- performance on large files
- portability between standard C++ and POSIX-only fast paths
This guide walks through the full ladder:
std::getlinefor the correct, readable baselinestd::istreambuf_iteratorfor compact byte scanningfreadfor the fastest practical cross-platform defaultmmapfor a POSIX-only mapped-file fast path
If you only want the short answer:
- small or medium file, readable code first:
while (std::getline(file, line)) - large file, count only, portable speed:
freadwith a64 KiBbuffer - POSIX-only environment and benchmark-driven tuning:
mmap
Quick Method Guide
| I want to... | Use this | Main warning |
|---|---|---|
Write the simplest correct c++ count lines loop | while (std::getline(file, line)) | do not write while (!eof()) |
| Write concise modern byte-scanning code | std::count with std::istreambuf_iterator<char> | reset the stream before reusing it |
| Get the best cross-platform performance | fread plus a byte scan | fix the missing final newline case |
| Push for the POSIX fast path | mmap plus a byte scan | benchmark it and own the portability cost |
| Reuse the stream after counting | file.clear(); file.seekg(0); | required if a failed extraction already set failbit |
For most production c++ count lines in file work, fread is the most practical answer.
Method 1: std::getline - The Correct Basic Approach
This is the baseline every c++ ifstream count lines article should start with:
#include <fstream>
#include <stdexcept>
#include <string>
long count_lines_getline(const std::string& filename) {
std::ifstream file(filename);
if (!file.is_open()) {
throw std::runtime_error("Cannot open: " + filename);
}
long count = 0;
std::string line;
while (std::getline(file, line)) {
++count;
}
return count;
}
Why this remains the cleanest beginner answer:
- it counts logical lines, not just
'\n'bytes - it correctly handles a last line without a trailing newline
- it maps directly to how most people think about
c++ count lines in file
The while(!eof()) trap
This is the classic while eof c++ wrong pattern:
int count = 0;
std::string line;
while (!file.eof()) {
std::getline(file, line);
++count;
}
Why it overcounts:
eof()is checked before the extraction.- The last successful
getlinereads the final line. - The loop runs again because EOF has not been observed through a failed read yet.
- The next
getlinefails. - You still increment
count.
That is why correct c++ count lines code makes the read itself control the loop:
while (std::getline(file, line)) {
++count;
}
Skipping empty lines or comment lines
If your real goal is "content lines" rather than raw line count, build from the same loop:
long count_nonempty_lines(const std::string& filename) {
std::ifstream file(filename);
std::string line;
long count = 0;
while (std::getline(file, line)) {
if (!line.empty()) {
++count;
}
}
return count;
}
long count_data_lines(const std::string& filename) {
std::ifstream file(filename);
std::string line;
long count = 0;
while (std::getline(file, line)) {
if (!line.empty() && line[0] != '#') {
++count;
}
}
return count;
}
Method 2: std::istreambuf_iterator - Compact, Single-Pass, and Easy to Misuse
This is the tight modern one-liner many people reach for:
#include <algorithm>
#include <fstream>
#include <iterator>
long count_newline_bytes(const std::string& filename) {
std::ifstream file(filename, std::ios::binary);
if (!file.is_open()) {
throw std::runtime_error("Cannot open: " + filename);
}
return std::count(
std::istreambuf_iterator<char>(file),
std::istreambuf_iterator<char>(),
'\n'
);
}
It is concise, but it is not a complete c++ count lines in file answer yet.
A newline-byte counter has to answer two more questions:
- is the file empty?
- if it is non-empty, does it end with
'\n'?
Here is the corrected version:
#include <algorithm>
#include <fstream>
#include <iterator>
#include <stdexcept>
long count_lines_iter(const std::string& filename) {
std::ifstream file(filename, std::ios::binary);
if (!file.is_open()) {
throw std::runtime_error("Cannot open: " + filename);
}
file.seekg(0, std::ios::end);
const std::streamoff size = file.tellg();
if (size == 0) {
return 0;
}
file.seekg(0, std::ios::beg);
long count = std::count(
std::istreambuf_iterator<char>(file),
std::istreambuf_iterator<char>(),
'\n'
);
file.clear();
file.seekg(-1, std::ios::end);
char last = '\n';
file.get(last);
if (last != '\n') {
++count;
}
return count;
}
That last-byte fix matters. A non-empty file containing:
alpha
beta
has two logical lines, even if there is only one newline byte.
Trap 2: the stream is at the end after counting
This is the second classic bug:
std::ifstream file("data.txt");
auto count = std::count(
std::istreambuf_iterator<char>(file),
std::istreambuf_iterator<char>(),
'\n'
);
std::string line;
while (std::getline(file, line)) {
// reads nothing
}
std::istreambuf_iterator is a single-pass input iterator. After that count finishes, the stream position is at the end.
If you want to read again immediately, reset the stream explicitly:
file.clear();
file.seekg(0);
That is the safest istreambuf_iterator seekg pattern.
Why both calls?
seekg(0)is enough in the simple case where you reset the stream immediately after the iterator passclear()becomes necessary once a later extraction has already failed and setfailbit
So the robust habit is clear() first, then seekg().
Do not use std::istream_iterator<char> here
For newline counting, std::istream_iterator<char> is the wrong tool:
std::istream_iterator<char> wrong_begin(file);
That iterator performs formatted extraction and normally skips whitespace, including newlines. For c++ count lines, use std::istreambuf_iterator<char> when you want raw byte traversal.
Method 3: fread - The Best Cross-Platform Performance Default
If the job is only c++ count lines large file, you do not need std::string allocation per line. That is where c++ fread count lines becomes the workhorse.
#include <array>
#include <cstdio>
#include <cstdint>
#include <memory>
#include <stdexcept>
#include <string>
long count_lines_fread(const std::string& filename) {
auto closer = [](FILE* fp) {
if (fp) {
std::fclose(fp);
}
};
std::unique_ptr<FILE, decltype(closer)>
fp(std::fopen(filename.c_str(), "rb"), closer);
if (!fp) {
throw std::runtime_error("Cannot open: " + filename);
}
constexpr std::size_t chunk_size = 65536;
std::array<unsigned char, chunk_size> buf{};
long count = 0;
bool saw_any = false;
unsigned char last = '\n';
std::size_t bytes = 0;
while ((bytes = std::fread(buf.data(), 1, buf.size(), fp.get())) > 0) {
saw_any = true;
for (std::size_t i = 0; i < bytes; ++i) {
if (buf[i] == '\n') {
++count;
}
}
last = buf[bytes - 1];
}
if (std::ferror(fp.get())) {
throw std::runtime_error("fread failed: " + filename);
}
if (saw_any && last != '\n') {
++count;
}
return count;
}
Why this is such a strong answer to c++ count lines large file:
- one fixed buffer
- no per-line
std::stringwork - one pass
- portable across Windows, Linux, and macOS
Open in "rb" mode when you want raw byte semantics, especially if you care about consistent Windows and Linux behavior.
If you want the same systems-level I/O reasoning in plain C, see C fread and mmap patterns.
Why fread often beats line-oriented C++ I/O
The core cost difference is simple:
std::getlinehas to find delimiters and manage a mutable string objectfreadfills a fixed buffer and lets you scan bytes directly
That is why c++ fread count lines is usually the first serious performance step up from c++ ifstream count lines.
Method 4: mmap - The POSIX Mapped-File Fast Path
On Linux and macOS, mapped-file scanning can be excellent:
#include <fcntl.h>
#include <stdexcept>
#include <string>
#include <sys/mman.h>
#include <sys/stat.h>
#include <unistd.h>
long count_lines_mmap(const std::string& filename) {
const int fd = ::open(filename.c_str(), O_RDONLY);
if (fd == -1) {
throw std::runtime_error("open failed: " + filename);
}
struct stat st {};
if (::fstat(fd, &st) == -1) {
::close(fd);
throw std::runtime_error("fstat failed: " + filename);
}
if (st.st_size == 0) {
::close(fd);
return 0;
}
auto* data = static_cast<const char*>(
::mmap(nullptr, st.st_size, PROT_READ, MAP_PRIVATE, fd, 0)
);
if (data == MAP_FAILED) {
::close(fd);
throw std::runtime_error("mmap failed: " + filename);
}
::madvise(const_cast<char*>(data), st.st_size, MADV_SEQUENTIAL);
long count = 0;
for (off_t i = 0; i < st.st_size; ++i) {
if (data[i] == '\n') {
++count;
}
}
if (data[st.st_size - 1] != '\n') {
++count;
}
::munmap(const_cast<char*>(data), st.st_size);
::close(fd);
return count;
}
This is the POSIX answer to "I want c++ count lines with the fewest copies possible."
But mmap is not free:
- it is not standard C++
- Windows needs a different mapping API
- you still do
O(n)work because every byte must be examined - gains over a tuned
freadloop depend on the file, the kernel, and cache behavior
So for most teams, the real ladder is:
- start with
std::getlinefor correctness and clarity - move to
freadwhenc++ count lines large filebecomes the problem - consider
mmaponly after measuring
Benchmark: Useful Ranking, Not a Universal Promise
I measured the methods above on this machine with a warm page cache and a synthetic 199 MiB text file containing 3.2 million fixed-width lines.
Environment:
g++ 13.3.0- Linux
6.17 199 MiBfile in/tmp- all methods verified to return the same logical line count
Directional results:
| Method | Time on this machine | What it means |
|---|---|---|
std::getline loop | 175 ms | simple and correct, but still pays line-oriented parsing cost |
std::istreambuf_iterator plus std::count | 319 ms | compact code, but not a guaranteed speed trick |
fread with 64 KiB buffer | 145 ms | best cross-platform performance default |
mmap plus byte scan | 134 ms | strong POSIX fast path, but only slightly ahead here |
Three important conclusions:
- the ranking matters more than the exact milliseconds
std::istreambuf_iteratoris elegant, but not automatically faster thanstd::getlinefreadis the safest serious answer when you needc++ count lines large file
I did not include while (!eof()) in the useful ranking because it is a correctness bug first and a benchmark candidate second.
Part 5: A Production-Ready C++17 Line Counter
If you want one small module that keeps the choices explicit, this is a good starting point:
#pragma once
#include <array>
#include <cstdio>
#include <cstdint>
#include <filesystem>
#include <fstream>
#include <memory>
#include <optional>
#include <string>
namespace lc {
struct Result {
long total = 0;
long empty = 0;
long nonempty = 0;
std::uintmax_t file_size = 0;
};
inline std::optional<Result> count_getline(
const std::filesystem::path& path
) {
std::ifstream file(path);
if (!file.is_open()) {
return std::nullopt;
}
Result result;
result.file_size = std::filesystem::file_size(path);
std::string line;
while (std::getline(file, line)) {
++result.total;
if (line.empty()) {
++result.empty;
} else {
++result.nonempty;
}
}
return result;
}
inline std::optional<long> count_fast(
const std::filesystem::path& path
) {
auto closer = [](FILE* fp) {
if (fp) {
std::fclose(fp);
}
};
std::unique_ptr<FILE, decltype(closer)>
fp(std::fopen(path.string().c_str(), "rb"), closer);
if (!fp) {
return std::nullopt;
}
constexpr std::size_t chunk_size = 65536;
std::array<unsigned char, chunk_size> buf{};
long count = 0;
bool saw_any = false;
unsigned char last = '\n';
std::size_t bytes = 0;
while ((bytes = std::fread(buf.data(), 1, buf.size(), fp.get())) > 0) {
saw_any = true;
for (std::size_t i = 0; i < bytes; ++i) {
if (buf[i] == '\n') {
++count;
}
}
last = buf[bytes - 1];
}
if (std::ferror(fp.get())) {
return std::nullopt;
}
if (saw_any && last != '\n') {
++count;
}
return count;
}
inline void rewind_after_iterator(std::ifstream& file) {
file.clear();
file.seekg(0);
}
} // namespace lc
This gives you:
- a clean
std::getlinepath for richer per-line logic - a fast
freadpath when only the count matters - an explicit helper for the
istreambuf_iterator seekgreset case
Part 6: Edge Cases That Change the Answer
Empty file
Every method should return 0 for an empty file.
One line without a trailing newline
This file:
alpha
still has one logical line.
That means:
std::getlinecounts it correctly- pure newline-byte counting returns
0unless you add the final-line fix
Windows CRLF line endings
When you count only '\n', Windows "\r\n" still works because each logical line contains one newline byte.
For byte-scanning methods, use binary mode:
std::ifstream file(filename, std::ios::binary);std::fopen(filename.c_str(), "rb");
That keeps c++ count lines windows linux behavior predictable.
Reusing the same stream
If you count with std::istreambuf_iterator and then need std::getline afterward, reset the stream:
file.clear();
file.seekg(0);
If you never need the stream again, you can ignore the reset entirely and close the file normally.
Which C++ Method Should You Use?
The practical answer is shorter than the ecosystem of examples makes it seem:
- default readable
c++ count lines in filecode:std::getline - default fast
c++ count lines large filecode:fread - neat iterator-based code where you also understand the reset and last-line rules:
std::istreambuf_iterator - POSIX-only tuning where you have benchmark evidence:
mmap
That is the real hierarchy.
FAQ
How do I count lines in C++?
Use std::ifstream plus while (std::getline(file, line)). It is the cleanest correct baseline for c++ ifstream count lines.
Why does while(!eof()) count one extra line?
Because eof() only flips after a read fails. That is why while eof c++ wrong is such a common trap.
Why does getline read nothing after istreambuf_iterator?
Because the iterator already consumed the stream to the end. Use file.clear(); file.seekg(0); before trying to read again.
Is std::istreambuf_iterator faster than std::getline?
Not automatically. It is a more compact raw-byte iterator, not a guaranteed performance win. In my local benchmark for this article, it was slower than the plain std::getline loop.
What is the fastest portable way to count lines in C++?
For most workloads, c++ fread count lines is the strongest cross-platform default because it avoids per-line string management.
How do I count lines in a large C++ file?
Use fread or mmap and count '\n' bytes, then add one more line when the non-empty file does not end with '\n'.
Sources Checked
- cppreference on
std::istreambuf_iterator, including its single-pass input-iterator model: https://en.cppreference.com/w/cpp/iterator/istreambuf_iterator.html - cppreference on
std::basic_istream::seekg, including the note that it clearseofbitbefore seeking: https://en.cppreference.com/w/cpp/io/basic_istream/seekg - cppreference on
std::istream_iterator, whose formatted extraction behavior is the reason it is wrong for newline counting: https://en.cppreference.com/w/cpp/iterator/istream_iterator.html - Stack Overflow discussion of why
while(!eof())miscounts file lines in C++: https://stackoverflow.com/questions/5605125/why-is-iostreameof-inside-a-loop-condition-considered-wrong - Stack Overflow discussion showing that stream reuse after iterator-based counting requires repositioning the stream: https://stackoverflow.com/questions/8399276/after-counting-lines-in-a-text-file-with-istreambuf-iterators-why-cant-i-read
- Linux
mmap(2)manual page for mapped-file behavior: https://man7.org/linux/man-pages/man2/mmap.2.html - Linux
madvise(2)manual page forMADV_SEQUENTIAL: https://man7.org/linux/man-pages/man2/madvise.2.html - The
left404.combenchmark that makes the block-I/O versus tiny-call I/O gap concrete: https://left404.com/2011/03/17/disk-io-in-c-avoid-fgetcfputc/
Related Guides and Tools
- C
freadandmmappatterns - Rust BufReader vs byte scanning
- Python file I/O
- Java file line counting
- Line Counter tool
Tired of off-by-one bugs and EOF surprises in C++?
Paste the file into the Line Counter. No while(!eof()) trap. No forgotten seekg(0). Just the number.
Frequently Asked Questions
How do I count lines in C++?
The simplest correct answer is a std::ifstream plus while (std::getline(file, line)) loop. For large files where only the count matters, fread with a medium-sized buffer is usually the best cross-platform default.
Why is while(!eof()) wrong in C++?
Because eof() only becomes true after a read fails. That means the loop body runs one time too many and often overcounts the last line.
Why does getline read nothing after istreambuf_iterator?
The iterator has already consumed the stream to the end. Reset the stream before reusing it, and if a failed extraction already set failbit, call clear before seekg.
What is the fastest way to count lines in C++?
For portable code, fread plus byte scanning is the most practical high-performance default. On POSIX systems, mmap can be competitive or slightly faster, but you should measure it on your workload.
How do I count lines in a large C++ file without loading it all into memory?
Use fread or mmap and count newline bytes in chunks or directly in mapped memory, then fix the missing-final-newline case.
Should I use istream_iterator or istreambuf_iterator?
Use istreambuf_iterator for raw byte scanning. istream_iterator<char> performs formatted extraction and normally skips whitespace, which makes it unsuitable for counting '\n'.
Related Guides
13 min read
How to Count Lines in a File in C (And Why `fgetc` Is 9x Slower Than `fread`)
Count lines in a file in C — fgets, fread, mmap, and the large performance gap between them. Covers `wc -l` internals, Windows vs Linux portability, long-line traps, and production-ready counting patterns for large files.
16 min read
How to Count Lines in a File Using Rust (The Right Way, and the Fast Way)
Count lines in a file using Rust — from .lines().count() to zero-allocation byte scanning. Covers the 8KB buffer trap, String allocation overhead, and concurrent multi-file processing with Rayon.
20 min read
How to Count Lines in Python: 7 Methods, Benchmarked and Battle-Tested
Count lines in Python strings, text files, large files, and directories. Includes real performance benchmarks, empty file handling, splitlines vs split, and production-ready functions.
16 min read
How to Count Lines in a File Using Java (6 Methods, Benchmarked)
Count lines in a file using Java — BufferedReader, Files.lines, LineNumberReader, BufferedInputStream, and more. Includes benchmark results for 5GB files and Java 8–17 examples.