C++Talk.NET Forum Index C++Talk.NET
C++ language newsgroups
 
Archives   FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

fgets() vs std::getline() performance
Goto page 1, 2  Next
 
Post new topic   Reply to topic    C++Talk.NET Forum Index -> C++ Language (Moderated)
View previous topic :: View next topic  
Author Message
crhras
Guest





PostPosted: Fri Sep 15, 2006 9:01 am    Post subject: fgets() vs std::getline() performance Reply with quote



Wow ! I just used two different file IO methods and the performance
difference was huge. Is there something that I am doing wrong? or is
fgets() just that much faster than getline()?

Here's the code I used :

// SLOOOOOOW
// -----------------
std::string line;
std::ifstream in(filename.c_str());

while (std::getline(in, line,'\n'))
{
}

// FAAAAAASSSSSTTTTT
// ---------------------------
FILE * fp;
fp = fopen(filename.c_str(), "r");

while (fgets(line, 512, fp) != NULL)
{
}


[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
Back to top
Earl Purple
Guest





PostPosted: Fri Sep 15, 2006 7:02 pm    Post subject: Re: fgets() vs std::getline() performance Reply with quote



crhras wrote:

Quote:
Wow ! I just used two different file IO methods and the performance
difference was huge. Is there something that I am doing wrong? or is
fgets() just that much faster than getline()?

Here's the code I used :

// SLOOOOOOW
// -----------------
std::string line;
std::ifstream in(filename.c_str());

while (std::getline(in, line,'\n'))
{
}

// FAAAAAASSSSSTTTTT
// ---------------------------
FILE * fp;
fp = fopen(filename.c_str(), "r");

while (fgets(line, 512, fp) != NULL)
{
}

If you performed the fstream test first for a big file then immediately
did the FILE * test on the same machine with the same file, it is
likely that the O/S had cached something because it was a recently read
file. That might explain the difference in performance.


[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
Back to top
Pete Becker
Guest





PostPosted: Fri Sep 15, 2006 7:13 pm    Post subject: Re: fgets() vs std::getline() performance Reply with quote



crhras wrote:
Quote:
Wow ! I just used two different file IO methods and the performance
difference was huge. Is there something that I am doing wrong? or is
fgets() just that much faster than getline()?


It's easy to implement getline badly. Making it faster requires some
thought, but there's no good reason for getline to be significantly
slower than fgets. It should be within 10-20 percent.

--

-- Pete

Author of "The Standard C++ Library Extensions: a Tutorial and
Reference." For more information about this book, see
www.petebecker.com/tr1book.

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
Back to top
alex
Guest





PostPosted: Fri Sep 15, 2006 11:05 pm    Post subject: Re: fgets() vs std::getline() performance Reply with quote

crhras wrote:
Quote:
Wow ! I just used two different file IO methods and the performance
difference was huge. Is there something that I am doing wrong? or is
fgets() just that much faster than getline()?

Here's the code I used :

// SLOOOOOOW
// -----------------
std::string line;
std::ifstream in(filename.c_str());

while (std::getline(in, line,'\n'))
{
}

// FAAAAAASSSSSTTTTT
// ---------------------------
FILE * fp;
fp = fopen(filename.c_str(), "r");

while (fgets(line, 512, fp) != NULL)
{
}

The code blocks above are not completely equivalent. You can re-test
with this block instead:

std::ifstream in(filename.c_str());
char line[512];
while (in.getline(line, 512))
{
}

BTW, specifying precise test conditions may somewhat clarify you point.


Alex


[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
Back to top
crhras
Guest





PostPosted: Sat Sep 16, 2006 4:48 pm    Post subject: Re: fgets() vs std::getline() performance Reply with quote

{ Quoted clc++m banner removed. -mod }

"Thomas Tutone" <Thomas8675309 (AT) yahoo (DOT) com> wrote in message
news:1158296928.437104.95290 (AT) m73g2000cwd (DOT) googlegroups.com...
Quote:
crhras wrote:

Wow ! I just used two different file IO methods and the performance
difference was huge. Is there something that I am doing wrong? or is
fgets() just that much faster than getline()?

Here's the code I used :

// SLOOOOOOW
// -----------------


insert the following line here, then rerun your timing test:

std::ios::sync_with_stdio(false);

std::string line;
std::ifstream in(filename.c_str());

while (std::getline(in, line,'\n'))
{
}

Please report back on the result after making that change.

Best regards,

Tom

Thank you for your suggestion.
I inserted the line you suggested and it cut around 25% off of the test
time
but fgets() is still considerably faster than getline().

Here's the test program which I used :
------------------------------------------
time_t start, end;
double dif;

std::string line;
std::ifstream in("c:\\Data\\IVData.csv");

std::ios::sync_with_stdio(false);

time(&start);
while (std::getline(in, line,'\n'))
{
}
time(&end);
in.close();

dif = difftime (end,start);
cxMemo1->Lines->Add((AnsiString)"Test 1 has taken " + dif + " seconds.");

Result :
------------------
Test 1 has taken 159 seconds. (getline)
Test 2 has taken 4 seconds. (fgets)


[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
Back to top
crhras
Guest





PostPosted: Sat Sep 16, 2006 4:49 pm    Post subject: Re: fgets() vs std::getline() performance Reply with quote

{ Quoted clc++m banner removed. -mod }

"Earl Purple" <earlpurple (AT) gmail (DOT) com> wrote in message
news:1158307576.886768.308030 (AT) d34g2000cwd (DOT) googlegroups.com...
Quote:

crhras wrote:

Wow ! I just used two different file IO methods and the performance
difference was huge. Is there something that I am doing wrong? or is
fgets() just that much faster than getline()?

Here's the code I used :

// SLOOOOOOW
// -----------------
std::string line;
std::ifstream in(filename.c_str());

while (std::getline(in, line,'\n'))
{
}

// FAAAAAASSSSSTTTTT
// ---------------------------
FILE * fp;
fp = fopen(filename.c_str(), "r");

while (fgets(line, 512, fp) != NULL)
{
}

If you performed the fstream test first for a big file then immediately
did the FILE * test on the same machine with the same file, it is
likely that the O/S had cached something because it was a recently read
file. That might explain the difference in performance.

This is what I first thought but I ran the test twice to make sure that
caching wasn't a consideration.


[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
Back to top
crhras
Guest





PostPosted: Sat Sep 16, 2006 4:50 pm    Post subject: Re: fgets() vs std::getline() performance Reply with quote

{ Quoted clc++m banner removed. -mod }

"alex" <alex.shulgin (AT) gmail (DOT) com> wrote in message
news:1158333704.094313.148570 (AT) d34g2000cwd (DOT) googlegroups.com...
Quote:
crhras wrote:
Wow ! I just used two different file IO methods and the performance
difference was huge. Is there something that I am doing wrong? or is
fgets() just that much faster than getline()?

Here's the code I used :

// SLOOOOOOW
// -----------------
std::string line;
std::ifstream in(filename.c_str());

while (std::getline(in, line,'\n'))
{
}

// FAAAAAASSSSSTTTTT
// ---------------------------
FILE * fp;
fp = fopen(filename.c_str(), "r");

while (fgets(line, 512, fp) != NULL)
{
}

The code blocks above are not completely equivalent. You can re-test
with this block instead:

std::ifstream in(filename.c_str());
char line[512];
while (in.getline(line, 512))
{
}

BTW, specifying precise test conditions may somewhat clarify you point.


Alex

Thanks for your response.
I tested your suggestion and it seemed to take about 25% less time. Based
on the results it seems to do the same as adding the line
std::ios::sync_with_stdio(false); which was a suggestion in another post.

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
Back to top
crhras
Guest





PostPosted: Sat Sep 16, 2006 4:55 pm    Post subject: Re: fgets() vs std::getline() performance Reply with quote

{ Quoted clc++m banner removed. Wrapped lines fixed up. Because I had
the time. Wink -mod }

"crhras" <crhras (AT) sbcglobal (DOT) net> wrote in message
news:TkoOg.1329$e66.993 (AT) newssvr13 (DOT) news.prodigy.com...
Quote:


Wow ! I just used two different file IO methods and the performance
difference was huge. Is there something that I am doing wrong? or is
fgets() just that much faster than getline()?

Here's the code I used :

// SLOOOOOOW
// -----------------
std::string line;
std::ifstream in(filename.c_str());

while (std::getline(in, line,'\n'))
{
}

// FAAAAAASSSSSTTTTT
// ---------------------------
FILE * fp;
fp = fopen(filename.c_str(), "r");

while (fgets(line, 512, fp) != NULL)
{
}

Thank you for the responses. I went back to the drawing board using
your suggestions. Some of the results are in this post and others will
be posted under the specific newsgroup response which I was testing.

First off, the data file used contains 3.5 million text records of
varying lengths terminated by '\n'.

I reran the tests with some timers to tell exactly how long each case is
taking. I ran each test twice in sequence to make sure that caching was
not responsible for the time difference.

-----------------------------------
// Test 1
while (std::getline(in, line,'\n')) { }
-----------------------------------
// Test 2
while (fgets(cline, 512, fp) != NULL) { }

Results :
Test 1 has taken 203 seconds.
Test 2 has taken 5 seconds.
Test 1 has taken 201 seconds.
Test 2 has taken 4 seconds.

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
Back to top
ShaoVie@gmail.com
Guest





PostPosted: Sat Sep 16, 2006 5:00 pm    Post subject: Re: fgets() vs std::getline() performance Reply with quote

$ yes dddd > cui
ls -l cui
-rw-rw-r-- 1 ace ace 130027520 Sep 16 16:53 cui
------------------------------------------------------------------------------------------------------
#include <fstream>
#include <cstdio>

using namespace std;
int main ()

{
#ifndef FILE_C
ifstream in("cui", ios::in);
char ch[80];
while (in.getline (ch, 80))
;
in.close ();
#else
FILE *fp = fopen ("cui", "r");
char ch[80];
while (fgets (ch, 80, fp))
;
fclose (fp);
#endif
}
my code
------------------------------------------------------
g++ -o test test.cpp
$time ./test
real 0m3.782s
user 0m3.650s
sys 0m0.140s
-----------------------------------------------------
g++ -DFILE_C -o test test.cpp
$time ./test
.....
real 0m4.539s
user 0m4.420s
sys 0m0.120s
-------------------------------------------------------


[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
Back to top
Seungbeom Kim
Guest





PostPosted: Sat Sep 16, 2006 5:01 pm    Post subject: Re: fgets() vs std::getline() performance Reply with quote

crhras wrote:
Quote:
Wow ! I just used two different file IO methods and the performance
difference was huge. Is there something that I am doing wrong? or is
fgets() just that much faster than getline()?

Here's the code I used :

// SLOOOOOOW
// -----------------
std::string line;
std::ifstream in(filename.c_str());

while (std::getline(in, line,'\n'))
{
}

// FAAAAAASSSSSTTTTT
// ---------------------------
FILE * fp;
fp = fopen(filename.c_str(), "r");

while (fgets(line, 512, fp) != NULL)
{
}

Strange. What environment did you run the tests on?
On Linux-2.6.8, gcc-3.3.5 and libstdc++5-3.3-dev, the fgets version gives

real 0m0.020s
user 0m0.016s
sys 0m0.005s

while the getline version gives

real 0m0.006s
user 0m0.004s
sys 0m0.002s

on a 3.6 MB test file. This may not always be true, but can
at least indicate that getline is not always slower than fgets.

In addition, adding std::ios_base::sync_with_stdio(false) at the
beginning did not alter the result significantly.

--
Seungbeom Kim

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
Back to top
Jeff Koftinoff
Guest





PostPosted: Sat Sep 16, 2006 10:47 pm    Post subject: Re: fgets() vs std::getline() performance Reply with quote

Ulrich Eckhardt wrote:
Quote:
crhras wrote:
snip
std::string line;
std::ifstream in(filename.c_str());

while (std::getline(in, line,'\n'))
{
}

// FAAAAAASSSSSTTTTT
// ---------------------------
FILE * fp;
fp = fopen(filename.c_str(), "r");

while (fgets(line, 512, fp) != NULL)
{
}

These two snippets are not comparable:
1. std::getline() dynamically resizes the string to fit the input.
snip


This is an important feature... However it can be a big problem that
std::getline() function can not be told what maximum length to allow.

In actual fact on most platforms, giving a 2 Gigabyte file (with no
'\n' in it) will cause the program to crash (with no exception thrown),
precisely because of this feature of dynamically resizing the string to
fit the input with no upper bounds.

Depending where this code is used, it can be the basis of a security
hole - and in that sense fgets() could be a better solution!

Jeff Koftinoff
www.jdkoftinoff.com


[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
Back to top
shakahshakah@gmail.com
Guest





PostPosted: Sat Sep 16, 2006 10:51 pm    Post subject: Re: fgets() vs std::getline() performance Reply with quote

{ Quoted clc++m banner removed. Yes, it's nice that the banner is so
popular. But could people please stop quoting it all the time? -mod }

Seungbeom Kim wrote:
Quote:
crhras wrote:
Wow ! I just used two different file IO methods and the performance
difference was huge. Is there something that I am doing wrong? or is
fgets() just that much faster than getline()?

Here's the code I used :

// SLOOOOOOW
// -----------------
std::string line;
std::ifstream in(filename.c_str());

while (std::getline(in, line,'\n'))
{
}

// FAAAAAASSSSSTTTTT
// ---------------------------
FILE * fp;
fp = fopen(filename.c_str(), "r");

while (fgets(line, 512, fp) != NULL)
{
}

Strange. What environment did you run the tests on?
On Linux-2.6.8, gcc-3.3.5 and libstdc++5-3.3-dev, the fgets version gives

real 0m0.020s
user 0m0.016s
sys 0m0.005s

while the getline version gives

real 0m0.006s
user 0m0.004s
sys 0m0.002s

on a 3.6 MB test file. This may not always be true, but can
at least indicate that getline is not always slower than fgets.

In addition, adding std::ios_base::sync_with_stdio(false) at the
beginning did not alter the result significantly.

FWIW, I get the following results consistently on Linux (fgets about
40% faster than getline):

jc@re1-dev:/tmp$ uname -a
Linux re1-dev 2.6.17-1.2145_FC5 #1 SMP Sat Jul 1 13:05:01 EDT 2006
x86_64 x86_64 x86_64 GNU/Linux

jc@re1-dev:/tmp$ g++ --version
g++ (GCC) 4.1.1 20060525 (Red Hat 4.1.1-1)
Copyright (C) 2006 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is
NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR
PURPOSE.

jc@re1-dev:/tmp$ l /tmp/biggestfile
-rw-r--r-- 1 jc ccbrk 831993344 Sep 16 14:34 /tmp/biggestfile

jc@re1-dev:/tmp$ time ./tt /tmp/biggestfile fgets
file: /tmp/biggestfile
mode: fgets

7999936 lines read in 1559 msec

real 0m1.560s
user 0m0.984s
sys 0m0.576s
jc@re1-dev:/tmp$ time ./tt /tmp/biggestfile getline
file: /tmp/biggestfile
mode: getline

7999936 lines read in 2518 msec

real 0m2.520s
user 0m2.004s
sys 0m0.516s


[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
Back to top
crhras
Guest





PostPosted: Sun Sep 17, 2006 2:04 am    Post subject: Re: fgets() vs std::getline() performance Reply with quote

Quote:
Wow ! I just used two different file IO methods and the performance
difference was huge.

I should have mentioned that I'm using Borland Studio 2006 on Windows XP
Pro. At this point, I think it might be caused by a flaw in the way
getline( ) is implemented by Borland. I am going to post this question
at borland.public.cppbuilder.language.cpp and if I discover anything
there I'll post it here.

Thanks again for everyone's responses.


[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
Back to top
Carlos Moreno
Guest





PostPosted: Sun Sep 17, 2006 2:05 am    Post subject: Re: fgets() vs std::getline() performance Reply with quote

Jeff Koftinoff wrote:

Quote:
In actual fact on most platforms, giving a 2 Gigabyte file (with no
'\n' in it) will cause the program to crash (with no exception thrown),
precisely because of this feature of dynamically resizing the string to
fit the input with no upper bounds.

Depending where this code is used, it can be the basis of a security
hole - and in that sense fgets() could be a better solution!

Nice!! Don't remember the last time I saw an irony that compares to
this one!!

In fact, it could first cause the entire machine to almost-irreversibly
collapse, then crash the program (memory exhausted, intensive swap
memory use, etc.)

Carlos
--

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
Back to top
Wu Yongwei
Guest





PostPosted: Sun Sep 17, 2006 3:53 pm    Post subject: Re: fgets() vs std::getline() performance Reply with quote

Jeff Koftinoff wrote:
Quote:
Ulrich Eckhardt wrote:
crhras wrote:
snip
std::string line;
std::ifstream in(filename.c_str());

while (std::getline(in, line,'\n'))
{
}

// FAAAAAASSSSSTTTTT
// ---------------------------
FILE * fp;
fp = fopen(filename.c_str(), "r");

while (fgets(line, 512, fp) != NULL)
{
}

These two snippets are not comparable:
1. std::getline() dynamically resizes the string to fit the input.
snip

This is an important feature... However it can be a big problem that
std::getline() function can not be told what maximum length to allow.

In actual fact on most platforms, giving a 2 Gigabyte file (with no
'\n' in it) will cause the program to crash (with no exception thrown),
precisely because of this feature of dynamically resizing the string to
fit the input with no upper bounds.

Why do you say `no exception thrown'? I would expect a std::bad_alloc,
and, when it is not caught, an abort().

Quote:

Depending where this code is used, it can be the basis of a security
hole - and in that sense fgets() could be a better solution!

Yes, this is a problem, but I do not see it a security hole. Every
program can run out of memory for some kind of input, including Firefox
and Internet Explorer. I often see Firefox occupies more than 500 MB of
memory, which makes me feel necessary to restart it after viewing the
big page. I am sure it is possible to make a big page to crash them.

When this problem could be a real issue, there are ways to go around
it. For example, using custom allocators. The point is that C++ does
not strangely force an arbitrary limit on how long a line could be. And
the system limitation could be put somewhere else than the processing
logic.

Quote:

Jeff Koftinoff
www.jdkoftinoff.com

Best regards,

Yongwei

--
Wu Yongwei
URL: http://wyw.dcweb.cn/


[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
Back to top
Post new topic   Reply to topic    C++Talk.NET Forum Index -> C++ Language (Moderated) All times are GMT
Goto page 1, 2  Next
Page 1 of 2

 
 


Powered by phpBB © 2001, 2006 phpBB Group