C++Talk.NET Forum Index C++Talk.NET
C++ language newsgroups
 
Archives   FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

How to read tsv file?

 
Post new topic   Reply to topic    C++Talk.NET Forum Index -> C++ language (comp.lang.c++)
View previous topic :: View next topic  
Author Message
BCC
Guest





PostPosted: Fri Jan 30, 2004 4:25 am    Post subject: How to read tsv file? Reply with quote



Hi,

I have a tab separated value table like this:
header1 header2 header3
13.455 55.3 A string
4.55 5.66 Another string

I want to load this guy into a vector of vectors, since I do not know how
long it may be. I think I have to have a vector of vectors of strings, and
then extract the doubles later(?):
std::vector<std::vector m_data_vec;

I started off with this skeletal function, but Im not sure how to parse the
line for tabs and newlines, and stuff the elements into the vector. Is it
better to read in the whole line then parse it? Can I parse it on the fly?
How?

void MyClass::ReadTSV(const* filename)
{
using namespace std;

ifstream infile(filename);
if (!infile) {
cout << "unable to load file" << endl;
}

// Now what?
}

Thanks,
Bryan


Back to top
Victor Bazarov
Guest





PostPosted: Fri Jan 30, 2004 5:36 am    Post subject: Re: How to read tsv file? Reply with quote



"BCC" <a@b.c> wrote...
Quote:
I have a tab separated value table like this:
header1 header2 header3
13.455 55.3 A string
4.55 5.66 Another string

I want to load this guy into a vector of vectors, since I do not know how
long it may be. I think I have to have a vector of vectors of strings,
and
then extract the doubles later(?):
std::vector<std::vector m_data_vec;

I started off with this skeletal function, but Im not sure how to parse
the
line for tabs and newlines, and stuff the elements into the vector. Is it
better to read in the whole line then parse it?

Oh, so much better...

Quote:
Can I parse it on the fly?

I don't know. Can you?

Quote:
How?

void MyClass::ReadTSV(const* filename)
{
using namespace std;

ifstream infile(filename);
if (!infile) {
cout << "unable to load file" << endl;
}

// Now what?

If you know how many fields to expect, you could use get( ... , 't') N-1
times and then get( ... , 'n') and then again and again.

Easier still to get one by one character and watch for 't' and 'n'. But
I would still do the "get the whole line and then parse it" thing.

Quote:
}

V



Back to top
Sharad Kala
Guest





PostPosted: Fri Jan 30, 2004 6:01 am    Post subject: Re: How to read tsv file? Reply with quote




"BCC" <a@b.c> wrote

Quote:
Hi,

I have a tab separated value table like this:
header1 header2 header3
13.455 55.3 A string
4.55 5.66 Another string

I want to load this guy into a vector of vectors, since I do not know how
long it may be. I think I have to have a vector of vectors of strings, and
then extract the doubles later(?):
std::vector<std::vector m_data_vec;

I started off with this skeletal function, but Im not sure how to parse the
line for tabs and newlines, and stuff the elements into the vector. Is it
better to read in the whole line then parse it? Can I parse it on the fly?
How?

void MyClass::ReadTSV(const* filename)
{
using namespace std;

ifstream infile(filename);
if (!infile) {
cout << "unable to load file" << endl;
}

// Now what?
}
May be this gives you the basic idea.

I haven't tested it. Also no checks for errors etc.


#include <fstream>
#include <string>
#include <vector>
using namespace std;

void ReadTSV(const char* filename)
{
using namespace std;

ifstream infile(filename);
if (!infile) {
cout << "unable to load file" << endl;
}
string str;

vector vvStr;
vector<string> vStr;
int pos1, pos2;
while (getline(infile, str))
{
pos1 = 0;
while((pos2 = str.find('t'))!= string::npos)
{
vStr.push_back(str.substr(pos1, pos2));
pos1 = pos2++;
}
vStr.push_back(str.substr(pos1, string::npos));
vvStr.push_back(vStr);
}

}

</UNTESTED CODE>

Best wishes,
Sharad



Back to top
Sharad Kala
Guest





PostPosted: Fri Jan 30, 2004 6:19 am    Post subject: Re: How to read tsv file? Reply with quote


"Sharad Kala" <no.spam_sharadk_ind (AT) yahoo (DOT) com> wrote

Quote:

"BCC" <a@b.c> wrote in message
news:p1lSb.7633$uM.3791 (AT) newssvr29 (DOT) news.prodigy.com...
Hi,

I have a tab separated value table like this:
header1 header2 header3
13.455 55.3 A string
4.55 5.66 Another string

I want to load this guy into a vector of vectors, since I do not know how
long it may be. I think I have to have a vector of vectors of strings, and
then extract the doubles later(?):
std::vector<std::vector m_data_vec;

I started off with this skeletal function, but Im not sure how to parse the
line for tabs and newlines, and stuff the elements into the vector. Is it
better to read in the whole line then parse it? Can I parse it on the fly?
How?

void MyClass::ReadTSV(const* filename)
{
using namespace std;

ifstream infile(filename);
if (!infile) {
cout << "unable to load file" << endl;
}

// Now what?
}
May be this gives you the basic idea.
I haven't tested it. Also no checks for errors etc.

UNTESTED CODE

#include #include #include using namespace std;

void ReadTSV(const char* filename)
{
using namespace std;

ifstream infile(filename);
if (!infile) {
cout << "unable to load file" << endl;
}
string str;

vector vvStr;
vector<string> vStr;
int pos1, pos2;
while (getline(infile, str))
{
pos1 = 0;
while((pos2 = str.find('t'))!= string::npos)
{
vStr.push_back(str.substr(pos1, pos2));

oops..second parameter should be pos2-pos1+1 i guess.



Back to top
Jonathan Turkanis
Guest





PostPosted: Fri Jan 30, 2004 6:26 am    Post subject: Re: How to read tsv file? Reply with quote

"BCC" <a@b.c> wrote

Quote:
Hi,

I have a tab separated value table like this:
header1 header2 header3
13.455 55.3 A string
4.55 5.66 Another string

I want to load this guy into a vector of vectors, since I do not
know how
long it may be. I think I have to have a vector of vectors of
strings, and
then extract the doubles later(?):
std::vector<std::vector m_data_vec;

I started off with this skeletal function, but Im not sure ho to
parse the
line for tabs and newlines, and stuff the elements into the vector.
Is it
better to read in the whole line then parse it? Can I parse it on
the fly?
How?

Here's some code I wrote some time ago for splitting sequences of
characters and adding them to lists. I have used it a lot with Visual
C++. I don''t guarantee its portability or efficiency, but I looks
generally okay.

Usage:

struct is_tab {
bool operator(char c) const { return c == 't'; }
};

// Split s using tab as a separator character,
// adding segments to the end of a vector.
string s;
vector<string> vec;
split(s.begin(), s.end(), back_inserter(vec), is_tab(), false);

Here you could use any input iterators for the first and second
arguments; in particular, you should be able to use istream_iterators
or istreambuf_iterators.

Jonathan


---------------------
//
// File name: split.h
//
// Descriptions: Contains template functions for splitting a string
into
// a list.
//
// Author: Jonathan Turkanis
//
// Copyright: Jonathan Turkanis, July 29, 2002. See Readme.txt for
// license information.
//

#ifndef UT_SPLIT_H_INCLUDED
#define UT_SPLIT_H_INCLUDED

#include <iterator>
#include <locale>
#include <string>
#include <boost/bind.hpp>
#include <boost/ref.hpp>

namespace Utility {

//
// Function name: split.
//
// Description: Splits the given string into components.
//
// Template paramters:
// InIt - An input iterator type with any value type Elem.
// OutIt - An output iterator type with value type equal to
// std::basic_string<Elem>.
// Pred - A predicate with argument type Elem.
// Parameters:
// first - The beginning of the input sequence.
// last - The end of the input sequence.
// dest - Receives the terms in the generated list.
// sep - Determines where to split the input sequence.
// coalesce - true if sequences of consecutive elements satisfying
sep
// should be treated as one. Defaults to true.
//
template<class InIt, class OutIt, class Pred>
void split(InIt first, InIt last, OutIt dest, Pred sep, bool coalesce
= true);

//
// Function name: split_by_whitespace.
//
// Description: Splits the given string into components.
//
// Template paramters:
// InIt - An input iterator type with any value type Elem.
// OutIt - An output iterator type with value type equal to
// std::basic_string<Elem>.
// Pred - A predicate with argument type Elem.
// Parameters:
// first - The begiining of the input sequence.
// last - The end of the input sequence.
// dest - Receives the terms in the generated list.
//
template<class InIt, class OutIt>
void split_by_whitespace(InIt first, InIt last, OutIt dest)
{
using namespace std;
typedef iterator_traits<InIt>::value_type char_type;
locale loc;
split(first, last, dest, boost::bind(isspace<char_type>, _1,
boost::ref(loc)));
}

template<class InIt, class OutIt, class Pred>
void split(InIt first, InIt last, OutIt dest, Pred sep, bool coalesce)
{
using namespace std;
typedef iterator_traits<InIt>::value_type char_type;
typedef basic_string<char_type> string_type;

bool prev = true; // True if prev char was a separator.
string_type term;
while (first != last) {
char_type c = *first++;
bool is_sep = sep(c);
if (is_sep && (!coalesce || coalesce && !prev)) {
*dest++ = term;
term.clear();
}
if (!is_sep)
term += c;
prev = is_sep;
}
if (!term.empty() && !coalesce || coalesce && !prev)
*dest++ = term;
}
}

#endif // #ifndef UT_SPLIT_H_INCLUDED



Back to top
Jon Bell
Guest





PostPosted: Fri Jan 30, 2004 7:38 am    Post subject: Re: How to read tsv file? Reply with quote

In article <p1lSb.7633$uM.3791 (AT) newssvr29 (DOT) news.prodigy.com>, BCC <a@b.c> wrote:
Quote:
Hi,

I have a tab separated value table like this:
header1 header2 header3
13.455 55.3 A string
4.55 5.66 Another string

I want to load this guy into a vector of vectors,

Use getline() to read one line at a time, then use a stringstream to split
the line into tokens. Note you can specify some other line terminator
than 'n', for getline().

std::vector<std::vector m_data_vec;
std::string line;
while (std::getline (infile, line))
{
std::istringstream linestream (line);
std::string token;
std::vector<std::string> row;
while (std::getline (linestream, token, 't')
{
row.push_back (token);
}
m_data_vec.push_back (row);
}

Actually, your example is easy to parse without a stringstream, if you
use a struct to represent a line, with appropriate member data types:

struct data_rec
{
double foo, bar;
std::string baz;
};

std::vector<data_rec> m_data_vec;
data_rec linedata;
while ((infile >> linedata.foo >> linedata.bar))
&& std::getline (input, linedata.baz))
{
m_data_vec.push_back (linedata);
}

--
Jon Bell <jtbellm4h (AT) presby (DOT) edu> Presbyterian College
Dept. of Physics and Computer Science Clinton, South Carolina USA

Back to top
Jon Bell
Guest





PostPosted: Fri Jan 30, 2004 7:40 am    Post subject: Re: How to read tsv file? Reply with quote

In article <p1lSb.7633$uM.3791 (AT) newssvr29 (DOT) news.prodigy.com>, BCC <a@b.c> wrote:
Quote:
Hi,

I have a tab separated value table like this:
header1 header2 header3
13.455 55.3 A string
4.55 5.66 Another string

I want to load this guy into a vector of vectors,

Use getline() to read one line at a time, then use a stringstream to split
the line into tokens. Note you can specify some other line terminator
than 'n', for getline().

std::vector<std::vector m_data_vec;
std::string line;
while (std::getline (infile, line))
{
std::istringstream linestream (line);
std::string token;
std::vector<std::string> row;
while (std::getline (linestream, token, 't')
{
row.push_back (token);
}
m_data_vec.push_back (row);
}

Actually, your example is easy to parse without a stringstream, if you
use a struct to represent a line, with appropriate member data types:

struct data_rec
{
double foo, bar;
std::string baz;
};

std::vector<data_rec> m_data_vec;
data_rec linedata;
while ((infile >> linedata.foo >> linedata.bar)
&& std::getline (infile, linedata.baz))
{
m_data_vec.push_back (linedata);
}

--
Jon Bell <jtbellm4h (AT) presby (DOT) edu> Presbyterian College
Dept. of Physics and Computer Science Clinton, South Carolina USA

Back to top
Chris Theis
Guest





PostPosted: Fri Jan 30, 2004 8:57 am    Post subject: Re: How to read tsv file? Reply with quote


"Sharad Kala" <no.spam_sharadk_ind (AT) yahoo (DOT) com> wrote

Quote:

"Sharad Kala" <no.spam_sharadk_ind (AT) yahoo (DOT) com> wrote in message
news:bvcrv5$qomb8$1 (AT) ID-221354 (DOT) news.uni-berlin.de...

"BCC" <a@b.c> wrote in message
news:p1lSb.7633$uM.3791 (AT) newssvr29 (DOT) news.prodigy.com...
Hi,

I have a tab separated value table like this:
header1 header2 header3
13.455 55.3 A string
4.55 5.66 Another string

I want to load this guy into a vector of vectors, since I do not know
how
long it may be. I think I have to have a vector of vectors of
strings, and
then extract the doubles later(?):
std::vector<std::vector m_data_vec;

I started off with this skeletal function, but Im not sure how to
parse the
line for tabs and newlines, and stuff the elements into the vector.
Is it
better to read in the whole line then parse it? Can I parse it on the
fly?
How?

void MyClass::ReadTSV(const* filename)
{
using namespace std;

ifstream infile(filename);
if (!infile) {
cout << "unable to load file" << endl;
}

// Now what?
}
May be this gives you the basic idea.
I haven't tested it. Also no checks for errors etc.

UNTESTED CODE

#include #include #include using namespace std;

void ReadTSV(const char* filename)
{
using namespace std;

ifstream infile(filename);
if (!infile) {
cout << "unable to load file" << endl;
}
string str;

vector vvStr;
vector<string> vStr;
int pos1, pos2;
while (getline(infile, str))
{
pos1 = 0;
while((pos2 = str.find('t'))!= string::npos)
{
vStr.push_back(str.substr(pos1, pos2));

oops..second parameter should be pos2-pos1+1 i guess.


There is even an easier way to obtain the vStr vector using stringstreams:


template <class T>
std::vector<T> StringToVector( const std::string& Str )
{
std::istringstream iss( Str );
return std::vector<T>( std::istream_iterator<T>(iss),
std::istream_iterator<T>() );
}

[OT]
Using VC++ 6.0 this solution has to be altered a little bit using copy and a
back_inserter 'cause the appropriate ctor of vector is not yet available in
that compiler version.

Regards
Chris



Back to top
Sharad Kala
Guest





PostPosted: Fri Jan 30, 2004 10:50 am    Post subject: Re: How to read tsv file? Reply with quote


"Chris Theis" <Christian.Theis (AT) nospam (DOT) cern.ch> wrote

Quote:

"Sharad Kala" <no.spam_sharadk_ind (AT) yahoo (DOT) com> wrote in message
news:bvct2d$r15ue$1 (AT) ID-221354 (DOT) news.uni-berlin.de...

"Sharad Kala" <no.spam_sharadk_ind (AT) yahoo (DOT) com> wrote in message
news:bvcrv5$qomb8$1 (AT) ID-221354 (DOT) news.uni-berlin.de...

"BCC" <a@b.c> wrote in message
news:p1lSb.7633$uM.3791 (AT) newssvr29 (DOT) news.prodigy.com...
Hi,

I have a tab separated value table like this:
header1 header2 header3
13.455 55.3 A string
4.55 5.66 Another string

I want to load this guy into a vector of vectors, since I do not know
how
long it may be. I think I have to have a vector of vectors of
strings, and
then extract the doubles later(?):
std::vector<std::vector m_data_vec;

I started off with this skeletal function, but Im not sure how to
parse the
line for tabs and newlines, and stuff the elements into the vector.
Is it
better to read in the whole line then parse it? Can I parse it on the
fly?
How?

void MyClass::ReadTSV(const* filename)
{
using namespace std;

ifstream infile(filename);
if (!infile) {
cout << "unable to load file" << endl;
}

// Now what?
}
May be this gives you the basic idea.
I haven't tested it. Also no checks for errors etc.

UNTESTED CODE

#include #include #include using namespace std;

void ReadTSV(const char* filename)
{
using namespace std;

ifstream infile(filename);
if (!infile) {
cout << "unable to load file" << endl;
}
string str;

vector vvStr;
vector<string> vStr;
int pos1, pos2;
while (getline(infile, str))
{
pos1 = 0;
while((pos2 = str.find('t'))!= string::npos)
{
vStr.push_back(str.substr(pos1, pos2));

oops..second parameter should be pos2-pos1+1 i guess.


There is even an easier way to obtain the vStr vector using stringstreams:


template <class T
std::vector {
std::istringstream iss( Str );
return std::vector<T>( std::istream_iterator<T>(iss),
std::istream_iterator<T>() );
}

How do you take care of the 't' in the string?




Back to top
Chris Theis
Guest





PostPosted: Fri Jan 30, 2004 5:49 pm    Post subject: Re: How to read tsv file? Reply with quote


"Sharad Kala" <no.spam_sharadk_ind (AT) yahoo (DOT) com> wrote

[SNIP]> >
Quote:
There is even an easier way to obtain the vStr vector using
stringstreams:


template <class T
std::vector {
std::istringstream iss( Str );
return std::vector<T>( std::istream_iterator<T>(iss),
std::istream_iterator<T>() );
}

How do you take care of the 't' in the string?


This should be done by the istream_iterators (at least in the Dinkumware
implementation used under VC++). However, I did not yet try it under another
compiler like g++.

Cheers
Chris



Back to top
David Harmon
Guest





PostPosted: Fri Jan 30, 2004 6:07 pm    Post subject: Re: How to read tsv file? Reply with quote

On Fri, 30 Jan 2004 16:20:30 +0530 in comp.lang.c++, "Sharad Kala"
<no.spam_sharadk_ind (AT) yahoo (DOT) com> was alleged to have written:
Quote:
template <class T
std::vector {
std::istringstream iss( Str );
return std::vector<T>( std::istream_iterator<T>(iss),
std::istream_iterator<T>() );
}

How do you take care of the 't' in the string?

istream_iterator<T> uses T's operator>> which in turn recognizes any
kind of whitespace as a delimiter.


Back to top
Display posts from previous:   
Post new topic   Reply to topic    C++Talk.NET Forum Index -> C++ language (comp.lang.c++) All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2006 phpBB Group
SEO toolkit © 2004-2006 webmedic.