C++Talk.NET Forum Index C++Talk.NET
C++ language newsgroups
 
Archives   FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

Reading unicode text files

 
Post new topic   Reply to topic    C++Talk.NET Forum Index -> C++ language (comp.lang.c++)
View previous topic :: View next topic  
Author Message
Wx
Guest





PostPosted: Tue May 22, 2007 9:10 am    Post subject: Reading unicode text files Reply with quote



Hello.

I'm trying to read a textfile written by the NTBackup utility on
Windows 2003 SBS. The problem is that when i print the output, it
looks like this:

S t a t o : b a c k u p
O p e r a z i o n e : b a c k u p
D e s t i n a z i o n e b a c k u p a t t i v o : F i l e
N o m e s u p p o r t o : " l u m e v e . b k f c r e a t o i
l 2 1 / 0 5 / 2 0 0 7 a l l e 2 3 . 0 0 "

As you can see, there is a space prior to any charater. I know that
unicode characters uses two bytes, so... can be the problem related to
different charset?

If I try to read a new textfile, there are no problem.

This is the relevant portion of the code:


try {
ifstream infile(strLogFile.c_str());

if (infile.is_open()) {
string line;
while (getline(infile, line)) {
cout << line << endl;
}

infile.close();

} else {
cerr << "Impossibile aprire " << strLogFile << endl;
return false;
}

Excuse me for my english.
Thanks

Wx
Back to top
Alf P. Steinbach
Guest





PostPosted: Tue May 22, 2007 9:11 am    Post subject: Re: Reading unicode text files Reply with quote



* Wx:
Quote:

I'm trying to read a textfile written by the NTBackup utility on
Windows 2003 SBS. The problem is that when i print the output, it
looks like this:

S t a t o : b a c k u p
O p e r a z i o n e : b a c k u p
D e s t i n a z i o n e b a c k u p a t t i v o : F i l e
N o m e s u p p o r t o : " l u m e v e . b k f c r e a t o i
l 2 1 / 0 5 / 2 0 0 7 a l l e 2 3 . 0 0 "

As you can see, there is a space prior to any charater. I know that
unicode characters uses two bytes, so... can be the problem related to
different charset?

Yes. The "spaces" are, at least before they end up in your program,
zero bytes.


Quote:
If I try to read a new textfile, there are no problem.

This is the relevant portion of the code:


try {
ifstream infile(strLogFile.c_str());

Well, it doesn't help you to use a wide character stream, because they
simply convert to/from external narrow character data.

What you can do is open the file in binary mode.

Then read the contents as binary data and treat as a sequence of wchar_t
values (e.g., you can just store them in a std::wstring).

Essentially this means implementing the machinery that the standard
library provides for narrow character streams. Or, you can buy an
existing implementation or find one on the net (I doubt you'll find
one). I think Dinkumware offers such an implementation.

Note that handling wchar_t in Windows leads you into compiler-specific
territory, since e.g. MingW g++ 3.4.4 doesn't support wide character
streams.


--
A: Because it messes up the order in which people normally read text.
Q: Why is it such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?
Back to top
Display posts from previous:   
Post new topic   Reply to topic    C++Talk.NET Forum Index -> C++ language (comp.lang.c++) All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2006 phpBB Group
SEO toolkit © 2004-2006 webmedic.