encoding of text files
Jorge Almeida
jalmeida
Tue May 24 16:20:07 PDT 2005
On Tue, 24 May 2005, Michael Hipp wrote:
> Jorge Almeida wrote:
>> I only need files in Portuguese. The problem is that they come in latin1
>> or UTF8...
>> It's all about html files. I want to write accented characters and then
>> filter the file through a script in order to replace them by html code.
>> The script uses Perl substitution operator. Example:
>> with "s/?/\ç/g;" in the script,
>> "po?o" in a latin1 encoded file will be substituted by "poço"
>> But if the file is UTF8 I get "poÃ?o". Nasty!
>> Using "recode UTF8..latin1 file" would solve the problem, but one has to
>> know that it's UTF8 encoded...
>
> Does the 'file' command do what you want?
>
"file -k myfile" yields "HTML document text\012- exported SGML document text"
I tryed with a UTF8 file and with a latin1 file, and the outcome is the
same.
Jorge
More information about the Linux-users
mailing list