Java Mailing List Archive

http://www.java2.5341.com/

Home » nutch-user.lucene »

Re: encoding

David Jashi

2008-09-29

Replies: Find Java Web Hosting

Author LoginPost Reply
ყველაფერი რიგზეა, utf-8 მაგივრად nutch რამოღაც 16–ბიტიანს აბრუნებს.

It's OK, for some strange reason Nutch uses this encoding instead of
UTF-8. Text is displayed normally anyhow.

On Mon, Sep 29, 2008 at 1:04 PM, daut <misha-daut@(protected):
>
> hello,
> I've installed nutch-0.9 and made first crawling.Then I've made search on
> search page. Everithing seems ok. I can see all result characters correctly.
> (non ASCI characters, Georgian language). But when I view page source,
> Instead of georgian letters, for example პოლ, there are such
> simbols:&_#_4_3_1_8;&_#_4_3_1_7;&_#_4_3_1_4;.(without "_" simbols :) ) Why
> happens this? Is it normal?
> Best Rgds daut.
>
>
> --
> View this message in context: http://www.nabble.com/encoding-tp19720443p19720443.html
> Sent from the Nutch - User mailing list archive at Nabble.com.
>
>



--
with best regards,
David Jashi
Web development EO,
Caucasus Online
+995(32)970368
David@(protected)

პატივისცემით,
დავით ჯაში
ვებ–განვითარების დირექტორი
"კავკასუს ონლაინი"
+995(32)970368
David@(protected)
©2008 java2.5341.com - Jax Systems, LLC, U.S.A.