Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
1.2.0, 1.2.1, 1.3.1
-
None
-
Ubuntu 10.04
Description
Greek text extraction error
Ι have a greek pdf but
a) after extraction the greek letter π is extracted as pi
for expamle
original text in pdf
"φυσικών προσώπων"
extracted text
"φυσικών piροσώpiων"
b) the greek letter μ is displayed as µ
there is no difference in display except that is different encoding and when searching for μ cannot find it (you find only the uppercase Μ)
if you copy μ as displayed search for that is working fine
e.g. the word is displayed as "κλίµακας" but it is different from the typed word κλίμακα due to the letter μ
due to this problem solr is not indexing documents correctly
is there any configuration I can make?