Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Not A Problem
-
0.9.0-incubating
-
Debian squeeze Linux 2.6.32-5-amd64 SMP x86_64
java version "1.6.0_26"
Java(TM) SE Runtime Environment (build 1.6.0_26-b03)
Java HotSpot(TM) 64-Bit Server VM (build 20.1-b02, mixed mode)
Description
I'm trying to implement a tag suggestion feature in a document editing application. I'm using the stanbol enhancer to get EntityAnnotations for a piece of HTML.
This works great most of the time, but sometimes no results are returned. The difference between the text for which results are returned, and the text for which no results are returned is sometimes only a single character.
I was able to reduce one case down to an additional .
With the following text, the enhancer returns an EntityAnnotation for Syria, but not for CNN:
So, where does the Syria conflict stand now? CNN
With the following text, the enhancer returns EntityAnnotations for both Syria and CNN:
So, where does the Syria conflict stand now? CNN
I post the text with the following command (where @test refers to the file that contains the text):
curl -v -X POST -H "Accept: application/json" -H "Content-Type: text/html;charset=utf-8" --data-binary @test "http://localhost:8086/enhancer"
I checked out stanbol from svn
$ svnversion .
1337074
and started it with the following command line
java -Xmx1g -jar launchers/full/target/org.apache.stanbol.launchers.full-0.10.0-incubating-SNAPSHOT.jar -p 8086
I will try to work around this problem by simply converting everything to plain text.