Details
-
Improvement
-
Status: Open
-
Major
-
Resolution: Unresolved
-
1.6
-
None
Description
Right now when I use the following code I get the stack trace at the bottom of this description. This seems to be because the Request URI is too large to make the service request. We need to have a mechansim within the call to Tika.translate which will, on a service-by-service basis, determine the maximum Request URI which can be sent. I beleive that this should be on the Tika side as how else am I meant to know the maximum request size?
translator.java
+ Translator translate = new MicrosoftTranslator(); + ((MicrosoftTranslator) translate).setId("..."); + ((MicrosoftTranslator) translate).setSecret("..."); for (java.util.Map.Entry<Text, Parse> entry : parseResult) { Parse parse = entry.getValue(); LOG.info("---------\nUrl\n---------------\n"); @@ -201,7 +207,7 @@ System.out.print(parse.getData().toString()); if (dumpText) { LOG.info("---------\nParseText\n---------\n"); - System.out.print(parse.getText()); + System.out.print(translate.translate(parse.getText(), "fr")); }
stacktrace.log
Exception in thread "main" java.lang.Exception: [microsoft-translator-api] Error retrieving translation : Server returned HTTP response code: 414 for URL: http://api.microsofttranslator.com/V2/Ajax.svc/Translate?&from=&to=fr&text=%D0%A4%D0... ... at com.memetix.mst.MicrosoftTranslatorAPI.retrieveString(MicrosoftTranslatorAPI.java:202) at com.memetix.mst.translate.Translate.execute(Translate.java:61) at com.memetix.mst.translate.Translate.execute(Translate.java:76) at org.apache.tika.language.translate.MicrosoftTranslator.translate(MicrosoftTranslator.java:104) at org.apache.nutch.parse.ParserChecker.run(ParserChecker.java:210) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.nutch.parse.ParserChecker.main(ParserChecker.java:228) Caused by: java.io.IOException: Server returned HTTP response code: 414 for URL: http://api.microsofttranslator.com/V2/Ajax.svc/Translate?&from=&to=fr&text=%D0%A4%D0%BE%D1%80%D1%83%D0%B... ... at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at sun.net.www.protocol.http.HttpURLConnection$6.run(HttpURLConnection.java:1675) at sun.net.www.protocol.http.HttpURLConnection$6.run(HttpURLConnection.java:1673) at java.security.AccessController.doPrivileged(Native Method) at sun.net.www.protocol.http.HttpURLConnection.getChainedException(HttpURLConnection.java:1671) at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1244) at com.memetix.mst.MicrosoftTranslatorAPI.retrieveResponse(MicrosoftTranslatorAPI.java:178) at com.memetix.mst.MicrosoftTranslatorAPI.retrieveString(MicrosoftTranslatorAPI.java:199) ... 6 more Caused by: java.io.IOException: Server returned HTTP response code: 414 for URL: http://api.microsofttranslator.com/V2/Ajax.svc/Translate?&from=&to=fr&text=%D0%A4%D0%BE... ... at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1626) at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:468) at com.memetix.mst.MicrosoftTranslatorAPI.retrieveResponse(MicrosoftTranslatorAPI.java:177) ... 7 more