Description
Although our DefaultHTTPClient using a "PoolingHttpClientConnectionManager" we are unable to use parallelism to take advantage of this, because the getActualDocumentIRI(), getContentType(), and getContentLength() methods are defined on the actual http client itself, and not on a response object, and thus, by the time they are called, their values may have changed as a result of a different http client url request. Thus there is no way to execute calls in parallel using a single http client.
Background: I ran into this problem while trying to parallelize the online microdata tests (cf. ANY23-67) for speed, using a single Any23 instance to extract from multiple pages simultaneously. Usually, the tests would pass, but sporadically, they would fail as a result of the document IRI not matching the page the triples were extracted from. I had to work around this by using a different Any23 instance (and thus a different http client) for every single request.