Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
Description
What happen:
Is no RobotRuleSet is in the cache for a host, we create try to fetch the robots.txt.
In case http response code is not 200 or 403 but for example 404 we do " robotRules = EMPTY_RULES; " (line: 402)
EMPTY_RULES is a RobotRuleSet created with the default constructor.
tmpEntries and entries is null and will never changed.
If we now try to fetch a page from the host that use the EMPTY_RULES is used and we call isAllowed in the RobotRuleSet.
In this case a NPE is thrown in this line:
if (entries == null) {
entries= new RobotsEntry[tmpEntries.size()];
possible Solution:
We can intialize the tmpEntries by default and also remove other null checks and initialisations.