Details
-
Bug
-
Status: Closed
-
Trivial
-
Resolution: Not A Bug
-
2.4
-
None
-
None
-
Ubuntu mate
Description
Nutch v2.4 Not crawling The html page After input tag with name javax.faces.viewstate it is crawling before this tag but unable to go ahead after this javax viewstate which is having a lot special character.
This page is having different tabs, Current crawler is fetching information till date(
Date Published: 06/30/2020 09:00 PM) After that it is unable to fetch from Assembly Bill No. 103 which is title
i m crawling this site: http://leginfo.legislature.ca.gov/faces/billTextClient.xhtml?bill_id=201920200AB103
This is the output i am getting after crawling.