selected publications academic article Automated extraction of structured information from a variety of web pages 2018