• If you are citizen of an European Union member nation, you may not use this service unless you are at least 16 years old.

  • Finally, you can manage your Google Docs, uploads, and email attachments (plus Dropbox and Slack files) in one convenient place. Claim a free account, and in less than 2 minutes, Dokkio (from the makers of PBworks) can automatically organize your content for you.


Word Frequency Analyzer3 - From Web Pages

Page history last edited by Dorai Thodla 13 years, 2 months ago

Goal: Given a web page address (url), perform a word frequency analysis on the content and print the results. Reuse the modules/pacakges developed in Word Frequency Analyzer-2



  1.  Accept the following parameters - url, noise file, output file
  2. Read the page at the url
  3. Parse the page, remove tags and write all the text into a temporary file
  4. Invoke Word Frequency Analyzer2 with the temporary file, noise file and outputfile


The Skills you need for this project


  • Parsing html and extracting page content
  • dictionaries/hashs/maps usage to store noise words and count frequency of occurence of input words
  • sorting

Comments (0)

You don't have permission to comment on this page.