| 
  • If you are citizen of an European Union member nation, you may not use this service unless you are at least 16 years old.

  • You already know Dokkio is an AI-powered assistant to organize & manage your digital files & messages. Very soon, Dokkio will support Outlook as well as One Drive. Check it out today!

View
 

Word Frequency Analyzer3 - From Web Pages

Page history last edited by Dorai Thodla 16 years, 2 months ago

Goal: Given a web page address (url), perform a word frequency analysis on the content and print the results. Reuse the modules/pacakges developed in Word Frequency Analyzer-2


 

 

  1.  Accept the following parameters - url, noise file, output file
  2. Read the page at the url
  3. Parse the page, remove tags and write all the text into a temporary file
  4. Invoke Word Frequency Analyzer2 with the temporary file, noise file and outputfile

 

The Skills you need for this project

 

  • Parsing html and extracting page content
  • dictionaries/hashs/maps usage to store noise words and count frequency of occurence of input words
  • sorting

Comments (0)

You don't have permission to comment on this page.