|
Description: |
Hi,
I need someone to make an email crawler for the swedish yellow pages. This should be done in PHP, Visual C++ or C#, Delphi, Java or Python.
http://gulasidorna.eniro.se/
The program/script should work like this:
1. First I enter "Rubrik" (title) and "Område" (area), click Crawl.
2. All search hits should be crawled and save the crawled data into an xml-file.
3. The emails should be in text format. In eniro they are in image format so they need to be converted back to text, using some sort of captcha-algorithm, or if you find a better solution on how to get the emails in text format.
This is how the output should look:
<com>
<name>The company name</name>
<phone>The company phone number</phone>
<email>The company email</email>
</com>
Thanks,
Filip
|