|
Description: |
Hi everyone
This is a brief for very simple Excel search tool and data cleaner to be written VB/Java/anything else. Please view the attached Powerpoint first, its only 8 slides and is very easy to understand.
Hope it interests you. - I know it's a piece of cake, so your realistic prices only, please!
------------------------------------
Here's the script from the Powerpoint, though it won't make much sense unless you also view the Powerpoint.
Each month I receive 25 separate reports, each one in Excel XLSX format. I gather all 25 spreadsheets into a single folder, screenshot below.
What's in the reports?
- Each of the 25 separate spreadsheets shows sales of books for a single category of publishing, i.e. one spreadsheet shows sales of psychology books, another spreadsheet shows sales of fiction books. Each of the 25 spreadsheets always has the same header row.
What task must I accomplish?
- I have to write qualitative and quantitative reports that show the relative popularity of certain key words in each topic, for example gardening books vs cookery books.
 The data in the spreadsheets is not very “clean”. i.e. you'd expect to find books about “cake-baking” only in the cookery spreadsheet, but sometimes you'll see books about cake-baking in the fiction spreadsheet, placed there by accident by the people who compile the spreadsheet. 

Dirty data is bad, because if I want to be sure I've looked at every single cake-caking book I must look through 25 spreadsheets - and each spreadsheet has up to 5000 rows!
I need a simple search tool
- I'd like to be able to point my simple search tool at my folder of 25 xlsx spreadsheets, and then ask it to search that folder for one or more search terms, such as “cake-baking” + “wind-sailing”.

I'd also like to be able to search this way:

“Find “cake-baking” and “barbeque foods” but ignore “Sea-food” (in case I wanted to ignore a book called “Cake-Baking with Sea-Food”)
What should the search results look like ?
- For every search term that is found, I would like the entire row of data containing the search term(s) copied into a brand new spreadsheet on my desktop, inserting the header row and preserving the original column order of data from the donor row.

Things you should know
The xlsx spreadsheets I work with are always given to me with two tabs in each spreadsheet

One tab contains the sales data I've shown you in slide 3, the other tab contains irrelevant data that should not be included in any searches. 

 - The irrelevant tab is always called “Totals_2”. Don't include in the search, please. 
 - The relevant tab in every spreadsheet is always called “Chart_1”
Last things
- I work on Macs and use MS Office 2008 for Mac. I would really, really like to use this on a Mac and not on Windows.
I do have an HP Laptop running Windows Vista and MS Office.
I will accept a Java programme that runs on either.
------------------------------------
I know this is a simple task for a good VB or Java programmer
Thanks for reading, and good luck!
Additional Info (Added 2/9/2009 at 6:16 EST)...Attached file: == 09 Politics and Gov-1.csv
File info: Sample files No.1 Additional Info (Added 2/9/2009 at 6:18 EST)...Attached file: == Biography 01.csv
File info: Sample files No.2 Additional Info (Added 2/9/2009 at 6:36 EST)...The Spreadsheets to be analysed are also available in .csv format, not just xlsx format. Perhaps this makes things easier?
I've attached samples of .csv and .xlsx data to the project message board. One spreadsheet show sales of politics books, the other shows sales of biography books. I will test your programme by doing a search through both of these example sheets. I will test by doing a search for two separate terms and disregarding a third term, detailed below:
Search for "Obama"
Search for "Kabul"
Disregard "Obamaland"
(Case insensitive)
I need the search result to be output to a *spreadsheet* because I have to do statistical analysis - it's no good if the search terms are output to MS Word or Notepad or anything like that. It has to be Excel, very preferably XLSX format. Additional Info (Added 2/9/2009 at 6:42 EST)...Attached file: == 09 Politics and Gov-1.xlsx Additional Info (Added 2/9/2009 at 6:42 EST)...Attached file: == Biography 01.xlsx
|